April 2010
Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30  

Authors' Committee


Matt Blackwell (Gov)


Martin Andersen (HealthPol)
Kevin Bartz (Stats)
Deirdre Bloome (Social Policy)
John Graves (HealthPol)
Rich Nielsen (Gov)
Maya Sen (Gov)
Gary King (Gov)

Weekly Research Workshop Sponsors

Alberto Abadie, Lee Fleming, Adam Glynn, Guido Imbens, Gary King, Arthur Spirling, Jamie Robins, Don Rubin, Chris Winship

Weekly Workshop Schedule

Recent Comments

Recent Entries



SMR Blog
Brad DeLong
Cognitive Daily
Complexity & Social Networks
Developing Intelligence
The Education Wonks
Empirical Legal Studies
Free Exchange
Health Care Economist
Junk Charts
Language Log
Law & Econ Prof Blog
Machine Learning (Theory)
Marginal Revolution
Mixing Memory
Mystery Pollster
New Economist
Political Arithmetik
Political Science Methods
Pure Pedantry
Science & Law Blog
Simon Jackman
Social Science++
Statistical modeling, causal inference, and social science



Powered by
Movable Type 4.24-en

« March 16, 2010 | Main | April 16, 2010 »

15 April 2010

The inevitable R backlash

There is a blog post floating around by Dr. AnnaMaria De Mars, where she speculates on what the "next big thing" is going to be. Apparently, it is data visualization and analyzing unstructured data, but not R:

Contrary to what some people seem to think, R is definitely not the next big thing, either. I am always surprised when people ask me why I think that, because to my mind it is obvious...I know that R is free and I am actually a Unix fan and think Open Source software is a great idea. However, for me personally and for most users, both individual and organizational, the much greater cost of software is the time it takes to install it, maintain it, learn it and document it. On that, R is an epic fail. It does NOT fit with the way the vast majority of people in the world use computers. The vast majority of people are NOT programmers. They are used to looking at things and clicking on things.

(I am not sure how a "non-programmer" is going to be able to analyze unstructured data or create wonderful visualizations, but that is beside the point.)

The ease-of-use argument or the "not everyone is a programmer" argument is one to which I am sympathetic. It has become quite heated in the debate over the Apple iPad in the last few months. Where the iPad succeeds is to simplify the act of content consumption, which is fantastic.

The act of content creation is more fickle and has always required special tools and running statistical analyses falls firmly into content creation. While it is true that most people are not programmers, it is also true that most people are not creating statistical content. Being able to program grants you agility in the face of data analysis that large statistical software packages cannot provide. They move too slowly.

R's core functionality moves fairly slowly as well, but it gives you the tools you need to implement basically any algorithm or any statistical model. This is leading to a lot of innovation by small groups of users, creating packages to fill voids. It feels more like a programming language than a unified piece of software (libraries! command-line!), but this is what makes it flexible.

And if we are being honest with ourselves there is a fundamental fact: point-and-click interfaces do not promote replicability. This might be fine in the private sector, I am not sure. But in the academic world, being able to replicate a finding is crucial.

Posted by Matt Blackwell at 8:27 AM