November 2011
Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30      

Authors' Committee


Matt Blackwell (Gov)


Martin Andersen (HealthPol)
Kevin Bartz (Stats)
Deirdre Bloome (Social Policy)
John Graves (HealthPol)
Rich Nielsen (Gov)
Maya Sen (Gov)
Gary King (Gov)

Weekly Research Workshop Sponsors

Alberto Abadie, Lee Fleming, Adam Glynn, Guido Imbens, Gary King, Arthur Spirling, Jamie Robins, Don Rubin, Chris Winship

Weekly Workshop Schedule

Recent Comments

Recent Entries



SMR Blog
Brad DeLong
Cognitive Daily
Complexity & Social Networks
Developing Intelligence
The Education Wonks
Empirical Legal Studies
Free Exchange
Health Care Economist
Junk Charts
Language Log
Law & Econ Prof Blog
Machine Learning (Theory)
Marginal Revolution
Mixing Memory
Mystery Pollster
New Economist
Political Arithmetik
Political Science Methods
Pure Pedantry
Science & Law Blog
Simon Jackman
Social Science++
Statistical modeling, causal inference, and social science



Powered by
Movable Type 4.24-en

« November 6, 2011 | Main | November 14, 2011 »

9 November 2011

Studies that withhold replication data are more likely to have errors

We already knew that scholars who provide replication data get cited more. Now we know that they are also more likely to be right! Paper by Wicherts, Bakker, and Molenaar here. Blog post by Gelman here.

The authors asked for replication data to 49 psychology studies. Amazingly, many of them did not comply even though they were explicitly under contract with the journals to provide the data.

1) Papers whose authors withheld data had more reporting errors, meaning that the reported p-value was different than the correct p-value as calculated from the coefficient and standard error (as reported in the paper). I'd really like to think that these were all just innocent typos but: in seven papers, these typos reversed findings. None of those seven authors shared their data.

2) Papers whose authors withheld data tended to have larger p-values, meaning that their results were not as "strong" in some sense. This interpretation tortures the idea of the p-value a little bit, but it certainly represents how many researchers think about p-values. It's striking that researchers who think their results are "weaker" were less likely to provide data. It also suggests that researchers who are getting a range of p-values from different, plausible models tend to pick the p-value just below 0.05 rather than the one just above. But then, we already knew that.

This is frightening, not least because most of these were lab experiments, where we tend to think that the results are less sensitive to analyst manipulation because of strong design. Also, these are only the problems that were obvious without access to the replication data.

Most responses to this study include appeals for better data sharing standards, but I don't think it's necessary. As long as we know which authors provide replication data and which don't, we can all update accordingly.

Posted by Richard Nielsen at 7:48 PM