29 September 2006
With the 2006 election coming up soon, here are a couple of blogs that might appeal to both the political junkie and the methods geek in all of us. Political Arithmetik , a blog by Charles Franklin from Wisconsin, is full of cool graphs that illustrate the power of simple visualization and non-parametric techniques, something that we spend a lot of time talking about in the introductory methods courses in the Gov Department. (On a side note, I think that the plots like this of presidential approval poll results that you find on his site and others have to be one of the best tools for illustrating sampling variability to students who are new to statistics.) Professor Franklin also contributes to another good polling blog, Mystery Pollster, run by pollster Mark Blumenthal. It just moved to a new site, which now has lots of state-level polling data for upcoming races. All in all, plenty of good stuff to distract you from the "serious" work of arguing about causal inference, etc.
28 September 2006
In a 1986 JASA article, Paul Holland reported that he and Don Rubin had once made up the motto, “NO CAUSATION WITHOUT MANIPULATION.” The idea is that even in an observational study, causal inference cannot proceed unless and until the quantitative analyst identifies an intervention that hypothetically could be implemented (although Professor Holland accepts the idea that the manipulation may be not ever be carried out for physical or ethical reasons). The idea of studying the causal effect of things that we as human beings could never influence is incoherent because such things could never be the subject of a randomized experiment.
My question: do we really adhere to this principle? Take the one causal link established via observational studies that pretty much everyone (even Professor Freedman, see below) agrees on: smoking causes lung cancer. Has anyone ever bothered to imagine what manipulation to make people smoke is contemplated? Aren’t we pretty sure it wouldn’t matter how we intervened, i.e., however it happens that people smoke, those who smoke get lung cancer at a higher rate? (It might matter what they smoke, how much they smoke, perhaps even where and when, but what got them started and what keeps them at it?) If folks agree with me on this, what’s left of Professor Holland’s maxim?
Paul W. Holland, Statistics and Causal Inference, 81 J. Am. Stat. Ass’n 945, 959 (1986)
David Freedman, From Association to Causation: Some Remarks on the History of Statistics, 14 Stat. Sci. 243, 253 (1999)
27 September 2006
Here's something new to pick at, in addition to methods problems: coding isues. A recent Science (August 18, 2006, pages 979-982) article by Bruce Dohrenwend and colleagues reported on revised estimates of post traumatic stress disorders of Vietnam veterans. See here for an NYT article. The new study indicates that some 18.7% of Vietnam veterans developed diagnosable post-traumatic stress, compared with earlier estimates of 30.9%. The differences comes mainly from using revised measures of diagnosis and exposure to combat for a subset of the individuals covered in the original data source, the 1988 National Vietnam Veterans' Readjustment Study (NVVRS). The authors added military records to come up with the new measures.
Given the political and financial importance (the military has a budget for mental health), this is quite a difference. One critical issue pointed out by the Science article is that the original study did not adequately control for veterans who had been diagnosed for mental health problems before being sent to combat. Just looking at the overall rates after combat is not a great study design. But this also makes me wonder about how the data was collected in the first place. Maybe the most disabled veterans didn’t reply to the survey, or were in such state of illness that they couldn’t (or had died of related illnesses). The NVVRS is supposedly representative but this would be an interesting point to examine.
This article also illustrates how important the data, measures and codings are in social science research these days. It seems that taking these issues more seriously should be part of the academic and policy process just like replication should be (see here and here for some discussion this issue). While study and sample design are under much scrutiny these days, there are still few discussions about the sensitivity to coding and data. Given the difference they can make, this should change.
26 September 2006
I'm a little late into the game with this, but it's interesting enough that I'll post anyway. Several folks have commented on this paper by Gerber and Malhotra (which they linked to) about publication bias in political science. G&M looked at how many articles were published with significant (p<0.05) vs. non-significant results, and found -- not surprisingly -- that there were more papers with significant results than would be predicted by chance; and, secondly, that many of the significant results were suspiciously close to 0.05.
I guess this is indeed "publication bias" in the sense of "there is something causing articles with different statistical significance to be published differentially." But I just can't see this as something to be worried about. Why?
Well, first of all, there's plenty of good reason to be wary of publishing null results. I can't speak for political science, but in psychology, a result can be non-significant for many many more boring reasons than that there is genuinely no effect. (And I can't imagine why this would be different in poli sci). For instance, suppose you want to prove that there is no relation between 12-month-olds' abilities in task A and task B. It's not sufficient to show a null result. Maybe your sample size wasn't large enough. Maybe you're not actually succeeding in measuring their abilities in either or both of the tasks (this is notoriously difficult with babies, but it's no picnic with adults either). Maybe A and B are related, but the relation is mediated by some other factor that you happen to have controlled for. etcetera. Now, this is not to say that no null results are meaningful or that null results should never be published, but a researcher -- quite rightly -- needs to do a lot more work to make it pass the smell test. And so it's a good thing, not a bad thing, that there are fewer null results published.
Secondly, I'm not even worried about the large number of studies that are just over significance. Maybe I'm young and naive, but I think it's probably less an indication of fudging data than a reflection of (quite reasonable) resource allocation. Take those same 12-month-old babies. If I get significant results with N=12, then I'm not going to run more babies in order to get more significant results. Since, rightly or wrongly, the gold standard is the p<0.05 value (which is another debate entirely), it makes little sense to waste time and other resources running superfluous subjects. Similarly, if I've run, say, 16 babies and my result is almost p<0.05, I'm not going to stop; I'll run 4 more. Obviously there is an upper limit on the number of subjects, but -- given the essential arbitrariness of the 0.05 value -- I can't see this as a bad thing either.
This week the Applied Statistics Workshop will present a talk by Ben Hansen, Assistant Professor of Statistics at the University of Michigan. Professor Hansen graduated from Harvard College, magna cum laude, with a degree in Mathematics and Philosophy. He went on to win a Fulbright Fellowship to study philosophy at the University of Oslo, Norway, after which he earned his Ph.D. in Logic and Methodology of Science at the University of California, Berkeley.
Professor Hansen’s primary research interests involve causal inference in comparative studies, particularly observational studies in the social sciences. His publications appear in the Journal of Computational and Graphical Statistics, Bernoulli, Journal of the American Statistical Association, and Statistics and Probability Letters. He is currently working on providing methods for statistical adjustment that enable researchers to mount focused, specific analogies of their observational studies to randomized experiments.
Professor Hansen will present a talk entitled "Covariate balance in simple, stratified and clustered comparative studies." The working paper that accompanies the talk is available from the course website. The presentation will be at noon on Wednesday, September 27, in Room N354, CGIS North, 1737 Cambridge St. Lunch will be provided.
If you missed the workshop’s first meeting, you should check out the abstract of Jake Bowers’ talk, “Fixing Broken Experiments: A Proposal to Bolster the Case for Ignorability Using Subclassification and Full Matching”.
25 September 2006
In the next few weeks, the number of articles posted to this site is set to increase, partly because school's back in session, and partly because we've recruited some new authors for the committee. This is a good thing in general. However, I know I work best on a deadline, so it happens that I tend to post when the flow is slower, and less when a lot of articles are being posted by the other authors.
To bring this back to the realm of science: Am I taking the position of a economic free rider (or "freeloader", if you prefer), if I tend to post less frequently than other authors, or is someone in my position merely acting as a balancing actor, keeping stability?
As for the "art", I doubt that this observation is opera-worthy, but it does tend to happen a lot in social situations I've seen. Certainly in an early episode of Seinfeld where George wanted to split a cab but not have to pay for it because they "were going that way anyway".
24 September 2006
Last Fall I counted 51 faculty methods jobs posted in political science. I paid close attention because I was on a relevant search committee. This was particularly interesting because equilibrium in past years was about five or so. Right now there are 39 methods jobs posted (subtracting non-tenure/tenure track positions). Now some of these are listed as multiple fields, but one has to presume that listing the ad on the methods page is a signal.
Apparently we have US News and World Report to thank for fundamentally changing the labor market by making methodology the fifth "official" field of the discipline. A number of (non-methodologist) colleagues believed that I must be exaggerating since an order of magnitude difference seems ridiculous. Actually, it turns out that I was underestimating as Jan Box-Steffensmeier (president of the Society for Political Methodology and the APSA methods section) recently got a count of 61 from the APSA. I think their definition was a little broader than mine (perhaps including formal theory and research methods jobs at undergraduate-only institutions).
So an interesting question is how quickly does supply catch up to demand here? My theory is that it will occur rather slowly since the lead time for methods training seems to be longer than the lead time for other subfields. This is obviously good news for graduate students going on the market soon in this area. I'm curious about other opinions, but I think that this is a real change for the subfield.
19 September 2006
Andrew Fernandes, a fellow Canadian expat and PhD student at NC State, responded to my earlier request for advice on exploring a Dirichlet-type simplex.
Among other places, the idea is presented in the Wikipedia entry for Simplex. He suggests perturbing the cumulative sums, then putting the perturbed sums back in order to draw a time-reversible proposal. This has the advantage of not sending too many parameters below zero - a maximum of one - as opposed to an equal perturbation of each parameter, and not pinning a high-valued parameter in place with a standard Dirichlet proposal.
15 September 2006
Ah, the beginning of fall term -- bringing with it the first anniversary of this blog (yay!), a return to our daily posting schedule (starting soon), and a question for you, our readers:
Do you have any feedback for us? Specifically, are there topics, issues, or themes you would like us to cover more (or less) than we do? Would you like to see more discussion of specific content and papers? More posts on higher-level, recurring issues in each of our fields (or across fields)? More musings about teaching, academia, or the sociology of science? Obviously the main factor in what we write about comes down to our whims and interests, but it's always nice to write things that people actually want to read.
In my specific case, I know that I try not to blog about many cognitive science and psychology topics that I think about if they aren't directly related to statistics or statistical methods in some way: I fear that it wouldn't be of interest to readers who come here for a blog about "Social Science Statistics". However, maybe I've been needlessly restrictive...?
So, readers, what are your opinions?
10 September 2006
The semester is about to start, which means it is math camp time at the Government Department. The very first topic is usually an introduction to dimensions, starting from R1 (lines), to R2 (planes), to R3 (3D planes), to R4 (3D plane plus time). Here is a nice flash animation (click on “imagining ten dimensions” on the left) that takes you a step further, from zero to ten dimensions in less than 5 minutes (including cool visual and acoustic effects). It doesn’t necessarily become more graspable as you ascend ... :-)