7 November 2009
A friend recently pointed me to a 2007 New Republic article in which the author, Noam Scheiber, argues that the "Freakonomics" phenomenon is lamentable because it represents a trend toward research in which clever identification strategies are prized over attempts to answer what Scheiber calls "truly deep questions." Although two years and the publication date of a second Levitt and Dubner book have since passed, the article caught my attention because I have been considering a related issue of late. We are all well aware of how difficult it is to make causal inferences in the social sciences, so it is not surprising that researchers are drawn to settings in which some source of exogenous variation allows for identification of the influence of a specific causal factor. In fact, progress on those "truly deep questions" depends in part on this type of work. However, focus on clean identification has some potentially negative implications. Scheiber names one: answering questions of peripheral interest. A second, which is of greater concern for me, is concentrating on population subgroups that may or may not be of scientific interest in and of themselves and that, in either case, are unable to provide direct insights into broader population dynamics.
Thanks to Imbens and Angrist, we know that even when it is not possible to identify the population average effect of a "treatment" (i.e., causal factor of interest) on a given outcome, it is often possible to identify a "local average treatment effect," that is, the average effect of a treatment for the subpopulation whose treatment status is affected by changes in the exogenous regressor. This subpopulation is composed of so-called "compliers," who will take the treatment when assigned to take it and will not when they are not. Sometimes this subpopulation is of scientific or policy interest (for example, we may be interested in knowing the effect of additional schooling on earnings for those students who might drop out of high school but for compulsory education laws). Oftentimes, it is not. In contrast, the broader population and the portion of the population that receives treatment are almost always of interest. These groups are certainly policy-relevant (it would be misleading to project the effect of a drug on public health based only on the drug's effect amongst those who were induced to take the drug) and they are needed to generate "stylized facts" that help us organize our understanding of the social world. (Also, these groups can be observed whereas compliers are not a generally identified subpopulation.)
Unfortunately, when treatment effects are heterogeneous, the identified local average effect does not provide direct information about the wider population. This is problematic since treatment effects are likely to be heterogeneous in social science applications. In fact, this heterogeneity is one of the reasons why identifying causal effects is so difficult (individuals' self-selection into a treatment status based in part on anticipated treatment effects induces endogeneity problems).
A number of demographers have discussed the problem of extrapolating local average treatment effect estimates to the broader population. Greg Duncan, in his presidential address to the Population Association of America, stated that although causal inference is "often facilitated by eschewing full population representation in favor of an examination of an exceedingly small but strategically selected portion of a general population with the 'right kind' of variation in the key independent variable of interest.... a population-based understanding of causal effects should be our principal goal." Robert Moffitt writes that although "some type of implicit weighting is needed" to help us understand how to trade off internal and external validity, "this problem has not really been addressed in the applied research community." Some researchers have suggested using bounds for average treatment effects that are not point-identified (for example, Manski). Of course, the usefulness of bounding techniques depends on the tightness of the bounds, which in turn depends on what assumptions we are willing to impose - and it is exactly scholars' discomfort with prevailing assumptions (e.g., lack of correlation between the error and the treatment indicator) that drove the current focus on non-representative population subgroups. It seems to me that there is still work to be done to connect subpopulation causal estimates to broader population trends. I would be interested to hear of work in this area that you think is promising.