Can matching solve endogeneity?

I get asked this question from time to time, but when I got asked this question multiple times on Friday, I guessed that something had gone down.

What went down was Chris Blattman offering a rant (his description, not mine) about the "cardinal sin of matching" -- the belief that matching can single-handedly solve endogeneity problems. Most of the questions I got went something like "Chris says matching can't help with endogeneity. You say it can. What gives?"

First, let me say that I agree with most of Chris' rant and I think that his blog post should be required reading for anyone using matching right now. There are too many people out there that think that matching is a magical method that fixes endogeneity automatically. It's not, and reading Chris' discussion should be the first step in a 12 step process for those of us that have drunk the matching cool-aid too hard.

Now for the statistics:

Basically, matching can solve your endogeneity/selection/confounding problem if you can measure the variables that influence treatment assignment. That is a big "if" and measurement is the key here. Matching is generally a pretty smart way to condition on observables, but it doesn't buy you anything if you believe that there are unobserved variables that systematically influence treatment assignment. Thus, if you think your regression is biased because of unobservables, then matching by itself won't help you. What you really need to do is go out and measure the unobserved confounders and condition on them.

In the end, I think that people who like matching methods (and other conditioning methods) tend to believe that most confounders can be measured (perhaps with a lot of hard work) and that there aren't a lot of lurking unobservables. In contrast, people I talk to who are skeptical of matching almost always argue that there will always be problematic unobservables lurking no matter how hard you try to measure them. In general, these types of people prefer instrumental variables approaches (and tend to be economists rather than statisticians, interestingly enough).

Fair enough -- there may be lurking unobservables. Frankly, there's no way to get empirical traction on how many lurking unobservables are out there (definitionally), so I think it comes down to subjective beliefs about the nature of the world. But what always gets me is that the same people who tell me that lurking unobservables are everywhere tend to be fairly comfortable making the types of exclusion restrictions that make IV approaches work. The crazy thing is that just like matching, these assumptions rely on assumptions about unobservable causal pathways. The claim that an instrumental variable is valid is the claim that there are no unobserved (or observed) variables linking the instrument to the outcome except through the path of the instrumented variable. So it always puzzles me that the same people who think that lurking unobservables are everywhere in matching somehow think that all these lurking uobservables go away as soon as you call something an instrument and try to defend it as exogenous.

I'm pretty skeptical of most observational IV approaches -- unless you flipped the coins yourself or you can really tell me a plausible story about how nature flipped coins, I probably won't believe your instrument. So why am I falling into the reverse trap: believing that unobservables are more likely to undermine IV than conditioning approaches? Maybe I'm just wrong here and I need to become an even more extreme skeptic of most empirical research than I already am. But my sense is that the conditions for an IV to hold are more knife-edge than the ignorability assumptions. Perhaps that's wishful thinking.

But wishful thinking aside, matching can help solve endogeneity problems if you can measure the variables that influence selection (and if there happens to be sufficient overlap, yadda, yadda). All those people out there who make blanket statements like "matching can't solve endogeneity" are either making the assumption that there are always lurking confounders or else they are just plain wrong.

Posted by Richard Nielsen at October 29, 2010 10:30 AM