Beyond Standard Errors, Part I: What Makes an Inference Prone to Survive Rosenbaum-Type Sensitivity Tests?

Jens Hainmueller

Stimulated by the lectures in Statistics 214 (Causal Inference in the Biomedical and Social Sciences), Holger Kern and I have been thinking about Rosenbaum-type tests for sensitivity to hidden bias. Hidden bias is pervasive in observational settings and these sensitivity tests are a tool to deal with it. When done with your inference, it seems constructive to replace the usual qualitative statement that hidden bias “may be a problem��? with a precise quantitative statement like “in order to account for my estimated effect, a hidden bias has to be of magnitude X.��? No?

Imagine you are (once again) estimating the causal effect of smoking on cancer and you have successfully adjusted for differences in observed covariates. Then you estimate the “causal��? effect of smoking and you’re done. But wait a minute. Maybe subjects who appear similar in terms of their observed covariates actually differ in terms of important unmeasured covariates. Maybe there exists a smoking gene that causes cancer and makes people smoke. Did you achieve balance on the smoking gene? You have no clue. Are your results sensitive to this hidden bias? How big must the hidden bias be to account for your findings? Again, you have no clue (and so neither does the reader of your article).

Enter stage Rosenbaum type sensitivity tests. These come in different forms but the basic idea is similar in all of them. We have a measure, call it (for lack of latex in the blog) R, which gives the degree to which your particular smoking study may deviate from a study that’s free of hidden bias. You assume that two subjects with the same X may nonetheless differ in terms of some unobserved covariates, so that one subject has an odds of receiving the treatment that is up to Gamma ≥ 1 times greater than the odds for another subject.. So, for example, Gamma=1 would mean your study is indeed free of hidden bias (like a big randomized experiment), and Gamma=4 means that two subjects who are similar on their observed X can differ on unobservables such that one could be four times as likely as the other to receive treatment.

The key idea of the sensitivity test is to specify different values of Gamma and check if the inferences change. If your results break down at Gamma values just above 1 already, this is bad new. We probably should not trust your findings, because the difference in outcome data you found may not be caused by your treatment but may instead be due to an unobserved covariate that you did not adjust for. But if the inferences hold at big values of Gamma, let’s say 7, then your results seems very robust to hidden bias. (That’s what happened in the smoking on cancer case btw). Sensitivity tests allow you to shift the burden of proof back to the critics who laments about hidden bias: Please, Mr. Knows it all, go and find me this “magic omitted variable��? which is so extremely imbalanced but strongly related to treatment assignment that it is driving my results.

More on this subject in a subsequent post.

Posted by Jens Hainmueller at 4:24 AM