February 2009
Sun Mon Tue Wed Thu Fri Sat
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28

Authors' Committee


Matt Blackwell (Gov)


Martin Andersen (HealthPol)
Kevin Bartz (Stats)
Deirdre Bloome (Social Policy)
John Graves (HealthPol)
Rich Nielsen (Gov)
Maya Sen (Gov)
Gary King (Gov)

Weekly Research Workshop Sponsors

Alberto Abadie, Lee Fleming, Adam Glynn, Guido Imbens, Gary King, Arthur Spirling, Jamie Robins, Don Rubin, Chris Winship

Weekly Workshop Schedule

Recent Comments

Recent Entries



SMR Blog
Brad DeLong
Cognitive Daily
Complexity & Social Networks
Developing Intelligence
The Education Wonks
Empirical Legal Studies
Free Exchange
Health Care Economist
Junk Charts
Language Log
Law & Econ Prof Blog
Machine Learning (Theory)
Marginal Revolution
Mixing Memory
Mystery Pollster
New Economist
Political Arithmetik
Political Science Methods
Pure Pedantry
Science & Law Blog
Simon Jackman
Social Science++
Statistical modeling, causal inference, and social science



Powered by
Movable Type 4.24-en

« February 18, 2009 | Main | February 21, 2009 »

19 February 2009

Uncertainty Estimates and the Current Population Survey

The Annual Social and Economic Supplement (ASEC) to the Current Population Survey (CPS) is among the most widely used and influential data sets in the social sciences and in policymaking. For example, the much-cited figure of 45 million uninsured is a CPS estimate; Title I education funding is allocated using the CPS; and state outlays for the State Children's Health Insurance Program are also determined using the survey.

From the perspective of the social scientist, the CPS is a key research tool because of its large sample size (roughly 60,000 households) and because it is is typically released publicly about 5-6 months after the survey is initially fielded. However, one major drawback is that, unlike other major national surveys (the SIPP, the MEPS, and the NHIS to name a few), the public release of the CPS data does not include variables that must be used to get the correct standard errors for the complex survey design. Rather, the CPS releases a series of adjustment factors for specific population subgroups (e.g. by race, income group, state, etc.) that can be applied to uncertainty estimates. However, this approach is obviously problematic in the case of regression -- which adjustment factors does one use if the regression contains a rich array of covariates? As a result, much research using the CPS (which appears quite often in economics and health services research journals) proceeds either under the assumption of simple random sampling, or using robust standard errors. These studies therefore likely have understated uncertainty estimates, casting some doubt on the conclusions of this work.

So what is the applied researcher to do? One simple method of approximation (suggested to me once by Alan Zaslavsky) is to exploit the fact that the CPS uses monthly rotation groups that effectively replicate the CPS survey design. That is, one could produce separate estimates for each monthly rotation group and combine these estimates to come up with an estimate of the uncertainty from the survey design.

An alternative method (described in Davern, et. al Inquiry 43 (3) 2006), is to construct synthetic stratum and primary sampling unit (PSU) variables using available information in the survey (e.g. metropolitan statistical area, state, and household identifiers). In the above article, the authors compared this synthetic method to the internal census files (which obviously do have the complex survey design variables) and computed the ratio of the synthetic method to the standard error from the internal census file. In general, the ratios were on the order of 0.75 to 0.85, bringing the uncertainty estimates closer to the internal estimates than the ratios of about 0.5-0.6 they found under the assumption of simple random sampling (i.e. making no adjustment for survey design) and using robust standard errors.

Posted by John Graves at 8:23 AM