April 2006
Sun Mon Tue Wed Thu Fri Sat
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29

Authors' Committee


Matt Blackwell (Gov)


Martin Andersen (HealthPol)
Kevin Bartz (Stats)
Deirdre Bloome (Social Policy)
John Graves (HealthPol)
Rich Nielsen (Gov)
Maya Sen (Gov)
Gary King (Gov)

Weekly Research Workshop Sponsors

Alberto Abadie, Lee Fleming, Adam Glynn, Guido Imbens, Gary King, Arthur Spirling, Jamie Robins, Don Rubin, Chris Winship

Weekly Workshop Schedule

Recent Comments

Recent Entries



SMR Blog
Brad DeLong
Cognitive Daily
Complexity & Social Networks
Developing Intelligence
The Education Wonks
Empirical Legal Studies
Free Exchange
Health Care Economist
Junk Charts
Language Log
Law & Econ Prof Blog
Machine Learning (Theory)
Marginal Revolution
Mixing Memory
Mystery Pollster
New Economist
Political Arithmetik
Political Science Methods
Pure Pedantry
Science & Law Blog
Simon Jackman
Social Science++
Statistical modeling, causal inference, and social science



Powered by
Movable Type 4.24-en

« April 12, 2006 | Main | April 14, 2006 »

13 April 2006

Data from China: Land of Plenty? (I)

Sebastian Bauhoff

While the media keeps preaching that this century is Chinese, many researchers are getting excited about new opportunities for data collection and access to data. For the past decades, many development researchers have focused on India because of the regional variation and good infrastructure for surveys. It seems that now China holds a similar promise, and could provide an interesting comparison to India.

I recently started collecting information on China (here); below are some highlights. If you know of more surveys, do let me know.

Probably the best known micro-survey at this point is the China Health and Nutrition Survey CHNS, which is a panel with rounds in 1989, 1991, 1993, 1997, 2000, and 2004 (the 2006 wave is funded) and covers more than 4,000 households in 9 provinces. Though this is an amazing dataset, using it is not always easy. For example there are problems of linking individuals over time. New longitudinal master files are continuously released but the fixes are sometimes are hard to integrate in ongoing projects (the ID's are mixed up). Also there seem to be some inconsistencies in the recording, especially in earlier rounds and some key variables such as education. The best waves seem to be those of 1997 and 2000.

There is also a World Bank Living Standards Measurement Study (LSMS) for China. That survey used standardized (internationally comparable?) questionnaires and was conducted in 780 households and 31 villages in 1996/7. For those interested in the earlier periods, there is commercial data at the China Population Information and Research Center which has mainly census-based data starting from 1982. The census itself is also available electronically now (and with GIS maps) but there is a lively debate as to how reliable the figures are, and whether key measures changed over time. But it should still be good for basic cross-sectional analysis.

Posted by Sebastian Bauhoff at 6:00 AM