October 2007
Sun Mon Tue Wed Thu Fri Sat
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      

Authors' Committee


Matt Blackwell (Gov)


Martin Andersen (HealthPol)
Kevin Bartz (Stats)
Deirdre Bloome (Social Policy)
John Graves (HealthPol)
Rich Nielsen (Gov)
Maya Sen (Gov)
Gary King (Gov)

Weekly Research Workshop Sponsors

Alberto Abadie, Lee Fleming, Adam Glynn, Guido Imbens, Gary King, Arthur Spirling, Jamie Robins, Don Rubin, Chris Winship

Weekly Workshop Schedule

Recent Comments

Recent Entries



SMR Blog
Brad DeLong
Cognitive Daily
Complexity & Social Networks
Developing Intelligence
The Education Wonks
Empirical Legal Studies
Free Exchange
Health Care Economist
Junk Charts
Language Log
Law & Econ Prof Blog
Machine Learning (Theory)
Marginal Revolution
Mixing Memory
Mystery Pollster
New Economist
Political Arithmetik
Political Science Methods
Pure Pedantry
Science & Law Blog
Simon Jackman
Social Science++
Statistical modeling, causal inference, and social science



Powered by
Movable Type 4.24-en

« September 28, 2007 | Main | October 2, 2007 »

1 October 2007

The Changing Evidence Base of Political Science Research

Kay Schlozman and Norman Nie and I are preparing an edited volume in honor of Sidney Verba. The volume is entitled Political Science: What Should We Know? What Should They Know?. Instead of the usual 10 or so chapters representing something other than each contributor's best work, we invited 100 scholars to write about 1,000 words each -- basically one idea (similar to a blog entry) to address to one or both of the questions in the title. I include a draft of mine below. Comments welcome.

The Changing Evidence Base of Political Science Research

I believe the evidence base of political science and the related social sciences are beginning an underappreciated but historic change. As a result, our knowledge of and practical solutions for problems of government and politics will begin to grow at an enormous rate --- if we are ready.

For the last half-century, we have learned about human populations primarily through sample surveys taken every few years, end-of-period government statistics, and in-depth studies of particular places, people, or events. These sources of information have served us well but, as is widely known, are limited: Survey research produces occasional snapshots of random selections of isolated individuals from unknown geographic locations, and the increases in cell phone use and growing levels of nonresponse are crumbling its scientific foundation. Aggregate government statistics are valuable, but in many countries are of dubious validity and are reported only with intentionally limited resolution or after obscuring valuable information. One-off in-depth studies are highly informative but for the most part do not scale, are not representative, and do not measure long-term change.

In the next half-century, these existing data collection mechanisms will surely continue to be used and improved --- such as with inexpensive web surveys, if the problems with their representativeness can be addressed --- but they will be supplemented by the profusion of massive data bases already becoming available in many areas. Some produce extensive or continuous time information on individual political behavior and its causes, such as based on text sources (via automated information extraction from blogs, emails, speeches, government reports, and other web sources), electoral activity (via ballot images, precinct-level results, and individual-level registration, primary participation, and campaign contribution data), commercial activity (through every credit card and real estate transaction and via product RFIDs), geographic location (by carrying cell phones or passing through toll booths with Fastlane or EZPass transponders), health information (through digital medical records, hospital admittances, and accelerometers and other devices being included in cell phones), and others. Parts of the biological sciences are now effectively becoming social sciences, as developments in genomics, proteomics, metabolomics, and brain imaging produce huge numbers of person-level variables. Satellite imagery is increasing in scope, resolution, and availability. The internet is spawning numerous ways for individuals to interact, such as through social networking sites, social bookmarking, comments on blogs, participating in product reviews, and entering virtual worlds, all of which are possibilities for observation and experimentation. (Ensuring privacy and protection of personal information during the analyses to be conducted with this information will require considerable effort, care, and new work in research ethics, but should not be markedly more difficult than the now routine medical research involving experiments on human subjects with drugs and surgical procedures of unknown safety and efficacy.)

The analogue-to-digital transformation of numerous devices people own makes them work better, faster, and less expensively, but also enables each one to produce data in domains not previously accessible via systematic analysis. This includes everything from real-time changes in the web of contacts among people in in society (the bluetooth in your cell phone knows whether other people are nearby!) to records kept of individuals' web clicking, searches, and advertising clickthroughs. Partly as a result of new technology, governmental bureaucracies are improving their record keeping by moving from paper to electronic data bases, many of which are increasingly available to researchers. Some governmental policies are furthering these changes by requiring more data collection, such as the ``No Child Left Behind Act'' in education and via the proliferation of randomized policy experiments. All these changes are being supplemented by the replication movement in academia that encourages or requires social scientists to share data we have created with other researchers.

These data put numerous advances within our reach for the first time. Instead of trying to extract information from a few thousand activists' opinions about politics every two years, in the necessarily artificial conversation initiated by a survey interview, we can use new methods to mine the tens of millions of political opinions expressed daily in published blogs. Instead of studying the effects of context and interactions among people by asking respondents to recall their frequency and nature of social contacts, we now have the ability to obtain a continuous record of all phone calls, emails, text messages, and in-person contacts among a much larger group. In place of dubious or nonexistent governmental statistics to study economic development or population spread in Africa, we can use satellite pictures of human-generated light at night or networks of roads and other infrastructure measured from space during the day. The number, extent, and variety of questions we can address are considerable and increasing fast.

If we can tackle the substantial privacy issues, build more powerful and more widely applicable theories with observable implications in these new forms of data, help create informatics techniques to ensure that the data are accessible and preserved, and develop new statistical methods adapted to the new types of data, political science can make more dramatic progress than ever before. The challenge before us as a profession, before each of us as researchers, and before the broader community of social scientists, is to prepare for the collection and analysis of these new data sources, to unlock the secrets they hold, and to use this new information to better understand and ameliorate the major problems that affect society and the well-being of human populations.

original PDF version

Posted by Gary King at 8:05 AM