The Changing Evidence Base of Political Science Research
The Changing Evidence Base of Political Science Research
I believe the evidence base of political science and the related social sciences are beginning an underappreciated but historic change. As a result, our knowledge of and practical solutions for problems of government and politics will begin to grow at an enormous rate --- if we are ready.
For the last half-century, we have learned about human populationsprimarily through sample surveys taken every few years, end-of-periodgovernment statistics, and in-depth studies of particular places,people, or events. These sources of information have served us wellbut, as is widely known, are limited: Survey research producesoccasional snapshots of random selections of isolated individuals fromunknown geographic locations, and the increases in cell phone use andgrowing levels of nonresponse are crumbling its scientific foundation.Aggregate government statistics are valuable, but in many countriesare of dubious validity and are reported only with intentionallylimited resolution or after obscuring valuable information. One-offin-depth studies are highly informative but for the most part do notscale, are not representative, and do not measure long-term change.
In the next half-century, these existing data collection mechanismswill surely continue to be used and improved --- such as withinexpensive web surveys, if the problems with their representativenesscan be addressed --- but they will be supplemented by the profusion ofmassive data bases already becoming available in many areas. Someproduce extensive or continuous time information on individualpolitical behavior and its causes, such as based on text sources (viaautomated information extraction from blogs, emails, speeches,government reports, and other web sources), electoral activity (viaballot images, precinct-level results, and individual-levelregistration, primary participation, and campaign contribution data),commercial activity (through every credit card and real estatetransaction and via product RFIDs), geographic location (by carryingcell phones or passing through toll booths with Fastlane or EZPasstransponders), health information (through digital medical records,hospital admittances, and accelerometers and other devices beingincluded in cell phones), and others. Parts of the biologicalsciences are now effectively becoming social sciences, as developmentsin genomics, proteomics, metabolomics, and brain imaging produce hugenumbers of person-level variables. Satellite imagery is increasing inscope, resolution, and availability. The internet is spawningnumerous ways for individuals to interact, such as through socialnetworking sites, social bookmarking, comments on blogs, participatingin product reviews, and entering virtual worlds, all of which arepossibilities for observation and experimentation. (Ensuring privacyand protection of personal information during the analyses to beconducted with this information will require considerable effort,care, and new work in research ethics, but should not be markedly moredifficult than the now routine medical research involving experimentson human subjects with drugs and surgical procedures of unknown safetyand efficacy.)
The analogue-to-digital transformation of numerous devices people ownmakes them work better, faster, and less expensively, but also enableseach one to produce data in domains not previously accessible viasystematic analysis. This includes everything from real-time changesin the web of contacts among people in in society (the bluetooth inyour cell phone knows whether other people are nearby!) to recordskept of individuals' web clicking, searches, and advertisingclickthroughs. Partly as a result of new technology, governmentalbureaucracies are improving their record keeping by moving from paperto electronic data bases, many of which are increasingly available toresearchers. Some governmental policies are furthering these changesby requiring more data collection, such as the ``No Child Left BehindAct'' in education and via the proliferation of randomized policyexperiments. All these changes are being supplemented by thereplication movement in academia that encourages or requires socialscientists to share data we have created with other researchers.
These data put numerous advances within our reach for the first time.Instead of trying to extract information from a few thousandactivists' opinions about politics every two years, in the necessarilyartificial conversation initiated by a survey interview, we can usenew methods to mine the tens of millions of political opinionsexpressed daily in published blogs. Instead of studying the effectsof context and interactions among people by asking respondents torecall their frequency and nature of social contacts, we now have theability to obtain a continuous record of all phone calls, emails, textmessages, and in-person contacts among a much larger group. In placeof dubious or nonexistent governmental statistics to study economicdevelopment or population spread in Africa, we can use satellitepictures of human-generated light at night or networks of roads andother infrastructure measured from space during the day. The number,extent, and variety of questions we can address are considerable andincreasing fast.
If we can tackle the substantial privacy issues, build more powerfuland more widely applicable theories with observable implications inthese new forms of data, help create informatics techniques to ensurethat the data are accessible and preserved, and develop newstatistical methods adapted to the new types of data, politicalscience can make more dramatic progress than ever before. Thechallenge before us as a profession, before each of us as researchers,and before the broader community of social scientists, is to preparefor the collection and analysis of these new data sources, to unlockthe secrets they hold, and to use this new information to betterunderstand and ameliorate the major problems that affect society andthe well-being of human populations.
Posted by Gary King at October 1, 2007 8:05 AM