I read with some interest a recent NYTimes article about how cities are increasingly making public reams of municipal data. Basically, what the article noted was an increasing trend among US municipalities in making public data available and easy to digest. Among the cities taking the lead are San Francisco (with its DataSF website), New York (Data Mine), Washington DC (D.C. Data Catalog). The Federal Government has long hosted its own data site, Data.gov.
This trend doesn't seem to be limited to just US governments.
Over in the UK, Gordon Brown's government is hard at work on a new data site, data.gov.uk, which it hopes to launch early in 2010. In fact, the prime minister just today delivered a speech in which he extolled the virtues of data availability:
Releasing data can and must unleash the innovation and entrepreneurship at which Britain excels - one of the most powerful forces of change we can harness.
When, for example, figures on London's most dangerous roads for cyclists were published, an online map detailing where accidents happened was produced almost immediately to help cyclists avoid blackspots and reduce the numbers injured.
And after data on dentists went live, an iphone application was created to show people where the nearest surgery was to their current location.
And from April next year ordnance survey will open up information about administrative boundaries, postcode areas and mid-scale mapping.
All of this will be available for free commercial re-use, enabling people for the first time to take the material and easily turn it into applications, like fix my street or the postcode paper
For social scientists, having access to more data is never a bad thing. But, more importantly, perhaps having access to this otherwise mundane data will lessen our dependence on (notoriously unreliable) public opinion surveys. Instead of asking people how much they feel crime is affecting their particular neighborhood, we could measure it using the data provided by DataSF, data.gov.uk, and others. Instead of asking people how reliable or safe are their local hospitals, we'll be able to measure it using the same resources.
My point here is that it's often more useful for social scientists to see how people actually behave rather than to ask people how they say they will behave.
Of course, all this depends on having access to data at its rawest form. From just the quick look I did at some of the websites, I saw a lot of data in processed form (for example, available only through iPhone apps or through summary statistics in PDF form). This kind of processing makes things more accessible to the casual data consumer, but vastly less useful for the social scientist ready and willing to do her own data analysis.
The other thing, too, is that it will be interesting to see how governments, private companies, and academic institutions work together (or fail to work together) to make data available. Will Google step in to provide a search engine to search these databases? Will governments make their data available on something like IQSS's Dataverse? In general, what's the best way to make data available both to researchers and to the public?
It seems like an exciting time for data availability. If folks have other thoughts on this -- or leads or tips on other municipalities or governments increasingly making their data available -- I'd been keen to hear them.
Posted by Maya Sen at December 7, 2009 12:42 PM