Amazon Mechanical Turk for Data Entry Tasks

January 16, 2009

Yesterday I tried using Amazon's Mechanical Turk service for the first time to save myself from some data collection drudgery. I found it fascinating. For the right kind of task, and with a little bit of setup effort, it can drastically reduce the cost and hassle of getting good data compared to other methods (such as using RAs).

Quick background on Mechanical Turk (MTurk): The service acts as a marketplace for jobs that can be done quickly over a web interface. "Requesters" (like me) submit tasks and specify how much they will pay for an acceptable response; "Workers" (known commonly as "Turkers") browse submitted tasks and choose ones to complete. A Requester could ask for all sorts of things (e.g. write me a publishable paper), but because you can't do much to filter the Turkers and they aren't paid for unacceptable work, the system works best for tasks that can be done quickly and in a fairly objective way. The canonical tasks described in the documentation are discrete, bite-sized tasks that could almost be done by a computer -- indicating whether a person appears in a photo, for example. Amazon bills the service as "Artificial Artificial Intelligence," because to the Requester it seems as if a very smart computer were solving the problem for you (while in fact it's really a person). This is also the idea behind the name of the service, a reference to an 18th century chess-playing automaton that actually had a person inside (known as The Turk).

The task I had was to find the full text of a bunch of proposals from meeting agendas that were posted online. I had the urls of the agendas and a brief description of each proposal, and I faced the task of looking up each one. I could almost automate the task (and was sorely tempted), but it would require coding time and manual error checking. I decided to try MTurk.

The ideal data collection task on MTurk is the common situation where you have a spreadsheet with a bunch of columns and you need someone to go through and do something pretty rote to fill out another column. That was my situation: for every proposal I have a column with the url and a summary of what was proposed, and I wanted someone to fill in the "full text" column. To do a task like this, you need to design a template that applies to each row in the spreadsheet, indicating how the data from the existing columns should appear and where the Turker should enter the data for the missing column. Then you upload the spreadsheet and a separate task is created for each row in the spreadsheet. If everything looks good you post the tasks and watch the data roll in.

To provide a little more detail: Once you sign up to be a Requester at the MTurk website, you start the process of designing your "HIT" (Human Intelligence Task). MTurk provides a number of templates to get you started. The easiest approach is to pick the "Blank Template," which is very poorly named, because the "Blank Template" is in fact full of various elements you might need in your HIT; just cut out the stuff you don't need and edit the rest. (Here it helps to know some html, but for most tasks you can probably get by without knowing much.) The key thing is that when you place a variable in the template (e.g. ${party_id}), it will be filled by an entry from your spreadsheet, based on the spreadsheet's column names. So a very simple HIT would be a template that says

Is this sentence offensive? ${sentence}

followed by buttons for "yes" and "no" (which you can get right from the "Blank Template"). If you then upload a CSV with a column entitled "sentence" and 100 rows, you will generate 100 HITs, one for each sentence.

It was pretty quick for me to set up my HIT template, upload a CSV, and post my HITs.

Then the real fun begins. Within two minutes the first responses started coming in; I think the whole job (26 searches -- just a pilot) was done in about 20 minutes. (And prices are low on MTurk -- it cost me $3.80.) I had each task done by two different Turkers as a check for quality, and there was perfect agreement.

One big question people have is, "Who are these people who do rote work for so little?" You might think it was all people in developing countries, but it turns out that a large majority are bored Americans. There's some pretty interesting information out there about Turkers, largely from Panos Ipeirotis's blog (a good source on all things MTurk in fact). Most relvenat for understanding Turkers is survey of Turkers he conducted via (of course) MTurk. For $.10, Turkers were asked to write why they complete tasks on MTurk. The responses are here. My takeaway was that people do MTurk HITs to make a little money when they're bored, as an alternative to watching TV or playing games. One man's drudgery is another man's entertainment -- beautiful.

Posted by Andy Eggers at January 16, 2009 9:49 AM