November 13, 2005

Distributed outsourcing

A couple of weeks ago, Amazon Web Services introduced the Mechanical Turk, which inverts the usual relationship in interactive computing by providing "a web services API for computers to integrate Artificial Artificial Intelligence directly into their processing by making requests of humans". The name is a reference to Wolfgang von Kempelen's Turk, an 18th-century chess automaton which pretended to be a sort of clockwork computer, but in fact incorporated a small, hidden, human player.

Amazon's Mechanical Turk (whose welcome page is here) is "currently experiencing extremely heavy traffic", so it's going to be hard to really sign up to

Complete simple tasks that people do better than computers. And, get paid for it. Choose from thousands of tasks, control when you work, and decide how much you earn.

I presume that most of the current traffic is rubbernecking, but in principle, this sort of thing could turn into a new kind of labor exchange, in which a large pool of workers can connect with a large number of (small or large) tasks.

Of local interest, this kind of labor exchange can be an efficient way to create training data for machine translation, speech recognition and various sorts of pattern-recognition and pattern-classification systems. There are obvious issues of training and quality control, but there are equally obvious solutions. Some colleagues at Johns Hopkins used a similar technique on a small scale a few years ago, to get translations done for a pilot project on machine translation in a language for which little parallel text was available. The main problem in extending their (quite effective) experiment was the issue of tax and employment regulations. It's not so easy for an American organization to pay a large number of individuals from around the world, without running afoul of various IRS regulations. As I understand it, the issues in payment for services are different in this respect from the issues in auction or sales sites like eBay. I wonder how Amazon deals with this problem?

I also wonder how Amazon prevents this from being used for the most obvious single application, namely helping spammers circumvent captchas?

A post at Bitporters media gives one Turk-worker's experience:

So four days and 505 HITs later I'm sitting at a cool $6.84. Note: More than half of my HITs are still in the pending state. I'm getting pretty quick at cracking these off, with my tabletPC I'm down to < 5-10 seconds pet HIT. At 3 cents a hit, I'm not really sure if this is a waste of time or not, for now I'm just going to do enough to buy a book I've been wanting.

If you can really keep up 10 HITs per minute at $.03 per HIT, that's $18/hour, which would appeal to a lot of people, especially for a job that you can do from any location, whenever and for however long you like. I wonder whether the numbers really do work out that way -- this person claims to have worked for 30 minutes at a rate of $10.20/hour, minus (an unknown number of) disallowed HITs -- but in any event, a system like this will presumably bring supply and demand into a world-wide equilibrium at a rate of pay that reflects the value that employers put on the product, and the value that workers with the relevant skills put on their time. If employment regulations actually permit such a marketplace to develop...

Posted by Mark Liberman at November 13, 2005 09:56 AM