March 22nd, 2010

breaking bad

A revolution in machine translation

crowleycrow (John Crowley) posts about Google Translate, and it blows my mind.

Full explanation in the New York Times here.

I've always thought that natural language translation is the test case and driver for AI development, and translation software has always been crap, making me think that we are miles away from AI. We still are miles away from AI, but translation problem is now suddenly much closer to a solution.
Google Translate is a statistical machine translation system, which means that it doesn’t try to unpick or understand anything. Instead of taking a sentence to pieces and then rebuilding it in the “target” tongue as the older machine translators do, Google Translate looks for similar sentences in already translated texts somewhere out there on the Web. Having found the most likely existing match through... statistical reckoning ... Google Translate coughs it up, raw or, if necessary, lightly cooked. That’s how it simulates — but only simulates — what we suppose goes on in a translator’s head.

Excellent. This is the way forward. It's distributing the task of translation to multiple human thinkers, and harvesting their intelligence. It's just the way social networking produces useful filtering.

Translation is a very important issue for me, and the harnessing of distributed human social intelligence. I've got to go to work now, but I couldn't resist posting about this.