July 04, 2007

The state of Natural Language Processing as Revealed by Google

A few minutes ago I googled for information about the incident in 2003 in which the Saudi religious police prevented the rescue of students from a burning girls school in Mecca because they were not (by the standards of misogynist psychopaths) fully dressed — killing 15 of them. (In fact, it seems it was even worse; they reportedly actually forced some girls who had escaped back into the burning school.) I used the query "Saudi Arabia girls school fire". In addition to the regular results, which were useful, I got two sponsored links, which Google chooses by means of Natural Language Processing techniques. One of them was for firefighter training; the other was a link to a dating site headed "Hot Dubai Women".

Addendum: as one reader has pointed out, Google allows both for exact matches and "broad matches". The dating site link is presumably the result of a broad match and does not reflect the full capability of Google's matching algorithm. It is still hilarious, though in exceedingly bad taste.

Further addendum: I didn't notice the bad pun with "Hot Dubai Women" being a "broad match", pointed out by a reader. Honest.

Posted by Bill Poser at July 4, 2007 05:13 AM