February 09, 2004

"sex pro" (not)

A Language Log piece on recent Sapir-Whorf research is (at this moment) the top-ranked Yahoo search result for "sex pro." This underlines something that we all know: as useful as information-retrieval systems are, they still have a problem with precision (which is what IR researchers call the proportion of things found that are relevant). Either that, or I need to work on my headline writing.

Just for the record, I know about Yahoo's current results for "sex pro" because some internet pilgrim found our site by following the link from Yahoo, and this was duly noted in our server log, which I check from time to time. These days, I'm happy to say that most of the (roughly 400) folks a day that come to our site via search engines probably find what they're looking for. Last night's 27 pilgrims interested in talking parrots, the 30 who wanted to know something about emo, and even the 22 who asked about "wedding vowels", all probably went away with something useful to them. Well, I'm not sure about the substantial group who asked about "emo girls" and similar things, but I'll give us the benefit of the doubt there.

The logs of a site like ours are not a very good measure of the precision of the search engines, since mostly people should be able to tell from the short context presented with each suggestion whether it's a hit or a miss (and often it's pretty hard to figure out from the query text what the searcher was really looking for). But there were certainly several other crystal-clear false positives in last night's crop, such as the folks who found our site as a result of queries like "how to say things about sex in other languages". We haven't provided any relevant information here, but judging from the level of interest, there's a market for a book by that title.

Posted by Mark Liberman at February 9, 2004 09:07 AM