January 25, 2008

Linguification extinct?

Just a word to thank the journalistic community at large for apparently giving up the practice of linguification, which used to puzzle me so much. I did catch someone on NPR saying in November 2006 that "when you say recount you know the word Florida can't be far behind" (totally false, of course: recount gets 4.8 million Google hits, while {recount Florida} gets only 166,000, so that means over 96% of the time a page with recount on it does not have Florida on it anywhere [but see below]). Then 2006 ended, and 2007 was blissfully free of linguifications as far as my recollections are concerned. Language Log apparently did not have to mention the word at all during last year. The term bloomed in July 2006, and flourished, and then like a fragile flower it was gone before that year ended, and with it the phenomenon to which it referred. No more occurrences of this strange trope — writing a demonstrably false claim about linguistic occurrences as a perverted way of expressing a (possibly true) claim about non-linguistic matters — were ever seen again. We have vanquished the practice.

Unless... Why do I have a sinking feeling that people are now going to start mailing my Gmail account with new sightings, and thus disappoint me in this matter? I should keep my big mouth shut (if that metaphor is appropriate for a blogger; maybe it would be better to say, I should keep my big paws off the keyboard).

I'm not actually serious about the above Google figures showing anything. For one thing, the counts should have been done in 2006, or better still between November 2000 and the end of 2001. But in addition, couple of correspondents point out that Google counts on this topic are mathematically incoherent to an unusual degree. Steve Hansen, trying to distinguish the noun meaning "repeat of tabulation" from the verb meaning "narrate", found that "a recount" gets 669,000 g-hits and {"a recount" Florida} gets 215,000, and notes that this means {"a recount" Florida} get more g-hits than {recount Florida}, which simply cannot be right (every page with "a recount" on it should be a page with "recount" on it). I have replicated that result. And Tim McKenzie, thinking that the verb would not be spelled with a hyphen but the noun sometimes would be, did Google searches for {re-count} and {re-count Florida}. The former got 803,000 hits, but the latter got 1,290,000. Tim remarks:

So, the astonishing claim on NPR seems to have been an understatement; 160% of pages that mention the word "re-count" also mention the word "Florida".

None of this makes any numerical sense. There really is a lot we don't know about the Google search and tabulation algorithms.

Posted by Geoffrey K. Pullum at January 25, 2008 07:03 AM