February 07, 2008

Romney can't compete with Senator Moccasin

Today's big political news story was Mitt Romney's announcement that he was suspending his presidential campaign. When a major event like this occurs, everyone's anxious to get the news out quickly, so it's tailormade for... the Cupertino effect! Once again, Google News gives us an early report off the Associated Press wire with an embarrassing spell-checker error, and then leaves it online long after corrected feeds have gone out:

This is a transformation of McCain's name already predicted in this space, in Mark Liberman's Jan. 12 post reporting on what happened when Cody Boisclair ran the names of presidential candidates through the built-in spell-checker on Mac OS X. But McCain appears correctly throughout the rest of the AP article, so there hasn't been a global substitution of the name. And McCain is actually included in recent Microsoft Office speller dictionaries (unlike Huckabee and until very recently Obama), so it's fair to assume this isn't the result of a correct spelling being misrecognized.

Instead, it appears to be a case of a spelling error being miscorrected into a different word on the spell-checker's list of suggestions. My first guess was that the reporter typed Moccain here, since that's just one missing letter away from moccasin. As Thierry Fontenelle of the Microsoft Natural Language Group explained, deletion of one character is a pretty small "edit distance," so moccasin is an obvious suggestion. But Steve Chrisomalis, who was one of two readers (along with Paul Justice) to email me about this, has a better candidate: Maccain, since his version of MS Word gives moccasin as the first choice and McCain as the second.

As usual, this is not intended to mock the hard-working reporters and editors who fall into spell-checker traps. Do not judge a journalist until you've walked a mile in his McCains moccasins.

[Update: Our good friend Thierry Fontenelle of Microsoft writes in:

I took your text and ran it through Office/Word 2007 (our latest version). As you can see below, McCain appears as the first suggestion of Maccain. But even if it did not, as in your reader's version, apparently, I really wonder why people do not read the suggestions before accepting them... I'm afraid there is very little we (i.e. the people who create these tools) can do about that...

And Bob Hay's got another theory:

You suggest "Maccain" as a possible source of the correction to "moccasin". Confusing "Mac" and "Mc" is certainly possible, as both are common in names. But this seems somewhat unlikely to me, since the name was spelled correctly in the rest of the article. Additionally, capitalization of the second "C" is relevant, "Maccain" (2 errors) versus "MacCain" (1 error). My spell check (Word 2004 for Mac) gives "moccasin" for "Maccain" but properly gives "McCain" for "MacCain".
My theory is that the original mistake was "McCasin". It's an easy keystroke error to make since the "a" and "s" are next to each other on the keyboard. Only one letter away, and my spell check gives "moccasin" as the first suggestion regardless of capitalization.
]

Posted by Benjamin Zimmer at February 7, 2008 08:39 PM