March 31, 2006

Does it matter what linguists think?

Of course it does, at least when the topic is language, but I'm having some trouble figuring out what Michael Kinsley means by raising the issue today in his musings on "The Twilight of Objectivity":

Would it be the end of the world if American newspapers abandoned the cult of objectivity? In intellectual fields other than journalism, the notion of an objective reality that words are capable of describing has been going ever more deeply out of fashion for decades. Maybe it doesn't matter what linguists think. But even within journalism, there are reassuring models of what a post-objective press might look like. [emphasis added]

Is Kinsley using linguistics to represent those fields where the notion of objective reality has become unfashionable? I hope not, because linguists' failure to embrace PoMo attitudes has left them sadly estranged from academic humanists. If anything, my colleagues are all too confident that objective reality not only exists, but is firmly in their grasp.

Or does Kinsley assume that the field of linguistics bears overall responsibility for judging the connections between words and things, so that the last slim thread tying journalists to the pretense of rationality is a concern for how linguists will judge their work? If true, this would certainly favor my campaign to ensure that every civilized person is taught the basic concepts and techniques of linguistic analysis -- but other disciplines also have a role to play in the game of truth and consequences, and I'm sure Kinsley knows that.

No, I'm afraid that I can't make any sense at all of the progression of ideas in the paragraph that I've quoted. This worries me, though perhaps it shouldn't. So if you think that you can reassure me by explaining what Kinsley was getting at, please let me know, and I'll tell the world.

[A web search does suggest that this is not the only case where Kinsley has drafted linguists into rhetorical service; for example:

(link) Linguists note that the question, "Who lied in George Bush's State of the Union speech" bears a certain resemblance to the famous conundrum, "Who is buried in Grant's Tomb?" They speculate that the two questions may have parallel answers.

But that one I understand.]

The straw dog of amnesty

The debate over a Senate bill to legalize illegal immigrants has devolved into squabbling over the word "amnesty," Dana Milbank reports in today's Washington Post. Opponents of the bill are doing their best to invoke the scary "A-word" whenever possible, while supporters insist that the bill promotes not "amnesty" but "earned citizenship." One vocal opponent, Sen. Jeff Sessions (R-Ala.), bulldozed through any pesky semantic nuances:

"In every sense of what people mean by amnesty, it's amnesty. If it's not amnesty, it's the same thing as amnesty. That's just what it is. ... By any standard definition of the word 'amnesty,' this bill has it. I know that's a loaded word, and I don't want to be playing around demagogically with the word 'amnesty.' "

I'm still trying to wrap my head around the exquisite logic of "If it's not amnesty, it's the same thing as amnesty." I suppose he means that even if the bill doesn't fit some narrow legalistic definition of "amnesty," it's still close enough to the popular conception of the word. But then after taking this populist, Colbertesque approach to language, he changes his tune and appeals to the "standard definition of the word." (I'm surprised that Sessions, wryly identified by Milbank as "the Alabama lexicographer," didn't pull out a dictionary to provide a "standard definition.") And after all that, he has the temerity to claim that he's not "playing around demagogically with the word 'amnesty.'"

Sessions' colleague in the House, Rep. Steve King (R-Iowa), went so far as to equate "amnesty" with another notorious A-word:

"Anybody that votes for an amnesty bill deserves to be branded with a scarlet letter, 'A' for amnesty, and they need to pay for it at the ballot box in November."

There's nothing like references to puritanical persecutors to rally the public to your cause! Meanwhile, a more temperate opponent of the bill, Sen. Judd Gregg (R-N.H.), unwittingly provided his own literary allusion:

"I think that's a straw dog, to be very honest with you, this argument of amnesty. The debate is misfocused in some ways when the word 'amnesty' becomes the hot button, nomenclature versus the more substantive question."

Gregg's measured response is undercut a bit by his odd use of "straw dog." Granted, the expression avoids the gender specificity of "straw man" without resorting to the awkwardness of "straw person." But I've never heard "straw dog" used as a replacement for "straw man" (straw man's best friend?). It's possible that he was mentally juggling "straw man" with some canine idiom, like "that dog won't hunt" or "that's a dog of an argument." More likely, though, there was interference from the title of Sam Peckinpah's 1971 film Straw Dogs. Peckinpah borrowed his cryptic title from a passage in Lao Tzu's Tao Te Ching:

Heaven and earth are ruthless, and treat the myriad creatures as straw dogs;
the sage is ruthless, and treats the people as straw dogs.

D.C. Lau's translation of Tao Te Ching (Penguin Classics edition) explains in a footnote that "straw dogs were treated with the greatest deference before they were used as an offering, only to be discarded and trampled upon as soon as they had served their purpose." So perhaps Senator Gregg's comment contains a hidden Taoist message. Opponents of the immigration bill will use the "amnesty" argument as a sacrificial offering to the American public. Then as soon as they have defeated their foes, the argument will be instantly disposed of. Sounds like a keen insight into the transience of opportunistic political demagoguery.

[Update: See Languagehat for further discussion of the Tao Te Ching passage.]

An ambiguity of editing

In the Letters section of the April 10, 2006 issue of The Nation, the editors wrote:

"A Letter to the American Left" (Feb. 27) by Bernard-Henri Levy (aka BHL) ignited a firestorm of (mostly) angry mail. Readers described Levy's writing as "inane," "tripe," "blather," "windbaggery," "merde," "condescending and crass," "shortsighted" and "best left to line the bird cage." They called Levy a "fashionable lightweight," "motor-mouth," "posturing populist," "narcissistic" and an "arrogant," "self-promoting" "rock-star philosopher." Kevin Beavers of Freiburg, Germany, suggests that BHL has been "bumbling about America more like Inspector Clouseau than Tocqueville." Many wrote to inform Levy that Susan Sontag is dead (an ambiguity of translation caused some to mistake his complimentary invocation of Sontag as referring to her in the present tense). [emphasis added]

Jan Freeman emailed to ask what the "ambiguity of translation" might have been: "was it something about the choice of possible tenses, the difference in the way French and English express that kind of generalization, what?"

The phrase "ambiguity of translation" suggests that something in the source language could have been construed in two different ways, and the translator made the wrong choice, rendering the phrase in a way that gives a mistaken impression of the writer's intended meaning. But after a bit of investigation, I've concluded that this is not what happened here. Rather, both the French original and the original translation into English were unambiguous. The problem was introduced in the subsequent editing of the English version. Thus it would be more accurate to say that the culprit was "an ambiguity of editing".

Here's how the passage in question appeared in the version of Bernard-Henri Levy's "A Letter to the American Left" as printed in the February 27, 2006 issue of The Nation:

And Guantánamo? And Abu Ghraib? And the special prisons in Central Europe, those areas where the rule of law no longer applies? I know, of course, that the press has denounced them. I know you have journalists who, in a matter of days, accomplished what our French press still hasn't finished forty years after our Algerian War. But since when does the press excuse citizens from their political duties? Why haven't we heard from more intellectuals like Susan Sontag--or even Gore Vidal and Tony Kushner (with whom I disagree on most other grounds) on this vexed and vital issue? And what should we make of that handful of individuals who, after September 11, launched the debate about the circumstances in which torture might suddenly be justified? [emphasis added]

BHL was referring to Sontag's essay "Regarding the Torture of Others" (NYT Magazine, May 23, 2004), and asking why more intellectuals haven't followed her example. But the sentence is indeed ambiguous, as we can see more clearly if we leave out the aside about Vidal and Kushner:

Why haven't we heard from more intellectuals like Susan Sontag on this vexed and vital issue?

There are two possible construals of the relationship between Sontag and the sets of intellectuals heard from or not. The choices are brought out by the different continuations below:

(a) Why haven't we heard from more intellectuals like Susan Sontag, who has been the only one to say anything?
(b) Why haven't we heard from more intellectuals like Susan Sontag, who has been uncharacteristically silent?

I think that both continuations are linguistically plausible. I originally thought that this might reflect a sort of restrictive vs. non-restrictive interpretation of the "like NP" phrase. But a note from John Cowan has persuaded me that this is a vagueness intrinsic to phrases that describe a class by giving a characteristic member. He offers the examples

(a) Birds like the American robin are very common in the Eastern United States.
(b) Birds like the American robin are very common in Siberia.

where example (a) suggests that the examplar is included, while (b) suggests that it's not. These interpretive options continue to exist in phrases such as "robin-like birds".

Apparently many people read BHL's reference to Sontag in the inclusive sense -- "why haven't we head from Susan Sontag and the rest of the intellectuals like her?" -- and were moved to respond "Because she's dead, you dope!" or "We have, you ignoramus!", depending on whether or not they remembered her 2004 NYT magazine piece.

But no such interpretation was plausible for BHL's original French version of the crucial sentence:

Pourquoi, depuis Susan Sontag, n’entend-on pas davantage les clercs sur le sujet?

An interlinear gloss:

Susan Sontag
Susan Sontag

And the same is true of the original (unedited) English translation, which is idiomatic but close to the original:

Why haven’t we heard from more scholars, post-Susan Sontag, on the subject?

And here is that unedited sentence in the context of the rest of the original English translation of the paragraph:

And Guantánamo? And Abu Ghraib? And the special prisons in central Europe, those areas where the law doesn’t apply, that are unworthy of a democracy? I know, of course, that the press has denounced them. I know that you have journalists who in a few days did work that our French press still hasn’t finished doing forty years after our Algerian War. But since when does the press excuse citizens from their duties? Why haven’t we heard from more scholars, post-Susan Sontag, on the subject? And what should we think about those individuals who, after September 11th, in Dissent and elsewhere, launched the pointless debate about the circumstances in which recourse to torture might suddenly be justified?

As you can see, the language of this paragraph was modified quite a bit on its way to the pages of The Nation. Some of the changes are minor: "their duties" became "their political duties". Other are a bit more consequential: "the pointless debate" became "the debate", and the reference to Dissent was dropped.

The only major addition is the reference to Vidal and Kushner, and I think this addition suggests where the misreadable phrase in the final version came from. BHL's original sentence, though complimenting Sontag, suggested falsely that other American intellectuals have been silent about the practice of torture by American forces in the GWOT. In fact, many Americans, scholars or otherwise, have been pretty noisy on this subject -- including some on the right, like Andrew Sullivan and John McCain. The noisier voices on the left include Gore Vidal, a contributing editor of The Nation, and Tony Kushner, an editorial board member. I surmise that (some of) these facts were brought to BHL's attention, and he agreed to add Vidal and Kushner's names to Sontag's -- along with the otherwise-gratuitous parenthetical swipe "with whom I disagree on most other grounds".

The exchange between The Nation's editors and BHL was perhaps a bit strained, and the crucial sentence needed major surgery to incorporate both the two new names and the parenthetical disclaimer. In this difficult process, the phrase became awkward to the point of ambiguity.

And I may be reading too much into the textual tea leaves here, but in the Letters section of the April 10 issue, the editors seem to have been enjoying themselves as they summarize their readers' evaluations of BHL's remarks:

"inane," "tripe," "blather," "windbaggery," "merde," "condescending and crass," "shortsighted" ... "best left to line the bird cage" ... "fashionable lightweight," "motor-mouth," "posturing populist," "narcissistic" .. "arrogant," "self-promoting" "rock-star philosopher" ... "bumbling about America more like Inspector Clouseau than Tocqueville."

[An ironic postscript... Thomas Lifson at The American Thinker was one of those who misunderstood the sentence we've been analyzing. But his interpretation is sadly off the mark:

Clueless liberal intellectuals never fail to astound us with their ignorance. Frenchman Bernard-Henri Levy is famous enough as a philosopher and left wing activist to be known by his initials BHL in chic French intellectual circles, as well as among American Upper West Siders like Katrina vanden Heuvel, editor and publisher of The Nation magazine.

Ignorant pretentious verbiage is the norm on those pages, of course. But usually the lefties keep up with gossip about each other. But The Nation has just published a column (translated from the French) entitled “A Letter to the American Left” by BHL in which the noted member of the nouveau philosophes lists assorted American outrages and laments:

“Why haven’t we heard from more intellectuals like Susan Sontag…?”

Apparently BHL and his editor vanden Heuvel haven’t been doing much reading lately. Susan Sontag died in December, 2004. Her obituary made all the big newspapers, from the LA Times to The Guardian.

Someone has got to break the sad news to these two intellectual giants.

The sad thing is not that Lifson misunderstood the sentence, it's that he sees Lévy and The Nation as indistinguishable "liberal ... left wing activist[s]". In fact Lévy is about as close as France gets to a neo-conservative; and it's clear that he is politically not at all in tune with either The Nation's editors or its readers. I'm not sure whether this mistake is due to Lifson's perspective -- Los Angeles and Beijing are indistinguishable when viewed from Beta Centauri -- or due to his lack of curiosity about political positions that differ even slightly from his own.

This reminds me of something by Glen Greenwald that Andrew Sullivan recently recently quoted:

"It used to be the case that in order to be considered a 'liberal' or someone "of the Left," one had to actually ascribe [sic] to liberal views on the important policy issues of the day ...

Now, in order to be considered a "liberal," only one thing is required – a failure to pledge blind loyalty to George W. Bush."

The same misunderstanding of the Sontag sentence appears in Doug Ireland's attack on BHL from the left, though of course Ireland does not use the same terminology.

As for the implicit dispute between BHL and The Nation, it's tempting to interpret it in terms of Adam Gopnik's riff on fact checkers vs. theory checkers. But I'll resist the temptation.]


March 30, 2006

What part of speech is "the"?

Bill Poser notes another Sign of the Apocalypse (Education Division):

Of all people (other than professional linguists), you'd think that high-school English teachers would know basic grammar, but they don't. In the past few years, two of my friends in British Columbia have retreaded as secondary school teachers. Both of them majored in English as undergrads and are now teaching English or mostly English. One of them took a couple of linguistics courses as well as a year of Carrier language, so she actually knows something about grammar. When she did her practicum, she reported in dismay that one of the regular English teachers was teaching that "the" is an adjective and was not to be persuaded otherwise. The knowledge of the other English teachers was no better, but such questions didn't arise because they never expressed themselves on grammatical questions at all.

The problem with "the" (and many other items) is that the school tradition about parts of speech is so desperately impoverished. 

It's the Big Four (noun, verb, adjective, adverb) and the Little Two or Three (preposition, conjunction, sometimes pronoun), plus an appendage (interjection).  Everything has to fit in here somewhere, and since the parts of speech are defined semantically in this tradition, "the" just has to be an adjective, because it's a kind of modifier.  What else could it be?  (If you have pronoun as a part of speech, that would be a very clever answer, but you're going to have a lot of trouble convincing non-linguists of that.)

In fact, if you look up "the" in the OED, it's labeled "dem. a." (demonstrative adjective).  Ok, this is then glossed "(def. article)", but the main part-of-speech classification is as a kind of adjective.  Hey, this is THE dictionary of English, the boss man of English dictionaries.

A scholar of Chinese literature asked me last week about English "much" (as in "much difficulty"): what part of speech was it?  I said that it was a kind of determiner, more specifically a quantity determiner.  She looked despairing.  Coming to her aid, a scholar of medieval religion explained (genially, I should add) that "in terms ordinary people know, that would be an adjective".  (At the Stanford Humanities Center, we believe deeply in cross-disciplinary conversations, but these require a certain amount of negotiation.)  And yes, in those terms, he was right; any kind of modifier of a noun counts as an adjective.  (A position that quickly leads you into all sorts of problems.  What about the first word of noun-noun compounds, like "Christmas cookie"?  "Christmas" here is functioning as a modifier, so it should be classified as an adjective, but it sure looks a hell of a lot like a noun.)

The deeper problem is the school tradition itself.  It's a TRADITION, after all -- a system devised in the past and treated as a kind of dogma in the present.  The idea that you could DISCOVER what the parts of speech in some language are, that this is (in principle) an empirical question, is foreign to this way of thinking.  Even stranger is the idea that there could be a whole lot of them, some of them subtypes of others, and some of them overlapping.  Still stranger is the idea that though the parts of speech of one language will usually correspond very roughly to those of another, there can be considerable differences.  But linguists are here -- and have been for a very long time -- to tell you that you should take these ideas seriously.

In the meantime, a linguist who proposes to introduce, say, the technical term determiner for a class of pre-adjectival modifiers in English that includes the articles, demonstratives, quantifiers, possessives, and more is likely to be seen as UNDERMINING tradition, casting off the sureties of the past in favor of fashionable jargon.  Evil, obfuscationist linguist! 

zwicky at-sign csli period stanford period edu

If the researchers for a BBC-commissioned study can only find 28 rude words in British English, then they're really not looking very hard. Fortunately, whenever the next study is conducted, the BBC will be able to exploit an in-house resource cataloging vernacular obscenities.

BBC America recently started up a "British American Dictionary" on its website to help us poor colonials understand the argot of such shows as Little Britain and Footballers' Wives. The maintainers of the dictionary fielded submissions from users (much like Merriam-Webster's Open Dictionary), and visitors to the site have approached the task with gusto. According to a recent article in the Scotsman, the dictionary already tops 2,500 terms, broken down by category and region. "As with all dictionaries," the Scotsman reports, "the most well-thumbed sections doubtless relate to sex and insulting words." (Can an online dictionary be well-thumbed? Well-clicked, maybe?)

Sad to say, the "insulting words" and "sex" categories are still on the meager side, currently containing only 36 and 19 terms respectively. But that's still way ahead of the 28 items ranked for rudeness in Andrea Millwood-Hargrave's report. And the moderators of the dictionary are no doubt withholding the really offensive stuff, since even in the posted entries the F-word, the S-word, and the C-word have been daintily asterisked. (This despite the site's wry disclaimer: "Please note: the dictionary contains many words that are not used in the politest of conversations. If you search for insulting words or browse through our collection of insulting words, you may be insulted.")

I eagerly await the study that analyzes the relative rudeness of bampot, berk, div, gobshite, knob jockey, tosser, and twonk. (For connoisseurs of such verbiage, let me recommend the excellent new edition of Cassell's Dictionary of Slang, edited by Jonathon Green. According to Cassell's, twonk is a blend of twat and plonker, dating to the 1980s.)

March 29, 2006

Delete expletives

While we're on the topic of standards in public discourse: Ben Goldacre at badscience has recently posted the BBC's "ranked list of rudeness". He originally saw it at a meeting with "the editorial policy/legal people at the BBC", when he asked them "which swear words am I allowed to use?"

The list was treated as sensitive information by the BBC EP/L people --

"... the offending piece of paper was physically removed from my hand (I think they had the idea that I would scan it, post it on my blog, and write an article about it).

... I mentioned this to someone else from the BBC at a party recently: she sent me a copy this morning, and as you can see, I have indeed scanned it and posted it on my blog. Disappointingly the list turned out to be from a report which is freely available in the public domain ... "

Don't you hate when that happens? Specifically, the source is a December 2000 report by Andrea Millwood-Hargrave, "Delete expletives?", which identifies itself as "Research undertaken jointly by the Advertising Standards Authority, British Broadcasting Corporation, Broadcasting Standards Commission and the Independent Television Commission". It's more recent and more scholarly than George Carlin's famous 1973 "Filthy Words" monologue, but not as funny. Also, three of Carlin's "seven words ... that will curve your spine, grow hair on your hands and maybe, even bring us, God help us, peace without honor" are unexpectedly missing from Millwood-Hargrave's list of 28, as are three of the four that he added in a later installment. I'm not sure whether this is because of differences between British and American norms, or simply due to limited survey funds.

This raises again the difficult question of how to inculcate and enforce linguistic taboos without violating them. Cultures down the eons have managed this, but it's not always an easy trick.

Posted by Mark Liberman at 11:19 AM

March 28, 2006

Why comics avoid the name "Clint"

According to Joseph Gerharz, at least one of these "has been floating about for years on the net, decades in print. I first saw it back in 1982 at a comic con; and was told there was an actual policy against using the name Clint in comics after this was published."

[Don Porges writes:

I've heard "never name your character Clint Flicker". I am sure I will be but one of many to point out that there is at least one semi-major Marvel character named Clint; the archer Hawkeye is/was Clint Barton (since the late 1960s). He is currently deceased, although I think I read something that implied that his status could change soon. (Look, it's _comics_.) Although in the "alternate" Marvel continuity, the "Ultimate" line, he's alive.

By the way, this becomes less of an issue with the advent of mixed-case lettering, computer lettering, and better ink and paper to avoid the letters bleeding into one another.


[And Eric Bakovic writes:

Sometime between 1984 and 1987 there was a magazine cover story on Clint Eastwood, and in big block letters on the cover it said "CLINT!" I saw it in a magazine rack at my school, and the very bottom of the letters were obscured in such a way that the relevant visual effect was accentuated. (I believe it was Newsweek ... it wasn't Time, because I just checked their back issue covers; Newsweek's website doesn't provide any, but Time has 'em all the way back to 1923.)

Of course I remember all this detail because I was somewhere between 13 and 16 at the time, and naturally found it hilarious. I may have even gotten kicked out of class for laughing out loud or pointing it out or something.


[Mike Albaugh notes that "the label for the candy 'Flicks' was changed from all upper-case to mixed-case in about 1971".]

The ludic impulses of science writers

I've noted before that in certain domains writers are inclined towards all kinds of playing with language: in advertising, headlines for feature stories, and titles of porn flicks (mentioned here), and also generally in science writing for a popular audience (mentioned here), among other places.  In those postings from last October, I focused on playful variations on formulaic language, but formulaic language often appears "straight" (though in such a way as to call attention to itself), and there's a lot of phonological playfulness -- rhyme, alliteration, assonance, transposition, etc.

In the April 2006 Scientific American, an article by George Musser starts out (on p. 18) with a bang: a self-conscious bit of formulaic language in the head, and then a wonderful phonological transposition in the second sentence (which is followed by a pretty good metaphor).

The head:

The Check Is in the Mail

This is a kind of quotation, of one of three Dubious Reassurances in a joke.  (In the version I know best, it's the only one that's not sexual in content.  Instead, it's about money.)  As the subhead makes clear, the story is about sending money, though not by checks in the mail (so the quotation calls attention to itself):

Does the money migrants send home do any good?

On to the text:

If there is any political issue that could use a dose of scientific rigor, it is migration.  U.S. immigration policy is widely regarded as a total mess, the European melting pot produces pelting mobs, and all over the world tall fences have been constructed to keep facts from entering the debate.

As a sometime student of speech errors, I really appreciated the Spooneristic "melting pot... pelting mobs".  As a sometime student of imperfect rhymes and imperfect puns, I appreciated the phonological transposition even more: not the simple "melting pot... pelting Mott (/mat/)" (which would of course make no sense in this context), but the imperfect transposition in "mobs" /mabz/.  Cute.

Then comes the reference to tall fences being erected, which at first we're likely to take to be about the literal tall fences going up along various borders -- surely they're being alluded to -- but then turns out to be (metaphorically) about barriers excluding information.

Well, the whole performance gave me a moment of pleasure.  Maybe I'm just easily entertained.

The adjective "the"

Geoff laments that Senate aides or Boston Globe columnists don't know their basic lexical categories, but it's actually worse than that. Of all people (other than professional linguists), you'd think that high-school English teachers would know basic grammar, but they don't. In the past few years, two of my friends in British Columbia have retreaded as secondary school teachers. Both of them majored in English as undergrads and are now teaching English or mostly English. One of them took a couple of linguistics courses as well as a year of Carrier language, so she actually knows something about grammar. When she did her practicum, she reported in dismay that one of the regular English teachers was teaching that "the" is an adjective and was not to be persuaded otherwise. The knowledge of the other English teachers was no better, but such questions didn't arise because they never expressed themselves on grammatical questions at all.

My other friend had had no linguistics and in fact had to take a Structure of English course in order to get into the teacher training course. She took it as a distance course, where the students interact with each other and the instructor via email and a wiki. I was often unable to help her when questions arose because her course materials were so vague as to make it unclear what was intended. When I could work out what they wanted, I often found that it was outright wrong. From everything I've heard, British Columbia is not particularly backward in this respect.

Posted by Bill Poser at 02:04 PM

Carrier Scrabble

That Dakota Scrabble set looks pretty nifty, but I have to point out that the idea of playing Scrabble in endangered languages has a longer history. Back in 1994 we did this for Carrier, and yes, we recalculated the letter frequences and values for Carrier. The program we used to do this can be obtained here. Since we didn't have any money to speak of and didn't want to fuss with getting permission from the owners of the game, we made our sets by buying English Scrabble sets and printing labels with the letters and their values on sheets of sticky-backed paper, which we used to re-label the tiles from the English Scrabble sets.

For a short time I was Carrier Scrabble champion, though I'm afraid that this was due to the fact that I knew how to play rather than to my superior knowledge of the language. The Yinka Dene Language Institute web site has a page explaining what we did.

Posted by Bill Poser at 12:45 PM

Uptalk is not HRT

Ben Zimmer recently pointed me to an interesting article by Stefanie Marsh in the Times of London about "The rise of the interrogatory statement" (3/28/2006), which cites a couple of Language Log posts. These posts connected Marsh to some sociolinguistic research (published in the International Journal of Copus Linguistics in October of 2005) that she probably wouldn't have found otherwise, and thereby influenced her conclusions.

It's nice to see this example of a weblog mediating between the scientific literature and the popular press, and so I'm going to add another attempt to influence the Anglophere's ongoing virtual conversation about what Marsh's headline calls "the interrogatory statement". "Uptalk", invented by a journalist in 1993, is a good term for the practice of ending assertions with rising pitch. "High rising terminal" or "HRT", invented by linguists, is a bad term, making a false claim about the phonetics of the phenomenon. It should be abandoned.

The Wikipedia article on "High rising terminal" (to which a search for "uptalk" redirects you, alas) says:

Towards the end of the statement (the terminal), the intonation starts high and rises.

This is empirically false for many instances of uptalk whose pitch tracks I've examined. Uptalk often starts low, at the bottom of the speaker's range. I believe that the "high rising" idea came out of a contested 1990s theory of intonational meaning, which posited a qualitative distinction between high rises and low rises, and assigned uptalk to the category of "high rise" for theory-internal rather than empirical reasons. It's also possible that some geographical variants of uptalk are really high rising in general, though I haven't seen any careful studies that support this conclusion.

There are some pitch tracks and audio clips showing low-rising uptalk in the Language Log posts that Marsh cites, here and here. These are not examples of stereotypical Moon Unit Zappa uptalking -- (part of) the point of those two posts was to document the use of final rises on assertions in contexts where most people don't notice them. I don't have sound clips of more stereotypical uptalk at hand, but I'll dig some up and present clips and pitch tracks in a later post. For now, I'll just give some more terminological history and a few comments on Marsh's Times column.

The term "uptalk" was coined by James Gorman in an On Language column "Like, uptalk?":

I used to speak in a regular voice. I was able to assert, demand, question. Then I started teaching. At a university? And my students had this rising intonation thing? It was particularly noticeable on telephone messages. "Hello? Professor Gorman? This is Albert? From feature writing?"

I had no idea that a change in the "intonation contour" of a sentence, as linguists put it, could be as contagious as the common cold. But before long I noticed a Jekyll-and-Hyde transformation in my own speech. I first heard it when I myself was leaving a message. "This is Jim Gorman? I'm doing an article on Klingon? The language? From 'Star Trek'?" I realized then that I was unwittingly, unwillingly speaking uptalk.

I was, like, appalled?

Rising intonations at the end of a sentence or phrase are not new. In many languages, a "phrase final rise" indicates a question. Some Irish, English and Southern American dialects use rises all the time. Their use at the end of a declarative statement may date back in America to the 17th century.

Nonetheless, we are seeing, well, hearing, something different. Uptalk, under various names, has been noted on this newspaper's Op-Ed page and on National Public Radio. Cynthia McLemore, a University of Pennsylvania linguist who knows as much about uptalk as anyone, says the frequency and repetition of rises mark a new phenomenon. And although uptalk has been most common among teen-agers, in particular young women, it seems to be spreading. Says McLemore, "What's going on now in America looks like a dialect shift." In other words, what is happening may be a basic change in the way Americans talk.

[NYT 8/15/1993]

Gorman's column is smart and accurate as well as funny. He expresses the stigma of uptalk, but removes the nastiness by being appalled at his own usage. And uptalk is an excellent coinage: short, clear and memorable. It prospered in the linguistic marketplace, as it deserved to do, and got an entry in the fourth edition of the American Heritage Dictionary:

A manner of speaking in which declarative sentences are uttered with a rising intonation as though they were questions.

Many people are still just as appalled by uptalk as Gorman was. Stefanie Marsh's article begins:

My sister lives in Los Angeles, and has picked up this irritating verbal tic, “uptalk”, which means that she uses an interrogative tone even when making statements such as: “I never want to talk to you again (?)” In the old days I could pretend to listen to her on the phone while actually reading a book — I would do this by keeping one ear on her intonation and lobbing a well-placed “in principle, I would say yes”, after every one of her high notes. These days when I do that she sighs and says flatly: “It wasn’t a question, Stefanie.”

Ms. Marsh was moved to write because Jason Horowitz's mean-spirited take-down of "City Girl Squawk" in the New York Observer mentioned uptalk. She shares his irritation with this "verbal tic" as well as his low opinion of its female vectors:

But uptalk has spread far beyond California and the dur-brained Valley Girls who are supposed to have invented it. An article in last week’s New York Observer confirms that “high-rise terminals” have infected the East Coast, while psychology professors writing in the Toronto-based Globe and Mail talk of an “epidemic” in Canada. We won’t even talk about Australia.

Note the repeated infectious disease metaphors. The low status of the people whose usage is noticed -- students, women, benighted colonials -- is an interesting example of what Arnold Zwicky has called the "out-group illusion":

... people pay attention selectively to members of groups they don't see themselves as belonging to and so locate [novel linguistic] phenomena as characteristics of these groups.

Since non-uptalkers often think that uptalk rises are really meant as questions, and assign low status to the people they associate uptalk with, it's natural to conclude that uptalk must be a signal of self-doubt and need for reassurance. As Marsh explains

...the view held by experts was that uptalk was a symptom of self-doubt: framing your statements as questions was thought to indicate a desire for approval. Research by DiResta Communications in 2001 found that uptalk was “destroying the credibility of millions of professionals who are unknowingly falling victim to this increasingly common form of speech”. DiResta claimed that uptalk was the result of having either foreign parents or low self-esteem. The bottom line was that nobody could take you seriously as a boss when you pronounced “You’re fired!” as “You’re fired?”

Similar theories were commonly expressed during the original flurry of uptalk discussion in the early 1990s. But thanks to the internet, contrary (and I think more accurate) opinions are now more easily accessible. Marsh continues:

New studies show that people who use uptalk are not insecure wallflowers but powerful speakers who like getting their own way: teachers, talk-show hosts, politicians and facetious shop assistants.

Mark Liberman, a phonetician at the University of Pennsylvania, who has been monitoring George W. Bush’s speeches on his fascinating weblog Language Log, points out that the President has started peppering his Iraq speeches with HRTs. Why? Not, apparently, because Bush’s confidence is failing him. Rather, it has more to do with an aggressive need to direct conversation. Liberman quotes from a linguistics paper published last year in which scientists counted the number of HRTs used in real-life conversations: “In four business meetings . . . the chairs (sic) used rise tones almost three times more often than the other participants did.

“In conversations between academic supervisors and their supervisees, the supervisors used rise tones almost seven times more often than the supervisees. So maybe the problem with ‘Valley Girls’ and other youth of the past couple of decades,” continues Liberman, “is really that they’re, like, totally self-confident and socially aggressive?” This news seems to have percolated down to primary schools ages ago. Parents: you are being had.

[Update: Kathe Burt writes (from Oregon):

I have friends who talk this way, and it seems to me that they *do* expect me to say "uh-huh" or something else vaguely positive when they pause after the rise in tone. If I don't, they think I'm not listening.

As I understand it, uptalk is often (intended and understood) as an invitation for the interlocutor at least to signal attention and perhaps also to assent.

The key thing is that "uptalk" is not a signaling a question, in the literal sense of a request for information about the truth of the proposition being presented; nor does it (usually) mean that someone with low self-confidence is making a plea for reassurance. Rather, the studies suggest that it's usually someone who feels in control of the interaction and is inviting a response, as evidence that the interlocutor is going along.

But there are quite a few reasons for final rises in (most forms of) English: the intitial if-clause of a conditional or the first option of an exclusive disjunction is often rising; lists may be presented with rises on their non-final members; and of course yes-no questions are stereotypically performed with final rises.

Some dialects apparently use final rises as the default option, or at least much more often than speakers of other dialects expect. This is apparently true of Belfast English, for example -- and something similar has apparently been happening with the world-wide spread of uptalk over the past couple of decades, at least in the sense that some people have come to use final rises much more often. It's possible that the thin edge of the uptalk wedge, so to speak, has been the "are you with me?" rise. Pretty much all English speakers use this sometimes, or at least can do so if they choose to. But if someone chooses to do this almost all the time, then its force fades with repetition, and perhaps in some cases becomes almost totally bleached out.

It's also worth mentioning that the form of final rises can vary a lot. The starting point and ending point can move around, with respect to the speaker's pitch range and also with respect to the immediately preceding material. The rate of rising and the alignment with the words of the message also vary. It remains unclear, in my opinion, which aspects of this variation are phonetic dimensions that speakers can choose to deploy to a greater or lesser degree, like the choice of how fast or loud to talk, and which aspects involve qualitatively different alternatives, like the choice between two different words. So if you're interested in prosody, communication and dialect variation, this is a great topic to study.]

Posted by Mark Liberman at 10:30 AM

Dangerously incompetence

From the Boston Sunday Globe last weekend, March 19 (sorry, I'm behind with the blogging), on page A8, from Nina J. Easton's politics column "The Briefing", about the Democrats deciding to start a new campaign of using the phrase "dangerous incompetence" when talking about President Bush:

The one-time amateur boxer [Senate Democratic Leader Harry Reid] was eager to give the incompetence label "an edge," said a Reid aide, Jim Manley, and "dangerous" was the winning adverb...

Winning adverb? It's an adjective! What the hell is going on in a culture where someone can hold a job as either a Senate aide or a Globe columnist (we can't tell which from the above quote, because we don't know whether Nina Easton was quoting or paraphrasing) without knowing the basic lexical categories of grammar from one another, and a man can get a reputation as a business writer when he can't do fractions? Is it just us at Language Log Plaza who think this is almost unbelievable?

(Mark my words, some harrumphing old fool is going to email me now about how I should have said "Is it just we at Language Log Plaza..." That would certainly be a possibility in formal style. I'm chatting with you now in informal Standard English. Informal isn't wrong. Informal is friendly. And it isn't non-standard. "I don't got none" is non-standard. "Is it just us" is normal. Trust me. Why would I lie to you about these things when I could easily have written "Is it just we" if I had decided to go formal? I didn't go formal because you're a Language Log reader, so you're like family.)

Incidentally, the adverb is the word that Senator Debbie Stabenow put up beside her on a big red sign saying "DANGEROUSLY INCOMPETENT" during a speech in the Senate after the campaign was launched. Adverbs can be used as pre-head modifiers in adjective phrases, you see. She got that right. Only she made the mistake of wearing a bright red suit in just the same color as the sign, so when she stood beside it and the Republicans saw what it looked like... Oh, dear, the Democrats just can't do this image and media stuff, can they? And having George Lakoff advise them won't help.

They should hire me to lead them. "Listen," I'd say: "Number one, learn to distinguish adjectives from adverbs. Just get it right. Noun, verb, adjective, adverb, preposition... learn those for a start. Don't be like Jon Stewart. Don't make us look ignorant. And number two, if you're going to illustrate a speech by displaying a big sign saying ‘DANGEROUSLY INCOMPETENT’ or ‘DIMWITTED GOOFBALL’ or ‘INTERN GROPER’ or ‘MENDACIOUS PONTIFICATING OLD WINDBAG’ or anything, that's fine, but just don't get photographed beside it, OK? Now let's go fight for a better future for America. And hey: be careful out there."

Posted by Geoffrey K. Pullum at 09:52 AM

Dakota Scrabble, anyone?

Via Patrick Hall's Blogamundo comes news of Scrabble being used to promote the learning of Dakota Sioux. Patrick wonders if the makers of the game "rebuilt distribution and points on the tiles to correspond to Dakota frequencies." I don't know about that, but the original article about the debut of Dakota Scrabble (shortened for the AP wire) notes that the tiles have all the appropriate orthographic markings, as each set is hand-crafted by tribal members. Organizers got backing from Hasbro to make the Dakota version of Scrabble (along with the 207-page Official Dakota Scrabble Dictionary), and they also received support from the Association on American Indian Affairs and Sisseton-Wahpeton College. The AAIA and the college have supported previous efforts to revitalize the language, such as the recording of Dakota rap songs last year.

[Update 3/29/06: Tammy DeCoteau of the AAIA posted a comment on Blogamundo saying that the distributions and point values of letters were based on the words in the Official Dakota Scrabble Dictionary. So that answers Heidi Harley's question about whether letter distributions were determined by corpus frequency or lexicon frequency.]

March 27, 2006

Of silos and stovepipes

The Mar. 27 Wall Street Journal has an article filling readers in on the very latest business buzzwords (available online here, but only for subscribers). Among such jargon as delayering (firing managers) and Sox (shorthand for the Sarbanes-Oxley corporate governance act) is one morphologically interesting item:

Another current buzzword, "unsiloing," mangles the noun silo to make an important but simple point: Managers must cooperate across departments and functions, share resources and cross-sell products to boost the bottom line.

I'm afraid I don't quite see how the coinage unsiloing "mangles" the root form silo. Is the problem with turning a noun into a verb? Yes, yes, we all know Calvin's dictum that "verbing weirds language," but noun-to-verb enallage is so commonplace in corporate-speak — from leveraging to impacting to solutioning — that I'd think another example wouldn't even raise an eyebrow at the Journal.

Perhaps it's the creation of a verb with the un- prefix that is seen as somehow hurtful to poor silo. But the formation un-X, meaning 'to remove X from (something/someone),' has been a source of lexical innovation in English for many centuries: think of unbandage, uncap, uncloak, uncrown, unmask, unpeople, and so forth. Typically this type of un-X formation mirrors another noun-derived verb X, meaning 'to put X in/on (something/someone),' though it's also possible to create un-X neologisms without a preexisting positive form. The Oxford English Dictionary entry for un- includes such literary examples as unblouse from James Joyce ("Miss bronze unbloused her neck") and undogcollar from Dylan Thomas (Esau,.. undogcollared because of his little weakness, was scythed to the bone one harvest by mistake").

In the case of unsilo, there is indeed a positive form without the un- already in use among management types, though the two verb forms do not always reflect the semantic symmetry of such lexical pairs as mask/unmask, people/unpeople, etc. The verb silo, along with the participial adjective siloed and the verbal noun siloing, relies on the metaphorical sense of silo as a rigidly defined "vertical" organizational structure or communciation channel — also known by another metaphor, stovepipe. (See Grant Barrett's Double-Tongued Word Wrester entry for further silo nuances.) The earliest example I've found for the verbal noun siloing is from Thriving on Chaos: Handbook for a Management Revolution, a 1987 book by Tom Peters (the 1988 paperback edition is searchable on Amazon and Google Book Search):

Rip apart a badly developed project and you will unfailingly find 75 percent of the slippage attributable to (1) "siloing," or sending memos and minutes up and down vertical organizational "silos" or "stovepipes" for decisions...

Note that this use of siloing doesn't mean 'putting a silo in (something)' or 'putting (something) in a silo' (the OED has non-metaphorical uses of transitive silo, as in "siloing grass," back to 1883). Rather, siloing here is better defined as 'communicating up and down an organizational silo.' This sense closely emulates the development of the verb stovepipe, which the New Oxford American Dictionary defines as 'transmit (information) directly through levels of a hierarchy.' We have heard much in recent years about the "stovepiping" of faulty intelligence directly to the White House from Bush loyalists, as in Seymour Hersh's 2003 New Yorker piece, "The Stovepipe."

Organizational silos and stovepipes are almost always discussed in disparaging terms, as hindrances to the efficient management of a company (or a school, or a health-care system, or a presidential administration). So it's not surprising that a term has developed for the breaking down of the silo structure in order to promote "horizontal" cooperation across an organization. Unstovepiping is a bit unwieldy, so that leaves unsiloing. (One could also imagine desiloing, like the dewatering of New Orleans, though that doesn't seem to have caught on.) Similarly, corporate leaders would prefer their organizations to be unsiloed rather than siloed, to use the participial adjective forms. Here are a few recent citations of the verb from the Factiva database:

Wall Street & Technology, 1 June 2004
There seems to be no shortage of trends and technologies to focus on, from algorithmic trading to unsiloing the institution.

Delaney Report, 26 Sep 2005
Agencies siloed these disciplines, while some unsiloed them, putting them all under one roof in a bid for greater efficiency, control, profitability and a higher level of cross-disciplinary teamwork.

Oxford Industries Earnings Conference Call, 6 Oct 2005
I think we've actually tried to unsilo the business with this strategy, and really bring the common merchandising, sourcing and back ends of what was the individual slack company and the individual shirt company into one group.

Getting rid of those nasty silos and stovepipes is no doubt easier said than done. But one can visualize management consultants spreading the gospel of unsiloing, perhaps with PowerPoint demonstrations that make the silos magically disappear from organizational charts. The Wall Street Journal quotes David D'Alessandro, former CEO of John Hancock Financial Services, as saying that unsiloing is especially attractive to new CEOs: "Suddenly they're in charge and they want everyone to play together nicely in the sandbox." That's some impressive metaphor-muddling there: everybody out of the silo and into the sandbox!

[Update: See Semantic Compositions for further thoughts on the silo/stovepipe metaphor, as well as the competing metaphor of the pipeline.]

Lexical drift (2)

Here's another example of surprising and idiosyncratic drift in word meaning. I've been reading quite a few of the Alex Delaware novels by Jonathan Kellerman — police procedurals about hunting down murderers in Los Angeles. Psychologist Alex Delaware and his best friend the gay detective Milo Sturgis are always having long and complex discussions about how much the current evidence favors this or that suspect. And Milo will often say, for example, "So, you like the husband now?" — meaning (and I have no idea how it was that I could see this instantly), "So, you now favor the hypothesis that the husband is the murderer that we seek?" The verb like has taken on a new sense where A likes B means "A favors the hypothesis that B is the culprit." See how that works? Maybe the new sense will catch on more widely, maybe it will be limited (or is limited) to police talk, maybe it will never spread much; we don't know, and we can't predict.

I have a feeling (although my knowledge of police procedurals and police talk is quite limited) that I may have encountered similar instances elsewhere over the past years or decades, and that is more likely to be right than wrong (beware the recency illusion). In fact [update over the following few hours] email is already piling up: Jan Freeman tells me that this use of like was common on NYPD Blue; Adrian Riskin thinks it may go back at least as far as Damon Runyon; and James Nye and Douglas Davidson and Steven Keiser and Nathalie Hecker and all sorts of people point out that a very similar usage is found in sports (Which horse do you like in the fifth?; Who do you like for the Superbowl?), and that was very probably the seed from which the police usage grew. Final suspect as winner of a tournament of evidence, like meaning "favor as the probable final pick."

It is just possible that I may have said this before, but perhaps it will bear repeating: word meanings are imperceptibly shifting all the time in surprising ways. It's like continental drift, only less predictable.

Posted by Geoffrey K. Pullum at 09:47 AM

Lexical drift (1)

Kalim Kassam pointed out to me an article in The Guardian with an interestingly extended sense of the word gaydar. The topic is the goth subculture:

Louise (she prefers not to give her surname) works in credit risk in Leeds. Aged 34, she got into goth music 17 years ago and now has tickets for the upcoming Sisters of Mercy tour. She reckons about "four or five people" at her workplace are former goths. "There's a kind of gaydar that lets you spot them."

You see how that works? The extension is from "subtle, quasi-extrasensory ability to detect that someone is a member of the gay subculture" to "subtle, quasi-extrasensory ability to detect that someone is a member of a subculture". That's a fairly typical example of the weird and wonderful processes of lexical semantic change.

It seems new, but Kalim is fairly sure he's heard similar instances before, and he's very likely to be right. He's aware of the recency illusion.

Word meanings are imperceptibly shifting all the time in surprising ways. It's like continental drift, only less predictable.

Posted by Geoffrey K. Pullum at 09:44 AM

March 26, 2006

The Origin of Redskin

The controversy over the Washington Redskins trademark has attracted considerable attention, here and elsewhere. We have had quite a few previous posts about this. It began with a petition by seven American Indian activists led by Suzan Harjo in 1992 to the Trademark Trial and Appeal Board of the US Department of Commerce requesting cancellation of the trademark on the grounds that the word redskin

was and is a pejorative, derogatory, denigrating, offensive, scandalous, contemptuous, disreputable, disparaging and racist designation for a Native American person

In 1998 the Trademark Trial and Appeal Board decided in favor of the petitioners and cancelled the trademark. Pro Football, Inc. appealed to the United States District Court, which in 2003 overturned the decision of the Trademark Trial and Appeal Board and reinstated the trademark. It gave several grounds for its decision:

  • that there was an absence of evidence that the term redskin is disparaging in the particular context of the name of the sports team;
  • that the TTB did not sufficiently articulate its inferences and explain how it decided between competing pieces of evidence. In particular, the District Court was critical of the fact that the TTB ruled on the basis "of the entirety of the evidence" but did not review that evidence in any detail and made few findings of fact;
  • that the petitioners' claim was barred by the doctrine of laches, which provides that a right or claim should not be enforced if the long delay in asserting it puts the respondent at an unreasonable disadvantage. In this case, the Court held that opposition to the mark should have been asserted when the mark was issued in 1967 or shortly thereafter and that the delay of twenty-five years was unreasonable.

The case was appealed to the Court of Appeal for the District of Columbia Circuit. In its 2005 decision, the Court of Appeal held that the doctrine of laches did not in principle bar the suit of one of the petitioners, Mateo Romero, the youngest, because he was only one year old in 1967 when the trademark was registered. (In US federal law, the clock for laches starts when the petitioner reaches the age of 18.) It therefore returned the case to the District Court for further consideration of whether laches should bar the suit on the part of Mateo Romero.¹ The Court of Appeal did not address the question of whether there was sufficient evidence that redskin is disparaging in the context of the name of the sports team because there is no need to decide that question if the suit is barred by laches.²

Although the main topic I want to discuss is a linguistic one, I've reviewed the legal history because I think that much of the discussion of the case has been rather misleading. To a large extent the decisions of the courts have focussed on the "technicality" of laches, not on the question of whether redskin is disparaging. The District Court did not simply ignore overwhelming evidence as some commentators suggest. Indeed, even in its holdings on the disparagement issue, the District Court's criticisms of the TTB were that it did not sufficiently address the question of whether redskin is disparaging in the context of the name and that the TTB did not make sufficient findings of fact. And in overturning the District Court, the Court of Appeal made no judgment whatever as to whether redskin is disparaging. Its decision dealt exclusively with laches. In short, the decisions of the courts have been concerned largely with technical questions, not with the linguistic issues.

I think that it is well established that redskin is taken by most people today to be disparaging. What is more interesting is whether it has always been so, as Harjo et al., as well as various others, claim. One interesting piece of evidence is the origin of the name Washington Redskins. In 1933, George Preston Marshall, the owner of the team, which was then located in Boston, renamed it the Boston Redskins in honor of the head coach, William "Lone Star" Dietz, an American Indian.³ When the team moved to Washington in 1937 it was renamed the Washington Redskins. George Marshall clearly did not consider the name disparaging.

The term redskin of course goes much farther back than 1933. The details of this history have recently been explored by Ives Goddard of the Smithsonian Institution, in a paper conveniently available on-line. Some of the evidence is available in greater detail on Goddard's web site. You can read speeches by the Meskwaki chief Black Thunder and the Omaha chief Big Elk in which the expression redskin is used, and early nineteenth century examples of the Meskwaki usage of terms meaning redskin and whiteskin.

I won't review the evidence in detail because Goddard's paper is short enough and accessible enough that if you are interested you should read it yourself. I'll just summarize it. Goddard shows that the term redskin is a translation from native American languages of a term used by native Americans for themselves. Harjo's claim that it "had its origins in the practice of presenting bloody red skins and scalps as proof of Indian kill for bounty payments" is unsupported by any evidence.⁴ The term entered popular usage via the novels of James Fenimore Cooper. In the early- to mid-nineteenth century the term was neutral, not pejorative, and indeed was often used in contexts in which whites spoke of Indians in positive terms. Goddard concludes:

Cooper's use of redskin as a Native American in-group term was entirely authentic, reflecting both the accurate perception of the Indian self-image and the evolving respect among whites for the Indians' distinct cultural perspective, whatever its prospects. The descent of this word into obloquy is a phenomenon of more recent times.

The response to Goddard's paper is disappointing. Other than reiterating the unsubstantiated and implausible theory that the term owes its origin to scalping, Harjo and others have merely waved their hands, asserting that as Indians they know differently without presenting any evidence whatsoever. A typical example is found in this Native Village article, which quotes Harjo as follows:

I'm very familiar with white men who uphold the judicious speech of white men. Europeans were not using high-minded language. [To them] we were only human when it came to territory, land cessions and whose side you were on.

The only point here that even resembles an argument is the bald assertion that Europeans never spoke of Indians other than disparagingly. This is not true. Evidence to the contrary is explicitly cited by Goddard. What is more disturbing is that Harjo's primary response to Goddard is ad hominem: that as a white man what he says is not credible. Whether he is white, red, or green is of course utterly irrelevant, as thinking people have known since at least the Middle Ages. Goddard presents his evidence in detail, with citations to the original sources. You can evaluate it yourself, and you need not rely on his statements of fact but can, if you are willing to devote some time and effort, check out the sources yourself. Furthermore, without the slightest evidence Harjo imputes to Goddard not merely bias but racism, a charge which, based, as her own words reveal, entirely on racial stereotyping, merely reflects back on herself.

So, there you have it. On the one hand an utterly unsubstantiated and implausible theory advocated by Suzan Harjo, who exhibits no knowledge of the history of English usage of redskin, of American Indian languages, or of the early history of relations between Indians and Europeans. On the other hand a detailed account with numerous explicit citations to original documents by Ives Goddard, who has dedicated his entire life to the study of American Indian languages and the documentation thereof. It is always possible that some new evidence will be brought to bear, but for the present I don't think that there can be any ambiguity as to which is the more credible account.


¹ The District Court held that Romero's suit was not barred by laches simply as a matter of the length of time that had elapsed since the cancellation petition was filed only seven years from the date of his majority, but might nonetheless be barred by laches if the delay of seven years put Pro Football at an unreasonable disadvantage. For this reason it is important to understand that laches is distinct from the doctrine of statute of limitations. A suit is barred by the statute of limitations if there is legislation setting such a time limit. In contrast, laches is an equitable doctrine and is based on the principle that too long a delay is unfair to the respondent, not on any particular time limit.

² Similarly, the District Court never addressed Pro Football's arguments that section 2(a) of the Lanham act, under which Harjo et al. sued, is an unconstitutional violation of the First Amendment right of free speech and the Fifth Amendment right of due process because it overturned the TTB's decision on other, non-constitutional, grounds.

³ Harjo et al. question this story of the origin of the name, but as the Circuit Court noted (p. 13, footnote 6), they provide no evidence whatever to the contrary and give no convincing reason to disbelieve the primary source, a newspaper article presenting the account by Marshall's grand-daughter. Some authors have also claimed that Dietz was not an American Indian. The articles cited, however, do not cite their sources, so it is difficult to evaluate their claims. It is, however, undisputed that Dietz presented himself as an American Indian and that George Marshall publicly presented him as one. George Marshall surely thought that Dietz was an American Indian, which is really what counts here.

⁴ A point that has not, as far as I know, been mentioned in this context is that scalps or other body parts presented as evidence of kills would not, in general, have been red. As I can attest from personal experience with the processing of animals killed by hunters, mammalian blood is bright red when fresh but darkens quickly as it oxidizes. When dried it retains a dark red tinge if thin but in any thickness is black. Under most circumstances bounty hunters did not present their trophies for payment until days or weeks after the kill, by which time the blood would have been more black than red. The suggestion that such trophies would give a primary impression of red is due either to a false idea that they would usually have been presented when fresh or to a lack of familiarity with dried blood. A further difficulty with Harjo's hypothesis is that, although whites did indeed collect Indian trophies as evidence of kills, the popular image of scalping was and is that it was an activity engaged in primarily by Indians who mutilated the corpses of their white victims. There was therefore no reason to associate bloody trophies, red or not, with Indians. If anything, the association would have been with the white victims of scalping.

Posted by Bill Poser at 06:42 PM

Big hair in the blogosphere

In the April Vanity Fair there's an article by Marie Brenner, "Lies and Consequences: Sixteen Words That Changed the World", which features an interesting quote from Judith Miller. Here's the quote in context:

Back from Iraq in June 2003, Miller realized that she was losing her authority. She had worked with Bill Keller for years, and she admired his reporting. He was a fellow Pulitzer winner. But now Keller was in a sensitive situation. Miller would have to be reeled in. "You are radioactive," she says he told her. "You can see it on the blogs." "Why do you give a shit about the blogs?," Miller remembered asking Keller. "They do not know anything." (Keller responds: "I'm pretty sure I never said any such thing.")

Miller later told me, "The bloggers were without editing, without a way for people to understand what was good, what was well reported—to distinguish between the straight and the slanderous. Things would get instantly picked up, magnified, and volumized.… I was appalled, not by the blogs—that would be like getting appalled at the Industrial Revolution—but by my colleagues, who believed what they read on the blogs."

This being Language Log, the issue is not the invention of pretexts for war, but rather the invention of uses for words. Specifically: volumized.

The word volumize immediately makes me think of hair-care products, and the OED agrees:

2.b. Of a product or styling technique: to enhance the thickness of or give body to (hair or eyelashes).

1991 Los Angeles Times Mag. 3 Feb. (Mag.) 6/3 Magic shampoos that don't just deep-clean -- they restructure, humidify, bodify and/or volumize.
Canad. Living Feb. 83 (advt.) Fluffs up and volumizes natural curls or permed hair.
Evening Chron. (Newcastle) (Nexis) 15 Jan. 4 She applied one coat of architecture mascara and then some natural fibres to lengthen and volumise the lashes.

Checking for "volumized" on the web, I mostly find things like the advice on the "physique stylezone", which tells me that "to maximize volume" I should

Start by using Physique Volumizing Shampoo and Conditioner in the shower to begin creating dynamic air pockets between the strands of your hair. Follow with Physique Volumizing Hair Spray to add even more volume to your styles.

The evolutionary psychologists will blame it all on Darwin, I think, observing that "big hair" is a flamboyant symbol of health and reproductive vigor, communicating that the organism has plenty of biological resources to spare for a spectacular 'do. I suppose they have a point, since the hair-care industry makes a lot of money by helping us transform our hair so as to send the right message, and I don't think I've ever seen an advertisement for a product that promises to make hair seem thin or sparse.

Anthropologists will tell us a similar story, but about cultural rather than genetic evolution. And they've got a point, too, since "big hair" is an important value in some (sub-)cultures and not in others. There are several recent books on the subject: "Big Hair" by James Innes-Smith; "Big Hair: A Journey into the Transformation of Self", by Grant McCracken; and "Bald in the Land of Big Hair", by Joni Rogers, which tells us that "the land of big hair" is Houston, Texas. I can't claim any expertise in geo-socio-cosmetology, but I do have the impression that big hair is a sunbelt rather than snowbelt kind of thing.

All this is background to an observation about creative metaphors. I don't think I've ever heard "volumized" used before with respect to stories or ideas; but from Miller's quote I immediately knew what it would mean to "volumize" a story by circulating it in the blogosphere, and I imagine that you got the same idea that I did. The obvious meaning: a thin story was being fluffed and teased and generally puffed out with "dynamic air pockets" to seem bigger than it was.

But I think there's a regional and social tinge here as well. In the red states -- some of them anyhow -- big hair is valued among the dominant social classes. Judith Miller's pictures suggest that she's not a big hair type of person, and I believe that this reflects the values of her peers. In New York City and rest of the northeast, big hair is a working-class or ethnic style. Thus a discussion of "Black Cadillacs, Big Hair, and Pinkie Rings: Dressing (and Speaking) the Part for The Sopranos" defines "big hair" as

A popular hairstyle for women—think teased out, dyed, and hair-sprayed until the hair reaches several inches in height. Considered very sexy and feminine by certain females living in North Jersey and Long Island (note: big hair is not complete without very long fake fingernails lacquered and painted with designs at least once a week).

(For European readers, New Jersey is to New York City roughly as Belgium is to France -- a place that can get a laugh just by being mentioned. I say this as someone who lived in New Jersey for 15 years.)

Thus volumizing is not only a form of false presentation, it's a tacky, provincial, lower-class form of false presentation. As you'd expect from those unedited red-state bloggers...

It wouldn't be right to leave this topic without observing that verbs in -ize have been a traditional bugbear for prescriptivists. Edwin Newman used to rail against hospitalize, and the Plain English Campaign is still making a fuss over prioritize. This adds another tinge of ironic disdain to Miller's complaint.

That's a lot of meaning to pack into one little neologism. You may think I'm overinterpreting, but in my opinion, this is just the usual poetics of everyday talk.

Anyhow, Miller's communicative virtuosity was in vain. Brenner's article continues:

But it wasn't just the blogs. By then a platoon of reporters, including Seymour Hersh in The New Yorker, had pounced on the issue of the missing W.M.D. Soon the criticism rose to a critical mass. Miller and the Times's Baghdad-bureau chief, John Burns, another Pulitzer winner, had had an acidic exchange over Chalabi, one of her longtime sources, which was picked up by The Washington Post's Howard Kurtz. "If reporters who live by their sources were obliged to die by their sources, New York Times reporter Judith Miller would be stinking up her family tomb right now," Slate's Jack Shafer would later comment.

Posted by Mark Liberman at 03:38 PM

March 25, 2006

Don't look back: the old geezers may be gaining on you

I've been reading a book called Ageism: Stereotypes and Prejudice Against Older Persons, edited by Todd D. Nelson (MIT Press 2004). I confess that this topic is important to me because a few years ago I graduated from middle age to the Old Guy category myself. Most of the chapters are written by psychologists who have been studying ageism for the past 30 years but I was especially interested in chapter six, "Ageism in the Workplace: A Communication Perspective," by Robert McCann and Howard Giles, because it's the only chapter that deals with language. Well, it deals with words anyway.

Forensic linguists don't tend to share much about the cases they've worked on but I think it's safe to say that we haven't had very many age discrimination lawsuits, even though the major evidence in such cases is likely to be found in written documents. I once worked on a case brought by an executive in one of America's huge corporations. He was dumped while still in his early 50s. Oddly enough, his employment history showed only very positive annual employee evaluations. Nothing bad was reported, nor would anybody above him in the corporate ladder explain to him why he was dismissed. So he took the matter to a lawyer who then asked me to help him find evidence of age bias.

One huge problem was that the company's president made it a practice never to write memos, letters or anything else that could be used to show discrimination. There were also precious few things written by the lower layers of executives. It's hard to know if there is a Department of Shredding at that company but one is allowed to have one's suspicions. I did manage to find some journal articles that quoted the company's president and in them I found a number of the same or similar expressions used by employers in past age discrimination cases. These included inflexible, rigid, cautious, cranky, old goat, long in the tooth, unadaptable, out of touch, close minded, and focused on the past. Naturally, these expressions helped win judgments against employers.

I discovered most of these and a few others through electonic searches of about a hundred journal articles that had been written about the company. 14 of the articles contained quotes attributed to the president and other high-level executives. I also located 4 speeches that the president had given at various college business schools but I could locate only one internal company memo outlining the type of person the company wanted to fill an open executive position. From all of these I isolated the following words and expressions used to describe a candidate that the company considered an undesirable manager: old, close to the vest, anchor of the past, yesterday's manager, weak, long years of service, sat in the right chairs, plateaued out, ruled by tradition, cautious, slow, conformist, organization man, rooted in yesterday, tied to the past, bureaucratic, faint of heart, and, belive it or not, loyal.

In contrast, attributes that the company considered desirable in an executive included the following: a gut person, adopts to change, tomorrow's leader, lean, agile, fast, able to grow, creative, driven, tries the new, fresh, impatient, and irreverant.

Obviously I used these expressions in my report. After all the expert witness reports on both sides were submitted, the case was settled without going to trial. The judgment was sealed so I'll never know for sure what the plaintiff got out of it but the lawyers I worked with told me that they were very happy with the result.

Ageism is indeed shaped by language. Nothing magical happens at age 65. I am one of those cautious, slow, loyal old goats myself, so I was heartened by one of the statements McCann and Giles said in their chapter: "There is little change in intellectual function for individuals throughout adulthood except in matters pertaining to speed and reaction time...brain activity in healthy people in their 80s is comparable to that of people in their 20s."

Yet the common reluctance of employers to hire and retain workers over the age of 55 is that the mental demands of work, the inability to cope with change, and resistance to new technology remains. Age discrimination in the workplace can be a treasure trove of linguistic research. But watch out, you young, fresh, agile, lean, driven young people! If the true value of an older worker is ever fully recognized, Satchel Paige's advice may be relevant:"Don't look back, somebody might be gaining on you."

Op-Ed arithmetic at the NYT

In a 3/25/2006 NYT Op-Ed piece "A Fine that Fits the Crime", Gary Weiss describes a recent $250-million settlement agreement between the SEC and Bear Stearns:

When compared to what most people would define as "income," the penalty shrinks to a hair under one-fifteenth (one-14.6th, to be exact) of Bear Stearns's gross revenue of $3.6 billion during the first quarter.

But 1/15 is .0666... while 1/14.6 is .0685... And 250,000,000/3,600,000,000 is actually exactly 1/14.4, which is 0.0694... So the penalty is not "a hair under one-fifteenth of Bear Sterns's gross revenue", but rather 4.2% greater than one-fifteenth of Bear Sterns's gross revenue, at least given the figures cited.

This leaves us to wonder. Did Weiss make a verbal or mathematical slip? Or is he interpreting "under" to refer only to the denominator of the fraction? Either way, he needs a better editor, given that he's writing popular books about financial matters.

Normally I wouldn't mention such an inconsequential slip, but my bedtime reading last night happened to include Judith Miller as quoted in the April Vanity Fair:

The bloggers were without editing, without a way for people to understand what was good, what was well reported -- to distinguish between the straight and the slanderous. Things would get instantly picked up, magnified, and volumized ... I was appalled, not by the blogs -- that would be like getting appalled at the Industrial Revolution -- but by my colleagues, who believed what they read on the blogs.

Funny, that's just how I feel about most science reporting in the popular press. Bloggers might not have editors, but most of us can handle fractions.

[Update 3/28/2006: John Bauman writes

I just wanted to note that 1/14.6 is exactly 250,000,000/3,650,000,000. I suspect that Weiss just used Banker's rounding on the original number he was given (and didn't inform us of this fact).

Banker's rounding is "rounding to the nearest even number", so the idea here is that Weiss (or one of his sources or assistants or editors) modified 3,650,000 to 3,600,000 before calculating the fraction. That would explain the "14.6", but not the "under". ]

Posted by Mark Liberman at 02:21 PM

Another solution to the person/people problem

It's been a while (Ben Zimmer here, with links back to postings by Mark Liberman and Geoff Pullum) since we reflected on the plural of person: persons or people?  Now Opal Eleanor Armstrong Zwicky, aged 2, offers a fresh solution, which ruthlessly applies logic to the problem and also (in the spirit of prescriptivism) clarifies stylistic choices by just discarding one of the alternatives.

For some time Opal had no plural marking by suffixes (she might, at least occasionally, have been marking plurality via the number word two, as many children do, so that "two dog" just meant "dogs" and could be used for large groups of dogs; it's often hard to tell what a small child's intentions are, however).  Then, suddenly, plural suffixes rushed in, along with possessive suffixes, and on 3/22/06 her mother, Elizabeth Daingerfield Zwicky, reported the following dialogue:

Opal: Peoples!
Me: Are there? I didn't see them.
Opal: One people.
Me: Oh, there was just one person? No wonder I missed them.
Opal: No person! One people!

You can see the wheels of logic turning here.  Where she had "lotsa people" before, she now needs a plural form, which would of course be "peoples".  For her, this is not a double plural (the way it seems to us), but just an ordinary plural, like "dogs".  Ok, nobody around her says "peoples", they all say "people" for this meaning, but she has logic on her side, and anyway, as prescriptivists sometimes tell us, even if a whole lot of people use some form they could still all be wrong wrong wrong.

Then, once you have "peoples", it follows, as must the night the day, that the singular is "people".  Hence, "one people".  Ok, once again, nobody around her says "one  people" -- she's now managed to have ALL occurrences of the lexical item people diverge from the adult standard -- but Logic Ruulz.

What about the lexical item person?  We don't have records of the speech of people around her, but I'll bet that plural "persons" is extremely rare in this speech, and that singular "person" is not particularly frequent (think about how often you really need this form in everyday conversation, when more specific nouns are available, as are indefinite pronouns like someone).  On the other hand, occurrences of "people" abound.  [Brett Reynolds reports, by e-mail on 3/25/06, that in the British National Corpus, spoken register, "people" is the #1 most common noun, with "person" at #59, and "persons" not in the top 1000.  I'd imagine that "person" is even rarer in speech directed at young children than it is in speech in general.]  So, Opal reasons: I have "people(s)" available for reference to human beings, why do I need "person(s)"?  That would just be pointless stylistic variation, like having both "that" and "which" available for marking restrictive relative clauses.  DISCARD ONE OF THE ALTERNATIVES.  In particular, discard the less frequent alternative.  Hence, "No person! One people!"

Eventually, she will come to appreciate the fact that what rules in these situations is not Logic but Society.  She will have to bow to the common usage of those around her.  Or, to put it another way, it will be time for her to join a community, rather than inventing language on her own.  [Update on 3/30/06: "peoples" has already vanished in favor of "people"; I suspect that "one people" has a very brief future.]

Simpsonological linguistics

Last week, to celebrate the one-year anniversary of her blog HeiDeas, Heidi Harley posted a long list of linguistically relevant Simpsons jokes. All of us who teach introductory linguistics are in her debt, and at more advanced levels, researchers will be scanning this post for classical allusions to use in future scientific and scholarly publications. The headings that Heidi provides for the jokes are also a good test of how well you know the field's terminology and literature: thus "Context-dependent reference of tense (cf. Partee 1973 ...)":

Marge and Homer getting into bed:
M: Did you close the gate?
H: Yeah.
Through the open window comes the sound of the gate slamming in the wind. Marge looks at Homer. Homer looks fake-surprised and says,
H: Oh, you mean tonight!

Posted by Mark Liberman at 09:08 AM

March 24, 2006

The aperiodic song of the communications engineer

A recent paper giving an information-theoretic analysis of humpback whale song has been getting a lot of press (e.g. here, here, here). Let's leave the dissection of the MSM coverage as an exercise for the reader, and go directly to the paper's abstract (slightly abbreviated):

The structure of humpback whale (Megaptera novaeangliae) songs was examined using information theory techniques. The song is an ordered sequence of individual sound elements separated by gaps of silence. Song samples were converted into sequences of discrete symbols by both human and automated classifiers. This paper analyzes the song structure in these symbol sequences using information entropy estimators and autocorrelation estimators. [...] The results provide quantitative evidence consistent with the hierarchical structure proposed for these songs by Payne and McVay [Science 173, 587–597 (1971)]. Specifically, this analysis demonstrates that: (1) There is a strong structural constraint, or syntax, in the generation of the songs, and (2) the structural constraints exhibit periodicities with periods of 6–8 and 180–400 units. [...]

[Ryuji Suzuki, John R. Buck and Peter L. Tyack, "Information entropy of humpback whale song", The Journal of the Acoustical Society of America, Vol. 119, No. 3, pp. 1849–1866, March 2006.]

Here's the paper's description of what humpback whale songs are like:

Most humpback whales have an annual migratory pattern, breeding in subtropical latitudes during winter, and migrating to high latitude waters to feed in the summer. The vocalizations produced by humpback whales during these feeding and breeding seasons differ greatly. The feeding calls involve a few simple sound patterns produced in simple sequences (D'Vincent et al., 1985), whereas whales produce complex songs during the breeding season. The term song is used in animals, such as songbirds and whales, to describe an acoustic signal that involves a wide variety of sounds repeated in a specific sequence.

Humpback songs consist of a sequence of discrete sound elements, called units, that are separated by silence. Each song contains a complicated series of more than 12 different units. These units cover a wide frequency range (30–3000  Hz), and consist of both modulated tones and pulse trains. Payne and McVay (1971) proposed a hierarchical structure for humpback song. A song is a sequence of themes, where a theme consists of a phrase, or very similar phrases, repeated several times. A phrase is a sequence of several units. The song is repeated many times with considerable accuracy to make a song session. The reported range of song duration is from 7  to  30  min (Payne and McVay, 1971) or from 6  to  35  min (Winn and Winn, 1978). Winn and Winn (1978) also reported the maximum duration of observed song sessions to be 22  h. Throughout this paper, we use song length to indicate the number of units in a song, and song duration to indicate the number of minutes the song lasts.

This is a serious and interesting paper, deserving fuller discussion. But as a first step, I figured that turn about is fair play. Imagining myself a cetacean researcher investigating the structure of naked ape (Homo sapiens) vocal displays, I applied one of Suzuki et al.'s techniques to their own paper.

Specifically, I applied their "discrete sequence correlation" analysis to their Discussion section. The basic idea here is to line up a sequence against itself at various lags, like so:

  . . . 1 2 3 1 2 3 4 1 2 3 1 2 3 4 1 . .
. . . 1 2 3 1 2 3 4 1 2 3 1 2 3 4 1. . .
lag = 1, number of equal elements = 0 of 14

    . . . 1 2 3 1 2 3 4 1 2 3 1 2 3 4 1 . .
. . . 1 2 3 1 2 3 4 1 2 3 1 2 3 4 1. . .
lag = 2, number of equal elements = 0 of 13

      . . . 1 2 3 1 2 3 4 1 2 3 1 2 3 4 1 . .
. . . 1 2 3 1 2 3 4 1 2 3 1 2 3 4 1. . .
lag = 3, number of equal elements = 6 of 12

and so on. (I'll spare you the formula, which is their equation (27)). At lags corresponding to the periodicities of the sequence -- if any -- similar units will line up, and the "correlation" (here just the length-normalized count of equal elements) will be higher. At other lags, the corresponding units will be out of phase, so to speak, and the "correlation" will be lower.

The "periods of 6–8 and 180–400 units" that they found in the whale song show up clearly in their autocorrelation plots. The 6-8 element periods, presumably corresponding to Payne's "phrases", show up in the shorter-time plots:

The 180-400 unit periods show up in the plots of longer-time-scale patterns:

The comparable plots for their own discussion section are much less regularly structured:

So as cetacean scientists, should we conclude that English is less regularly patterned than Humpback? Well, sort of. My comparison is rather unfair, since Humpback only has a dozen or so different "sound elements", whereas English has hundreds of thousands of qualitatively distinct wordforms. For this reason alone, the structure of English is less likely to show up in this simple sort of serial cross-correlation analysis. It might work a bit better to do the serial cross-correlation on strings of English parts of speech, or perhaps on strings of vectors in a "latent semantic analysis" subspace, or something of the sort. But however we do it, the fact will remain that Humpback song is, well, song. As Suzuki et al. explain:

A song is a sequence of themes, where a theme consists of a phrase, or very similar phrases, repeated several times. A phrase is a sequence of several units. The song is repeated many times with considerable accuracy to make a song session

While we all know people whose discourse gives this impression, the fact is that even the most boring and repetitive among us produces less rigidly patterned rhetoric than this.

Suzuki et al. make the point that the structure of these songs (the 6-8-unit phrases, the 180-400-unit groupings) means that they are not generated by a low-order markov process. On the other hand, such "songs" might be generated by various simple (non-stationary) generalizations of markov processes, such as the hierarchical markov random fields used in image segmentation, or the "hierarchical Dirichlet processes" recently discussed by Teh et al. And models of this kind have an obvious interpretation in terms of neurological pattern generators. (For those to whom this is opaque, consider it a promise to come back and explain -- if only to myself -- when I have a bit of free time...)

I first heard Roger Payne talk about humpback whale song back in the mid 1970s, and I was convinced by his talk that these displays exhibit hierarchical (though probably not recursive) structure. It's nice to see this confirmed by various quantitative means, but the confirmation is not a surprise, in my opinion. The aspect of humpback vocalizations that I found most interesting in Payne's talk, and still find most interesting today, is the social evolution of the song patterns. Here's how Suzuki et al. describe this process:

All whales in a population are singing the same or very similar songs at a given time, although whales within hearing range do not coordinate to sing the same part of the song at the same time. The songs within a population gradually change over time, so that after several singing seasons few elements of the song have been preserved (Payne et al., 1983). Several reviewers believe that the speed and pervasiveness of this change indicates that singing whales must learn each sound unit and the sequence order that make up a full song (Janik and Slater, 1997; Tyack and Sayigh, 1997). Guinee and Payne (1988) suggested that this song evolution presents a difficult learning and memory task. They proposed that humpback whales increase the redundancy of parts of phrases between adjacent themes as a mnemonic aid, and they found that this redundancy was more common in songs with larger numbers of themes where more material would have to be remembered.

This cultural evolution of humpback song has not been quantitatively analyzed or formally modeled, as far as I know: I hope that the authors of this paper (or others) will go on to do so. I've heard that the U.S. Navy has released long-term passive sonar recordings from its SOSUS program; along with recordings from NOAA and other sources, these presumably include thousands of hours of humpback whale song recorded over several decades.

[Here's an audio clip of a fragment of one of these songs.]

[As an example of how a genuine human song would show structure somewhat more like that of a humpback whale song, here's a plot of the discrete sequence autocorrelation of the lyrics to the children's song "Skip to my Lou":

Of course, this is an unusually repetitive song -- for more normal sorts of lyrics, you'd need a more abstract representation to see the periodicity.]

Posted by Mark Liberman at 06:54 AM

March 23, 2006

I found my snowclone in Palo Alto

Continuing our discussion of what's a snowclone and what's just a lot of playful allusion to and creative variation on some original (most recently: X-back Mountain?), I take up the legend on a flyer for the Artfibers Yarn Millshop in San Francisco, presented to me in Palo Alto on 3/21/06 by several friends who'd just been there:

I found my yarn in San Francisco.

I think there is a snowclone here, Left in San Francisco, of the form "I left my X in San Francisco" and based on the song title "I Left My Heart in San Francisco".  But I also think that the Artfibers sentence is not an instance of it, but a bit of fresh play on the song title.  Formulaic variants and novel variants can coexist.

The song has lyrics by Douglas Cross and was made famous in a recording by Tony Bennett (a chart hit in 1962), but was also recorded notably by Frank Sinatra.   A Google web search on

"I left my" "in San Francisco" -heart

on 3/21/06 pulled up ca. 42,300 hits.  I looked at the first hundred and excluded those that seemed ONLY to be saying that the writer had left something in San Francisco (many were reporting such an event but also pretty clearly alluding to the song).  There were 36 different fillers for the "heart" slot, which is a pretty impressive diversity:

    Fillers phonologically similar to heart: hearts [an arcane reference involving Cthulhu], art, fart, hat, cart, harp, parts, hearth, Hart (9)

    Other organs or body parts: stomach, neck, calves, ribs, nervous system, liver, schnozze, brain, ankle (9)

    Other song titles: blues [Buddy Guy, "I Left My Blues in San Francisco"], gun [Butt Trumpet, "I Left My Gun in San Francisco"] (2)

    Others: wallet, Duracell, mini, iPod, lunch, phone, pug, signs, plot continuity, Jell-O, passport, majority, shoes, MTV, blog, coke (16)

There are some examples with subjects other than "I", most of them personal pronouns (plus a modest number of examples with subject "Tony Bennett", "Tony", or "Bennett"); almost all of these, however, have "heart(s)" in the object slot.  So these look like occasional independent variations on the song title.  Most of the variation is in the object, rather than the subject.

The Artfibers slogan varies both the verb and the object.  When you try verbs other than "left", however, the diversity of objects disappears.  The first hundred hits for "I lost my X in San Francisco" have only three different objects in relevant examples: heart [of course], dart [phonologically similar to heart], mind [arguably an organ].  There are also some examples with the preposition to instead of in: "I lost my heart to San Francisco".  Finally, with the verb "found", the first hundred hits have only two different objects in relevant examples: heart [of course; also mostly in the book title I Found My Heart in San Francisco], art [phonologically similar to heart].  ("Yarn" didn't happen to come up in the first hundred hits.  Note that yarn is phonologically similar to heart.)

It looks like we have a formula "I left my X in San Francisco", with considerable diversity in what occurs in the X slot; about half the examples are neither phonologically nor semantically related to the original.  The formula allows a certain amount of play with the subject.  But examples with the verb varied show little diversity in their objects, and stick to objects that are phonologically or semantically related to heart.  These look like novel plays on the song title, not instances of a formula.

There is, however, a second formula based on the song title: "I left my heart in X", where X is a location.  There's a huge diversity in locations; of the first TWENTY hits for

"I left my heart in" -"San Francisco"

(which yielded ca. 110,000 raw webhits), there were at least fifteen different locations: Cincinnati, Hawaii, Nova Scotia, Boston, Amalfi, New Brunswick, San Antonio, Texas, Limbe, Iran, Isla Vista, Chicago, Costa Rica, Paradise, Europe.  We pretty clearly have a second snowclone based on the song title.

It won't work to vary both the object and the location, of course, to get something like "I left my jazz in Cincinnati".  The result will just be interpreted literally.  You might possibly get away with "I found my snowclone in Palo Alto", because the object "snowclone" is a big ol' flag and because Palo Alto is prosodically similar to San Francisco and names a city in the Bay Area. Well, maybe I might get away with it.

What's the problem?

The New York Times is running an article about Kathleen Troia McFarland, a potential Republican challenger to Senator Hillary Clinton, whose claims to fame are that she was the highest ranking woman in the Pentagon in the Reagan administration and that she drafted Reagan's Star Wars speech.

But interviews with former Reagan administration officials and a review of documents show her claims were not entirely accurate. Though she helped write the "Star Wars" speech, its most famous passage - the one that announced the anti-ballistic missile program - was actually written by the president himself and his top national security advisers, according to two senior advisers to Mr. Reagan and a review of the literature and news articles of the period.

I'm having trouble figuring out where the Times thinks the problem lies. Ms. McFarland said that she drafted the speech, not that she wrote every bit of the final version. Her claim is entirely consistent with the accounts cited by the Times to the effect that the announcement of the Star Wars program was written by Reagan himself and a group of advisors that did not include Ms. McFarland. Do neither the author of the article nor the Times editorial staff understand the meaning of the word draft?

Ms. McFarland's politics are not mine - in a race between her and Hillary Clinton I would no doubt vote for the latter - but I see no hint of dishonesty in her claim.

Addendum 2006-03-23:

A couple of people have pointed out that for some people "draft" as a verb has a sense in which it can be used without the implication that what one drafts is a preliminary version. lists this sense after the sense in which I use the word. People who have only this sense of draft might indeed take Ms. McFarland's claim to be dishonest, but surely, before accusing someone of dishonesty, one should make sure that it isn't a matter of one's own misunderstanding. And it still means that neither the author of the article nor the Times editorial staff knows the older and standard meaning of "draft".

The other point that has been raised is this: if she wasn't trying to claim credit for the "Star Wars" passage that she in fact did not write, why did she mention that she drafted the "Star Wars Speech"? It seems perfectly possible that her intention was merely to point out that she was sufficiently important and trusted to have drafted the speech that became the Star Wars speech. How else would she refer to it? It doesn't have another name by which most people would recognize it. And her reference to it was on her CV, not a place where one could expect a detailed explanation of what speeches she worked on and exactly what her role was.

Posted by Bill Poser at 06:26 AM

March 22, 2006

Today's Linguistics 101 essay assignment

From a letter to the editor in today's (3/22/06) New York Times (p. A26), from Darwin L. Brown of Atlanta, reacting to a story on the deepening plight of black men in the U.S.  (Brown describes himself as "a married, hard-working black man who is a devoted father of two"):

    Opportunities for black men have never been greater.  It is incumbent upon all of us, regardless of racial background, to take advantage of them.
    Unfortunately, thousands of our black men have failed at this and subsequently descend into being absent fathers, poorly educated, poorly trained and unproductive workers.
    Contrary to what many would have us believe, the people responsible for this are not whites, President Bush or the Republican Party.
    Actually, it is those black men who value conception over fatherhood, Ebonics over proper English language and pocket money over building wealth who are to blame for their own downfall.

Your assignment:

Write an essay (of no more than 200 words) either (a) explaining why Ebonics appears on this list (along with sexual promiscuity and lack of thrift), or (b) arguing that it probably doesn't belong there.  For extra credit, do both.  If this is the sort of thing that entertains you, for a little bit of extra credit, explain why that last sentence might make you long for the serial comma.

Do not send your essays to me; submit them to your section instructor.  I am merely a magisterial presence, writing from a suite in the ivory tower of Language Log Plaza.

[Addendum: In e-mail on 3/24/06, Chuck Smith notes that the letter-writer appears to be saying that we should all take advantage of opportunities for black men.  This can't be what he meant; presumably he meant that we should all take advantage of the opportunities available to us.]

Vanishing slurs

While assembling a set of performances of (and take-offs on) Cole Porter's song "Let's Do It (Let's Fall in Love)" -- for an iTunes playlist -- I came across two from the early 40s, one sung by Mary Martin (with Ray Sinatra's orchestra), the other by Billie Holiday (with Eddie Heywood's orchestra), in which the first line of the first verse (after an intro) has the arresting words "Chinks do it, Japs do it" ("Up in Lapland little Lapps do it") in it.  (Martin sings Porter's original "And that's why Chinks do it, Japs do it", but Holiday gets right to the Chinks.)

Needless to say, these ethnic slurs rapidly vanished from the song, usually in favor of "Birds do it, bees do it" ("Even educated fleas do it").

The bees and fleas have been moved up from verse 4, the insect verse (verse 1 features nationalities, verse 2 birds, verse 3 sea creatures, verse 5 higher animals): "Locusts in trees do it, bees do it, / Even highly educated fleas do it".  Starting with locusts would be pretty lame, but the bees in the second half of the line naturally suggest that there should be birds in the first half.  Sometimes the fleas are highly educated, but most of the time they're just (normally) educated.

The song is from the show "Paris" (1928).  I'm getting the words from the 1971 volume Cole (edited by Robert Kimball, with a biographical essay by Brendan Gill), pp. 88-9.  Singers vary the words in various ways, skip verses, take them in different orders, keep the (cutesy) intro or cut it, but most of the performances around stick surprisingly close to the words I have (including the wonderful line "Lithuanians and Letts do it").  The exception is, famously, Noël Coward, who kept the tune and the repeated line "Let's do it, let's fall in love", but (from his first cabaret performances of it in 1955) constantly varied everything else, supplying new, gayer and more topical, versions ("Even Liberace, we assume, does it") from one occasion to the next.

The song is of course one giant double entendre, in which "fall in love" refers to what is euphemistically called "making love".   This is clear in a couple of places: "Folks in Siam do it, / Think of Siamese twins" (verse 1) and "Why ask if shad do it? / Waiter, bring me shad roe" (verse 3) and "Sweet guinea-pigs do it, / Buy a couple and wait" (verse 5).  Word is that people -- among them, Porter himself, for friends -- were making the verses bluer from the moment the words I have were made public.  There's even a report (in a review of the 2004 movie De-Lovely) that the original had the line "Roosters with a doodle and a cock do it" in it; this would have replaced one of the lines in the verse 2 couplet "Penguins in flocks, on the rocks, do it, / Even little cuckoos, in their clocks, do it".  That seems pretty coarse for Porter in public.  (On the other hand there's the couplet from verse 4 "Moths in your rugs do it, / What's the use of moth-balls?")

[Aaron Dinkin now reports that The Complete Lyrics of Cole Porter (1984, edited by Robert Kimball), which tries to be both authentic and uncensored, and tries to list variants of lyrics, lacks the doodle/cock line.  So it's unlikely to have been published, though he might have written it.]

Porter has been cleaned up on at least one other occasion.  In "I Get a Kick Out of You" from Anything Goes (1934), he wrote

Some get a kick from cocaine.
I'm sure that if I took even one sniff
That would bore me terrific'ly too.

I hear rumors that the rich, famous, and socially advantaged among us sometimes dabble in cocaine these days as then, but we no longer sing about it playfully, just as decent folk no longer use epithets for the Chinese and Japanese.  Well, not in public or in places where we might get quoted.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 02:09 PM

Further thoughts on "The Affect"

As Mark Liberman notes, the recent New York Observer article on "The Affect" presents "a very mixed bag of phenomena" supposedly characterizing the speech of young upper-middle-class New York women, including "bits of Valley Girl, a few undissolved lumps of Larchmont Lockjaw, and a generous sprinkling of generally female-associated stylistic features." The grab-bag approach is evident from the very beginning, with the headline reading: "City Girl Squawk: It's Like So Bad— It. Really. Sucks?" The illustrative sentence starts off using like as a discourse particle and so as an intensifier (though in the text of the article like is only remarked upon as a quotative and so is not mentioned at all). Then the punctuation of "It. Really. Sucks?" is supposed to cue the reader to a particular intonational pattern: the periods between words suggest an emphatic prosody (as in "Worst. Episode. Ever."), while the italicization of "sucks" and the final question mark are intended to code the rising intonation commonly known as "uptalk."

The writer Jason Horowitz initially posits "The Affect" as a pattern that is distinct to a certain type of New York women ("what really identified this New Yorker was her voice"), but by the next paragraph he's hedging the localization, referring to "a distinct group of young women in the American Northeast." And by the time he gets around to expert testimony from linguists, he's given up on geographical specificity. Bert Vaux focuses instead on gender and age distinctions cutting across regions: "Some people think of it as a young-girls shift. The main features are taking place in many, many parts of the country." Vaux is further quoted in paraphrase:

Mr. Vaux, who conducted a survey of American dialects while teaching at Harvard, said that while the voice heard increasingly in New York was distinctive, it was not particular to the region.

Trying to pinpoint what made it unique, Mr. Vaux crossed off nasality, which he says is what humans always mistakenly identify as different in foreign speech. He overlooked "like," for which he said the speakers of Sanskrit also had a penchant. While he also emphasized that his was not a rigorous scientific assessment of the new speech, he noted that the accent involved speaking with a higher average position of the tongue dorsum and perhaps a slightly different configuration of the laryngeal muscles, yielding a slightly creakier voice than is normal in other accents.

The comment about Sanskrit is perplexing; most likely Horowitz is poorly paraphrasing a point Vaux was making about similarities between English like and the Sanskrit quotative particle iti. Fortunately, the next paraphrase gives some actual phonetic description rather than the impressionistic generalizations pervading the article. Vaux describes the type of phonation known as "creaky voice," which some dialectologists have been observing in young American women in many regions of the country. This is likely what Horowitz means when he says young New York women's "long, whiney vowels" have a "touch of an early-morning grumble." Gawker's comment on the Observer article refers to "the hoarse whine of Parliaments," using a typical characterization of creaky voice as the raspiness of a cigarette smoker (though it should be noted that not all users of creaky phonation are smokers — they may just sound that way).

As for "uptalk," that is neither a regional phenomenon nor a new one. Cynthia McLemore and others have been investigating final rises among young American women for more than a decade and a half now (see here and here for further discussion). Most trace the intonational pattern to "Valley Girl English" as popularized by the likes of Moon Unit Zappa in the early '80s, though it probably could be detected among speakers from southern California long before that. Some of the other phonetic characteristics that Horowitz is gesturing towards also probably have a Californian provenance, particularly the shifting and lengthening of vowels as studied by Penelope Eckert. The popularization of like as a quotative or discourse marker has also been traced to California English.

Despite the lack of descriptive precision in the article, it serves at least as a vague indication of what counts as stigmatized these days in American youth speech, particularly patterns coded as feminine and therefore frivolous. Again we have a popular media account attempting to instill linguistic anxiety in some readers (and instilling a sense of linguistic superiority among others). But never fear: as Horowitz describes, speech therapists run "accent elimination programs" to train speakers away from stigmatized patterns. If this article gets widely circulated, business will probably be booming in New York.

Posted by Benjamin Zimmer at 01:26 PM

The Affect: Sociolinguistic speculation at the NYO

Lane Greene sent in a link to Jason Horowitz's 3/27/2006 New York Observer piece about the "City Girl Squawk" of "proudly upper-middle-class girls who love nothing more than to linger on a vowel".

It's an interesting story, whose premise is that "a distinct group of young women in the American Northeast are speaking with warped syllables that are a linguistic love song to their own exclusive milieu". Horowitz dubs this mode of speech "The Affect", apparently on the theory that it's affected, in the sense of being artificially adopted as a symbol of upper-middle-class identity.

There's been a recent flurry of media interest in the "Northern Cities Shift", a set of sound changes taking place in a belt of American cities from Rochester to Chicago. The NCS has been extensively documented by sociolinguists like Bill Labov and his students. For his discussion of the Affect, Horowitz consulted linguists Bert Vaux, John Singler, Walt Wolfram as well as Bill Labov, and he provides plenty of evocative examples. It's nice to see so much recent popular interest in linguistic variation and change.

And Horowitz might be on to something with his "Affect". However, the examples in his article are a very mixed bag of phenomena, ranging from some fragments of the Northern Cities Shift itself (his example of "ob-juhk-tion" for objection") to bits of Valley Girl, a few undissolved lumps of Larchmont Lockjaw, and a generous sprinkling of generally female-associated stylistic features, such as the modulations of duration and pitch sometimes rendered by typographical devices like italics and repeated letters.

Often, it's hard to figure out what linguistic features he's really talking about. He gives his examples in an exaggerated sort of "eye dialect" whose interpretation is often pretty clear ("ob-juhk-tion" for objection) but is sometimes baffling. What is "li-yike" for like supposed to mean, for example? Just a drawn-out diphthong? or is it really supposed to correspond to something like IPA [laiˈjaik] ? I think it must be the drawn-out diphthong, because the re-articulated version doesn't sound like anything I've ever heard in English except as a speech error. But in that case, Horowitz's spelling doesn't make much sense -- so maybe he means something different entirely that I haven't thought of.

All too many of his examples are effective not because they highlight special features of pronunciation or word choice or even speech style, but rather because of the content and context of what's said:

They can turn any item on a menu into an ancient Greek’s ritual lament (Stooooohhhleee owwrindge and taaaahnick!). They can separate emphasis from meaning, transforming the most straight-faced declarations into squeaky questions (“I haiiiiight haaaahr soooh maaaahch?”).

To be sure that we don't miss the point, Horowitz clues us in with phrases like "whining sorority girls" and "stigma of stupidity, juvenilia and shallowness". The linguistic characteristics are framed by details of dress, appearance and behavior, and described with evaluative words like "lazy" and "grumble":

“I laaaaahv a diiiiivey baaaaaahr,” said a girl with a voice that could crack the ice in her vodka tonic. It was her third drink. She was sitting with a friend at Duke’s (the “divey bar”) on 19th Street off Park Avenue South, wearing a periwinkle scarf around her neck and zebra-print shoes on her feet. She was in her late 20’s, had thick, dark eyebrows and straight, shiny brown hair worn in a long ponytail. She looked like a million other girls in New York: attractive but not pretty, stringy but not skinny, smart but not all that intelligent. [...]

More than the pearls or the diamond-stud earrings, what really identified this New Yorker was her voice: those long, whiney vowels; that touch of an early-morning grumble; that lazy, whistling “s” and glottal stop that hushes the “t,” even in such cherished words as “bachelorette.”

I spend a lot of time among upwardly-mobile college-age Americans, both male and female, and I recognize some of the ways of talking that Horowitz sketches. But it seems to me that he's talking about a diffuse collection of features, with many different sources and a complex pattern of connections to geography, class, gender and communicative intent. I also suspect that his description is subject to the "seductive effects of selective attention" that Arnold Zwicky has repeatedly warned us about: "the Recency Illusion (if you've noticed something only recently, you believe that it in fact originated recently) and the Frequency Illusion (once you notice a phenomenon, you believe that it happens a whole lot)." And especially the "out-group illusion":

... people pay attention selectively to members of groups they don't see themselves as belonging to and so locate phenomena as characteristics of these groups.

Arnold's advice is worth quoting again, in bold face and big letters:

The point here is that your impressions are unreliable; you need to find out what the facts are.

But this takes more time than a journalist usually has, and all that extra time might just spoil a good story -- a story that lets everyone feel superior to all those uppity young women in the bars off Park Avenue South.

Posted by Mark Liberman at 12:23 PM

March 21, 2006

It's X's world, we just live in it

Eric Bakovic is understandably puzzled by a sentence in Vanity Fair about Steve Jobs: "It's Steve's gadget- centric world which we just live in". Probing Google for "we just live in" will alert us to the fact that this is an allusion to the snowclone "It's X's world, we just live in it". The first three pages of hits give us these values for X:

Albert [Einstein], Google, a dog, doc [Searls], Sub Pop, Monsanto, Cory, Fellini, Microsoft, Dave Winer, Sinatra, Bill Gates, Choire Sicha, a pet, Sandy, Nancy Grace...

However, I'll confess that I don't know what the origin of this phrase is. [But now I do -- see below.]

The Wikipedia article on the iPod cites a Fortune article titled "It's iPod's Revolution: We Just Live in It" (from August 2005?), which makes even less prima facie sense than the Vanity Fair sentence does. When someone writes something that seems to make no sense at all, these days, a snowclone allusion is at least as likely as a typo or spellcheckism.

[A bit more web search turned up the fact that in the 1993 movie "The Saint of Fort Washington", Jerry, a homeless squeegee man, says to a driver "Hey, thanks a lot. It's your world; we just live in it." I don't know whether this is the original source, or just another allusion.]

[Karl Hagen suggests that the original was Dean Martin on Frank Sinatra: 'It's Frank's world, we just live in it". Some supporting evidence: Michael Sragow in Salon, 11/11/1999, "Being Charlie Kaufman":

To me, "Being John Malkovich" is the satiric ne plus ultra of the headline that became addictive to slick-magazine editors in the '90s: "It's Frank's World, We Just Live In It." (The headline has been used for everyone from Sinatra to Regis Philbin.) Kaufman conceded that what he was doing "is sort of parallel to that. A lot of it comes from the idea of not wanting to be yourself and being envious of other people. There is for sure the idea of looking out in the world and feeling you don't deserve to be there. How do you come to feel you have as much right as anyone else to be on this planet, when you have a barrage of information telling you that you don't have a right to be here, or that you have to change yourself to be allowed to be here? I took each character and on an instinctive level explored how they would react to that anxiety."

I've certainly seen the pattern, though I didn't know the source. However, for me the phrase always suggested the arrogance and egotism of X, not the low self-regard of the writer.]

[And I barely beat Ben Zimmer to the punch -- here's his post, written mere seconds after mine, and much better informed, as you would expect:

Eric Bakovic was stumped by this quote from a recent Vanity Fair article on Steve Jobs, wondering what the "just" was doing in there:

Except that one day in the near recent past everybody woke up and found out that while all the geniuses were blathering on about content this and content that, the media culture had, in fact, come to be dominated by machines. It's Steve's gadget-centric world which we just live in.

Chalk this one up to failed snowcloning. The Vanity Fair writer was clumsily attempting a variation on the old snowclone:

It's X's world -- We just live in it.

This expression has long been associated with Frank Sinatra. In discussion about the snowclone on the American Dialect Society mailing list last year, I found it in use with reference to Sinatra as far back as January 1964, in a newspaper column by Earl Wilson quoting Dean Martin from a few months earlier:

Reno Evening Gazette, January 4, 1964, p. 10
When Dean, Frank and their buddy Sammy Davis Jr. appeared at the Las Vegas Sands' llth anniversary, Dean bowed to Frank and said, "It's your world, Frank; I just live in it."

In the ADS thread Wilson Gray recalled hearing a similar expression in the mid-50's, so it was most likely floating around in various forms long before Dino began using it Sinatra-centrically. The snowclone would be revived again in the early to mid-'90s, when it found its way into pro basketball circles. From there it was popularized by ESPN announcers (Larry Horn called it one of SportsCenter's "tropes in residence"). As used by announcers like Stuart Scott regarding stars like Charles Barkley, the snowclone appeared most frequently with the progressive form of "live" ("...I'm/we're just livin' in it"), in keeping with African American English verbal patterns.

So where did the Vanity Fair version of the snowclone go wrong? It would have been more recognizable if the typical snowclone template had been used: "It's Steve's (gadget-centric) world; we just live in it."  But for some reason the writer felt compelled to work this into an awkward relative-clause construction: "It's Steve's gadget-centric world which we just live in." Nobody says it this way: a Google search on "It's X's world which we just live in" finds quotes from the Vanity Fair article and nothing else. Considering the enormous range of usage encompassed by Google, that's a very bad sign. It's possible that this was a revision made by an editor unfamiliar with the snowclone, but it has been rendered utterly un-idiomatic.

[Ben Zimmer]


Posted by Mark Liberman at 08:07 PM

How's this for ambiguity?

I love gadgets, especially ones made by Apple. So of course I enjoyed reading Michael Wolff's article on Steve Jobs in this month's Vanity Fair, which (in part) attempts to explain the course of Jobs' career as being due to his single-minded belief that "[t]he medium is the message" -- in other words: content, schmontent.

Yes, I often find excuses to use my Apple gadgets in situations where I probably don't need to use them, or I invent reasons for wanting to buy a new Apple gadget that I probably don't need. My latest not-really-necessary-but-I-want-it-anyway gadget is the new iPod Hi-Fi. I can't justify the price at this point, but I've been reading occasional reviews just in case I can be convinced. Today I read this positive-but-not-terribly-informative review, and I was struck by this paragraph (emphasis added):

So what do you need to spend all that money for? Sound is one thing: The IPod Hi-Fi does sound, well, like a high-fidelity unit. The sound can fill a room or even overfill it, if you crank up the volume sufficiently. It is rich, the bass is deep and the treble trills quite nicely. Some who've heard my evaluation unit complained about a lack of "midrange" sound; I didn't notice any.

Did the reviewer not notice any "midrange" sound, or did he not notice any lack of "midrange" sound? This is a critical matter about which we're left guessing, though presumably it's the latter, or else he would have presumably noted that he agreed with the "[s]ome who've heard [his] evaluation unit".

(But really, can we trust this reviewer to really hear anything? Why the scare-quotes around "midrange", but not around "bass" or "treble" -- are the latter somehow more discrete? Also, describing the bass as "deep" is hardly informative, and I don't know that I've ever thought of treble in terms of "trills". But anyway.)

Back to the Vanity Fair article: I'm stuck on the bolded line in the following paragraph. What's the "just" doing here?

Except that one day in the near recent past everybody woke up and found out that while all the geniuses were blathering on about content this and content that, the media culture had, in fact, come to be dominated by machines. It's Steve's gadget- centric world which we just live in.

Just now? Just live? Either way, I can't make sense of it. Must be Spring.

[ Comments? ]

Posted by Eric Bakovic at 05:15 PM

Not nearly hard enough

Via the Comics Curmudgeon:

Posted by Mark Liberman at 01:12 PM

March 20, 2006

It's hard

Elmore Leonard's got a blog now, though it mostly seems to be reviews and news clips posted by an assistant. But "again by popular demand" on March 14 was Elmore Leonard's Ten Rules of Writing. We've featured this list before, especially his Third and Fourth Rules: "Avoiding Rape and Adverbs"; "Self-exposure at the NY Times"; "The Sins of Dialogue Attribution"; "Overpermissive Quotatives: Grammar Change or Thesaurusizing?"; "What can you Bret Easton Ellis to that?"; "Love, adverbially".

Leonard's "most important rule, one that sums up the 10", is

If it sounds like writing, I rewrite it.

Or, if proper usage gets in the way, it may have to go. I can’t allow what we learned in English composition to disrupt the sound and rhythm of the narrative. It’s my attempt to remain invisible, not distract the reader from the story with obvious writing. (Joseph Conrad said something about words getting in the way of what you want to say.)

There's a quote from Stick that illustrates this instead of explaining it. At the beginning of chapter 23, Ernest Stickley Jr., 42 years old, born in Norman OK, raised in Detroit, sits down to write the plot summary for a movie. This is part of a scam designed to pry loose some money that's morally owed to him by a drug dealer, but the plot function doesn't really matter.

Stick wrote on tablet paper: Although Buck and Charlie are famous and experienced trafficers dealers, they are able to get be believed to be government agents, because of all the confusion there is among the different state and U.S. law enforcement groups that are falling all over each other and or not telling each other what they are doing in their work of trying to stop the trafficing dealing in controlled substances and apprehend the alleged . . .


It was hard.

Why didn't he just say: Since none of the feds know what the fuck they are doing, they believe that Buck and Charlie are . . .

Cornell came out of his bedroom, sleep in his eyes.

"You doing, writing to your mama?"

Posted by Mark Liberman at 04:49 PM

March 19, 2006

Ayn Rand psychologizes a trope

There's more wrong with Ayn Rand's attribution of significance to the fact that Americans (supposedly only Americans) speak of "making" money than Mark points out in his post on the comparative etymology of the money-creation metaphor, in which he shows that the trope goes back at least as far as Latin.  I would never attempt to improve on Mark's scholarship;  my objection is orthogonal to his.  I suggest that if it was true that Americans were the first and only people to use a verb meaning 'create' in the sense of 'accumulate' when the object is money, that observation would signify little or nothing about Americans or anything else.  Rand is here committing the common error of psychologizing a trope.  To psychologize a trope is to assume that if some people talk about A as if they thought about it in terms of B, then those people think about A in terms of B. 

If you worry you may perhaps be a crypto-trope-psychologizer, here's an exercise.  The French expressions for kite (that you fly) and bat (the flying mammal) are, respectively, cerf volant lit. 'flying deer' and chauve souris lit. 'bald mouse'. Ask some French speaker if they think about kites as flying deer or about bats as bald mice.  My experience is that French speakers can be stunned at the images aroused by the literal interpretations of these expressions.  If people talk about A as if they thought about it as B it is of course possible that they really do think about A as B, but a possibility is not a necessity.  There are cases where the expression "dead metaphor" appears to be pretty darn apt.  The "making money" trope is a promising candidate.

Posted by Paul Kay at 11:01 PM

Breaking the law

Michael Crichton, in today's New York Times, discusses the latest shenanigans in the strange world of patent and trademark law. To the average Joe, it seems truly amazing that people can own parts of our language and require the rest of us to pay royalties if we use them. Okay, maybe it's okay to protect names of products and to give patent protection to important procedures and formulae, but now a company called Metabolite wants to get royalty fees from everyone who says, "Elevated homocysteine is linked to B-12 deficiency, so doctors should test homocysteine levels to see whether the patient needs vitamins." Well, I'm sure not going to say it. Oops, I just did. For that matter, so did Michael Crichton and for all I know, anyone who forwards this post also may be in deep do-do.

On the brighter side, maybe Metabolite's proposal has possibilities for stopping some of the strange usage advice offered by the likes of Strunk and White. If they would charge a royalty for everytime someone cites their questionable sentences, it might encourage people to stop quotting them and, who knows, maybe even to stop following them.   For example, we could then get rid of the following S&W pronouncements:

  • In formal writing, the future tense requires shall for the first person, will for the second and third. [Royalty fee: $1,000]
  • That is the defining, or restrictive pronoun, which the nondefining, or nonrestrictive. [Royalty fee: $2,000]

The prospects are unlimited here.

Posted by Roger Shuy at 09:56 PM

Open source translation

The Joint Reserve Intelligence Center of the US Foreign Military Studies Office has set up a web site from which anyone can download documents and transcripts of audio recordings from Iraq. According to the information on the Operation Iraqi Freedom Documents site itself and the press release issued by the Office of the Director of National Intelligence, the purpose is to make these materials accessible to the public.

Another view is that they are doing this in the hope that people who know Arabic will translate them and post the results, thereby alleviating the severe shortage of translators available to the US government. According to this Boston Globe article, some people are already doing this. This argument is undermined to some extent by the fact that they are only posting documents that do not contain classified material whose publication will not be harmful to innocent people, which means that they have all been read first. (Besides, won't it be a lot of work to exclude translations by gay people?) Yet another view is that they are releasing documents carefully chosen to support the Bush administration's case for war. A more paranoid interpretation would be that the site is a honeypot, meant to identify people who can read Arabic so that they can be detained and interrogated as suspected terrorists.

Posted by Bill Poser at 06:28 PM

Heralds of Resource Sharing

A note from Dave Farber points me to an amazing 1972 documentary about ARPAnet, "Computing Networks: the Heralds of Resource Sharing", by Steven King. (No, not that Stephen King -- at least I don't think so...)

My favorite parts are the comments by J.C.R. Licklider, one of the most important intellectuals of the second half of the 20th century. I've transcribed a few of them below.

About collaboration over the net:

The thing that makes the computer communication network special is that it puts the workers -- the tea- the team members who are geographically distributed -- in touch not only with one another, but with the information base with which they work all the time, so that when they get to developing plans, the blueprints (as it were) don't have to be copied and sent all around the country. The blueprints come out of the database and appear on everybody's scopes, and the correlation, the coordination of the activity, is essentially right there in the computer network itself.

About networked access to digital libraries:

Right now it's possible to buy for about a million dollars, an information store that will hold the equivalent of about a hundred thousand books. So one can store- one can buy the store for a book for about the same amount as he can buy the book, so that if everyone had a display console, in his home and in his office, he could be reading from electronically stored information instead of from a book, and the difference is, he could have access to anything he wanted to read instead of just what was in- within reach. Well, it turns out to be surprisingly inexpensive, if you get wideband transmission facilities, to send the stuff right when it has to be read, instead of sending it to a local bookstore or a local library in the hope that it might be read.

(Note the cost of an "information store" large enough for 100,000 books is now about $100, depending on how many illustrations there are.)

The processing and distribution technology, and the storage technology, are going to make it possible to get over onto a new technological base for intellectual efforts, before our ponderous social processes will let us. I think more people ought to get in there and think about the social processes.

A lot of the digital world that you and I live in has been led "from below". Some ordinary citizen, without any special authority or resources, invents something that spreads because others like it, adopt it and develop it further. Well-known examples include Tim Berners-Lee's URL/html/browser package and Paul Ginsparg's arXiv. But there are a few examples of visionary leaders who provided money and other help "from above" as well as ideas, and Licklider was eminent among them.

There isn't any real need to change things just for the sake of- of changing, but I tend to believe that things are going to be considerably better for a lot of people when and if we ever get changed over to an essentially electronic base. And I- it's just fundamental that if one wants to deal with information, you ought to deal with information, and not with the paper it's written on.

Posted by Mark Liberman at 10:12 AM

March 18, 2006

The Perils of Fieldwork

Since Geoff is in the mood for humour, here's a joke I heard a long time ago.

A graduate student went to New Guinea to do research for his dissertation. He found a guide to take him to the remote village whose language he hoped to study. About noon on the second day they began to hear loud, continuous drumming. Disturbed, he asked the guide: “What are those drums?”. The guide replied: “Drums OK, but VERY BAD when they stop.”

The linguist calmed down a little at this, and things went along uneventfully for a while. Then, suddenly, the drums stopped! Worried, the linguist asked:

“The drums have stopped, What happens now?”

The guide crouched down, covered his head with his hands and said: “Bass solo”.

Posted by Bill Poser at 02:44 PM

Linguist jokes (3)

I was walking across campus with a friend and we came upon half a dozen theoretical linguists committing unprovoked physical assault on a defenseless prescriptivist. My friend was shocked. She said: "Aren't you going to help?"

I said, "No; six should be enough."

All right, all right, this joke is a shameless, reprehensible ripoff, adapted from a fairly well known mother-in-law joke by comedian Peter Kay, with an ugly undertone of violence, and I should be ashamed of myself for telling it and I am. But I try to post one linguist joke a year, whether you need it or not (others are here and here, and I guess you could say also here, sort of), and they are not easy to come up with, and this is what I had on hand.

Anyway, if you smirked, let me just say... thank you for smirking. Hey, that would be a good film title.

Posted by Geoffrey K. Pullum at 02:01 PM

Labov on the Northern Cities Shift

If you'd like to hear Bill Labov talking about the Northern Cities Shift and other matters, you can hear an hour-long interview with Marty Moss-Coane on the 3/14/2006 edition of WHYY's Radio Times. A RealAudio stream is here, or you can get it in mp3 form here (though the podcast link may expire in a few days, I'm not sure).

Posted by Mark Liberman at 12:44 AM

March 17, 2006

It's not hard out here for a cliche

Matt Hutson sent in a link to documentation at for an(other ?) emergent Oscar insta-snowclone: "It's hard out (t)here for a(n) X". (For those who just returned from a couple of months in another galaxy, the precipitating event is this.)

Posted by Mark Liberman at 07:22 PM

Spreading like wildlife

John Kroll drew my attention to a curious ecological metaphor in an article by Gene Sloan in USA Today, 3/16/2006, "Strings can be pulled on hotel-review sites":

Take the Westin New York at Times Square, which recently saw a surge in requests for corner rooms with king beds. The rooms have great views.

"I couldn't figure out how these guests knew to specifically request such a room until I saw it on tripadvisor," says Karen Colliton-Thomson, head of sales and marketing. "Word spread like wildlife on the site." [emphasis added]

The usual expression, of course, is "spread like wildfire". "Spread like wildlife" is an interesting metaphor, though. Wildlife certainly can spread uncontrollably, but the word wildlife seems to be more evocative of endangered species and habit destruction than of out-of-control pests like zebra mussels and cane toads. The substitution of "wildlife" for "wildfire" may just be a malapropism, but the substitution also makes its own kind of sense -- which malaprops mostly don't -- so that "spread like wildlife" shades off towards the eggcorn category. (The eggcorn database already has "spread like wildflowers"...)

And that old devil attributional abduction raises its head again here. Perhaps Ms. Colliton-Thomson actually said this, in which case Gene Sloan might not have noticed, or might have decided not to correct it. On the other hand, maybe she said "wildfire" and he transcribed "wildlife", either as a mental error or because he actually thinks that "spread like wildlife" is the standard phrase. It's also possible that an editor made the change, though this seems less likely. Finally, this could be some sort of spellcheckism; but in a brief experiment, I wasn't able to get MSWord to suggest "wildlife" for any likely mistypings of "wildfire".

John Kroll also pointed out that there are a hundred or so other hits for {"spread|spreading|spreads like wildlife"}, suggesting either that the psychological slip/misconstrual is a recurrent one, or else that there's a spellcheck stimulus out there that I didn't find.

Posted by Mark Liberman at 03:35 PM

The N-Word in the News Again

A March 12, 2006 article in the Denver Post (see here) describes a recent criminal case in which a 22 year old Black mechanical engineering student was walking in downtown Boulder with a female friend when a 38 year old Hispanic man with a record of 40 prior arrests saw the couple and yelled at them from his car, using the n-word and a string of profanities. The Black student yelled back: "If you have something to say, get out and say it." After the Hispanic man  proceded to break the student's jaw, he was arrested and charged with second-degree assault and ethnic intimidation.

The case went to trial and the jury  found the attacker guilty of assault but acquitted him on the hate-crime charge. The 12 juror panel was made up of mostly people in their 20s who reasoned that "nigger" was meant as an insult but not as a racial slur. They reported that younger people familiar with the hip-hop culture portrayed these days in film and urban music may not be aware of the racial implication and history of the term.

The jurors also gave some attention to the pronunciation of the word as "nigga," which they believed supportive of their argument. But this interpretation doesn't fly with the U.S. Patent and Trademark Office,  which, for example, has twice rejected actor Damon Wayans' attempts to trademark "Nigga" for his proposed  hip-hop line of clothing. An act of Congress prohibits people from registering a word that is scandalous or that disparages a particular  group. The trademark office has a host of dead applications that tried to use the n-word.

I'm left with some confusion about what counts as racism these days. Sure, there is a pop-culture that uses this term perhaps more benignly that us old codgers can understand or appreciate. But at what point does the origin and continued broad meaning of an insulting, racial term become socially acceptable? To whom? By whom? Under what circumstances and context? And does dropping the final "r" in "nigga" really make it okay? Discourse context, register, phonetics, sociolinguistic variation, semantics, and the processes of language change all suggest themselves as relevant to this case. Maybe a linguist could have been helpful here.

Posted by Roger Shuy at 12:58 PM

Instilling linguistic anxiety in Raachester

For a scholarly work with the formidable list price of $620, the Atlas of North American English (by William Labov, Sharon Ash and Charles Boberg) has been getting some nice press attention since its launch earlier this year. Last month NPR's "All Things Considered" featured a thoughtful interview with Labov, and now the Atlas has inspired a dialectological road trip by Tim Sultan, a New York Times travel writer.

When Sultan approached Labov about using ANAE as a travel guide, Labov warmed to the idea, suggesting that he head northwest from New York City straight into the Inland North. The "Inland North" is Labov's designation for the dialect region that surrounds the Great Lakes in urban areas from western New York State to southeastern Wisconsin. The region is largely defined by the Northern Cities Shift, a set of vowel changes that makes "Dan" sound to others like "Deeean," "Don" like "Dan," "Dawn" like "Don," and so forth. (That cursory description hardly does the vowel shift justice; see here, here, and here for more discussion.)

Sultan followed Labov's advice and headed off for the area around Rochester — or, if you prefer, "Ratchester" or "Raachester," using pronunciation spellings that seek to approximate the fronting (and lengthening?) of the vowel in the first syllable according to the Northern Cities Shift. Sultan had no problem eliciting the shifted vowels from residents of the region, which Labov chalks up to a general state of dialectal unself-consciousness among speakers in the Inland North:

"Nobody with the Chicago-Rochester dialect makes a fuss about it," Professor Labov said. "They aren't as self-conscious or aware of it. Give a New Yorker or a Southerner a piece of paper with a word on it and ask them to say it, they'll start sweating."

Well, there's one way to get folks in Rochester feeling anxious about their dialect differences — more stories like this one from the Rochester Democrat and Chronicle, in which ANAE is tackled by a writer lacking Sultan's journalistic acumen:

Vowels speak volumes among 'funny-talking' Raachesterians

We may not know it, we may deny it and we might even be embarrassed about it, but a Pennsylvania linguist insists we talk funny in Raachester.

William Labov, a linguistics professor at the University of Pennsylvania in Philadelphia, calls our dialect the "northern city shift," claiming we say our vowels a bit more oddly than other parts of the country.

Needless to say, it's highly doubtful that Labov said that Rochester residents "talk funny," or that they say their vowels "a bit more oddly" than other speakers. But that's the way dialectal differences tend to get characterized in popular perception, so that's how Labov was apprehended by the Rochester reporter. Note also the accompanying sidebar that consoles readers: "Rochester isn't alone in having a dialect. Here are some ways other people talk." In common usage, terms like "dialect" or "accent" very often get equated with "talking funny," i.e., stigmatized divergence from perceived norms of pronunciation and usage. But works of dialectology like ANAE make no such value judgments on language variation. It's a shame, then, when reporters only hear what they want to hear.

Posted by Benjamin Zimmer at 12:57 AM

March 16, 2006

"My parents, Ayn Rand and God"

With reference to my recent post on Ayn Rand's foray into comparative ethnographic lexicography, Ben Zimmer points out another connection between linguistics and the founder of Objectivism: her role in an (apparently aprocryphal) argument for the serial comma.

Well, her name's role, at least. The crucial example is

This book is dedicated to my parents, Ayn Rand and God.

which was intended to mean "...dedicated to my parents, to Ayn Rand, and to God", but can all too easily be read as "...dedicated to my parents, [who are] Ayn Rand and God". The idea is that a serial comma (" my parents, Ayn Rand, and God") would have spared us all from blasphemous thoughts.

The full story is here, where Vicki Rosenzweig writes:

I heard of this example from Teresa Nielsen Hayden. When I asked her about it, she referred me to Jon Singer. Jon referred me to a former co-worker of his, to whom I sent an email that began, approximately, "You don't know me, but I'm a friend of Jon Singer's (I know, isn't everybody?)..." He sent me a friendly reply, explaining that he had never actually seen the book in question, only a copy of the dedication page, and that he no longer had an audit trail on this. For all I know, someone put it together as a joke and sent copies around. It almost doesn't matter: the example is so perfect that mere existence could not possibly add anything to it.

And Eric Bakovic comments:

When I first saw Mark's "Ayn Rand, Linguist" post, I thought it'd be (or just contain) something about Rand's lesser-known book Anthem. For those unfamiliar, this is a 1984-sort of story in which the people speak a variety of English devoid of first person singular pronouns & agreement and, of course, have to use the first person plural ("collective"?) pronouns & agreement instead. In the end, the protagonists escape and find an old library in the middle of some forest or something, where there are books with the magical word 'I'. And so the protagonists are saved (from themselves).

Through some inexplicable oversight in my education, I've missed this particular work; there's a Project Gutenberg e-text, so I've got no excuse anymore.

As we've often observed before, the 20th century was deeply attached to the notion that the central properties of a culture are to be found in the distribution of items in its lexicon and morphology. Here's another example I happen to have read recently, from Ursula K. Le Guin's Rocannon's World:

They were a boastful race, the Angyar: vengeful, overweening, obstinate, illiterate, and lacking any first-person forms for the verb "to be unable." There were no gods in their legends, only heroes.

A Randian crew. And note the serial comma.

[Update: Steve at Language Hat wrote in to remind me that he blogged the "Ayn Rand and God" story back in August of 2003. In the same post, he cited another serial-comma example acquired by way of Theresa Neilsen Hayden:

"The 'God and Ayn Rand' serial comma thing is possibly apocryphal, but there's one along the same lines that Rob Hansen spotted in the TV listings of The Times: Planet Ustinov - Monday, C4, 8pm By train, plane and sedan chair, Peter Ustinov retraces a journey made by Mark Twain a century ago. The highlights of his global tour include encounters with Nelson Mandela, an 800-year-old demigod and a dildo collector."

Hat added in his note to me

Of course, I don't know of any way to check the accuracy of the alleged TV listing, so we basically have to take Rob Hansen's word for it, but it's a great quote.

Yes, as Patricia observed, in such cases mere existence is superfluous... ]

Posted by Mark Liberman at 06:29 PM

printing a list of all english words

The GNOME project has just released GNOME 2.14. The release notes include a section on performance improvements which begins with:

Figure 1. GNOME Terminal performance improvements between GNOME 2.12 and 2.14. Time taken is the time to print a list of all English words to the screen. [emphasis mine]

which refers to this image:

That's so impressive you know it can't be right: there are infinitely many English words so they can't be printed in finite time. If this isn't obvious to you, the classic examples are kinship terms like "great-grandmother" and "great-grandson", which can be continued indefinitely:


Unfortunately, they don't make clear what list they actually used, though a good guess would be the list found on Unix systems in /usr/dict/words or /usr/share/dict/words, which usually is a list of only 45,425 words though there is a longer variant containing 235,882 words. Both, being finite, are inifinitely shorter than a list of infinite length. Fortunately, the quality of the GNOME project doesn't depend on its theoretical linguistic sophistication. (I say "theoretical" because GNOME now supports 45 languages.)

Posted by Bill Poser at 05:32 PM

Brokeback generalization in context

Jack Carroll reports from Cyprus:

... in the commentary to a fashion show review on Euronews one collection of women's wear was described as having "topical Brokeback chic."

This before the actual clothes were shown.  What do you think was Brokeback about them?

Yes, Western wear: cowboy hats, cowboy-inspired slacks, and cowboy boots.  Absent here is any reference whatsoever to the characters or plot of "Brokeback Mountain".  In this context all that matters is the clothes, just the clothes, ma'am.

March 15, 2006

Ayn Rand, linguist?

Joe Gordon sent in a link to a blog post by TBoggs commenting on a quote from Ayn Rand's Atlas Shrugged:

"If you ask me to name the proudest distinction of Americans, I would choose--because it contains all the others--the fact that they were the people who created the phrase 'to make money.' No other language or nation had ever used these words before; men had always thought of wealth as a static quantity--to be seized, begged, inherited, shared, looted or obtained as a favor. Americans were the first to understand that wealth has to be created. The words 'to make money' hold the essence of human morality."

Joe's question is of course a linguistic one: "Is that even close to accurate?  Is 'make money' such a rare phrase?"

I don't know how to answer the statistical question: what fraction of history's money-using cultures have described the accumulation of wealth by using a verb that can also mean "create" or "construct"? But it's easy to discover that we Americans were not the first to do so.

If Ms. Rand had looked up make in the OED she would have discovered a pre-Columbian citation:

1472 R. CALLE in Paston Lett. (1976) II. 356, I truste be Ester to make of the leeste l marke.

(I believe that in modern orthography this would be "I trust by Easter to make of the least one mark".) The letter's context makes it clear that Richard Calle was not writing to Margery Paston about a counterfeiting scheme. The quantity expression "1 marke of money" is antique, but the sense in which Paston anticipated "making" that amount seems contemporary enough.

If she had looked up pecunia in Lewis and Short, she would have learned that the Romans used the analogous locution:

I. property, riches, wealth (cf.: divitiae, res, bona, etc.).
I. In gen. ... pecuniam facere, to accumulate property, Cic. Div. 1, 49, 111

Lewis and Short choose "accumulate" as an English gloss, which Rand (or her mouthpiece Francisco d'Anconia) might have pounced on as an example of the old-fashioned thinking that sees wealth as a conserved quantity like gold, rather than one that can be created from nothing, like poetry. But L&S are citing a thoroughly capitalistic (or at least market-oriented) passage in Cicero's De Divinatione:

XLIX 109 ... Quos prudentes possumus dicere, id est providentes, divinos nullo modo possumus, non plus quam Milesium Thalem, qui, ut obiurgatores suos convinceret ostenderetque etiam philosophum, si ei commodum esset, pecuniam facere posse, omnem oleam, ante quam florere coepisset, in agro Milesio coemisse dicitur.

In Falconer's translation:

Such men we may call 'foresighted' - that is, 'able to forsee the future'; but we can no more apply the term 'divine' to them than we can apply it to Thales of Miletus who, as the story goes, in order to confound his critics and thereby show that even a philosopher, if he sees fit, can make money, bought up the entire olive crop in the district of Miletus before it had begun to bloom.

If Ms. Rand had looked up faire in Dictionnaire de L'Académie française, 8th Edition (1932-5), she would have found that

FAIRE signifie aussi Amasser, assembler, mettre ensemble, en parlant d'Argent ou des autres choses dont on a besoin de se pourvoir. Il tâche de se faire quelque argent.

FAIRE also means to Amass, assemble, put together, in speaking of Money or of other things that one needs to provide for oneself. He tries to make himself some money.

I don't know for sure that this usage in French is an old one, but given that the corresponding expression existed in Latin, I think it's a good bet. (And Rand's Francisco d'Anconia would again have pointed triumphantly to the academicians' assumption that this sense of faire is equivalent to "assemble" rather than "create" -- but that tells us about the views of the dictionary writers, and not necessarily those of the users of the language.)

I don't have time to look into Greek, Sanskrit, Hebrew and so on, but we've established that a few minutes in the reference section of a good public library would have allowed Ms. Rand to debunk her little linguistic homily. I don't know whether she would have cared: as always with such arguments from etymology (see here and here), the point is not philological but metaphysical. At least Rand's fable has the merit that if the claimed facts of language were true, they would be relevant to the argument.

On a related topic, I think I owe John Powers and Cullen Murphy an apology. This is not because I criticized them for lexicographical non-feasance -- they were guilty as charged. But I was wrong to imply a falling off from previously higher standards of scholarship:

Can't anybody use a dictionary anymore? I enjoy a good curmudgeonly rant about how English is going to the dogs these days, I really do. But why can't the journalists who crank out such screeds check their lexical prejudices against a good dictionary or two?

Since Atlas Shrugged was published in 1957, Rand's lexicographical insouciance antedates theirs by almost half a century.

[Update: Rob Groves writes that

... there exists an adjective in Greek, Plutopoio/s (πλυτοποιο/ς) which the full Lidell and Scott 9th ed. cite as wealth-creating and show it being applied to nouns like te/chne (τε/κνη)=skill, talent etc.  and chre^ma (χρη^μα) =good, possession, etc.  

Here's the link -- the transliteration used at Perseus is plouto-poios. ]

[Update 1/27/2007 -- John Cowan writes:

it's clear that Richard Calle in 1472 said "at the leeste L marke", not "at the leeste 1 marke"; in other words, "at least 50 marks". Wikipedia says that a mark after the Norman Conquest was 2/3 of a pound sterling, so he's talking about 33.33 pounds sterling.

How much is that? Following the method at, which takes into account the Phelps Brown-Hopkins consumables index value for 1472 of 104 and the December 2006 U.S. Consumer Price Index of 201.8, we get about $35,000 in current U.S. money. Definitely "making money".

In any case, a little knowledge of the history of economics would have told Rand that the counterpart to the "Spanish theory of value" (paraphrased by a Spaniard of my acquaintance as "Let's bring lots of gold and silver into the country, and we'll ALL get rich!") was the "English theory of value" (that wealth is a product of laboring), not the "American theory of value". But perhaps to point this out is to engage in a battle of wits with the unarmed.

Wow. ]

Posted by Mark Liberman at 12:07 PM

The Recency Principle Lives

One of the things that seems to characterize many of the tape recorded, surreptitious undercover conversations I've analyzed in court cases over the years is the recency principle. Often when cops bring up something that might incriminate a suspect, they quickly switch to a different, benign topic or two before the suspect even has a chance to respond to the potentially  incriminating one. Most of the time suspects answer the most recent topic on the table. Unless they are very alert, the cop's initial bad topic often gets lost in the ebb and flow of  conversation. People tend to respond to the most recent one. I call this the "hit and run" strategy of undercover cops (Creating Language Crimes Oxford, 2006). In cases where tapes are the evidence, even if the target makes no comment at all about the bad stuff, it's on the tape for later listeners such as juries to hear. They don't know about the recency principle. All they know is that some incriminating information is there and what's on the tape looks very bad for the suspect.

Unless this strategy is exposed by a linguist, the undercover cop's use of the recency principle works pretty well to bring convictions in criminal cases. Only recently have I thought  much about how it works in politics and world events. Now it occurs to me that it may be going on right now before our very eyes. As of today at least, the recent flap over control of the US ports is the current centerpiece for judging federal incompetence. It causes us to concentrate less on the increasingly long series of other recent political flaps, such as the Cheney hunting event, which in turn made us concentrate less on the illegal wiretapping issue, which made us concentrate less on the Katrina incompetence, which made us concentrate less on the prison atrocities, which made us concentrate less on going to war in the first place. Makes me wonder if our government is on the way the recency principle works. It's hard to believe that our bungling leaders deliberately create new goofs to make us respond only to the most recent ones and cause us forget the original incompetence that started it all. But it bears an eerie similarity to what cops do in undercover tape cases--the good old "hit and run strategy," only on a much larger scale.

Posted by Roger Shuy at 11:47 AM

March 14, 2006

The perils of semiotic speculation

During the recent controversy over the Danish cartoons, it's struck me how many people think that the events are messages of some sort, though they rarely agree about who has been communicating what to whom. The riots and embassy attacks have been characterized as a message from Syria to France, or from Saudi Arabia to the U.S., or from Iran to Europe, or from the Islamic "street" to the west as a whole. And perhaps the cartoons were a message from the Danish establishment to its Muslim residents, or from the European right to the European left, or ...

All of this semiotic speculation reminded me of Michael Dibdin's detective novel Blood Rain, in which Italian detective Aurelio Zen is sent from Rome to Sicily to coordinate with anti-Mafia activites -- or perhaps to spy for his superiors on his colleagues, or perhaps just to get him out of the way. The book's theme is that every event -- every death, in particular, since this is a detective story -- is a message. But for people who think that way, of course, more messages are received than have been sent. [Warning: plot spoilers ahead.]

Zen's daughter happens to be in the same Sicilian city, working as an IT consultant to install and test some new software for the anti-mafia task force. He meets her in a café:

Zen took a sip of the scalding coffee, which jolted his head back briefly, then pulled over the copy of the newspaper which the woman had been reading. DEATH CHAMBER WAGON TRACED TO PALERMO, read the headline. Aurelio Zen tapped the paper three times with the index finger of his left hand.

"So?" he asked, catching his companion's eyes.

The woman made a gesture with both hands, as though weighing a sack of some loose but heavy substance such as flour or salt.

"Not here," she said.

The headline refers to a badly decomposed corpse found in a locked railway car. Zen is perplexed by his daughter's concern with being overheard by strangers.

"If you start thinking like that, you'll go mad."

"And if you don't, you'll get killed."

Zen snorted.

"Don't flatter yourself, Carla. Neither of us is going to get killed. We're not important enough."

"Not to be a threat, no. But we're important enough to be a message."

She pointed to the newpaper.

"Like him."

Coruna Nunziatella is a judge assigned to anti-mafia cases, who strikes up a friendship with Zen's daughter. A bit later in the book, she reads

... the transcript of an interview between a magistrate in Palermo and a pentito, one of the former members of Cosa Nostra who had agreed to collaborate with the authorities in return for them and their families being buried and rebirthed in the government's witness protection programme, safe from the vengence of those they had betrayed.

Sometimes, yes, but normally we just kill them. It's quicker and cheaper. Saves a lot of effort. When you kill someone, you also send a message. Maybe even many messages.

Even contradictionary messages?
Especially those. But it has to be done right. There's an art to the thing. Because there's no such thing as a messageless death, are you with me?

In other words, if a message doesn't exist, someone will invent one.
Exactly. So you have to make sure that some message comes through loud and clear. Otherwise the communication can get fouled up. And when that happens . . .

When the messages start going astray, there's no rhyme or reason any more. No one knows what's going on, so everyone's extra edgy. Mistakes happen, and those mistakes breed others. Before you know where you are, you have another clan war on your hands.

So these executions have to be correctly performed. It's a sort of ritual theatre, in other words, like the priest consecrating the host. What's the matter?
Look, I'm trying to cooperate, all right? We're different men with different objectives, but I respect you just as you respect me.

Of course.
So no more jokes about the holy mass, please.

I apologize. To go back to what we were discussing, can you give me an example of such a message?
There are so many. But I'll mention a recent one.

Just to show that, even though for your own protection you're in solitary confinement down at the Ucciardone prison, you're still in touch.
Why would you take me seriously if you thought you were dealing with someone whose clock stopped when he got picked up? Anyway, the thing I'm thinking of is that body they found in a train near Catania.

The Limina case.
Only it wasn't the Limina kid at all, is what I've heard.

Who, then?
Some sneak thief who was picked up operating on protected turf. He'd been warned before, but he had more balls than brains. They were going to waste him in an alley somewhere, but then someone had a brighter idea. The thief looked quite a bit like Tonino Limina. Same age, same height and build, same colour hair. The Limina clan have been making themselves a bit of a nuisance on this side of the island, so a warning seemed in order. They shut the thief up in a a freight car on a train bound from Palermo to Catania, with a label with 'Limina' scrawled on it. One message delivered and one undesirable disposed of. A perfect solution.

But the Liminas explicitly denied that the murdered man was their son. Obviously they knew that Tonino was still alive. So the message was pointless.
No message is pointless. Maybe in this case it wasn't the young Limina. Next time, who knows?

Ironically, Zen's daughter will later be killed, not to send a message, but instead because she has become a threat. Nunziatella will be blown up in the same car, and everyone assumes that the bomb was meant for the judge, to send a message to the anti-mafia task force.

In fact, this is all backwards. The body in the railroad car really was Tonino Limina, not a thief who looks like him. And all of the murders were done by the Rome-based components of the anti-mafia task force. Why? Zen's daughter was killed because she had stumbled on some potentially incriminating information in the computer system. Tonino Limina and Coruna Nunziatella were killed to stir up trouble among the clans, and thereby to send a message to the Italian government, that the task force is still needed.

In Blood Rain, everyone's elaborate interpretation of the semiotics of events is nearly always completely wrong. As the conventions of the genre dictate, there really is a communicative agent and a communicative intent behind the novel's events, even though the agent and the intent are not what everyone thinks they are. In real life, messages can be received in the absence of any well-defined communicative intent at all -- though I have no idea whether or not this was recently the case in Copenhagen, Damascus and Beruit.

In another cafe scene in Blood Rain, a local cop named Baccio Sinico is explaining one of the fine points of Sicilian communications technology to Zen, who sums the lesson up as follows:

"Fine, so this Spada, whose name isn't Spada, makes a living by passing on messages in a way that is also a message in itself. Am I right?"

"Bravo," said Sinico with a curt nod. "You're starting to understand."

"All I understand is that I don't understand a damn thing."

"You'd be surprised how many people don't even understand that, dottore."

Posted by Mark Liberman at 02:44 PM

March 13, 2006

Voice confused with tense at the Economist

An anonymous reviewer in The Economist [print edition, 11 March 2006, p. 77], where writers are always anonymous unless there is a special reason to reveal their identity, says of the new novel Company that the author Max Barry "is a master of short sentences and the passive tense." Passive tense? The anonymity is a blessing here, since it serves to protect the reviewer from public shaming. I don't really relish the role of pedant, and I can guess what the writer meant; but the passive involves a voice contrast; it has absolutely nothing in common with tense. I am astonished, all over again, at how educated people can commit blunders as extreme as this one in print, and editors don't even notice.

People normally follow Strunk and White obediently in complaining about use of the passive, but it often turns out that they cannot actually identify passives to save their godforsaken lives, as Language Log has remarked so many times before (see this post and this one and this other one and yet one more here and still another one here, for example). In the case at hand today, the passive is apparently being approved of rather than condemned — but by a writer who cannot even tell it from a tense.

I know, you're going to say, why should we care? Well, the view I take is that if we are going to pay any serious attention to the formal properties of prose in our language, which is exactly what this reviewer is trying to do, we have to have a coherent system for talking about it; we can't just talk uninterpretable nonsense.

Tense is an inflectional category of verbs that has time reference as its primary semantic function. English has two orthogonal tense contrasts: the primary one is present vs. preterite (compare writes with wrote) and the secondary one, marked with the past participle preceded by the auxiliary verb have, which contrasts non-perfect versus perfect. The have can be in either present or preterite, so we get both writes / has written and wrote / had written. Subjects and objects are unaffected by tense changes.

In the passive voice, on the other hand, the semantics of the subject is assigned differently. The active voice, as in Mary wrote a letter, has the agent role associated with the syntactic subject (in this case Mary, denoting the person who did the writing). In the passive counterpart, A letter was written, the subject (a letter) is what would have otherwise been expressed as the direct object, had active voice been used.

The problem of people confusing voice with tense is not a huge danger for the future history of the world; but for heaven's sake, if people have absolutely no idea how to use technical terminology of grammar, why do they try, even when writing for print publication? Why do they imagine they can just guess at random and put their unchecked guess in The Economist? And why do the editors let them? I don't know. I'm at a loss for words. So I'll just fall back on repeating a splendid piece of rant that Mark posted a while ago on Language Log, in a depressingly similar context:

I hate this role of correcting elementary errors of linguistic analysis, or questioning unthinking prescriptions that are logically incoherent, factually wrong and promptly disobeyed by the prescriber. Historians aren't constantly confronted with people who carry on self-confidently about the rule against adultery in the sixth amendment to the Declamation of Independence, as written by Benjamin Hamilton. Computer scientists aren't always having to correct people who make bold assertions about the value of Objectivist Programming, as examplified in the HCNL entities stored in Relaxational Databases. The trouble is, most people are much more ignorant about language than they are about history or computer science, but they reckon that because they can talk and read and write, their opinions about talking and reading and writing are as well informed as anybody's. And since I have DNA, I'm entitled to carry on at length about genetics without bothering to learn anything about it. Not.

Right. What Mark said. Amen.

Posted by Geoffrey K. Pullum at 09:48 PM

Love, adverbially

Daniel Handler, better known to kids everywhere as Lemony Snicket, apparently doesn't agree with adverb-haters Elmore Leonard and Stephen King ("The road to hell is paved with adverbs," King once wrote). Mark Mr. Handler down as an authorial adverbophile. In fact, he's gone so far as to write a work of (adult) fiction called, simply, Adverbs: A Novel.

Naming a novel after a part of speech is just asking for trouble, since it inevitably will lead to metalinguistic confusion among reviewers who should know better. Here's a muddled passage from the Booklist review posted on Amazon (hat tip to yendi, aka Adam Lipkin, and his fellow Livejournalers):

The 16 intersecting stories (each headed by an adverb modifying the noun love) display a cadre of couplings: gay, straight, platonic, perverse.

On first read, the reviewer seems to imply that the intersecting stories are titled with adjectives modifying the noun "love": "gay," "straight," "platonic," "perverse." This would be bad news for Handler, since it would mean he should have called his work Adjectives: A Novel. But that's just a use-mention ambiguity, as those adjectives are simply describing the "cadre of couplings" in the book. The Booklist reviewer goes on to give some of the actual story headings, and they are indeed adverbs: "Obviously," "Symbolically," "Soundly," and "Frigidly." Publishers Weekly also mentions "Briefly" and "Truly," and a review of 4 Adverbs (a stage play derived from the book) adds four more: "Arguably," "Particularly," "Naturally," and "Wrongly."

So how could the reviewer think that these adverbs are "modifying the noun 'love'"? As we all learned in grammar classes, adverbs modify adjectives, verbs, or other adverbs, while adjectives modify nouns. But that's not the whole story. First of all, an adverb can modify a phrase, a clause, or even a whole sentence. And secondly, we often encounter adverbs used in an elliptical fashion as in Handler's headings, where it's unclear what exactly is being modified. Consider the Tennessee Williams play (and Joseph Mankiewicz movie) Suddenly, Last Summer. The title alludes to something happening "suddenly, last summer," but the suddenly-occurring event is left slyly unstated. (The tagline for the movie spoiled the allusivity: "Suddenly, last summer, Cathy knew she was being used for something evil!")

Then there was the triple-adverb movie title from 1991, Truly, Madly, Deeply (which has also served as the title to a hit song by the Australian pop duo Savage Garden). This turns out to refer to a nauseating game played by two lovebirds in the movie, who try to top each other by adding adverbs to "I love you" ("I really love you," "I really, truly love you," "I really, truly, madly love you," "I really, truly, madly, deeply love you," etc.). Or consider another romantic comedy title particularly relevant to the confusion over Handler's adverbs: Love Actually. As with Truly, Madly, Deeply, this is an allusive reference to dialogue within the movie — Hugh Grant as the Prime Minister gives a speech with the line, "If you look for it, I've got a sneaky feeling you'll find that love actually is all around." (According to IMDb, the working title for the film was Love Actually Is All Around, but it got shortened to Love Actually on final release.)

This is not to let the Booklist reviewer off the hook. Even in the case of Love Actually, it's not quite accurate to say that the adverb "actually" modifies "love"; rather, it's modifying an entire clause whose predicate remains in absentia (" all around"). Yet it's important to note that we have no problem digesting the movie title without knowing the fuller version. We're particularly adept at dealing with disembodied adverbs when they are of the type that can modify an entire sentence or independent clause. So-called "sentence adverbs" (or "disjunctive adverbs") — such as happily, fortunately, amazingly, frankly, strictly, thankfully, regretfully, surprisingly, clearly, basically, notably, and actually — can get tacked on to the beginning or end of a clause or sentence (sometimes occurring in a longer adverbial phrase, as in "amazingly enough" or "not surprisingly"). A sentence adverb can also occur medially, bracketed by an intonation pattern in speech that is usually represented by commas in writing (though not always, as demonstrated by "Love actually is all around"). One elliptical use for sentence adverbs is in answering yes/no questions (or more generally, expressing agreement or disagreement with an interlocutor's assertion): think of adverbs like absolutely, positively, definitely, obviously, or naturally when used as an emphatic response. Lou Costello once got stuck in a notorious use-mention dilemma involving Bud Abbott's adverbial response to "Who's on first?": "Naturally!"

So, unsurprisingly, Handler's story headings are mostly words that could serve as sentence adverbs, like "Obviously," "Briefly," "Arguably," "Particularly," "Naturally," and "Truly." (If Handler had really wanted to court controversy, he could have included the sentence adverb that rubs so many people the wrong way: hopefully.) A few of the titular adverbs serve some other ellipitical purpose, such as "Symbolically," "Soundly," and "Frigidly." Without having read the book, I can only guess that these adverbs are intended to describe how the characters love. This may indeed be the source of the Booklist confusion. Perhaps the writer initially said that the adverbs "modify 'love'" with the idea that they modify the verb "love." But then an officious editor inserted the word "noun" before "love." It wouldn't be the first time that an editor got flummoxed trying to label parts of speech.

Posted by Benjamin Zimmer at 05:41 PM

Snowclone Mountain?

As Mark Liberman just noted, the coining of the playful "Spongeback Mountain" (blending two titles, "Brokeback Mountain" for a movie about two cowboys in love and lust, and "SpongeBob SquarePants" for a children's television show about two undersea characters that a few critics have seen as in a homoerotically tinged relationship) has a snowclonish feel to it.  Perhaps we're seeing the birth of an X-back Mountain snowclone.

Well, I don't think we're there yet.  "Spongeback Mountain" is just one of a number of variations on "Brokeback Mountain" that people have come up with since the movie opened last year, and only a few of these variations are of the form "X-back Mountain".  It looks like we're still in the playful allusion stage, as I argued for the Eye Guy figure a while back, in a posting titled "Critical tone for a new snowclone" (the title exemplifying a very distant variant on the original, "Queer Eye for the Straight Guy").  The history of a snowclone involves two separate points at which expressions are fixed in form:  a first fixing, in an expression with little variation, and then after a period of creative variations on this formula, a second fixing as a relatively invariant prefab template with open slots, that is, a snowclone.  But not every expression makes it to the second fixing; only some finish the snowcloning process.

As for "Brokeback Mountain", there are three variants that can arise from people's (unconsciously) reshaping brokeback to something that makes more sense to them:  in order of descending frequency, "Brokenback Mountain", "Bareback Mountain", and "Breakback Mountain".  The first and third represent "corrections" of the nonstandard broke, and the second looks like an eggcorn, especially for someone who knows even the barest outline of the plot of the movie.

"Bareback Mountain" also has a large number of occurrences that are clearly deliberate plays on "Brokeback Mountain".  And it's of the form "X-back Mountain".  As are "Buttback Mountain" and "Boneback Mountain" (also with a sexual allusion), for each of which there's a handful of jocular uses.

But most of the variants I've found vary something other than the "Broke" slot, or vary something in addition to this slot.

First, there's the "Mountain" slot.  This lends itself to the obvious sexual allusion, the imperfect pun "Mounting" or the perfect pun "Mountin'".  Thousands of occurrences of "Brokeback Mountin'" and hundreds of "Brokeback Mounting".  The "X-back Mountain" versions with broke "corrected" also occur with puns on mountain: a few of "Brokenback Mounting", one jesting "Brokenback Mountin'"; and a few of "Breakback Mounting".  And, of course, a HUGE number of "Bareback Mounting" and "Bareback Mountin'", with the sexual reference in both words of the title.

[Added 3/14/06: Over on soc.motss, Gwendolyn Alden Dean has suggested an RCMP movie "Brokeback Mounties".  There are a modest number of web hits.]

There is also at least one variant in which the whole first word is varied: "Backdoor Mountain".  This preserves the prosodic structure of brokeback, and its initial /b/, and pounds in an allusion to anal intercourse.  Hundreds of occurrences, plus a dozen or so each of the doubly sexual "Backdoor Mounting" and "Backdoor Mountin'".

Then there's the "back" slot to vary, usually with break in the "Broke" slot.  With reference to the movie: a hundred or so each of "Breakthrough Mountain" and "Breakdance Mountain", plus small numbers of "Breakup Mountain", "Breakfront Mountain" (once with reference to Will and Grace), "Breakout Mountain", and "Breakbone Mountain".  Plus a hundred or so occurrences of "Brokedown Mountain".

I don't see a snowclone here.  I just see people playing with every part of the movie title they can, especially in ways that add sexual reference.

Now, if you look back at my postings last year on snowclones, you'll see a lot of stuff (especially in October) about distinguishing snowclones from other kinds of formulaic language.  Here's how I now see the development:

Pre-formula stage: an idea is expressed in various ways, say "what one person likes, another person detests", "things that please some people repel others", etc.  All of these expressions are understood literally, require no special knowledge (beyond knowledge of the language) to understand, and can be created on the spot.

First fixing: somebody produces an especially apt way of expressing the idea, uses an effective metaphor, or devises a memorable title or name.  This expression, which is essentially fixed in form, then spreads, and gains currency as a cliché, catch phrase, proverb, quotation, or well-known title or name.  "One man's meat is another man's poison", for instance.  Or the movie title "Brokeback Mountain".

Variation on the fixed expression: the fixed expression may quickly extend by developing open slots, or by playful allusion to it (via puns or other variations of it).  "One man's Mede is another man's Persian", for instance.  In many cases, every part of the fixed expression that can be varied for effect is, by somebody or other.

Snowcloning (the second fixing): these variants become (relatively) fixed as formulas with open slots in them, as in "One man's X is another man's Y".  It's still possible to play creatively with the expression (just as we can play creatively with idioms), but most occurrences of variants will fit the template.  I don't think we're there yet with the "Brokeback Mountain" variants, and I suspect that (as with the Eye Guy variants) we'll never get there; once the movie and the television show recede from the front stage of popular culture, these variants will be seen as quaint relics of the past.

I don't want to give the impression that the four stages can be crisply distinguished.  As with all historical changes, things proceed differently for different people; the resources of earlier stages will usually still be available at later ones; and there will be some borderline cases.  For instance, as I said last May in "An Avalanchlet of Snowclones":

... the line between clichés, some of which can have open slots (the wonderful world of X, as in the wonderful world of snowclones...), and the somewhat more complex classic snowclones, like the X have N words for Y (which gave the genus its name), is not at all clear.  Probably it's like the line between idioms and constructions: there are pretty clear examples at the extremes (the idiom by and large, the construction Subject Auxiliary Inversion), but [there's also] a range of intermediate types, with varying degrees and kinds of freedom as to what can fill the slots in the pattern and with varying degrees of semantic and pragmatic specialization.

But it isn't all fuzziness.  Sometimes you can feel pretty sure that things are in stage 3 for pretty much everybody.

Posted by Arnold Zwicky at 02:54 PM

Interpretation, Translation and the Mess in Our Courts

 Bill Poser's post on the problems of court interpretation is a stark reminder of how much linguists need to do in the matter of justice for speakers of languages other than that of the courts (see here and here). A few linguists have addressed the issues of court interpretation and how critical it can be. Susan Berk-Seligson's The Bilingual Courtroom (Chicago, 1990) and Sandra Hale's The Discourse of Court Interpreting (John Benjamins 2004) leap to mind but there are others as well.  One recent book related to court interpretation of deaf people is Language and the Law in Deaf Communities, edited by Ceil Lucas (Gallaudet University Press, 2002). It shows how the problems Poser mentions for non-English speakers are even greater when the deaf  take the witness stand.

A few years ago I encountered this problem in a case in which English transcripts of Spanish speakers in undercover tape recordings were used as evidence. The prosecution provided only an English translation of the tapes. Even with my limited knowledge of Spanish I could tell that they were dead wrong in several crucial places. I pointed this out to the defense attorney who then commissioned a translation of his own. Not surprisingly, it showed that the government's translation was badly in error. Naively perhaps, the judge then ruled that the two opposing translators should get together and try to agree on a single, accurate translation. This effort failed miserably, of course, and the judge finally ruled that both translations could be used at trial. His decision was hardly Solomonic but it was better than many of the usual judicial rulings about the use of foreign languages at trial.  

 But this type of solution is usually not available in the case of trial testimony, when on-the-spot interpretation is required. Last I heard in such cases there is commonly no transcription of the original foreign language used by the witness, thereby leaving no record that can be checked for accuracy of the interpretation. So it has to be accurate on the first try. The system simply skips this important phase and goes directly to the interpreter's version of events which, as Bill points out, becomes the official court record. This can be an unmitigated disaster in terms of justice.

I recently had the opportunity to hear government translation officials describe how the thousands of backlogged intercepted communications in US intelligence offices are made available for security analysis. Without embarrassment they told me that the translators select tapes in order to discover what might be important for national security purposes and then provide English translations of those passages that they consider important. At least two crucial steps are missed entirely here. First, translators are not intelligence analysts and cannot possibly know what is "important for national security purposes" and what is not. Second, this process offers no way to determine whether the translations of those passages are accurate because they don't make a transcription of the original language that can be checked or verified for translation accuracy. No time for this, they explain, and leave it at that.

Needless to say, linguists still have a lot of work to do in educating the fields of law and government about language, interpreting and translation.

X-back mountain

As Arnold Zwicky observes, Brokeback X hardly counts as a snowclone, except in the limited sense that any similar case would be, where we generalize a modifier from a specific referent in order to evoke a cloud of associated qualities. That's an everyday step in the evolution of words, an instance of what the Greeks named for us as metonymy -- with a small twist, since it's applied to half of a two-word title. But Chris Burrows sends in an example taking off from the movie title in a direction that is more canonically snowclonish: "Spongeback Mountain!"

Posted by Mark Liberman at 10:28 AM

What is the negative voice of authority?

During the NPR Marketplace show of 3/10/2006, Tom Bedore starts off a segment this way:

Currently set at eight point two trillion dollars, our new debt ceiling could approach nine trillion dollars. [audio link]

Although Marketplace is a business show, and Bedore is a comedian, his segment is actually about linguistics.

In particular, it's about the phonetics of rhetoric, as Bedore immediately explains:

And I said "new debt ceiling could approach nine trillion dollars" in a manner that emphasizes
I understand the significance of the situation,
and that it is bad.
Well, I have no idea what I'm talking about, and just assume nine trillion dollars in debt is bad.
But it might be a great thing.
I have no idea.

So what exactly is the "manner" that emphasizes his (pretended) understanding of the situation's gravity?

To help you think about it, here's a waveform and pitch track of "new debt ceiling could approach nine trillion dollars", with a link to the .wav file:

Send me your analysis, and I'll summarize what I get, and give my own opinion.

You don't need to accept Bedore's account of what his "manner" communicates. His performance of the phrase is certainly marked in several ways, and he describes insightfully what he's communicating when he performs it; but you should start with an open mind about what the performance itself contributes, independent of the content and context of the material performed.

During the rest of Bedore's piece, he repeatedly adopts an authoritative voice, and then immediately subverts it. As he says after his next round of self-exposure:

Some of you thought that I was making sense until I pointed out I had no idea
what I was talking about.
And I pointed out that I had no idea what I was talking about only because
I'm not a politician.

Or, he might have said, a radio journalist. Or a blogger, or a consultant, or a teacher, or just about anyone except a comedian. His generalization:

It's too easy for people to sound like they know what they're talking about
and not know what they're talking about.

All too true, and just as true about the phonetics of rhetoric as about any other topic. So if you happen to know what various experts have to say about the form and meaning of prosodic patterns like those in Bedore's performance, feel free to disagree.

Posted by Mark Liberman at 07:17 AM

March 12, 2006

Pretty girl visiting your town

More evidence of the utter brain-dead character of the techniques that are being deployed to harvest email addresses from the lonely and the stupid: Here's the text of a spam email I recently received (on my birthday, in fact, though I think that was coincidence):

From Wed Mar 8 04:27:03 2006
Date: Thu, 09 Mar 2006 07:29:16 +0200
From: "Carlos"
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5)
Gecko/20031013 Thunderbird/0.3
Subject: spending time with you

Hope I am not writing to wrong address. I am nice, prettya looking girl. I am planning on visiting yoaur town this month. Can we meet each other in person? Message me back at

You are writing to the right address, Carlos (I suppressed the To: line above, lest it be harvested by yet more spammers). But don't you have a rather unusual name for a girl? Or is your sister using your account on a remailer in Finland to trawl for guys in the Boston area?

By the way, you have two spelling mistakes and two syntactic errors (anarthrous singular NPs) in the message. If you will permit a word to the wise for when you visit Cambridge later this month: guys around here — at Harvard and MIT — may expect a higher level of literacy than this, even from a pretty girl like you.

Posted by Geoffrey K. Pullum at 01:11 PM

March 11, 2006

A pirated Barbie-ism

Veteran Wikipedian Leflyman sends along an intriguing early variation on the popular expression attributed to Teen Talk Barbie: Math is hard, let's go shopping! (later snowcloned into X is hard, let's go shopping!). To recap my earlier discussion, the Teen Talk Barbie doll was introduced by Mattel in 1992 and came programmed with certain sayings, including "Math class is tough" and "Want to go shopping? Okay, meet me at the mall." After protests from female educators, Mattel removed "Math class is tough" from the selection of utterances. But Barbie's vacuous math-talk lived on in the popular consciousness, grafted onto her shopping-talk in the compact form of "Math is hard, let's go shopping!" Usenet newsgroup participants were using this canonical version by May 1994 and began snowcloning it (replacing "math" with other tough stuff) in early 1996. The snowclone was firmly established by 1997.

Meanwhile, as Leflyman has discovered, the expression was germinating in the minds of video game designers.  In August 1995 Jonathan Ackley and Larry Ahern set to work on "The Curse of Monkey Island," the third game of the popular "Monkey Island" series from LucasArts. As with previous installments, the video game centered on the villainous shape-shifting pirate LeChuck, whose incarnations included a joke-cracking toy doll. When the game was finally released in November 1997, it included this line from the LeChuck doll: "Arrr! Math be hard! Let's go shopping!"

This doesn't quite rise to the level of a snowclone, since the original expression isn't used as a template with a key word like "math" replaced with something else. Rather, the variation comically shifts the saying into a different register, namely pirate talk. As we all know, pirates love using the interjection arrr. (For recent evidence, see this Saturday Night Live sketch about a convention of hyper-rhotic pirates and their keynote speaker, Peter Sarsgaard.) As Mark Liberman's post on arrr revealed, we owe this stereotypical bit of pirate-ese to Robert Newton, who portrayed Long John Silver in the Disney version of Treasure Island. Newton himself may have modeled his hyper-rhotic speech on maritime pidgin English dating back to the 17th and 18th centuries, with roots in the southwest of England. And we also know that pirates love using be as an invariant copula. According to Geoffrey Nathan in the update to Mark's post, invariant use of be in pirate talk may also represent a fossilized remnant of maritime pidgin English. So there's nothing better than "Arrr! Math be hard! Let's go shopping!" for a Barbie-aping pirate doll to say. (Putting incongruous Barbie talk into the mouth of a pirate doll is also likely an homage to the notorious Barbie Liberation Organization.)

The appearance of variations on the Barbie-ism in Usenet newsgroups like soc.motss and in the dialogue to a popular video game suggests that it was spreading in many different directions in the mid-'90s. Just in case there was any confusion about my original post, I did not mean to imply that the first known Usenet citation for Math is hard, let's go shopping! — from Nick Fitch on soc.motss in May 1994 — was the baptismal usage from which all others flowed. (Fitch himself has said that he's sure the expression was already in use when he posted it on soc.motss.) Rather, both Fitch and the video game designers represent different vectors for the expression, as they say in the field of urban folklore. As early adapters, they may have been influential in spreading the Barbie-ism in memetic fashion, but neither vector was solely responsible for its circulation. It's useful to return to a point made by Arnold Zwicky in his post on historical snowclonology: documenting the spread of a snowclone is not the same as discovering its origin. We will probably never know who first said Math is hard, let's go shopping!, or who first thought up X is hard, let's go shopping! as a generalizable template. But that doesn't mean we can't use the resources at our disposal, like searchable databases of online conversations, to discern the various formative pathways of the meme.

Posted by Benjamin Zimmer at 02:55 PM


People love to count words. I mean, exactly.

I'm sympathetic, and have some numbers of my own to contribute. Last night and this morning, in the process of working towards some corpus-based lexicography in Bengali, I wgot 12,328 .htm files from a journalistic site, containing 2,771,625 Bengali word tokens representing 91,407 distinct Bengali wordform types, with the commonest 16,848 wordforms giving 95% coverage of the tokens in the collection.

When I think back to what things were like even 15 or 20 years ago, this amazes me. The whole business required the following steps, taking between one and two minutes of my time in total:

(1) Find a promising source of Bengali text, by searching Google with a Unicode string for a common Bengali word. I used আমাদের, selected from the BBC News Bengali page (it's not the BBC site whose statistics I report -- the Beeb only keeps a few thousand words of text there, as far as I was able to determine). Time: 30 seconds.

(2) Use wget to download all the html pages from one (promising-looking journalistic) site. This was a one-line command on my desktop linux box. Time to issue the command: 10 seconds. (The download took several hours, but I was asleep at the time.)

(3) Find all the Bengali words in the downloaded material. This required me to write a short shell script

for n in `find . -name "*htm*" -print`
getbengali <$n
done | hist >BengaliHarvest1.hist

which calls a short perl program to extract Bengali word tokens from an arbitrary file:

use utf8;
binmode(STDOUT, ":utf8");
binmode(STDIN, ":utf8");
my $n; my $state = 0;
# 0 = start
# 1 = within Bengali
# 2 = within non-initial non-Bengali
while (<>) {
  while ($_ =~ s/^(.)//) {
    $n = ord($1);
    if($n >= 0x981 && $n <= 0x9FA){ # Bengali unicode range
      $state = 1; print("$1");
    elsif($state == 1){ # start of non-Bengali
      $state = 2; print("\n");

I'll confess that it took me a while to learn about binmode() and ord() -- my first few tries at writing the perl program to pull out Bengali words failed mysteriously, and I actually had to read skim the documentation on how to deal with utf8 in perl. However, I went through that learning process a week ago, so I already had the "getbengali" program in hand -- as well as a simple program to compute lexical histograms, written many years ago -- and therefore all I had to do was write the little shell script. Time this morning needed: 30 seconds. (Well, that was my time -- the script ran while I was preparing and eating a bowl of oatmeal.)

The total amount of my time needed to obtain a 2.7M-word corpus of Bengali and compute the frequencies of wordforms in it: about a minute and a half. I invested another minute or so in creating and browsing a concordance. I imagine that somewhere, Hugo of St. Cher is smiling and shaking his head. The 500 Dominican monks who worked for him may be signaling with a different body part.

[For those of you who are not text encoding geeks, or perhaps not even all that interested in computers, here's a fun story. (The rest of you may enjoy it too.)

The encoding known today as UTF-8 was invented by Ken Thompson. It was born during the evening hours of 1992-09-02 in a New Jersey diner, where he designed it in the presence of Rob Pike on a placemat (see Rob Pike’s UTF-8 history). It replaced an earlier attempt to design a FSS/UTF (file system safe UCS transformation format) that was circulated in an X/Open working document in August 1992 by Gary Miller (IBM), Greger Leijonhufvud and John Entenmann (SMI) as a replacement for the division-heavy UTF-1 encoding from the first edition of ISO 10646-1. By the end of the first week of September 1992, Pike and Thompson had turned AT&T Bell Lab’s Plan 9 into the world’s first operating system to use UTF-8.

All too often, we neglect the continued important role of scraps of paper in technological history.]

[I should also add that I don't know any Bengali. How I came to be involved in a bit of Bengali corpus-based lexicography is a another story.]

[And I hope it goes without saying that counting words so precisely is actually very silly, at least with reference to questions like "how many words did English borrow from Greek?", or "how many words are there in English?" ...]

[Update: Ben Zimmer points to this quote from My Big Fat Greek Wedding:

Give me a word, any word, and I show you that the root of that word is Greek.


Posted by Mark Liberman at 08:37 AM

Civics lesson

Remember that survey about Americans' knowledge of the five members of the Simpsons family vs. their knowledge of the "five freedoms" guaranteed by the First Amendment? ("Counting Freedoms, Simpsons and Percentages"; "Freedom of Speech: More Famous than Bart Simpson".) I wasn't the only one who thought that the press coverage told us more about the art of American public relations than about the state of American civics instruction. Carl Bialik, the WSJ's "Numbers Guy", picked up the problem with the survey question in his column today:

The survey boiled [the First Amendment] down to five freedoms: freedom of speech, freedom of religion, freedom of the press, freedom of assembly and freedom to petition for redress of grievances.

But as the blog Language Log pointed out, "The wording of the First Amendment only mention two 'freedoms' as such (speech and press), plus two 'rights' (assembly and petition); and religion gets mentioned twice (no establishment of it, no prohibition of it)," but the survey only counted it as one freedom."

Jamin Raskin, professor of constitutional law at American University, told me he considers the First Amendment to contain six rights: He treats free exercise of religion and the establishment clause separately. "It's clear as matter of constitutional doctrine that establishment doctrine and exercise clause are different thing," he told me.

Carl also picked up on the way the survey numbers were spun:

What's more, the details of the survey (which was conducted for the museum by market-research firm Synovate) aren't quite as dramatic as the headline. Yes, hardly anyone knew all five of the freedoms, at least as the survey defined them. But 69% of respondents got freedom of speech, arguably one of the more important provisions of the First Amendment. (By comparison, the most familiar Simpson family member, Bart, was named by 61% of those surveyed.)

All in all, I think this is an excellent topic for a civics lesson -- in the need to discount the rhetoric of survey-spinning!

Posted by Mark Liberman at 08:07 AM

Engrish explained

Illustrations of fractured English, particularly from East Asian countries, get passed around quite a lot online. There are even entire websites devoted to collecting absurd examples. Most notable is, focusing on Japan, where English is frequently used as a design element in advertising regardless of whether the words make much sense contextually. Others revel in poorly translated English as it appears on hotel signs, menus, and the like. (One well-circulated compilation originally appeared in Richard Lederer's 1987 book Anguished English.) Such collections tend to get tiresome — even when not explicitly racist, they nonetheless partake in a long xenophobic tradition of ridiculing the English usage of non-native speakers. Belittling the pidginized English of speakers from East Asia has an especially checkered past in American dialect humor.

Every once in a while, though, there is a presentation of "Engrish" that both amuses and enlightens. Jon Rahoi, an American living in mainland China, posted scans from an exceedingly bizarre restaurant menu — so bizarre that a commenter accused Rahoi of forging the whole thing with Photoshop. But "an anonymous professor of China studies" came to Rahoi's defense by demonstrating exactly how one evocative menu item ("Benumbed hot vegetables fries fuck silk") could have reasonably ended up that way through dictionary-aided word-for-word translation.

Take #1313, "Benumbed hot vegetables fries fuck silk." It should read "Hot and spicy garlic greens stir-fried with shredded dried tofu." However, the mangled version above is not as mangled as it seems: it's a literal word-by-word translation, with some cases where the translator chose the wrong one of two meanings of a word.

First two characters: "ma la" meaning hot and spicy, but literally "numbingly spicy" -- it means a kind of Sichuan spice that mixes chilies with Sichuan peppercorn or prickly ash. The latter tends to numb the mouth. "Benumbed hot" is a decent, if ungrammatical, literal translation.

Next two: "jiu cai," the top greens of a fragrant-flowering garlic. There's no good English translation, so "vegetables" is just fine.

Next one: "chao," meaning stir-fried, quite reasonably rendered as "fries" (should be "fried," but that's a distinction English makes and Chinese doesn't).

Finally: "gan si" meaning shredded dried tofu, but literally translated as "dry silk." The problem here is that the word "gan" means both "to dry" and "to do," and the latter meaning has come to mean "to fuck." Unfortunately, the recent proliferation of Colloquial English dictionaries in China means people choose the vulgar translation way too often, on the grounds that it's colloquial. Last summer I was in a spiffy modern supermarket in Taiyuan whose dried-foods aisle was helpfully labeled "Assorted Fuck." The word "si" meaning "silk floss" is used in cooking to refer to anything that's been julienned -- very thin pommes frites are sold as "potato silk," for instance. The fact that it's tofu is just understood (sheets of dried tofu shredded into julienne) -- if it were dried anything else it would say so.

For more along these lines, see these scans of menus from Shanghai and discussion in the comments section. Some of the same mistranslations appear, such as 干 'dry' getting transmogrified into fuck. At least one of these menus evidently had its text fed directly into Babelfish or a similar automated translator, which is a surefire recipe for disaster.

(In the interest of equal time, I'd also like to recommend a blog previously discussed here by Mark Liberman: Hanzi Smatter, "dedicated to the misuse of Chinese characters in Western culture.")

[More on this here.]

Posted by Benjamin Zimmer at 12:49 AM

March 10, 2006

Best. Snowclone. EVAR.

In response to Ben Zimmer's comment on my 3/9/2006 post Respect, Philip Brooks wrote to observe

Regarding the "Best * Ever" snowclone you mentioned recently on Language Log: I sometimes see or hear "Best * EVAR" (usually in caps like that when written, or yelled if verbalized, with the "ar" sounding the same as in the word "mar"). People also often write periods after each word for emphasis. I think it may have started with the cartoon Invader Zim, or at least that's the first place remember hearing it.

I don't know one way or the other about the Invader Zim attribution -- that's a piece of pop culture I managed somehow to miss, and the current Invader Zim wikipedia entry doesn't include the string "evar". However, {"best * EVAR"} gets 343,000 Google hits, so this should be a fertile field for future scholarship. Some values from the first couple of pages: game, joke, YTMND, emksaplation, webcomic, controllar, phonescams, prezzunt, birfday, emoticon, computar, ...

I'm not sure whether the "-ar" orthography started as an in-group dialect reference, or whether people have been adjusting their pronunciation under the influence of this spelling, but for several years I've been hearing occasional youth-culture emphatic forms of words like "ever", "never", "over", etc., in which the vowel of the last syllable (though still rhotic) is lowered and backed. In terms of my own dialect, that's away from the vowel of stir and towards the vowel of star.

Note by the way that the Wikipedia entry on Typographical error gives

teh best thign evar!!1!one!1!!

as a characteristic example of an error caused by careless typing, or more precisely an imitation of such an error, used to ridicule hackers with poor keyboard control (where the digit 1's are caused by letting the shift key up too early while repeating exclamation points, and the "one" is introduced by a critic in order to draw attention to this mistake).

[Ben Zimmer points out that in Clueless ("Jane Austen's Emma meets Beverley Hills 90210"), the character Amber says "whatevar" accompanied by the "W" hand sign.]

[Arthaey Angosii writes that he doesn't know anything more about the origins of -ar-ism, but can contribute some information about its radiation into another culture:

I often see it written in l337: "ev4r" returns 9,420 ghits. This is still half as many as "ev3r" with 20,800 ghits. But my l337-senses ;) tell me that the 4 looks more l337 than the 3.


Posted by Mark Liberman at 01:28 PM

March 09, 2006

More brokeback generalizations

Mark Liberman has been on the alert for new uses of the adjective brokeback that are derived from the movie title Brokeback Mountain.  The range of meanings turns out to be very broad, involving rapid expansion along familiar paths of semantic extension, almost right to the end of the road: words that start out attributing specific properties that are negatively evaluated in the culture often end up being usable as generic disparagements or insults, unmoored in the minds of many speakers from those specific properties.  Gay is a familiar example; although the (relevant part of) the path starts with an attribution of homosexuality, a fair number of people are now also using it as an adjective merely conveying negative opinion.

Mark reports a suggestion that Brokeback X is some kind of snowclone.  Well, it's certainly a generalization and it alludes to the movie, but I'm reluctant to call it a snowclone.  It's just the use of the (piece of a) proper name brokeback as an adjective with a meaning that is in some way related to the content of the movie. 

Brokeback in the fictional geographical name is presumably just a variant of brokenback -- there are several Brokenback Mountains in the U.S., and a Brokenback Mountain Range in Australia -- which in turn is a variant of brokenbacked or broken-backed, an adjective that, according to the OED, is attested since ca. 1400, originally just meaning 'having a  broken back', but eventually taking on various transferred and figurative meanings.  On 1/24/06 Carol Crompton reported on ADS-L that the version of the folksong "Liza Jane" that she learned as a child included the line "Brokeback mule, I'm bound to ride" -- which would appear to have brokeback meaning 'swaybacked', or possibly just 'worthless'.  In the movie the mountain in question has double peaks.

It may be that the movie has encouraged the return of 'broken, worthless' as a meaning for brokeback.  That's one way of seeing the Danny Schechter headline "Brokeback Media" that Mark noted -- as just conveying that the media are broken.

One set of semantic extensions of brokeback turning on the content of the movie really use a lot of that content, in particular men with secret gay lives.  There's the use that Mark reported from a recent New York Times piece: brokeback marriage referring to a marriage involving such a man.  And there's a use reported on soc.motss recently by Jed Davis: brokeback Mormon to refer to a Mormon who is such a man -- this with reference to a stage performance "Confessions of a Mormon Boy", about the life of a married gay Mormon man (who leaves his marriage).

Another (putative) use turns on the fact that the two men in the movie are, in the eyes of the rest of the world, heterosexual and are also, apparently, fishing buddies (though, in fact, no fish get caught on those fishing trips).  Out of this we get a use reported in the "Slang" feature on the "Know + Tell" page (p. 51: "Numbers, nomenclature, and news for the conspicuously clued-in") in the January/February 2006 issue of Details magazine  (GQ for young, hip, and fashion-conscious metrosexuals and gay men):

adj.  Descriptor for any activity performed together by two heterosexual men (e.g. brokeback brunching, brokeback shopping, etc.).  PROVENANCE: Suburban cineplexes.  USAGE: "Where's Bob?"  "Oh, he's out brokeback bowling with Dale."

Brokeback Bowling (Alley) -- "Love is a strike of nature" -- is one of the many lampoons of Brokeback Mountain (whose advertising proclaims "Love is a force of nature!"), which clutters up a search on "brokeback bowling", but as far as I can tell, the rest of the hits trace back to the Details piece, so I'm somewhat suspicious of this meaning. It also doesn't seem like a particularly useful extension.  In the usage example, "brokeback" contributes very little, since "with Dale" specifies that Bob is bowling with another man (presumably, the sexual orientations of the two men are already known to the speakers, and in any case are irrelevant to their participation in bowling); "He's out bowling with Dale" would do just as well, and is shorter.  In "He's out brokeback bowling", "brokeback" contributes something, but the sentence is less informative than "He's out bowling with Dale".

In any case, brokeback activity of this sort is not quite the same thing as the "man date" described in the NYT last spring, since "man date" specifically excluded standard guy activities like going to sports events, having a drink together at the neighborhood bar, jogging together, etc.

I suppose brokeback might also be used as a modifier of activities if those involved two GAY men -- an out couple, a closeted couple, or just two friends who happened both to be gay.  But I have no cites.

We are now moving into the arena of sexuality.  Some of the reported uses seem to cover "adopting stereotypical macho behavior to cover up being gay", as Alice Faber put it on ADS-L, 3/1/06.  Or covering up mere effeteness.   Consider this exchange reported by Jesse Sheidlower on ADS-L, 2/13/06:

I was having an (online) conversation with an English friend, who teased me about the supposed Anglophilia I manifest in my dress, so I said, "Well, I'm getting a motorcycle to counter my image as an effete fop," and he replied, "A motorcycle? How brokeback!"

These uses are no longer so closely connected to the movie, since the guys in the movie ARE in fact highly masculine, in most ways (except for that same-sex desire thing); they aren't putting it on.  But you can see how you might get to such meanings.

Of course, we can get the combination of hypermasculinity plus closeted homosexuality without any cover-up intended (as in the movie).  This was the interpretation Indigo Som (ADS-L, 2/6/06) put on a reference to Justin Timberlake's tough-guy character in the movie Alpha Dog as possibly a "brokeback alpha dog".

Jesse Sheidlower had earlier (1/31/06) posted about a related extension.  This time we have the testimony of the original speaker as to what he meant:

The relevant sentence was along the lines of "He got a Hummer? That's so brokeback!". On further questioning, the speaker said that it was used in reference to things that are so exaggeratedly masculine as to call into question the sexuality of the man involved. Thus a man driving a minivan wouldn't be brokeback, but a man driving a Hummer would be. Speaker was a New York-raised late-30s heterosexual man, who hadn't seen the film.

We are now led to a cluster of meanings for brokeback that cover 'unmasculine, unmanly', 'faggy, effeminate', and 'gay, homosexual',  three meaning domains that are tightly connected in the folk mind.  (Words that start out meaning one of these things tend to take on one or both of the others.)  Plus the related meaning domains 'girlie, feminine', 'flamboyant', and 'homoerotic'.  As we've seen already, these six domains are so closely connected to one another in such complex ways that it's often hard to be sure which meaning(s) someone intended by using brokeback, even if you have the context.

Mark reported on several cites that seemed to him to just be conveying 'gay': "Brokeback Gaujiro", "Geek Fu Brokeback Edition", and "Brokeback Bomber"; and from Matthew Hutson, "Brokeback Mohamed", "Brokeback Steelers", and "Brokeback Krypton".  Most or all of these are, in Mark's words, "apparently malicious if not positively defamatory" -- defamatory BECAUSE they attribute homosexuality.

Others are harder to work out.  Here's Geoff Nathan on ADS-L, 1/26/06, with news from the locker room:

I can report that on Wednesday morning I actually heard the use of 'brokeback' in the wild.  While in the men's locker room (really) of our local fitness center another denizen recommended to a third the use of a shaving cream he had learned about from his wife.  But, he assured the guy he was talking to, it wasn't a 'brokeback thing'--it was a men's shaving cream made by a women's face care company.

The ADS-L folks then went into a discussion (still not fully resolved) that I would now describe as being about whether 'gay', 'faggy', 'girlie', or 'unmanly' (or perhaps some combination of these) might have been intended.

More recently (ADS-L, 2/28/06), Ben Zimmer passed on a snarky Defamer posting:

A mysterious organization known only as the Global Language Monitor has released its annual list of the year's most influential "Hollywood words and phrases." Using advanced and sophisticated tracking techniques available to anyone with access to Google, the group has decreed "Brokeback" — that highly evocative cluster of geographical peaks and valleys on the map of the human heart that has quickly turned into yet another synonym for "faggy" — as Hollywood's word of the year.

This one specifically picks out 'faggy' as the meaning.  But the site's url includes the substring


so we're into some mixture of 'faggy' (flamboyantly effeminate presentation of self) and simple 'gay' (having sexual desire for other men).  Oi.

Since we're into flamboyance, Mark cited "Brokeback Baptists" used with reference to "the appearance of a 'flamboyantly heterosexual Baptist theologian'".  We're way far afield now.

Back to one more combination of semantic domains, this time from the world of sports, where insult is cultivated as an art form.  A report from Ben Zimmer (ADS-L, 2/16/06):

At a basketball game between Gonzaga University and St. Mary's College, a Gonzaga booster group chanted "Brokeback! Mountain!" to taunt a St. Mary's player (a photo had circulated online purporting to show the player kissing another man).

We start with an attribution of homosexuality, which is routinely used in a sports context to convey at least unmanliness, usually effeminacy (gay men are sissies) if not actual femininity (gay men are symbolically women).  So "brokeback" conveys general contempt -- badness, worthlessness.  Now we're inches away from uses of brokeback that are as bleached of sexual reference as some uses of gay.  Maybe that's (part of) what's going on in the Schechter headline Mark cited.  Unless the adjective brokeback goes out of fashion real soon, I expect to find some examples that are clearly fully bleached.

Finally, there are attested derived adverbs brokebackly and brokebackingly (Ben Zimmer,  2/15/06) and suffixed adjectives brokebackish and brokebackesque (Mark Peters, 2/15/06).  One of the brokebackish cites comes close to 'homoerotic', although it could be understood as merely 'like Brokeback Mountain': "A friend sent me the 'Brokeback Top Gun' video --- actually, it's clips from the film arranged in a way that makes it look rather Brokebackish."  Almost all of them present the difficulties of interpretation seen in some of the plain brokeback examples above.   Language  change on the hoof!

The December 1 DWIM effect

The damage done by well-intentioned (mis)features of MS Office is not limited to occasional dadafication of EU bureaucratese. According to Barry R Zeeberg, Joseph Riss, David W Kane, Kimberly J Bussey, Edward Uchio, W Marston Linehan, J Carl Barrett and John N Weinstein, "Mistaken Identifiers: Gene name errors can be introduced inadvertently when using Excel in bioinformatics", BMC Bioinformatics 2004, 5:80:

When we were beta-testing [two new bioinformatics programs] on microarray data, a frustrating problem occurred repeatedly: Some gene names kept bouncing back as "unknown." A little detective work revealed the reason: ... A default date conversion feature in Excel ... was altering gene names that it considered to look like dates. For example, the tumor suppressor DEC1 [Deleted in Esophageal Cancer 1] was being converted to '1-DEC.' Figure 1 lists 30 gene names that suffer an analogous fate.

A worse problem apparently afflicts information from microarray experiments:

There is another default conversion problem for RIKEN clone identifiers identifiers of the form nnnnnnnEnn, where n denotes a digit. These identifiers are comprised of the serial number of the plate that contains the library, information on plate status, and the address of the clone. A search ... identified more than 2,000 such identifiers out of a total set of 60,770. For example, the RIKEN identifier "2310009E13" was converted irreversibly to the floating-point number "2.31E+13." A non-expert user might well fail to notice that approximately 3% of the identifiers on a microarray with tens of thousands of genes had been converted to an incorrect form, yet the potential for 2,000 identifiers to be transmogrified without notice is a considerable concern. Most important, these conversions to an internal date representation or floating-point number format are irreversible; the original gene name cannot be recovered.

RIKEN microarrays are systematically affected, but other microarray results are apparently often garbled as well:

The floating-point conversion is not restricted to RIKEN clone identifiers but will affect any clone designation derived from plate coordinates. ... [If plate library references are omitted or numerical], all clones from row E of any plate are converted to floating point numbers by Excel. ... Since 96-well plates contain 8 rows and 12 columns, row E represents 12/96 or 12.5% of the clones on the plate; similarly, 6.25% of clones from 384-well plates would be affected. Most libraries contain hundreds of plates, each of which would be subject to this problem.

If some computer virus or trojan did this sort of damage to the results of thousands of high-cost biomedical experiments, I imagine that we'd see a serious effort to put some people in jail. I'm not suggesting that any similar sort of retribution is appropriate here, but perhaps some rehabilitation would be in order, along the lines suggested below.

There's an acronym from the old days of classic AI, DWIM, standing for "Do What I Mean". The Jargon File explains:

Warren Teitelman originally wrote DWIM to fix his typos and spelling errors, so it was somewhat idiosyncratic to his style, and would often make hash of anyone else's typos if they were stylistically different. Some victims of DWIM thus claimed that the acronym stood for ‘Damn Warren’s Infernal Machine!'.

In one notorious incident, Warren added a DWIM feature to the command interpreter used at Xerox PARC. One day another hacker there typed delete *$ to free up some disk space. (The editor there named backup files by appending $ to the original file name, so he was trying to delete any backup files left over from old editing sessions.) It happened that there weren't any editor backup files, so DWIM helpfully reported *$ not found, assuming you meant 'delete *'. It then started to delete all the files on the disk! The hacker managed to stop it with a Vulcan nerve pinch after only a half dozen or so files were lost.

The disgruntled victim later said he had been sorely tempted to go to Warren's office, tie Warren down in his chair in front of his workstation, and then type delete *$ twice.
DWIM is often suggested in jest as a desired feature for a complex program; it is also occasionally described as the single instruction the ideal computer would have. Back when proofs of program correctness were in vogue, there were also jokes about DWIMC (Do What I Mean, Correctly).

It seems to me that all interactive programs should have a prominently-displayed switch labelled something like DEWITYD, "Do Exactly What I Tell You, Damnit!" (pronounced as "de-witted"). No doubt the results will be wrong (or even disastrous) at least as often as the results of DWIM will be; but at least you'll know exactly who to blame.

[Update: Joshua Fruhlinger, the Comics Curmudgeon, writes:

In my non-comics-mocking life, I'm an editor, and I can tell you that the first thing most editors do with a new install of Office -- especially those of us who work in jargon-heavy fields -- is turn off the auto-correction features. These options are usually buried under several levels of preference menus, but for people who are in the business of writing precisely what they want to write, they must be turned off if life is to be worth living. I am always horrified that they are left on out of the box by default.


Posted by Mark Liberman at 05:51 PM

The Cupertino effect

It turns out that the modern affliction of spellcheckers wreaking havoc on unsuspecting documents has been given a name. Following a tip from commenter Qaminante on Languagehat, I discovered that runaway spellchecking has been dubbed "the Cupertino effect," at least among writers and translators for the European Union. As Qaminante explains, the common misspelling of cooperation as cooperatino leads some spellcheckers to suggest a change to Cupertino. (One EU writer claims that the Cupertino change can even happen to the word cooperation if the word processor's custom dictionary only has the hyphenated form co-operation. However, I find it difficult to believe that many custom dictionaries out there include Cupertino but not unhyphenated cooperation.)

This isn't a concern for users of Microsoft Word in its recent versions, since cooperation is the first suggestion given for cooperatino, not to mention coperatino, coperation, cuperation, coopertion, and various other typos. (You have to go all the way to cuperatino or cupertion before Word will suggest Cupertino.) In fact, if you're using the default Autocorrect settings for Word, it will automatically change cooperatino (and most of the other misspellings) to cooperation before you've even noticed the mistake. Nonetheless, the Cupertino carnage has been substantial, particularly for documents produced by the EU and other international organizations.

Here's a brief sampling of the hundreds of Cupertinos one can find on the ".int" domain used by international groups like the UN, the EU and NATO:

Within the GEIT BG the Cupertino with our Italian comrades proved to be very fruitful. (NATO Stabilisation Force, "Atlas raises the world," 14 May 2003)

The fact that Secretary General Robertson is going to join this session this afternoon in the European Union headquarters gives you already an idea of how close and co-ordinated this Cupertino is and this action will be. (NATO Press Point, 19 Mar. 2001)

Safe blood transfusion services are being addressed in Freetown and Lungi, using WHO RB funds in Cupertino with the Red Cross Society of Sierra Leone and in Bo by MSF/Belgium. (WHO/EHA report on Sierra Leone, 1 May 2000)

Could you tell us how far such policy can go under the euro zone, and specifically where the limits of this Cupertino would be? (European Central Bank press conference, 3 Nov. 1998)

Co-ordination with the World Bank Transport and Trade Facilitation Programme for South East Europe will be particularly important in the area of trade facilitation and shall be conducted through regular review mechanisms and direct Cupertino. (European Agency for Reconstruction, "Focal area: Justice and home affairs")

A consistent and efficient tax reform approach also will facilitate the shoring up broader EU and G-7 support for similar reform strategies -- this in turn would make international Cupertino easier. (European Parliament, "Towards a Re-Orientation of National Energy Policies in the EU? - Germany as a Case Study")

Conservation of the ecological system can be achieved by agreements, Cupertino, compensation (incentives), etc. with landowners or users of an area. ... . It was an interesting feature during this phases that both at the national and local levels, permanent and constructive means of Cupertino were built between the directorates and the civil society. ... Special voluntary management contracts or Cupertino with land users are also an excellent ways to conserve these areas. (Council of Europe, "Report on the establishment of the National Ecological Network and the status of its national programme in Hungary")

And so on and so forth. An expert in the history of word processing could probably trace the origin and spread of this spellchecker scourge. What is Cupertino doing in so many custom dictionaries around the world anyway? Could it have anything to do with the fact that the northern California city of Cupertino is home to the worldwide headquarters of Apple as well as various other Silicon Valley companies? Neither Microsoft (makers of Word) nor the various owners of WordPerfect (Borland, Novell, Corel) have a Cupertino connection, however.

One possible clue comes from "The Macintosh Secret Tricks List" (a list of  so-called "easter eggs") circulated among Mac users in January 1993. Someone noticed that if supression (sic) was typed in Microsoft Word 4.0 for the Mac, then the spellchecker would include Cupertino among its suggestions. The contributor speculated that this might be "secret Apple-bashing." I highly doubt that this represents some sort of anti-Apple easter egg, since even current versions of Microsoft Word for Windows will include Cupertino as an alternative to such misspellings as supretion (though no longer supression). The spellchecking algorithm seems to be making suggestions based on the possibility that initial s could actually represent initial c, and that the bigrams re and on could have been accidentally flipped from er and no respectively. But at least this establishes that Cupertino has been lurking in Microsoft's custom dictionaries since at least 1989 (when Word 4 for Mac was released).

[Update #1: On Languagehat, Qaminante also wrote:

[T]he best [spellcheck error] I ever came across was a report stating that "Albania was very interested in concluding a customs copulation agreement"! The author, who is a native Spanish-speaker, claims she typed "coperation" and got this from spellcheck.

At first I found this a little hard to believe, but searching on the ".int" domain finds at least two instances of improper "copulation":

The Heads of State and Government congratulated SATCC for the crucial role it plays in strengthening copulation and accelerating the implementation of regional programmes in this strategic sector. (Southern African Development Community, Communique from the 1982 SADC Summit)

The Western Balkan countries confirmed their intention to further liberalise trade amongst each other. They requested that they be included in the pan-european system of diagonal copulation, which would benefit trade and economic development. (International Organization for Migration, Foreign Ministers Meeting, 22 Nov. 2004)

Now that's embarrassing.]

[Update #2: Qaminante emails to explain that what was meant in the IOM passage was actually "the pan-European system of diagonal cumulation." So my guess is that copulation in this case emerged as the spellcheckified version of the misspelling comulation.]

[Late update, 10/02/06: For proof that Cupertino could actually be generated by a spellchecker unable to recognize unhyphenated cooperation, see here.]

Posted by Benjamin Zimmer at 04:21 PM


Callimachus at Done with Mirrors represents his fellow Fluffians right back at Mahmoud Ahmadinejad: "Yo, Iran!":

We took it from W.C. Fields, because he was one of us. We're not going to take this from you, Iran.

What'd they do? They left Philadelphia off their nuclear hit list. The bums.

Specifically, C cites

an "Intelligence Summit in Washington, D.C.," at which a knowledgeable military man listed American "cities targeted by Iran, Al Qaida et al for simultaneous nuclear detonation." The reporter caught New York, D.C., L.A., Chicago, Houston "and one I didn't hear," but the speaker noted particularly it was "not Philadelphia." "It was not clear how he knew this or whether he was simply engaging in educated speculation."

Speculation, be damned. My hometown Does. Not. Need. This. Crap. I'll come to the point, President Ahmadinejad: I want us on that list. Yesterday.

Note this instance of an increasingly common technique: using periods and capital letters to encode prosodic emphasis. Aside from this and a few uses of in', C's mock rant is an excellent example of how to give the impression of a local variant of English without a lot of "eye dialect".

[A comment by email from Ben Zimmer:

When did that "periods and capital letters" thing start anyway? I associate it with TV fan forums, most famously in the expression "Best. Episode. Ever." Or "Worst. Episode. Ever." (That expression, credited to Comic Book Guy on The Simpsons, has been snowcloned into "best/worst X ever," as in VH1's show "Best Week Ever.") Prosodically it seems to imply that each word should be treated as an intonation unit, with appropriate pitch and stress at each onset. On ESPN one hears that sort of intonation from Chris Berman in his annoying NFL recaps: "He! Could! Go! All! The! Way!"

Well, I was going to blog about this, but I'll tip my hand briefly here. Since resetting of intonational downdrift generally applies at phrase boundaries, and since ends of phases are generally lengthened, my guess is that this typographical usage probably started as a way to evoke the kind of emphatic speech in which words are produced slowly and with noticeable vocal effort in a high pitch range. Imagine raising your eyebrows and poking your finger at someone while yelling "my hometown does not need this crap!" However, as with "tsk" and "ugh" and so on, such typographical approximations of non-lexical aspects of speech often give rise to spelling pronunciations, with the evocative allusion given a literal interpretation that then becomes a semi-arbitrary symbol for the (psychological state behing the) nonverbal behavior originally depicted. Or something like that.]

[Update: Callimachus himself wrote in:

I'm the blogger whose post you kindly noted here.

When you write, "Imagine raising your eyebrows and poking your finger at someone while yelling 'my hometown does not need this crap!' " that's pretty close to what I visualize when I see it. That actually was the first time I ever wrote it.

The exact visual image it brings to my mind is a passage from William Wharton's "Birdy," which I read years ago, about the narrator's father, who used to chew you out and punctuate his words with finger jabs to the chest each of which packed the power of a punch.

Punctuation, punch, punctus, p.p. of pungere.


Posted by Mark Liberman at 02:19 PM

March 08, 2006

Not me

A long speech rendered in translation by one brief sentence may be humorous, as Ben Zimmer discusses in his post about Mel Gibson's antics at the Oscars, but there is a serious side to this. People who speak two languages but have no training as interpreters often produce a greatly reduced summary rather than a translation of what the speaker said. When this happens in court, the consequences may be very serious. I have a friend who even now, after three years in prison, speaks only the most rudimentary English. He was convicted of two counts of murder in a trial in which the interpreter repeatedly produced brief summaries of his testimony. (The appellate court overturned the conviction on other grounds and so never addressed the issue of inadequate interpretation.)

Unfortunately, it happens all too often in court. Here in Canada there are no standards for interpreters for languages other than the official languages, English and French. In the United States, there are standards for interpreters in Federal court, but not in State courts, where the majority of criminal cases are heard. Interpreters may be highly qualified, or they may be the bailiff's sister who took a little Spanish in high school. No one really knows how often this leads to miscarriages of justice, in part because it is very difficult to appeal on these grounds because appellate courts normally consider only the written record of the trial, and the written record contains only the English translation of the testimony, not what was actually said.

[Addendum: a reader informs me that in Vancouver the courts require a certificate from Vancouver Community College or the equivalent. This is not the case in the north.]

Posted by Bill Poser at 07:59 PM

No Axe Attack

Here in Canada an item that has been much in the news is an ambush Saturday in Afghanistan in which a Canadian solider, Lieutenant Trevor Greene, was badly wounded by a blow to the head. Every news account that I have found, including this one from the Vancouver Sun, this one from CTV news, and this one from the Toronto Globe and Mail, describes the incident as an "axe attack". Some articles, such as this one in the Toronto Star, are accompanied by this photograph of the weapon:

Do you see the problem? The tool shown is not an axe. It seems clear to me that the head is at right angles to the shaft, not in line with it. That makes it an adze. The contrast is evident below. The one on the left is an adze, the one on the right is an axe.


Either my eyes deceive me or not a single one of the many journalists and editors who have worked on this story knows the difference between an axe and an adze. Personally, I find this much more worrisome than the alleged, or even real, grammatical errors the language pundits worry about. Whether or not you split your infinitives makes no difference, but if people don't know an axe from an adze, civilization really is coming to an end.

[Addendum: A reader writes to say that she agrees that the implement is not an axe but thinks that it is a mattock, not an adze. Below is a nice photo that I (well, actually, Google) found of a man wielding a pick-mattock. The pick is to the rear - the mattock blade is to the fore. I considered whether it might be a mattock, but rejected that hypothesis on the grounds that it looks too short. In my experience, mattocks are longer, like pickaxes. But maybe its a short mattock. Or maybe it is longer than it looks. The photo doesn't show the whole thing so its hard to tell. I don't know how Afghans use their tools. Anyhow, we're agreed that it isn't an axe.]

a man swinging a pick-mattock

Posted by Bill Poser at 07:27 PM

A Hearty Second for Richard Grant White

Over the years I've collected a number of antique grammars of English. I say this guardedly, because all grammars of English appear to be antique these days, with the notable exception, of course, of Huddleston and Pullum's Cambridge Grammar of the English Language (note: we newly hired should always flatter the corporate high mucky-mucks).  These ancient books offer fascinating reflections of the thoughts and biases of self-proclaimed experts. But they make fascinating reading, as Arnold Zwicky reminds us in his recent Language Log post. (See here.)

Apparently he and I share a deep appreciation for Richard Grant White's 1870 book, Words and Their Uses.  In his post, Arnold referred us to White's chapter on the present progressive, which is only one of the delightfully modern rants in the book. What's really cool, though, is to read how White rails against the same sort of language advice that we see in the current popular press. To whet your appetite to read White, if you haven't already, here are a few samples:

  • From the Preface:
    "It is from the man who knows just enough to be anxious to square his sentences by the line and plummet of grammar and dictionary that his mother tongue suffers most grievous injury."
  • From Chapter IX, Grammar English and Latin, page 258:
    "When, at last, it dawned on the pedagogues that English was a language ... and they set themselves to giving rules for the art of writing and speaking correctly, they attempted to form these rules upon the models furnished by the Latin language. And what wonder? — for those were the only rules they knew. From this heterogeneous union sprang that hybrid monster known as English grammar, before whose fruitless loins we have sacrificed, for nearly three hundred years, our children and the strangers within our gates."
  • From Chapter X, The Grammarless Tongue, page 274:
    "But the truth of this matter is, that in the rules given in the books called English Grammars, some are absurd, and the most are superfluous."
  • (Commenting on the Greek language),  p.277:
    "Its complication, so far from being an element of its power, is a sign of rudeness, and a remnant of barbarism; that the Greek and Latin authors were great, not by reason of the verbal forms and the grammatical structure of their languages, but in spite of them. Our mother tongue, in freeing itself from these, has only cast aside the trammels of strength and disguises of beauty."

As Arnold says, the book is a delicious read and it does indeed suggest that  a new  book should be written about his work. As an antique myself, I second this idea.

Posted by Roger Shuy at 10:51 AM

Another Trent Reznor Award nominee

Another nomination for the Trent Reznor awards has come in over the transom at Language Log Plaza, home of the Academy of Linguistic Arts and Sciences. The author is Rosie DiManno, a columnist for the Toronto Star :

Earlier, already changed into his suit, Lindros had stepped right into the showers, there to have a private word with Tie Domi, who on this evening had been feted for a thousand games in the NHL, his mother, so disapproving of how he played the sport, endlessly worried about her son, finally lured into attending a game, Domi offering his own tender tribute to a father long deceased.
(Headline "Long faces on Leafs tell story of season gone horribly wrong; Rosie DiManno says team, after losses to Buffalo and Sens, and loss of Lindros, have look of a beaten crew"; Toronto Star. Toronto, Ont.: Mar 5, 2006. pg. B.03; archive link; maybe also here.)

The first two times this prestigious award was suggested, we convened a panel of judges and made the decision on the spot. However, the nominations are starting to come in thick and fast, so in order to maintain the high standards that our readers expect, we'll hold a vote and an awards ceremony for the 2006 Reznors at some point early in 2007.

[Virtual nomination from a post in Deadspin, forwarded by Ben Zimmer.]

Posted by Mark Liberman at 10:26 AM

March 07, 2006

Collocation provocation

When Crash upset Brokeback Mountain at the Academy Awards, the media and entertainment blog Gawker added fuel to the anti-Crash fire by claiming, "Google Can't Hide Its Oscar Disappointment." They point out that searching with Google for "i'm really glad crash won" prompts the question "Did you mean: 'i'm really glad trash won'?" At first this might seem like a case of Googlebombing, where a group of people (say, in this case, bitter Brokeback fans) try to skew Google's search returns by linking to pages with particular keywords. But Googlebombing only affects the ranking of pages, not the search engine's "Did you mean..." suggestions. (One enterprising Googlebomber did manage to get the top result for "french military victories" to link to a spoof page asking if you meant "french military defeats," but that suggestion never actually showed up on Google's result page.)

So why does Google ask if you're "really glad trash won"? As noted here before, Helpful Google sometimes moves in mysterious ways. But this case isn't so mysterious, since it looks like Helpful Google's algorithm is simply relying on collocational frequencies. It notices that "glad crash" isn't a common search return, so it looks for near misses on the assumption that the original query is a misspelling. And changing just one letter yields "glad trash," a much more common collocation thanks to GLAD trash bags. The same change from "crash" to "trash" occurs with various other search strings like "glad crash is" and "glad crash has," though Helpful Google demurs with plain old "glad crash."

Google's algorithms have been mistakenly ascribed intentionality in another recent case. The Times of London reports:

Google has been asked to explain why the name of the Premiership footballer Ashley Cole has been linked to the word "gay" in internet search results.
Lawyers acting for the Arsenal and England defender want the internet company to disclose why typing his name into the search engine generates "See results for: ashley cole gay". ...
Graham Shear, solicitor for Mr Cole, said that he is interested in the origin of Google's decision to display the "gay" results alongside general searches for his client.
He said: "I am keen to find out whether the decision to automatically include the term 'gay' to the keyword 'Ashley Cole' was an editorial decision or one made by a computer based on the volume of searches for 'Ashley Cole' linked to the word 'gay'.

The dispute hinges on a new feature from Helpful Google — now Extra Helpful Google! — as Search Engine Watch explains (with an illustrative screenshot):

This is an example of the middle-of-the-page query refinement that Google's been testing over the past several months, as we wrote about back in August.
In particular, what seems to be happening is that Google is performing "clustering," a long-standing technique of grouping pages on a similar topic together. In other words, its sees there are lots of pages about "ashley cole" along with a subgroup of those on the topic of "ashley cole gay."
That there might be a subgroup like this isn't surprising. Cole is currently suing newspapers The Sun and The News Of The World over allegations they printed that he is gay. Those allegations have fueled discussion on the web, leading to a subgroup of pages on this topic.

And it's precisely the tabloid allegations of Cole's gay relations that prompted his solicitor to go after Google. As Mr. Shear told the Times, "I would be interested in when and what prompted this and whether the process started since we launched the cases against the News of the World and The Sun or before." He implies, bizarrely, that Google might somehow be colluding with the tabloids to tarnish his client as gay (apparently a fate worse than death for British footballers).

Leaving aside the not-too-subtle homophobia underlying the solicitor's request, it is indeed curious why <ashley + cole + gay> should be suggested, when it's not even the most common subsearch of three terms where two of them are <ashley> and <cole>. Blogger Chromatius notes that the Google toolbar autofills the most common search keywords, and typing <ashley + cole + g...> doesn't automatically suggest "gay" or rank it as the most common choice of keyword. (Cole and his solicitor should be happy to know that the first g-word suggested is "girlfriend." And the g-word listed by the toolbar with the highest number of results is "gallery.") Once again, Search Engine Watch enlightens the ways of Google. The suggestion <ashley + cole + gay> shows up in the middle-of-the-page refinement not because it's the subsearch with the most results but because it's now the most commonly queried subsearch:

Why bring up this particular topic when something like "ashley cole" cars comes up with more matches (60,100 of them)? That brings me back to search volume. If Google's noticing that there are a lot of queries on a particular subtopic (ashley cole gay) related to the main topic (ashley cole) plus a significant number of pages on that topic, that might cause this refinement to kick in.

So apparently it's the controversy itself that is to blame. Lots of people have heard the allegations and are entering <ashley + cole + gay> into Google, which in turn is triggering the search engine to suggest that particular combination of keywords as a refinement of <ashley + cole>. I have a funny feeling that once the hacker types figure out the algorithm for the new feature, they'll use it for a new round of Googlebombing. All they would need to do is set up a "bot" that enters a particular query — let's say <george + bush + nincompoop> — over and over again, and eventually that will be the suggested refinement for <george + bush>. It might take people doing it from multiple IP addresses for this to work, but given the great success Googlebombers have already had, I'm sure it's just a matter of time before Extra Helpful Google is brought low.

Posted by Benjamin Zimmer at 09:26 PM

Brokeback generalizations

You've got to be quick in this business. A couple of days ago, Matthew Hutson wrote suggesting that "I believe we are witnessing the birth of a new snowclone: Brokeback X". I wrote back asking for examples, and Matthew responded yesterday with a few (see below), so I put it on my to-blog list. But today I see that we got scooped by Wonkette -- who else? -- in a post observing that Brokeback Mountain "may have lost at the Oscars, but it’s winning the battle of spawning irritating yet irresistible neologisms", and linking to today's NYT article by Katy Butler, "Many Couples Must Negotiate Terms of 'Brokeback' Marriages".

And there I was feeling so good about catching up on grading homework, writing a talk for next week's workshop in Tokyo, and setting up a bake-off of pitch trackers. That's the trouble with having a day job, in this competitive high-stakes field of lingua-blogging. Well, competitive, at least.

There's another reason that I didn't rush into blogging Matthew's examples. They're quite a bit more, um, offensive than the NYT neologism: "Brokeback Mohamed", "Brokeback Steelers" (from 1/31/2006), and "Brokeback Krypton" (from way back on 12/5/2005). The NYT coinage "Brokeback marriage" does some real linguistic work, since it evokes issues that were central to the movie. In the three examples that Matthew cited, the Brokeback reference basically just indexes "gay". And in all three cases, the motivations are apparently malicious if not positively defamatory.

Some of the many other recent cases where "brokeback" is used to mean "gay" include "Brokeback Guajiro", "Geek Fu Brokeback Edition", "Brokeback Bomber". But sometimes the connection is looser: "Brokeback Baptists" links to a post about the appearance of a "flamboyantly heterosexual Baptist theologian, Dr. R. Albert Mohler, Jr." discussing B.M. on Larry King. And I can't figure out why the heck AlterNet and/or Danny Schechter decided to headline his 1/3/2006 columns "Brokeback Media" -- it's got nothing to do with sexual orientation and doesn't mention the B.M. movie -- maybe they mean to imply that the straw of citizen journalism has broken the media camel's back?

Posted by Mark Liberman at 06:13 PM

The agenbite of Onion wit

On 2/27/2006, The Onion mock-quoted Howard Dean:

"Some rising stars with leadership potential like [Sen. Barack] Obama (D-IL) and [New York State Attorney General Eliot] Spitzer have emerged, but don't worry: We've still got some infight left in us," Democratic National Committee Chairman Howard Dean said. "Over the last decade, we've found a reliably losing formula, and we're sticking with it." [emphasis added]

This isn't the first time that infight has been substituted for fight in an idiomatic phrase: The New Yorker ran a 5/3/2004 interview about information-hoarding among U.S. intelligence agencies under the headline "Fighting the Good Infight". However, this technique of subversive substitution is a staple of Onion humor. The same Onion article attributes another example to Ted Kennedy:

"Don't lose faithlessness, Democrats," Kennedy said. "The next election is ours to lose. To those who say we can't, I say: Remember Michael Dukakis. Remember Al Gore. Remember John Kerry." [emphasis added]

And on today's online Onion front page is the headline "Kennedy Center To Dishonor Gilbert Gottfried".

Most Onion humor seems to be based on ironic displacement at one level or another. Another 2/27/2006 story, "Rotation Of Earth Plunges Entire North American Continent Into Darkness", starts with the familiar journalistic genre of disaster stories, and substitutes the concept of sunset and the normal social facts of night for the precipitating event. This is one of several standard Onionish ways to insert a mundane event or situation into the framework of a newspaper story presupposing a certain level of importance. An especially common method is to present quirky but banal interviews with stereotypical ordinary people: "Local Teen 'Definitely' Going To Burning Man Next Year", "Area Woman To Celebrate Quiet Women's History Month At Home This Year"; "Vegetarian Can't Bring Self To Eat IHOP's Funny Face Pancakes". A slightly different type of alliaceous displacement is to swap deprecated individuals or groups into heroic frames from popular culture, as in "Modern-Day John Henry Dies Trying To Out-Spreadsheet Excel 11.0" or "Bob Marley Rises From Grave To Free Frat Boys From Bonds Of Oppression". Among the Onion's many other translocative techniques are the substitution of political news into the genre of music reviews, as in "Latest Bin Laden Tape For Completists Only", or into a family ("Bush Hides U.S. Report Card In Sock Drawer") or personal context ("NASA Completely Forgot Probe Was Returning Today").

In the infight, faithlessness and dishonor examples, the displacement is more concrete and smaller in scale: one of the words in an idiomatic phrase is replaced with a morphologically related alternative. With the key word removed, the phrase may not be salient enough to work as a snowclone. Certainly "Don't lose X" is pretty thin as a generalization of "Don't lose faith". But substituting the related item faithlessness succeeds in evoking -- and subverting -- the original. The resulting semi-snowclone is not funny in itself, but it reinforces the article's overall theme that today's Democrats, against the odds, are likely to blow the opportunity handed to them by the incompetence and venality of their opponents.

Posted by Mark Liberman at 09:21 AM

March 06, 2006

Mel's Mayan mischief

On last night's Oscar broadcast, we finally learned what Mel Gibson meant when he said he wanted to make Mayan languages "cool again." That's what was reported when news first got out that his latest movie venture, Apocalypto, would be a historical epic shot on location in Mexico with a cast of locals speaking only Yucatec Maya, the indigenous language of the Yucatán Peninsula. In the silly pre-ceremony video segment last night, Mel contributed some footage showing him and his cast members speaking in Yucatec Maya with humorous (?) English subtitles.

If you didn't get to see the video that kicked off the ceremonies, you can watch it on YouTube (the bit with Gibson starts about a minute and a half into the clip). The announcer introduces a series of would-be hosts, but everyone (including previous hosts Billy Crystal, Chris Rock, Steve Martin, Whoopi Goldberg, and David Letterman) begs off with various excuses. Then we hear, "Ladies and gentlemen, Mr. Mel Gibson!" and we see Mel (no longer sporting his scary beard) at the shooting location for Apocalypto, walking through what appears to be a limestone quarry. Behind him is a line of male cast members wearing loincloths, their skin covered in limestone powder. If you caught the cryptic wordless trailer for the movie, you know the look.

So the joke, such as it is, has Gibson speaking at length in Yucatec Maya, but the subtitles simply say "Not... me." (The long-speech-with-short-subtitles gag was already getting tired when Mike Myers did it with Cantonese in Wayne's World.) Then the line of Mayan men speak in unison, with a subtitle reading "Not us!" The bit ends with the head of a black panther popping inexplicably into the screen and Mel and his cast members running away screaming. (The panther was in the trailer too, though see here about the implausibility of using this type of wildcat in the film.)

I'm in no position to judge Gibson's Yucatec proficiency, but it does appear that he was trying to say something meaningful (as opposed to, say, the natives' gibberish in the original King Kong). After perusing some online resources in Yucatec Maya, I've been able to determine that Mel's last line — repeated by the cast members behind him — is "Ma tene," which does indeed mean "Not me." You can find the phrase used in a transcribed folk tale, with English translation, accompanying "A Grammar of the Yucatecan Mayan Language" by David and Alejandra Bolles. (The tale is of "Juan Thul, The Trickster Rabbit," a character strikingly like Br'er Rabbit of the Uncle Remus stories.)

It's not too surprising that Gibson would strive for serious use of Yucatec Maya even in a light-hearted sketch for the Oscars. Last Friday, Time published a story in its online edition where Gibson revealed that he would be speaking Maya on the Oscar broadcast. (This is just a teaser for a longer article in a forthcoming issue of Time, which got an "exclusive peek" at the filming of Apocalypto.) The online piece notes "the obvious care that has been taken with costumes, sets and the dialect-correct language," which "suggests the kind of cultural attention filmdom has rarely if ever accorded the Mayas, who were the Greeks of the New World." (That's a line attributed to the archeologist Sylvanus Morley.)

Mel may be dutifully learning the "dialect-correct language" for the film, but that won't stop Yucatec Maya from being exoticized whenever it gets mentioned in the English-language press. As before, the latest round of media attention has painted Yucatec as both "ancient" (see here, here, and here) and "obscure" (see here, here, and here), when it is actually neither. It's a living language of about a million speakers, but it's getting treated as if it were some sort of quaint museum piece. (Granted, Gibson may be trying to approximate an archaic version of Yucatec Maya, since his film takes place 600 years ago, but so far we've had no indication that the language used on the set is anything but modern.) The supposed "ancientness" of the film's language fulfills a specific need for media commentators: to portray Gibson as something of a kook who keeps making films in weird dead languages like Aramaic. But from what I can tell, Gibson is fully playing into the media's exoticization of the language — and Mayan cultures in general — for his own purposes, whatever they may be. Why else would he shoot his little Oscar piece, completely decontextualized from anything that might explain why he was speaking that odd-sounding language in front of those odd-looking people?

(Oh, and did I mention the movie has human sacrifices?)

[Update #1: Apropos of too-terse Maya translations, Steve of Language Hat sends this along:

I thought I'd pass on this bit which struck me yesterday when I was finishing Nelson Reed's fascinating 1964 book The Caste War of Yucatan (the war began in 1847-48 and tailed off for decades); the author is making a trip to Yucatan in 1959 to round out his research for the book and interview anyone who might have personal knowledge of the events of the early part of the century, and he's met an old gent named Don Norberto Yeh:

"My next question, Did he remember the time of General Bravo [who conquered the independent Maya 1899-1912], brought a long, explosive diatribe (and Maya can be very explosive) which was translated, 'The Señor says Yes.'"

Well, Mel Gibson has said that he's done extensive reading on Mayan history, so who knows — maybe he lifted his joke from Nelson Reed!

Mark Liberman adds a similar anecdote involving English-to-French translation:

Almost 20 years ago, when we were aligning two languages the Canadian Hansards for use in MT research, we found that where the English version had a multiple-paragraph eulogy of some local dignitary, read into the record by his representative — "Beloved husband, supportive father, founder of this, pillar of that, etc., etc." — the French version would often have just four words: "M. ___ est mort."

I don't recall have seen the same thing going the other way, which may have been because there were simply fewer eulogies to translate from French to English, or perhaps because the Eng->Fr translation service was less stressed overall.

[Update #2: Language Hat now has a post up about this, in which he mentions another too-terse-translation bit: the Suntory scene in Lost in Translation ("Is that everything? It seemed like he said quite a bit more than that").]

Posted by Benjamin Zimmer at 10:10 AM

Tough language, woman

Berke Breathed's Opus strip for 2/5/2006 illustrates the two-cultures theory...

Here's the text version. Opus and Steve are sitting at a little round table in a coffee shop, drinking brewed coffee from paper cups.

Opus: Steve, ever notice how women sit in little groups and get all intense and chatty and whispery?
Steve Dallas: They're talking woman.
Opus: Oo. Tough language, woman.
Steve Dallas: I talk woman.
Opus: No way.
Steve Dallas: Absolutely.
Opus: You just know a few nouns.
Steve Dallas: I talk perfect woman.
Opus: Betcha couldn't pass as a native.
Steve Dallas: [taking out a bill] Five bucks.
Opus: [putting his bill on the table] Five bucks.
Steve Dallas: [at the next table, where three women are sipping frappuccino-like drinks] Hi ladies. I'm feeling unfulfilled, my mother's a controlling shrew and my butt's growing wider than Rhode Island. You too?
Opus: [to Steve, who is back in his chair with one frappuccino poured over his head, another in his face, and the third one stuck down the front of his pants] That was pidgin woman.

For the corresponding image in current American popular culture of women's stereotype of the language of men, we can turn to a passage in Prairie Home Companion's "Guy Noir" sketch for 2/18/2006. Guy has been hired to help judge the Ausgesprechen Pride of Milwaukee Music Contest. Garrison Keillor is playing Guy Noir; Tim Russell plays Mr. Olson, the president of the Ausgesprechen Brewing Company; Sue Scott plays Miss Hattendorf, the president of his competitor, Blanche Beer. [You can listen to the whole thing here: Miss Hattendorf's section starts at about 8:08, to which this link should take you directly; and the fragment below start about here.]

Miss Hattendorf: ... look at his new ad campaign —"Ausgesprechen Beer. Not just for breakfast any more." Huh? Boy oh boy. Huh?
Mr. Olson: It's the most popular beer in Milwaukee.
Miss Hattendorf: Yeah, yeah, yeah. Popular among men with low IQs. His beer contains testosterone.
Mr. Olson: It does not.
Miss Hattendorf: Yes it does. They lace it with male hormones, so after you've had two or three cans of Ausgesprechen, even if you're a, you know, graduate student in quantum field theory, well you stand by the bar and you go [imitates male vocal tract by lowering larynx and loosening lower face] "hi! hey! [etc.]"

Note that the script on the Prairie Home Companion site refers to Miss Scott's imitation of typical men's speech as "male honking".

[The idea that (modern American) men and women come from different cultures is well established in the public mind. Different versions of this idea have been commonplace for hundreds of years, though some of the particular characteristics stereotypically associated with the male and female approaches to things have changed over time. There's enough social separation of the sexes to make these ideas plausible, quite apart from any biological effects, but it seems to be very hard to separate truth from stereotype in areas with as much emotional resonance as this one.]

Posted by Mark Liberman at 07:36 AM

Hatching out a question of usage

Jim Gordon suggests battening the hashes against the expression "hatch-marked" in a March 4 NYT story by Kirk Johnson, "Out of Old Mines' Muck Rises New Reclamation Model for West":

The property is hatch-marked by miles of unmarked and unmapped trails carved by generations of backcountry users at a time when no owner was around to say boo. [emphasis added]

Certainly the various forms of hash mark are commoner than the corresponding forms of hatch mark, as these MSN Search counts indicate:


Total mhits for hatch: 8,255; total mhits for hash: 155,429. That's a 19-to-1 victory for hash. And hash mark makes the dictionary, while hatch mark doesn't (at least that's the situation in the AHD and the OED). On the other hand, hash(-)marked only beats out hatch(-)marked by 1212 to 783 (roughly 1.5 to 1) -- and both are rare -- suggesting that others are uncertain about this as well.

I wonder if deep down, Johnson might have meant to use the standard collocation "cross-hatched" rather than either "hatch-marked" or "hash-marked":

  +hatch +hash +hatched +hashed

Total mhits for hatch: 152,985; total mhits for hash: 1,133. That's 135-to-1 for hatch in this case.

However, the kind of hatch in question is "An engraved line or stroke; esp. one of those by which shading is represented in an engraving" (the OED's gloss for hatch3), and so the compositional meaning of hatch-marked "marked by strokes" should be perfectly acceptable in the context of Johnson's sentence. Taking the other side, though, we can observe that this meaning of hatch is rather obscure, and hash mark and cross hatch(ed) are idiomatic collocations, so that many readers like Jim are going to suspect that Johnson's "hatch-marked" was a malapropism (or perhaps an eggcorn, since the sound is very close and the sense is also appropriate). In the end, this may be another Hobbesian choice.

And ironically, both hash and hatch have (I think) the same etymological source, which according to the OED is:

[a. F. hache (12thc. in Littré) = Sp. hacha, It. accia: -- OHG. *happja, whence hęppa, MHG. hepe scythe, bill, sickle.]

In the case of "hash mark" and "hash sign", there was apparently a recent transition from hatch -- "altered by popular etymology", as the OED delicately puts it.

So if Johnson is putting the hatch back in hash mark, perhaps he's not making a hatch of it after all: that's just how the old hatchet sometimes crumbles.

[ Curiously, hack (though similar in sound and also in core meaning) seems to have a different source:

[Early ME. hack-en, repr. OE. *haccian (whence tó-haccian to hack in pieces): -- Common WGer. *hakkôn: cf. OFris. to-hakia, MHG., MLG., MDu., G. hacken, mod.Du. hakken.]


Posted by Mark Liberman at 06:47 AM

March 05, 2006

Stumbling across the linguistic divide

Via Language Hat comes another tale of spellchecking run amok. The Recorder on presents for our amusement these passages from an opening brief by Santa Cruz practitioner Arthur Dudley to San Francisco's 1st District Court of Appeal:

An appropriate instruction limiting the judge's criminal liability in such a prosecution must be given sea sponge explaining that certain acts or omissions by themselves are not sufficient to support a conviction.

It is well settled that a trial court must instruct sea sponge on any defense, including a mistake of fact defense.

In five different places in the brief, Dudley's spellchecker put "sea sponge" in the place of the phrase sua sponte, Latin for "of one's own accord, voluntarily." Dudley corrected the brief after his own client (a former judge) asked for an explanation. But word got around to local attorneys in Santa Cruz, who now kid Dudley about the "sea sponge duty to instruct."

Dudley shouldn't feel too bad, though, since cross-linguistic spellchecking goofs are not uncommon even in the copy-edited prose of major newspapers.

Regret The Error is the go-to site for all manner of journalistic schadenfreude. Last November, RTE picked up this correction from the Denver Post:

Because of an editor's error, a sentence on page 8D on Tuesday in a story about Rockies prospect Hector Gomez buying a bus was changed from "On the back he put 'Los Peloteros' which in Spanish means 'The Ballplayers'" to "he put 'Los Plotters' which in Spanish means 'The Pallbearers.'"

The spellchecker-enabled substitution of peloteros with plotters is easy enough to understand, but how in the world did ballplayers transform into pallbearers? I don't think a spellchecker can be blamed for that one, since no common misspelling of ballplayers will yield pallbearers when run through Microsoft Word's spellcheck dictionary (as when acquainted is misspelled as aquainted and gets changed to aquatinted). Rather, this seems like an auditory error of some sort. I can imagine somebody in the newsroom yelling out, "What does peloteros mean in Spanish?" — then a Spanish speaker yells out ballplayers, which is misheard as pallbearers.

The Denver Post has run into other spellchecker problems recently, not just when copy-editing foreign terms. Proper names are also a big stumbling block for spellcheckers — for instance, when the New York Times changed DeMeco (first name of University of Alabama linebacker DeMeco Ryans) to Demerol, or when the Rocky Mountain News changed the corporate name of Leucadia to La-De-Da. On its Nov. 30, 2005 editorial page, the Denver Post managed a trifecta of name glitches:

Last week, Michael Scanning, an associate of lobbyist Jack Brimful and before that an aide to DeLay, pleaded guilty to conspiring to bribe Ohio Rep. Bob Hey and other public officials in an investigation that centers on Brimful's activities.

Whoops... just replace Scanning with Scanlon, Brimful with Abramoff, and Hey with Ney. The Denver Post was self-effacing in its correction:

One sympathetic journalism expert said yesterday that spellcheck can be an editor's enemy, "as Voldemort is to Harry Potter."
Or as our spellchecker would have it, "as Voltmeter is to Harry Potter."
A lesson learned.

But even outdoing that triple error is a bollixed article recalled by Hard News author Seth Mnookin in an interview with Regret The Error. Before writing for Newsweek and Vanity Fair, Mnookin put in his time at the Palm Beach Post, where one of his articles was the victim of a massive spellchecking debacle involving both foreign words and proper names:

I was waking up early on Saturday to go visit some friends in New York. So I got up at 4 in the morning to go to the airport [and looked at the paper]. The story had gone through a spellchecker and since it was about a sculpture from Canada there were all these French Canadian words. The entire article was gibberish. Every single name — not just the name of the sculpture, but the name of the place it was coming from and everything — was just tuned [sic!] into gibberish.

I looked up the article on the Nexis database, and the damage was about as bad as Mnookin remembers:

"Gaps in Boca statue's cost hardly a joke" by Seth Mnookin
Palm Beach Post, October 17, 1998, p. 1A
The gallery owners who won a city endorsement to put a 7-foot-tall sculpture of a jester downtown, apparently misled officials about the cost of the art and their potential profit in the deal.
Owner Richard Lepanto, who runs Kitty's Gallery on East Palmetto Park Road with Claire Fontana, said he expects to make more than $50,000 if fund-raising efforts are successful.
In August, he told the city's Community Redevelopment Agency that the bronze sculpture would cost $265,000. He said he planned to raise twice that from local donors and donate half of everything raised to the American Heart Association.
But a gallery in Quebec that sells the same sculpture — Benevento by Canadian artist Nicole Tallinn — said the piece sells for $95,000 in Canadian dollars, or about $62,000 in American currency. ...
Tallinn, reached in Canada, said the piece sold for $95,000 Canadian when she first sold it several years ago. However, a bank manager at La Case Popular du Vex Quebec, who bought the first copy of Benevento in 1995, said the piece cost less than $75,000 Canadian, or less than $49,000 U.S. ...
''Of course it's going to cost more in Boca Raton than in Canada,'' said Sylvia Morin, a Deerfield Beach resident who is working with Kitty's Gallery.

Here is the redfaced correction the Palm Beach Post ran the next day:

Because of an editor's error in using a computer spelling program, names in a story about a jester sculpture in Boca Raton were misspelled in Saturday's Palm Beach Post. Kutty's Gallery owners are Richard Lapointe and Claire Fontaine. The name of the sculpture is Bienvenue and its artist is Nicole Taillon. A resident working with Kutty's Gallery is Sylvie Morin.

One can see some of the limitations of the Palm Beach Post's spellcheck dictionary: good with European place names (Benevento in Italy, Tallinn in Estonia), not so good with French greetings (bienvenue). But the printed correction didn't even mention one glaring cross-linguistic blunder: "La Case Popular du Vex Quebec," which is evidently a spellcheckified version of "La Caisse Populaire du Vieux-Québec." A caisse populaire is a form of credit union found throughout French Canada, while Vieux-Québec ("Old Québec") is the name of Québec City's historic district.

Quelle horreur. In comparison, Dudley got off relatively easily with his sea sponges.

Posted by Benjamin Zimmer at 09:55 PM


David Beaver recently posted about the price of his weight in gold, under a title ("... and the value of nothing") that alludes to one of Oscar Wilde's many bon mots:

LORD DARLINGTON. What cynics you fellows are!
CECIL GRAHAM. What is a cynic? [Sitting on the back of the sofa.]
LORD DARLINGTON. A man who knows the price of everything and the value of nothing.
CECIL GRAHAM. And a sentimentalist, my dear Darlington, is a man who sees an absurd value in everything, and doesn't know the market price of any single thing.
[Lady Windermere's Fan, Act 3]

This is widely quoted in various monologic forms (e.g. "A cynic is a man who..."), but David's post may be the first time that the quotation has been applied to a linguist, at least in print.

In fact, it's striking how rarely this little zinger is used as a phrasal template, of the kind that we've taken to calling snowclones. It starts with all the advantages: rich, thin, elegant, famous, parallel, memorable. Google suggests that thousands of web pages have succumbed to the temptation to apply this witticism to economists rather than to cynics, and also to accountants and a few other money-related professions. But at that point, our collective creativity seems to have stalled. "A cynic is a man who knows the price of everything and the value of nothing" is apparently a noclone: often quoted, rarely adapted.

Among the few exceptions that I've been able to find, there's a little clump where various movie-industry statistics are substituted for price:

Were he alive today, Oscar Wilde would describe a movie buff as a man who knows the weekly gross of everything and the value of nothing. [link]

A new figure emerged: the movie buff, who knows the credits of everything and the value of nothing. [link]

And there are a couple of puns on price:

One unintended side-effect of buying an offbeat car is that it encourages one’s friends to make outrageous puns. I’ve already been asked if I will become ‘Prius sensitive’. And someone has even adapted Oscar Wilde’s crack about a cynic being “someone who knows the prius of everything and the value of nothing”! It’s tough being an innovative consumer. Sigh. [link]

But compared to other patterns that we've looked for, such as "X is the dark matter of Y", or "in X, Y Z you", this one is rare.

The most obvious dimensions of generalization would be the blanks in:

___ is [someone] who knows the ___ of everything and the ___ of nothing.

In the first blank, it's natural to substitute economists and others concerned with prices, but apparently not much else. In the second blank, it's natural to substitute other measures that can be opposed to price, but not many substitutions show up. For example, the various controversies about psychological testing don't seem to have inspired anyone to write about those who know "the score(s) of everyone and the value(s) of no one", or anything similar that might be caught by the pattern {"scores of * and the values"}.

And I haven't found any full-out treatments at all, along the lines of "An X is someone who knows the Y of everything and the Z of nothing". I guess the relationship X:Y:Z in this case is just too specialized, or perhaps too abstract. Try completing sentences like "a computer scientist is someone who knows the __ of everything and the __ of nothing", or even "a lawyer is ..." It's hard to come up with any completions that aren't totally lame. But send me counterexamples, whether found or concocted, and I'll post them.

[Update: Several readers -- Grzegorz Chrupała was the first, and others included Will Fitzgerald and Bill Findlay -- have reminded me of a long-established and widely quoted variant: "Lisp programmers know the value of everything and the cost of nothing", due to Alan Perlis. JS Bangs explains: Lisp, every statement returns a value, but Lisp tends to be very costly in terms of processor power and memory. This is a complete snowclone, with all three spots given over to something different than their original, although the general cost/value pun remains.

With respect, I think that "pun" is not quite the right word here. In a variation of this, Don Porges suggests that "a computer scientist knows the address of everythng and the value of nothing. Or maybe that's a non-optimizing compiler."

Adrian Morgan observes that the simpler phrase "An X is someone who Y everything and Z nothing" is more productive of variants, such as

An internist is someone who knows everything and does nothing. A surgeon is someone who does everything and knows nothing. A psychiatrist is someone who knows nothing and does nothing. A pathologist is someone who knows everything and does everything too late.

Adrian also points to a penumbra of more loosely related cases, which probably aren't intended to evoke Wilde at all, such as this Oysterband lyric:

The shadow of the Pharaohs freezes up the earth
They know what the price is, they don't know what it's worth

In stricter imitation of the cynic quote, Adrian suggests a candidate of his own "A materialist is someone who knows the cause of everything and the reason for nothing".

And Bruce Rusk suggests that

Isn't Wilde working with the resonance of "everything ... nothing," which is often used to form parallel phrases? Things like "knows everything but understands nothing."

Quick LION check:

Wilde, The Critic as Artist (1891--the year before Lady Windermere's fan was first produced, presumably written the same year?): "Conversation should touch everything, but should concentrate itself on nothing."

Melville, Typee: For my own part, although hardly a day passed while I remained upon the island that I did not witness some religious ceremony or other, it was very much like seeing a parcel of "Freemasons" making secret signs to each other; I saw everything, but could comprehend nothing.

At what level would the snowclone lie?

As Bruce observes, the pattern here is just the parallel contrast of everything/nothing as objects of two semantically-related verbs with the same subject. Bruce also suggested that linguistics offers many opportunities, such as

"A phonetician is someone who knows the pronunciation of everything and the meaning of nothing."
Replace with grammar, derivation, etc.


Posted by Mark Liberman at 08:43 AM

Pioneers of word rage

A few months ago Mark Liberman remarked on a phenomenon that seems peculiar to the English-speaking tradition: "word rage" — that is, disgust over non-normative language use accompanied by imagined physical harm to the transgressor. A classic example is the reaction of Henry Higgins to Eliza Doolittle in My Fair Lady: "By rights she should be taken out and hung / For the cold-blooded murder of the English tongue." I've been keeping an eye peeled for historical cases of linguistic fury, and I've come across a couple of humorous progenitors for today's word-rageaholics.

The first example is from the British writer Samuel Butler (1835-1902), best known as the author of the satirical novel Erewhon. In 1875 Butler wrote the poem "A Psalm of Montreal" after a visit to that city. Like Erewhon, published a few years earlier, the poem lampoons the affectations of Victorian society. Here is a description from Betty Bednarski, in an article about a French translation of the verse:

Butler's poem is a comment on the prudish atmosphere that prevails in late 19th-Century Montreal, where, shoved away in a back room of the Natural History Museum and gathering dust, its offending piece of private anatomy facing the wall, he has found a plaster cast of a famous Greek statue. He is informed by an employee of the museum, a stuffer of owls and such, that the Discobolus — as it is called — is not fit for public view, since it has no "pants" to wear. Butler makes fun of the prudery, the lack of culture, the indifference to beauty, and the Canadian use of the word "pants."

The museum taxidermist, who has boasted that his brother-in-law is the haberdasher to one Mr. Spurgeon (evidently an important man in Montreal society), is rebuffed by Butler with scriptural wrath:

Then I said, "O brother-in-law to Mr. Spurgeon's haberdasher,
Who seasonest also the skins of Canadian owls,
Thou callest trousers 'pants', whereas I call them 'trousers',
Therefore thou art in hell-fire and may the Lord pity thee!"
O God! O Montreal!

For Butler, the taxidermist's Ashcroftesque insistence on concealing the statue is matched only by his ridiculous North American use of the word "pants." The Wikipedia entry for "trousers" helpfully explains the trans-Atlantic distinction:

In North American English, pants is the general category term, and trousers refers, often more formally, specifically to tailored garments with a waistband and (typically) belt-loops and a fly-front. For instance, informal elastic-waist knitted garments would never be called trousers in America.
In British English, trousers is the general category term, and pants refers to underwear (in America, called underwear, underpants or panties to distinguish them from other pants that are worn on the outside).

It's safe to say that Butler wasn't seriously condemning the museum worker to hell-fire for his impertinent colonial usage of "pants." Rather, he seems to be saying: "So you're offended by a nude statue? Well, I'm offended by your use of the word 'pants'! Which is the greater sin?" The image of hell-fire (and the mock-Biblical style of the "Psalm" in general) only serves to give the taxidermist a taste of his own pious medicine.

Butler doesn't quite threaten physical harm, instead sardonically implying that his Canadian interlocutor is doomed to suffer everlasting misery in the hereafter. But we can find depictions of actual usage-inspired violence in American cartoons from a century ago.

The year was 1906, and President Theodore Roosevelt had turned his reforming energies to... spelling. An advocate of simplifying English orthography, Roosevelt issued an executive order that required the Government Printing Office to adopt 300 reformed spellings, as recommended by the Simplified Spelling Board (an organization that included the likes of Andrew Carnegie, William James, and Mark Twain). This peremptory imposition of spelling reform was greeted with widespread scorn, both in Congress and around the nation. Roosevelt ended up being forced to withdraw the order, and his political miscalculation effectively killed the simplified spelling movement.

The vehemence of the backlash against Roosevelt's spelling reforms is vividly illustrated by the comic strip "The Outbursts of Everett True" by A.D. Condo and J.W. Raper (thanks to Josh Fruhlinger at The Comics Curmudgeon for the link).

It's worth noting that Everett True's savage pummeling of the unfortunate spelling reformer is entirely in character. As explained on the Barnacle Press site where these cartoons are archived, most strips follow a simple formula: in the first panel, Everett is annoyed by someone, and in the second, he beats the person up. He's just as likely to assault a tiresome song-plugger, a smart-alecky policeman, or a lascivious ankle-ogler. The preface to a collection of strips calls Everett "a living protest against the incarnate irritants that are with us always." So the spelling reformer was just one of the many "incarnate irritants" of the day, but the strip implies that readers would share Everett's orthographic indignation, even if they weren't driven to acts of violence.

Cartoonists also violently depicted Roosevelt himself during the spelling-reform fiasco. A video clip from a public television documentary series called "Children of the Code" describes Roosevelt's attempted reforms, with commentary from John A. Gable, executive director of the Theodore Roosevelt Association. Gable presents a 1906 cartoon from the popular magazine Collier's in which Teddy, sporting a half-academic, half-Rough Rider ensemble, shoots up the King's English in the form of a dictionary. Meanwhile, the ghosts of Dr. Johnson, Shakespeare, and Chaucer look on in horror.

Posted by Benjamin Zimmer at 12:32 AM

March 04, 2006

... and the value of nothing

Someone very dear to me told me yesterday I was worth my weight in gold. Naturally, I had to check it out. At time of utterance, that was $1,130,000.

Posted by David Beaver at 11:38 AM

March 03, 2006

The entire United States wept

I recently came across an article in the Mainichi Daily News describing how Tokyo police officers have compiled a glossary of juvenile jargon to help them decipher what Japanese teenagers are saying. Teen slang in Japan involves a great deal of wordplay, such as by shortening words and reversing their syllables (not unlike the veiled slang of many youth cultures, from French verlan to Indonesian prokem). Nouns are easily verbed by suffixing a clipped form of a word with the syllable -ru, as in famiru meaning 'go to a family restaurant.' The article reproduces some examples from the police glossary (originally appearing in the magazine Weekly Playboy), and as one would expect from ingroup slang many of the terms are cryptically allusive. Perhaps the most cryptic entry is the following:

Zenbei ga naita — literally, "the entire United States wept." Means nothing important.

Can you guess why that phrase might imply triviality?

Give up? The Mainichi writer explains:

One might be moved to wonder how the above expression could possibly take on such an unrelated meaning. After checking the blogs, your reporter came up with this explanation: When many U.S. films open in Japan, they are accompanied by posters claiming that American viewers were moved to tears. But the such films have little emotional impact on viewers here. So Japanese filmgoers have learned, apparently, to disregard such promotional claims as largely meaningless.

At first that struck me as wonderfully bizarre, but then I realized Americans often do something similar, sarcastically reframing the clichéd blurbs of movie or theater critics. One old one is "I laughed, I cried, it became a part of me!" Or: "It's the feel-good hit of the summer!" (Or for those who came of age in the mid-'80s, there's the line from the Saturday Night Live mock commercial for the hypnotist's show: "It was better than Cats!") Granted, none of these stock phrases has come to mean "nothing important," but the associative thinking behind the Japanese expression isn't all that far-fetched.

A side note: I stumbled upon this article via an exceptionally roundabout cruise through the blogosphere. How roundabout, you ask? A blog feed picked up Erin O'Connor's link to my recent snowclone post. Then clicking on Erin's tag for "snowclones" led to a post on cognoshanty about using Kai Carver's fun Google app as a snowclone accumulator. Comments on the cognoshanty post touched on the remodeling of clichés in other languages, in which context Kai mentioned zenbei ga naita. Kai found out about that by following a link on the pioneering Robot Wisdom blog to iMomus (the blog of performance artist Nick Currie, aka Momus), which in turn linked to Japundit, which linked, finally, to the Mainichi article. Whew!

By the way, the cognoshanty folks helpfully point out that Jorn Barger (illustrious coiner of the word "weblog") has been collecting snowclonish expressions on Robot Wisdom since the very early days of blogging, under the rubric MemeWatch / ClicheWatch. As noted on the Wikipedia entry for "snowclone," the first such expression catalogued by Robot Wisdom was back on Nov. 2, 1998: "The X formerly known as Y-not-Prince." This surely deserves special recognition in the annals of snowclonology — perhaps a fancy display case prominently exhibited in the Snowclone Hall of Fame (still under construction).

[Update: Anticipating Kai Carver's Gooph, Will Fitzgerald came up with his own snowclone/meme searcher based on Robot Wisdom's MemeWatch back in September 2003.]

Posted by Benjamin Zimmer at 01:57 PM

Freedom of speech: more famous than Bart Simpson

OK, all you spinmeisters out there, listen up. This is a small trick, but it's a good trick, and there are plenty more where this came from.

Suppose that your thing is X-ology, and you want to emphasize how ignorant people are about X, in order to publicize your efforts to educate them. So you do a survey where you ask a simple and important question that is actually tricky and confusing: say, what are the freedoms guaranteed by the First Amendment to the U.S. Constitution? Or what are the capitals of the states in the Pacific time zone? Or what are the features of a passive sentence in English? You ask a thousand people, and the results come out something like this:

Number of
correct answers
Number of
Percentage Cumulative

Of course, we're talking about the recent First Amendment survey, and the percentages in the third column are the numbers reported in the press materials provided by the McCormick Tribune Freedom Museum. (I created the other numbers to correspond to those percentages..) Now, how to present this?

73% could name at least one freedom, but 73% sounds like a really good number for a poll or survey. Almost 3/4 of the people surveyed "got it". Headline: Only 73% Can Name 1st Amendment Freedom. No, that's not going to work. What to do?

There are plenty of low percentages in the table, but we need a "good bad number", so to speak. You could say that only 1 in 1,000 could name all five -- but Americans don't expect perfection on tests, and such an extreme result risks focusing attention on the weirdness of the question. No, we need a number around 30% to symbolize the pitiful state of Americans' knowledge. (When W's approval rating reached 34% in some recent polls, we saw headlines like "poll-axed" and "all-time low".)

However, we also need a statistic that has a catchy and persuasive description. We could say that only 9% got three or more answers right; but 9% is suspiciously low, and anyhow, three or more seems to be an arbitrary choice. The only number that doesn't sound arbitrary is one -- but we already saw that 73% got one or more answers right. The 29% that got two or more right is a good percentage, but again, two or more seems like an arbitrary and therefore meaningless threshold. So to frame the result in a way that makes it seem natural and meaningful, how about "more than one"?

Bingo: the first sentence of the press release reads:

A new McCormick Tribune Freedom Museum survey finds that only about one in four Americans (28 percent) are able to name more than one of the five fundamental freedoms granted to them by the First Amendment to the U.S. Constitution.

[Never mind the 28%/29% difference -- there's a round-off issue in here somewhere.]

And here's the cool part. It's easy to confuse "more than one" with "one or more"; and "one or more" is a lot commoner:

  Google Yahoo MSN LDC News
"one or more of the"
"more than one of the"

So you can hope that even some excellent journalists, like Peabody-award winning Robin Young of NPR's Here and Now, will report your results like this:

... a new survey by the soon to open McCormick Tribune Freedom Museum in Chicago found that only twenty eight per cent of those asked were able to name *one* of the freedoms, yet fifty two percent could name at least *two* of the members of the Simpsons family.

Even if the press repeats your phrase "more than one" correctly, a large fraction of the public will make the same mistake that Robin Young did.

Of course, the museum's real genius here was to compare the first amendment poll results with the results of a question about members of the Simpsons family. But some careful spinning was still needed! The results stacked up as follows:

Freedoms Simpsons Reverse
Freedom Simpsons
0 of 5
1 of 5
1 or more
2 of 5
2 or more
3 of 5
3 or more
4 of 5
4 or more
5 of 5
all 5

Some possible headlines that we didn't see:

Don't have a cow, man: 30% more draw a blank on the Simpsons than on the Constitution [35% vs. 27%]
Survey: 73% can name a First Amendment freedom / Only 65% can name a Simpson

Some of the other results would also have lent themselves to reverse-spin headlines. In unprompted free recall, 70% of respondents came up with "freedom of speech" as a right guaranteed by the first amendment. The best-remembered member of the Simpsons was Bart, who was named by 61% of respondents. But we didn't see this headline either:

Freedom of speech: more famous than Bart Simpson!

And the most shocking statistics in the report, in my opinion: only 51% identified Homer, and only 43% identified Marge. Barely half of Americans can remember that Homer is a Simpson?!? Fewer than half can remember Marge?!? I mean, talk about burying the lede. An emergency educational initiative in Simpsonology is clearly required.

[All kidding aside, I'm solidly in favor of the McCormick Tribune Freedom Museum's initiative, to the extent that I understand it. In fact, since we Americans have such a short, inspired and well-written constitution, and since I think that memorization of canonical texts is a Good Thing, I'd be in favor of encouraging everyone to memorize all 7600 words of the constitution and its amendments. But the museum's presentation of its poll was a classic example of the rhetoric of public relations -- not the most dishonest one ever seen, but not overly scrupulous either -- and the press swallowed it hook line and sinker. Or maybe the press was a willing collaborator in spinning the issue for the public, I'm not sure.]

Posted by Mark Liberman at 08:24 AM

Playing for the Dominican, skiing in Czech, working in Saudi

New York Mets pitcher Pedro Martinez is nursing a sore toe, so he has opted against playing for the Dominican Republic in this month's inaugural World Baseball Classic. The AP quotes Pedro's reasoning:

"It would be totally unfair to the Dominican. I haven't even thrown a breaking ball yet. ... I actually reported early to try to do something. That was my main goal, to play for the Dominican. They know I would like to be there, but I cannot do it."

Pedro's shortening of "the Dominican Republic" to "the Dominican" is not particularly unusual among Dominicans living and working in the United States. Coverage of the leadup to the WBC has included discussions by players and reporters alike about who exactly would be "playing for the Dominican." And in case there was any confusion over whether "the Dominican" is elliptical for "the Dominican team," there are also many references to people going "to the Dominican" or doing things "in the Dominican," so it's definitely a toponym. Think of it as the converse of President Bush's "Great British" problem: it's what happens when a country has a straightforward toponymic adjective in English but lacks a one-word nominal form. (Well, Dominicans do sometimes use a single-word toponym: Quisqueya or Kiskeya, from the Taíno name for the island of Hispaniola, but that's not widely known among non-Dominicans.)

I can think of two other cases where a nation's toponymic adjective gets informally pressed into service as a noun as well (though neither uses the word "the" as in "the Dominican"). One is "Czech," which according to a recent article in the Prague Daily Monitor has been suggested as a one-word English designation for the Czech Republic. For instance, during the Winter Olympics, players for the national hockey team wore the word "Czech" on their uniforms. That's a symptom of the Czech Republic's difficulties in finding a snappy toponym ever since the official breakup of Czechoslovakia thirteen years ago. Even in the Czech language, the officially sanctioned appellation Česko has had decidedly mixed success, though it seems to have gained more acceptance over the past few years. Meanwhile, the English equivalent selected by the Czech Terminological Commission in 1993, "Czechia," has never really caught on. Neither has "Czechlands" or "Czecho," two other alternatives mentioned in the Prague Daily Monitor article. So it's not too surprising that "Czech" should step into the gap (even though the Wikipedia entry for "Names of the Czech Republic" sternly says that those who use "Czech" as the English name for the country do so "wrongly"). Michael Farris, a longtime resident of Poland, reluctantly admitted on the sci.lang newsgroup that this is his preferred informal usage:

My very inelegant usage is:

the Czech Republic (formal)
Czech (informal)
as in:
I haven't been much in Czech, just Prague.
They went to Czech to go skiing.
I'm not crazy about it but I've heard others use it as well.
If I saw/heard more people using Czechia I would too, but til then ...

The other case that springs to mind is "Saudi," sometimes used as shorthand for "Saudi Arabia." This is a bit different from stripping "Republic" from "(the) Dominican/Czech Republic," as "Arabia" is itself a perfectly serviceable toponym prefixed by an adjective derived from the ruling House of Saud. But using "Arabia" as a one-word name for "Saudi Arabia" doesn't really fly, since that usually refers to the entire peninsula. (Conflating "Saudi Arabia" with "Arabia" would annoy Yemenis, Omanis, Emiratis, Qataris, Bahrainis, and Kuwaitis about as much as calling the U.S. "America" bugs Canadians. On the other hand, opponents of the House of Saud often refer only to "Arabia" so as to delegitmize the Saudi regime.) The use of "Saudi" for "Saudi Arabia" seems to have been popularized by military types and expats working there, though it has spread beyond these circles, for instance to headline writers. Here are some recent examples indicating its common use in headlines both locally in the Gulf states and internationally:

  • "28 survivors of ferry tragedy land in Saudi" (The Peninsula [Qatar]/AFP, 2/5/06)
  • "Be patient, UK PM's wife tells women in Saudi" (Gulf Times [Qatar], 2/13/06)
  • "No inflation pressure in Saudi, says minister " (TradeArabia [UAE], 2/16/06)
  • "NRK on death row set to return from Saudi" (The Peninsula [Qatar] 2/19/06)
  • "Pinoy workers seek immediate repatriation from Saudi" (Sun Star [Philippines], 2/19/06)
  • "Two Indians held in Saudi to be freed" (The Hindu [India], 2/22/06)
  • "Al-Qaida threatens more attacks in Saudi" (Houston Chronicle/AP, 2/25/06)
  • "Oil disaster averted in Saudi" (UPI, 2/27/06)

The article from the Philippines about Pinoy (i.e., Filipino) migrant workers in Saudi Arabia is one of many such examples. As an informal label, "Saudi" can evidently have an even broader denotation among Filipino workers. I came across a 1997 article in Asiaweek that says that "Saudi" is "the overseas workers' term for all Arab states."

By far the largest proportion of online discussion about "going to Saudi" or coming "back from Saudi" derives from Americans who have worked in the Gulf, particularly those stationed there in the military. I would surmise that the usage was first popularized during Desert Storm, as it has the feel of military shorthand. Even before the war when Operation Desert Shield was getting underway, the toponym "Saudi" cropped up in military use: an Army Times headline from Nov. 26, 1990 reads, "We're Going to Saudi."

[Update #1: Dane Bell reports that he has heard his mother using "Saudi" for "Saudi Arabia" for at least the last 20 years when describing her life there 40 years ago. So expatriate usage of "Saudi" clearly extends back long before the Gulf War. Stephen Jones corroborates this, recalling that "Saudi" was in standard use among expats before the war. Graham Curran can date it back to 1974 from personal experience.]

[Update #2: Comments continue to pour in...

From Ben Sadock:

I myself have been hurting for a way to refer to the Czech Republic since 1993, and I've noticed that a lot of people seem to have settled on Czechoslovakia even when they know better. Others have started saying "Czech Republic" with no article, a la Ukraine, (which I also have trouble saying; I feel like I'm making fun of Slavs by omitting articles.) In any case, this is an interesting phenomenon. I think your Google searches do a decent job of showing that people (Dominicans among them) are referring to the country, and not just the team as "the Dominican," but they aren't doing so due to a lack of alternatives (or the obscurity of Quisqueya). I live near a large Dominican community, and they in fact do have a casual way of referring to their erstwhile home: the D. R., modelled perhaps on P. R. for Puerto Rico.
Me, I'm looking forward to the day when we start calling the Netherlands "the Nethers."

From Richard Hershberger:

I am, in my spare time, a 19th century baseball history geek. In your discussion of "playing for the Dominican" you discuss, and dismiss, the possibility that this is an elliptical form of "playing for the Dominican team". You are undoubtedly correct. It would be virtually unheard of in present-day English. But as a possibly interesting aside, in the 19th century that would have been a standard usage (substituting "club" for "team").
A common pattern of club names was, for example, "The Athletic Base Ball Club". In journalistic usage a player might be said to be playing "for the Athletics" as in modern usage, but also might be "playing for the Athletic Club" or the elliptical form "playing for the Athletic". Club names were routinely in the singular, though the plural might be used as appropriate.
Nowadays this usage is unheard of. The modern Oakland team uses both "Athletics" and "A's" but never "Athletic". I wonder if that form still appears in legal contexts, but I don't know. Modern team names are almost always in the plural, and when something other than a count noun is used (e.g. the NBA's "Jazz") this is criticized.

From Luis Rodrigo Gallardo Cruz:

> Conflating "Saudi Arabia" with "Arabia" would annoy Yemenis, Omanis,
> Emiratis, Qataris, Bahrainis, and Kuwaitis about as much as calling the
> U.S. "America" bugs Canadians.

Hey! It bugs us LatinAmericans too, you know!

From Jacob Lubliner:

Actually, Dominicans do use, among themselves, a simple (if not exactly single-word) toponym for their country; it's Santo Domingo. True, it's the name of the capital, but so are México, Guatemala and Panamá. The adjunct "City" is an anglicism, and its equivalent "Ciudad de ..." is used only rarely; people resort to other tricks for distinguishing capital from country.
The Czechs' problem with naming their country, in their own language or in others, seems to stem from the fact that Čechý (and its adjective český) historically refers only to that part of the Czech lands that non-Slavs know as "Bohemia" (or some version thereof). In Russian, Bohemia is called Chekhiya, which creates a bit of a problem for English Czechia, but Tschechien seems to have become well established in German. I try to use "Czechia" as much as possible, raised eyebrows be damned.
Now, as far as I know, in informal Arabic Saudi Arabia is usually called simply as-Sa'udiyeh, a form that has the advantage of being either an adjective or a noun. But whether Yemenis etc. would be annoyed by SA being called simply "Arabia" is questionable; I haven't heard of Algerians, Tunisians or Mauritanians complaining about Morocco's Arabic name, which is al-Maghrib. As I'm sure you know, synecdoche is quite common in toponymy.

From Andrew Gray:

I note the discussion on countries referred to by only part of their name. As a datapoint, I'm Scottish, and used to referring to Ireland as "the Republic".
This usage, I think, is picked up from my grandmother, who was brought up in Belfast just after the south became independent; I can't remember offhand if my other relatives there use it. I wonder if this just represents another case of the Saudi situation - the name has a geographic and a political part, but shortening it to the geographical name would be either confusing or inappropriate, so the political one gets used when people need a shorthand.
An interesting thought, this matter of terminology. Is it - or will it be - common in West Africa for people to be "from Ivoire"?

Posted by Benjamin Zimmer at 12:35 AM

March 02, 2006

Tracking snowclones is hard. Let's go shopping!

Michael Kaplan recently offered up a snowclone in need of investigation:

X is hard. Let's go shopping!

He was unsure of its origin, but a commenter on his blog wrote:

"Math is hard, let's go shopping" came from a talking Barbie doll. Mattel got a lot of heat for that one (people believed it made the science and math gap between males and females worse). Later models of the Barbie had the phrase deleted.

Well, it's a bit more complicated than that. Since talking dolls seem to lend themselves to the embellishments of urban folklore, I thought I'd track down the origins of this putative Barbie-ism.

It all started in 1992, when Mattel Inc. rolled out Teen Talk Barbie, the first talking Barbie doll in two decades. Each Teen Talk Barbie could say four phrases, selected from a total pool of 270 randomly programmed sayings. The doll was unveiled at the American International Toy Fair in February of '92 with great fanfare. The Washington Post reported on the unveiling with barely concealed sarcasm:

So here we are - a curtain rises, a stage appears and it's TEEN TALK BARBIE!
"You look great!" Barbie says. "Want to go shopping? Okay, meet me at the mall."
A thin, leggy blond woman (one of scores of such creatures brought in for the fair) introduces the chatty new doll and all her new clothing. The woman, with her hair and enthusiasm, is Barbie. She is holding a Barbie. The two Barbies are having a conversation, except the doll is doing most of the talking.
"Let's have a campfire!" says the doll. "We could take a Hawaiian vacation! Wouldn't you love to be a lifeguard?"

The doll began to be sold to the general public in July, and on Sep. 25 the Wall Street Journal broke the news of some serious objections about one of Teen Talk Barbie's utterances:

Barbie is a troublemaker.
The latest incarnation of the popular doll is posing a bit of a marketing problem for maker Mattel. Specifically, the Teen Talk Barbie Doll quips (among other remarks) that "Math class is tough." That has angered some women educators, who say they are already fighting to sustain schoolgirls' interest in math and science.
"If Barbie gives the message that math is tough, Barbie could be turning off girls to math and science, and that's a mistake," says Anne Bryant, executive director of the American Association of University Women, a research group that issued a report this year titled, "How Schools Shortchange Girls." The report found that, although girls and boys both do well in math prior to sixth grade, boys are more likely to score better and take higher-level courses after that time.
Judy Blitch, chairwoman of the education department at Wesleyan College for women in Macon, Ga., says she and a dozen students called Mattel's 800 number to protest. "We were concerned," says Prof. Blitch, who bought the doll for her five-year-old daughter. "We need to do whatever we can to encourage girls in that area. Barbie represents an important part of girls' lives and can influence thinking."
But Mattel mailed Prof. Blitch a reply that didn't address the issue. It doesn't plan to change the doll, which sells for about $25 and says four phrases. There are 270 sayings randomly programmed into different dolls, such as "Let's have a beach party" and "Don't be late for school." Says a spokeswoman: "I don't think math was chosen because it's more or less difficult. You can also get a doll that says, 'I'm studying to be a doctor.'"

Once the story was picked up by news outlets nationwide, however, Mattel soon caved in to public pressure. As the New York Times reported on Oct. 21, Mattel decided to alter the computer chip in Teen Talk Barbie to remove "Math class is tough" from the programmed selection of sayings.

So Barbie did talk vacuously about the difficulty of math ("Math class is tough") and the fun of shopping ("Want to go shopping? Okay, meet me at the mall"), but she never put the two sentiments together in one utterance. (I don't know if she ever said the exact words "Let's go shopping," either. More likely that was a confusion with another toy marketed to girls in the early '90s: the Pressman Toy Corporation's board game, "Let's Go Shopping.")

That might have been the end of things, but around Christmastime of 1993 came word of a shadowy organization known as the Barbie Liberation Organization, dedicated to switching computer chips between Teen Talk Barbie and her gender-stereotype opposite, Talking Duke G.I. Joe. (More about the BLO here, though the page inaccurately dates its founding to 1989.) The New York Times detailed the BLO's activities in a Dec. 31, 1993 article:

Your son tears the wrapping paper off his fierce new "Talking Duke" G. I. Joe doll and eagerly presses the talk button. Out comes a painfully chirpy voice that sounds astonishingly like Barbie's saying, "Let's go shopping!"
Does your son:
A) Furiously vaporize the doll with his own phaser rifle?
B) Go shopping with Joe?
C) Say: "Mom, I suspect we're the lucky victims of an elaborate nationwide publicity stunt designed to ridicule sexual stereotyping in children's toys. This barbaric little action figure you gave me may turn out to be a valuable collector's item."
If the answer is C, your son may be a collector's item himself, for he has correctly divined the latest socially conscious news media prank to hit the nation's toy stores.
For the last several months, a group of performance artists based in the East Village of Manhattan has been buying Talking Dukes and "Teen Talk" Barbies, which cost $40 to $50 each, painstakingly swapping their voice boxes and then, with the aid of cohorts, replacing dolls on the shelves of toy stores in at least two states.
The group, which asserts it has surgically altered 300 dolls, says its aim is to startle the public into thinking about the Stone Age-world view that the dolls reflect.
The result is a mutant colony of Barbies-on-steroids who roar things like "Attack!" "Vengeance is mine!" and "Eat lead, Cobra!" The emasculated G. I. Joe's, meanwhile, twitter, "Will we ever have enough clothes?" and "Let's plan our dream wedding!"

Publicity for the BLO's campaign of doll terror thrust Teen Talk Barbie back into the news and also resurrected the "Math class is tough" controversy from the previous year. Further proof that Teen Talk Barbie's inanities had struck a chord in the pop-culture consciousness came when The Simpsons aired an episode on Feb. 17, 1994 titled "Lisa vs. Malibu Stacy." In the episode eight-year-old Lisa launches a campaign against Barbie-esque Malibu Stacy when a talking version of the doll is released. Here are some of the things Malibu Stacy says (thanks to The Simpsons Archive):

I wish they taught shopping in school.
Let's bake some cookies for the boys!
Don't ask me; I'm just a girl <giggle> <giggle>.
Now let's forget our troubles with a big bowl of strawberry ice cream.
Thinking too much gives you wrinkles.
My name is Stacy, but you can call me <two-note wolf whistle>.

A few months later, the canonical form of the pseudo-Barbie-ism "Math is hard, let's go shopping" finally showed up in Usenet discussion, when Nick Fitch used it on May 27, 1994 in the gay and lesbian newsgroup soc.motss. But it took another couple of years for the expression to be snowcloned by Usenetters. The earliest example I've found is in a Jan. 30, 1996 post on ("College Bowl is hard ... let's go shopping!"). On Apr. 25 of that year, one contributor to soc.women.lesbian-and-bi who was stuck on a particular bit of phrasing made the tongue-and-cheek comment, "Language is hard. Let's go shopping." (Another contributor responded, "Shopping is hard. Let's go water skiing.") But 1997 was when the snowclone started to achieve popularity on Usenet, particularly on soc.motss where it became an in-joke for some of the regulars (among the items placed in the snowclone slot that year were "lit crit," "sociology and metaphysics," "situational ethics," and "sacrifice").

It wouldn't take long for the snowclone to move beyond newsgroups like soc.motss and into wider online usage. Nowadays just about anything can fill the slot of "math," from property to navigation to democracy to development theory to the programming language LISP. Michael Kaplan's post that set off this investigation stuck address formats in the slot. And the fact that Kaplan used it without being quite sure of its origins suggests that the snowclone has moved into a new phase, where the template can spread without being considered a direct allusion to something. Rather, use of the snowclone can now simply allude to other instances of the snowclone. How very meta.

[Update, 3/11/06: Barbie's thoughts on math and shopping also cropped up in a mid-'90s video game. See here for details.]

Posted by Benjamin Zimmer at 03:42 PM

March 01, 2006

On not emerging unscathed

Soon after ABC News posted a transcript of Elizabeth Vargas's breezy (and often vapid) interview with President Bush, bloggers were quickly picking apart every "um" and "ah" faithfully rendered by the network's transcribers. Bush was slammed on the usual grounds of disfluency: he's "incapable of speaking in complete sentences," he displays a "deteriorating ability to speak English," and so forth. Rather than piling on, I'd like to examine something peculiar uttered by Vargas, not Bush. It's a case of overnegation followed by a self-repair that only makes matters more confusing.

In one of the interview's more serious moments, Vargas asks about how Dick Cheney has been doing since his hunting accident. The transcript reads:

VARGAS: Do you think it's changed him?

BUSH: Um, I'm confident it changed him some how, you know. I, I think it shook him, and any time you get shaken like that, it's gotta have some effect on you.

VARGAS: He called it one of the worst days of his life. I don't think you can endure something like that without emerging unscathed, or changed.

Vargas starts off her sentence with a negative construction: "I don't think you can endure something like that..." The opening is then offset by introducing another negative, "without..." (This is not unlike the rhetorical figure of litotes: the expression of an affirmative by negating its contrary.) Those two negatives carry her along into yet another one: "emerging unscathed." But this is one negation too far, since she means to say that Cheney couldn't endure the experience and emerge unscathed. Perhaps mindful that she has negated herself into a corner, Vargas adds a positive alternative, "...or changed," as a kind of self-repair. Unfortunately, though, she ends up conjoining a negative ("unscathed") with a positive ("changed"), treating the two terms not as antonyms but as somehow equivalent. She does not emerge from that muddled sentence unscathed, but Bush carries on without missing a beat ("Yeah, yeah, exactly...").

It's easy to see how Vargas ended up mired in her troublesome overnegation. She falls back on an idiomatic phrase, "emerging unscathed," even though what she needs is its opposite. But as with many idioms, there's no easy way to negate it ("emerging ununscathed"? "emerging scathed"?). So instead she performs a Porky-Pig-style substitution and switches out of the vexing idiom, though the switch happens rather too late to be effective.

A crystallized negative form without an obvious positive alternative like "(emerge/escape) unscathed" seems particularly prone to overnegation. Here are a few more examples from an online search:

No one enters the zone of his imagination without emerging unscathed. (Bad Subjects)

So much of the landscape around us has changed, and so have we, since it's nearly impossible to experience college without emerging unscathed or unaltered. (Franklin & Marshall College Dispatch)

Not without escaping unscathed, working alongside the General is Emil Blonsky, a man bent on destroying all specimens spawned of gamma rays. (Epinions)

What Sher dramatizes would indicate the virtual impossibility of emerging without being psychologically unscathed. (Wolf Entertainment Guide)

No man would go through that kind of mixed paradise and hell without escaping unscathed. (Rory V. Pascual)

No country in the world will be able to flatten Israel without itself going unscathed. (soc.culture.usa)

I found myself rereading these examples several times to make sure that they were actually cases of overnegation. Without self-conscious monitoring, such constructions happily remain in our heads, ununpacked.

Posted by Benjamin Zimmer at 11:51 PM

Counting freedoms, Simpsons and percentages

[More here on the way the survey numbers were spun.]

According to a piece that aired today on NPR's Here and Now ("Freedom and The Simpsons"):

A new survey conducted by Chicago's McCormick Tribune Freedom Museum, which has yet to open, finds that only 28 percent of Americans are able to name one of the constituational [sic] freedoms, yet 52 percent are able to name at least two Simpsons family members.

If you listen to Robin Young interview Dave Anderson in the cited segment of the show, you'll learn that the "five freedoms" guaranteed by the First Amendment are religion, speech, press, petition and assembly -- and that only 1% of Americans surveyed could name all five of them, while 22% could name Homer, Marge, Bart, Lisa and Maggie.

But counting the Five Freedoms is confusing, after all: the Bill of Rights has 10 amendments, but the First Amendment covers 5 freedoms. And the wording of the First Amendment only mention 2 "freedoms" as such (speech and press), plus 2 "rights" (assembly and petition); and religion gets mentioned twice (no establishment of it, no prohibition of it), but only counts as one "freedom":

Congress shall make no law respecting an establishment of religion, or prohibiting the free exercise thereof; or abridging the freedom of speech, or of the press; or the right of the people peaceably to assemble, and to petition the government for a redress of grievances.

Then there are all sorts of other freedom-y things in the other nine amendments in the Bill of Rights, like the right to bear arms, the freedom from unreasonable searches and seizures, freedom from forced self-incrimination, freedom from cruel and unusual punishments. And then there are later amendments that have a lot to do with freedom, like the abolition of slavery in the 13th amendment. I bet that lots of people started listing things like those, and then realized that there are a lot more than five of them, and that some of them are more rights than freedoms, and ...

No, it's not surprising that Americans have trouble with the arithmetic of freedom, although it would be great if everyone could reel off religion, speech, press, petition and assembly with just as much facility as they can name Homer, Marge, Bart, Lisa and Maggie. A more surprising problem with arithmetic came in this passage:

Robin Young: Well, a- and when you say one percent, you mean you talked to a thousand people, only one in a thousand people knew all of the five freedoms?
Dave Anderson: That's correct.

[Update: There's a bigger numerical problem with this report: the museum's survey actually says that "72% were able to name at least one of these rights correctly, [but] this fell to only 28% who could name two or more,"; Robin Young renders this as "only 28% of those asked were able to name *one* of the freedoms, yet 52% could name at least *two* of the members of the Simpsons family".]

OK, this is kind of unfair, and I vacillated before posting about it. Robin Young co-hosts an hour-long radio show every day, day in and day out. It's a fine show, one that I often listen to. And everybody says dumb things from time to time, especially me. Several times a semester, someone comes up to me after a lecture and points out that I said something that didn't make any sense at all, and I realize that I swapped two words around, left out a not, or said log when I meant exp, or something like that. So I'm reluctant to beat up on Robin Young for defining "one percent" as "one in a thousand".

But what about Dave Anderson? Was he just disposed to say "that's correct" no matter what nonsense Robin Young attributed to him? Or is he also somewhat confused about the meaning of the statistics that he's peddling?

And Here and Now is not a live show. Presumably several people heard this piece before it aired. Does the fact that this blooper got through tell us something about quality of editing (or the degree of mathematical literacy) at NPR? No, let's just chalk it up to the fact that everybody makes mistakes, especially when speaking ex tempore.

[Wow. The confusion is spread wider and deeper than I thought. According to the UPI wire service story,

More jarring is that 22 percent of those polled can name all five characters -- Homer, Marge, Bart, Lisa and Maggie -- but just 1-in-1,000 people surveyed -- 0.01 percent -- were able to name all five freedoms.
The random telephone survey of 1,000 adults was conducted in January, and has a 3-point margin of error.

So this story makes an order-of-magnitude error in the opposite direction ("1-in-1,000" is given as "0.01 percent", though it's actually 0.1 percent) -- and also reports a rate of "1-in-1,000" for a survey that has a 3-point margin of error.

The press release from the McCormick Tribune Freedom Museum says "just one in 1,000 people surveyed (.1 percent)". So it's not their fault, except for featuring a 0.1 percent value from a survey with a nominal plus or minus 3.0 percent margin of error.

A longer form of the survey results is here. A quick read shows that there should be some very embarrassed people at Here and Now. The actual survey results were that 72% of the respondents could name at least one first amendment freedom, not 28%. The figure of 28% was for naming at least two of the five. The press release says that "only about one in four Americans (28 percent) are able to name more than one of the five fundamental freedoms granted to them by the First Amendment to the U.S. Constitution". This is misleading and tendentious writing, characteristic of the way that PR people spin the presentation of facts to promote the interests of their clients. In this case, the press (at least someone at Here and Now) swallowed it hook, line and sinker, and wound up mis-stating the true percentage by more than a factor of two (28% rather than 78%).

And the document confirms that the most of the people surveyed were able to list a large number of constitutionally guaranteed rights and freedoms -- the right to bear arms, the right to a trial by jury, and so on. They're just not keeping track of which rights and freedoms are covered by the First Amendment as opposed to other provisions. It's hardly fair for members of a profession that thinks 1 in a 1,000 is 1 percent (or maybe 0.01 percent), and that renders 72% as 28% when summarizing a document right in front of their face, to get all indignant about this public "ignorance". ]

Posted by Mark Liberman at 08:27 PM

Crazy talk

Tim Leonard recently pointed me to Joel Spolsky's use of the phrase crazy talk. Evaluating applications for summer internships, Joel wrote:

We did get a lot more good applications. Really good applications. Not just kids from Indiana. Students from all over. Illinois. Missouri. Well, OK, maybe not Missouri. Missouri is crazy talk.

Reading this, Tim "wondered what 'x is crazy talk' was taken from". A web search didn't uncover the answer, but he learned that folks over at the Joel on Software Discussion Group have been talking about the same thing. One participant suggests that "It sounds like dialog from an old western." Another agrees that "The way Joel phrased it, it seems like a reference to something.  We're just curious what it's a reference to."

Tim was mainly curious about how we know that a phrase like this is somehow special -- a snowclone or an idiom -- without knowing its source and perhaps without ever having heard it before. For now, I won't try to add anything to what I've written on related topics here, here, here, and here. Those posts are mostly about our hyper-sensitivity to infrequent but striking usages. As I point out, this sensitivity is sometimes an appropriate generalization from a very small number of examples, perhaps as few as one. ('s "statistically improbable phrases" feature aims to imitate this ability.) Arnold Zwicky has called this the Frequency Illusion in the cases where we interpret this feeling quantitatively, concluding that some individual or group uses a word or phrase or construction "all the time" or "hundreds of times", when in fact the real count may be in the low single digits.

The cited posts don't give a satisfactory answer to Tim's question, but at least they connect it to a number of other puzzling phenomena. (If this makes you feel better, as it does me, is it an example of the "Generalization Illusion"?) For now, let's pursue the specific case on the table: crazy talk.

This is certainly an expression that I've heard before. So have many others: Google yields 514,000 hits for {"crazy talk"}, Yahoo gives 523,00 and MSN gives 63,425. These are not enormous numbers -- Google claims 632,00 hits for {"language log"} -- but the phrase is definitely out there. Everything2 has an entry for the phrase "that's just crazy talk", explaining that it's "A useful all-around phrase when the person you're talking to just doesn't make sense. Or else needs to be shut up with a more polite phrase than 'mind your own business'".

It's pretty clear that "crazy talk" is a fixed expression, what people sometimes call a collocation. This is a bit different from an idiom, in that the meaning is pretty much transparent as a function of the meanings of the words that make it up -- unlike "red herring" or "kick the bucket" -- but the phrase is still favored over available alternatives. Thus it would have been surprising, I think, for Joel to have written "Missouri is crazy words". In contrast, we usually say or write "fighting words", not "fighting talk":

  __ words __ talk
crazy __ 49,700 514,000
fighting __ 673,000 252,000

We can also observe that "crazy talk" takes first-word stress (at least in my idiolect), like some other Adjective+Noun fixed expressions such as blackboard, bluebell, high school, as well as cases where the apparent adjective may be doing duty for a noun, such as medical student, etc.

Whatever its analysis, where did the phrase "crazy talk" come from? There are a couple of recent sources that might have been the trigger for some people.

One is William Wyler's 1961 movie The Children's Hour, adapted from Lillian Hellman's 1934 play of the same name. One version of the crucial passage is this:

Martha: There's always been something wrong. Always, just as long as I can remember. But I never knew what it was until all this happened.
Karen: Stop it Martha! Stop this crazy talk!
Martha: You're afraid of hearing it, but I'm more afraid that you.
Karen: I won't listen to you!
Martha: No! You've got to know. I've got to tell you. I can't keep it to myself any longer. I'm guilty!
Karen: You're guilty of nothing!

(The phrase "crazy talk" doesn't occur in the original stage play.)

A more recent source is a 2000 episode of The Simpsons, Bart to the Future. Bart has snuck into an Indian casino, but the casino security men capture him and bring him to the manager's office.

Manager:   If you want to see the future, throw a treasured personal item onto the fire.
           [Bart tosses a small object, which explodes with a bang]                                                                                           
           Not a firecracker!
Bart:      Hey, I bought it from a guy on your reservation.
Manager:   That's Crazy Talk.
Bart:      No, it's true.
Manager:   No, I know, that's my brother, Crazy Talk.  We're all a little worried about him.

But the phrase goes back beyond Bart and Karen. Some dialogue from Martha Gellhorn's 1948 novel Point of No Return ( p. 296)

"He says that he regrets he didn't have a machine gun intead of a jeep. He also says that he is sorry the war is over because if the war were not over he could volunteer to operate a flame thrower. He said that he only killed one German in the war; being a jeep driver he didn't have a chance at them. That's the sort of thing he says."

"That's crazy talk."

"He doesn't sound crazy."

And Cecil B. De Mille's 1908 vaudeville play "The Royal Mounted" includes this haunting passage:

No! I only want a fair chance. I love you on the square and I want you ter gi' me a show. There ain't no one else, is there?
(Moves away from him. Crosses R.)
No, there's no one else. I've got to fight you alone.
Now that's crazy talk. I ain't fightin' you and I don't want to.
(Crossing to her at R.C.)
I'll make it easy as I can for you, Rosa---if you'll only give me the chance---but there ain't ter be no one else, is there?
Not unless someone comes out of the wilderness---and that's not likely.

And skipping many late 19th century citations, there's Josiah D. Canning's 1838 Epistle to a Brother in Virginia:

73 ... the sick I found
74 Rose in delirium around,
75 And many, too, within the bound
76 Of an hour's walk;
77 Their ravings in my ear did sound
78 Like crazy talk.

And ten years earlier, an anonymous (?) story titled Andrew Cleaves, published in 1828 The Atheneum; or, Spirit of the English Magazines:

She wished, when their time came, they might lie half as quiet in their graves as old Andrew did in his, for all their nonsensical crazy talk about his walking o' nights.

Without trying to track the expression any further back, I conclude that crazy talk is a collocation without any particular source, though it may have been re-seeded from time to time by striking examples in popular culture.

[Note that Neil Postman's 1976 book Crazy Talk, Stupid Talk established crazy talk as a philosophical term of art, for a "form of collectivized nonsense" that "requires that we be mystified, suspend critical judgment, accept premises without question, and (frequently) abandon entirely the idea that language ought to be connected with reality", even though it "usually puts forward a point of view that is considered virtuous and progressive". Some discussion and quotes can be found here. I believe that Postman's usage is at best loosely connected to the ordinary-language meaning of the term, which seems to me to center around the literal meaning "talk that is crazy, talk that doesn't make sense, nonsense", and includes most examples of what Postman would assign to his contrasting category of "stupid talk".]

[As someone whose father was born and raised in St. Joseph, Missouri, I sympathize with those who were offended by Joel Spolsky's implication that Missouri is an unlikely source for good internship applications. But this is Language Log, not Regional Stereotypes Log.]

Posted by Mark Liberman at 08:55 AM

The revolving door has my head spinning

Speaking of the transformation of Senator John Thune (Rep., S. Dakota) from K Street lobbyist to senator, Sheryl Gay Stolberg writes in the 2/28/2006 New York Times, “It might be said that John Thune went through the revolving door – backward… Mr. Thune’s experience has put a spotlight on what some experts call ‘the reverse revolving door.’”

It’s easy to see what Ms. Stolberg intends by the “reverse revolving door” because we’re familiar with the revolving door as a characterization of the frequent passage from government official to lobbyist. What’s less apparent is why the trope works in the first place. It’s in the essence of a revolving door to permit simultaneous traffic in both directions. So what on earth could a reverse revolving door be? (Switching from clockwise to counter-clockwise might be momentarily off-putting, but it wouldn’t affect the direction(s) of traffic flow.) Once we notice the non-fit of visual image to intended meaning, it’s hard to fathom how the expression revolving door ever got started as a way to denote the one-way traffic from officialdom to K Street. It looks like this may be just another of those colorful tropes in which the physical image doesn’t match the intended concept, like falling between the cracks or back to back. Does anyone or anything normally follow another of the same kind facing backwards? And even if so, what about the more recent back to back to back (126,000 Google hits)? Visualize that, if you can.

We’re also familiar with tropes that break loose from originally sound moorings. And perhaps we can think of them as being related to those just considered – ones that have initially shaky physical foundations, like the revolving door. My favorite involves the French expressions faire long feu and ne pas faire long feu, both of which mean the same thing ‘to not last, to fizzle out’. The story goes that the original expression was faire long feu ‘burn for a long time’ and evoked a fuse that burns slowly and goes out before igniting the payload. But the image apparently switched, for some, from the defective fuse to the non-occurrence of the explosion, leading to the introduction of the negation. The non-negated, and presumably earlier (and “correct”), version retains about a seven to one advantage in Google hits.

The (ne pas) faire long feu phenomenon seems related in turn to the Mondegreen/nominal egg phenomenon (and laid him on the green misheard as and Lady Mondegreen; an arm and a leg misheard as a nominal egg), in which a bit of a lyric or other familiar phrase is given a new parse. [These reshapings of familiar words and phrases have come to be called "eggcorns", and have their own on-line database and wikipedia entry.] My personal favorite is something I heard a caller to a talk show say, “I’m an utter incomplete fool. I mean – I’m not even a complete fool!” I wouldn’t have known how the speaker intended the first sentences to be parsed if she hadn’t uttered the second one.

What this rambling may add up to, if anything, is that l’arbitraire du signe and the iconicity principles are always present and always working at cross purposes. To put it briefly, if unscientifically, it’s as if the signs are always trying to get more arbitrary and the people are always trying to make them less so.

Posted by Paul Kay at 06:25 AM