August 31, 2007

For glue the sex rubber mat

Would you buy a data switch whose assembly instructions include the phrase "For glue the sex rubber mat"? If your answer is "yes", [insert joke about Senate Republicans].

The datasheet that shipped with a Cisco switch, reported by Jake Vinson at Worse Than Failure, provides yet another example of translation from the Chinese by dictionary look-up. Some of the items are terms that might be English in some alternative universe, like "commutation machine" in place of "switch". Others are amusing mistakes that we've seen before, like "sex" in place of "type", which turned "one-time-type item" (i.e. "disposable item") into "a time sex thing" on an aisle sign at the Beijing Century Mart. Here it's "for glue the sex rubber mat", which I surmise might be a rubber-type pad to be attached in one of the machine's configurations:

(What do you suppose those "cones" are? "Modules", maybe?)

This next section features another old friend, in the phrase "a fast ether lord fucking net ascending". This learned discussion by Professor Victor Mair will give you a clue about how the f-word snuck in there -- and I suspect that the evocative "fast ether lord" is just a prosaic old "fast ethernet controller":

A larger sample of the datasheet is here:

Can it really be true that this is shipped with a Cisco product? It's understandable to see this kind of thing on a label or menu in a restaurant; it's more surprising to see it in the aisles of a hypermarket; but you'd think that a big-time multinational technology company would pay a little more attention. Richard Eng needs to point his Murciélago northwards, right away!

If you happen to have a copy of the Chinese original of this data sheet (unfortunately Jake Vinson's post doesn't specify the model number), please let me know.

[Update -- a google search for "give cones change the machine" turns up a number of things, including this. But "glue the sex rubber mat" comes up empty.]

[Update #2 -- Mel Wilson has a better idea about those "give cones":

It seems as though the product is one of a family of ethernet hubs or switches, with variants having 5, 8, 16 or 24 ports; so

5/ 8/ 16/ 24 give cones change the machine

would refer to the switch with the appropriate number of ports. 'Change' might have meant something like 'version', and 'give cones' somehow comes from 'sockets' or 'ports'.

I imagine the product comes with a sheet of pre-punched adhesive rubber feet to stick on the bottom.

Yes, but only for 16/24 give cones change the machine...

Seriously, I'm sure that Mel is right -- "16/24 give cones change the machine" must mean "the 16/24-port version of the machine".

Florian Weimer adds another piece of evidence:

I think the phrase "16/24 give cones change the machine" means "16 or 24 Ethernet ports". The smaller models with just 5 or 8 ports have got external power supplies; only the larger ones need an AC power cable.

Indeed -- add the insight that "change" means "version (of)" and the found poetry becomes a sensible and prosaic observation. ]

[Update #3 -- Brent Eades tries a clever experiment:

Use Google Translate, translate 'For glue the sex rubber mat' into Chinese (Simplified.)

Take the Chinese characters returned, then translate them back into English.

We get: "For adhesion of the rubber pad"

As for "give cones change the machine", repeating the above steps gives us "To change the tube machine". This is surprising; I would have thought Cisco had discovered semiconductors by now.

Finally, "fast ether lord fucking net ascending" gives us "Fast Ethernet main net or hell". Also surprising. And worrisome.


[Update 1/10/2008 - Lewis Jardine writes:

I think you've misread something: DailyWTF describes it as a 'Crisco switch' - i.e. a cheap generic one (Crisco being a brand of cooking oil). The same joke as when Homer goes to buy a 'Panaphonics' TV.

Maybe so -- in any case it has always bothered me that the scans did not include any brand indication. But just for the record, Crisco was basically and originally a brand of solid vegetable shortening, a lard substitute made by hydrogenating vegetable oil.]

Posted by Mark Liberman at 06:47 PM

Own, pone, poon, pun, pwone, whatever

Christopher Rhoads ("What Did U $@y? Online Language Finds Its Voice", WSJ, 8/23/2007) describes the the straw in the wind that broke the camel's back. Or is it the finger in the dike, for want of which the shoe was lost?

The central question is how to pronounce the characteristic typographical quirks of leetspeak:

Jarett Cale, the 29-year-old star of an Internet video series called "Pure Pwnage," enunciates the title "pure own-age." This is correct since "pwn" was originally a typo, he argues, and sounds "a lot cooler." But many of the show's fans, which he estimates at around three million, prefer to say pone-age, he acknowledges. Others pronounce it poon, puh-own, pun or pwone.

"I think we're probably losing the war," says Mr. Cale, whose character on the show, Jeremy, likes to wear a black T-shirt with the inscription, "I pwn n00bs." (That, for the uninitiated, means "I own newbies," or amateurs.)

Robert Hartwell Fiske is wheeled out for his traditional cameo, guarding the moon from wolves:

"There used to be a time when people cared about how they spoke and wrote," laments Robert Hartwell Fiske, who has written or edited several books on proper English usage, including one on overused words titled "The Dimwit's Dictionary."

But dude! Jarrett cares! And he's losing the war for the traditional pronunciation of "pwnage", while you mutter to yourself on the sidelines...

In fact, the article offers several examples of the strength of social norms, even if they're not exactly the norms that Fiske prefers:

"I pone you, you're going down dude, lawl!" is how Johnathan Wendel says he likes to taunt opponents in person at online gaming tournaments. Pone is how he pronounces "pwn," and lawl is how "LOL" usually sounds when spoken. Mr. Wendel, 26 years old, has earned more than $500,000 in recent years by winning championships in Internet games like Quake 3 and Alien vs. Predator 2. His screen name is Fatal1ty.


Mr. Wendel ... says he makes a point of using proper capitalization and punctuation in his online missives during competition. "It's always a last resort," says Mr. Wendel. "If you lose you can say, 'At least I can spell.'"

I'm hoping that he's expressing disdain for noobs who write "pone" instead of "pwn", and so on. I mean, a society is generally as lax as its language. A typical symptom of degeneracy is Roads' WSJ article itself, which includes quite a bit of dubious sound-influenced spelling:

In an episode of the animated TV show "South Park," one of the characters shouted during an online game, "Looks like you're about to get poned, yeah!" Another character later marveled, "That was such an uber-ponage."

From the point of view of substrantive linguistic description, my favorite part of the article deals with the evolving form, meaning and sound of "teh":

Those who utter the term "teh" are also split. A common online misspelling of "the," "teh" has come to mean "very" when placed in front of an adjective -- such as "tehcool" for "very cool." Some pronounce it tuh, others tay.

However, this seems to be incomplete and perhaps also partly false, at least according to the description in the wikipedia entry for "teh", which includes examples with verbs ("this is teh suck") and proper names ("teh Jeremy").

For more information about Pure Pwnage language and lifestyles, you might check out an episode.

[Update -- Alexis Grant writes:

I found your material on "teh" in "Own, pone, poon, pun, pwone, whatever" pretty interesting. One thing that stuck out at me was the quoted material 'such as "tehcool" for "very cool." ' No one I know would ever write "tehcool" without a space intentionally. (Actually, I don't know many people who use it at all anymore, but I used to. Now my friends make fun of me for sometimes using it.) So I wondered if that's a typo in LL, a typo in the article, or just a confusion of the author.

I was once referred to with the definite article in front of my name, and I took it as meaning that I was a notable and important instance of the name. I wouldn't say "teh cool" means "very cool", but something more like "an instance of cool that has high cool value". I wonder what units cool value has...

But I don't think Wikipedia has it right either: "that is best" and "that is the best" aren't related in the way that they seem to want them to be, and I don't think "teh [adjective]" really means that something is, e.g. the lamest, or that a person is the coolest (teh cool), but more "the essence of lame" or "the essence of cool".

Of course, it's hard for either Wikipedia or the WSJ to really pin down something that's still so much in flux, so I guess they should be excused for their confusions.

The string "tehcool" with no spaces is how it's rendered in the WSJ article. I don't know if that was a typo or Christopher Roads idea of how to spell it, but the WSJ is generally quite carefully edited, so I'd guess it was his choice. ]

Posted by Mark Liberman at 06:42 AM

August 30, 2007

Is Speaking Arabic Suspicious?

According to news reports, an American Airlines flight was delayed for several hours in San Diego Tuesday night when worried passengers overheard other passengers speaking Arabic. Ironically, the men speaking Arabic were returning to their homes in Detroit after training Marines at Camp Pendleton destined for Iraq. From a linguistic point of view, I guess the silver lining is that the worried passengers were able to identify Arabic.

This unfortunate incident reveals a common confusion of language, ethnicity, and religion. On the one hand, most speakers of Arabic are not terrorist material. On the other hand, potential Islamic terrorists speak many languages in addition to Arabic, including Urdu, Punjabi, Pashto, Persian, Indonesian, Malay, Turkish, Chechen, French, and English. A terrorist group with any intelligence will use people who do not attract attention to themselves.

It isn't surprising that hearing Arabic might make a passenger wonder, but you'd think that a moment's thought would overcome the impulse to notify the authorities in the absence of additional, more suspicious, behaviour. Indeed, insofar as the systems for detecting weapons and explosives work, there isn't really any reason not to allow people who actually do have terrorist sympathies to fly. As long as you know he has no weapons or explosives, you could let Osama bin Laden on the plane. I'd rather have him as a seatmate than many others - at least he doesn't drink.

Posted by Bill Poser at 02:07 PM


This morning Mark posted a paragraph from Russell which read in part:

We are given to understand that a Patagonian can understand you if you say 'I am going to fish in the lake behind the western hill', but that he cannot understand the word 'fish' by itself. (This instance is imaginary, but it represents the sort of thing that is asserted.)
I wonder if someone had been telling Russell about verb roots in polysynthetic languages, which often are what linguists call 'bound', i.e. cannot stand on their own as well-formed words. Rather, they must be inflected with one or more affixes to be well-formed utterances of the language.

Consider, for example, this description of Inuktitut verb stems from a translation of Elke Nowak's book Transforming the Images: Ergativity and Transitivity in Inuktitut, which allows a preview of itself to be nicely Googled up, hooray:

Inuktitut verbs can be formally described as nuclei that attain the status of a free-standing form by the addition of an inflectional ending containing information as to person, number, valence and mood. ...
(3) tukisi-paanga
. 'if s/he understood me'

The stem tukisi- means 'understand', all right, but it can't be uttered on its own; it's not a word. It might not be quite appropriate to say that an Inuktitut speaker wouldn't understand tukisi- on its own, but they certainly wouldn't take it as a meaningful utterance.

Even in English we have several roots with this property of needing an affix or two to make them independently utterable. In our case, they're mostly roots that came into English from a Romance language, but they're now a major subsystem in the vocabulary. Consider electr- for example. It shows up in electr-ic and electr-on and other words derived from those, but it is itself a unit of English morphology with a form and meaning. Or similarly the stem in anxi-ous and anxi-ety, or feroc-ious and feroc-ity, or frivol-ous and frivol-ity.

Those are just to give you the idea -- in some languages, all the roots and stems have this property of needing additional endings to make them possible utterances. The consequence is that, of course, you might not be able to say 'fish' by itself, without saying "my fish" or "I'm going fishing" or "Would you like to fish?"

Russell is right then to wonder what infants acquiring such languages do. In particular, do they invent uninflected forms of these bound roots to use as 'baby talk'? In fact, judging from this abstract, they don't. I don't have time to do any more investigating right now but I'd guess that the kids might use some frequent or default suffix as a generic way produce morphologically well-formed utterances until they've mastered the full complex system. I'll doublecheck on that and get back to you.

Update: Pursuant to a comment on the crosspost from Ricky, I went to check on frivol- -- I'd forgotten that it had a free use! It's a back-formation, though, so at least it used to be a bound root. Check out the OED's rather severe usage note:

Posted by Heidi Harley at 12:11 PM

Memetic mutation and traumatic release

Today's Stone Soup deals with a case of linguistic abilities unlocked by trauma: Max has fallen out of a tree on his head, and he suddenly jumps from babytalk to hyper-formal adult sentences:

This reminds me of a story that I heard long ago, about Thomas Babington Macaulay. Robert Herrick and Lindsay Todd Damon, Composition and Rhetoric for Schools, 1905, give this version in a (sample passage titled) "Theme on Macaulay":

Even in his early childhood Macaulay showed evidences of his coming greatness. At the age of four he had learned to read, and at five he had read the entire Bible. As early as this, too, he began to use those long words which are so much in evidence in his writings. Illustrative of this fact is the following anecdote. While he was dining one day with his father and mother at the house of a neighbor the servant upset a cup of coffee on his legs. On his hostess's inquiry as to whether he was hurt, the young Thomas immediately replied: "Madam, the agony is somewhat abated."

The story is presented in a very different way in Randall Jarrell's 1954 novel Pictures from an Institution. Young Derek doesn't speak, but only growls, and his mother consults a specialist, who tells her that

... "talking is simply a matter of maturation," but advised, as a precauation, "a thorough physical check-up." He went on to say that Carlyle's first words, delivered at the age of three, had been what ails wee Jock? -- that Lord Macaulay's first words had not come until he was four; a lady had spilt hot tea on him at a party, and he had said, drawing back a little from her solicitous caress: Thank you, Madam, the agony is somewhat abated.

Dinner has become a party, and coffee has become (more stereotypically British) tea. And the quotation, no longer the rhetorical fruits of precocious Bible-reading, is now the first utterance of a late talker, who needed the stimulus of scalding to unlock his tongue.

Jarrell's version is the one that I heard at some point in my childhood. I recall that this story was brought out by adults in exactly the context described in Jarrell's novel, as a way of offering support and encouragement to parents of late talkers. I don't know whether I heard people repeating something they had read in Jarrell's novel, or whether Jarrell's novel reproduces a piece of pre-existing intellectual culture.

In any case, I doubt that scalding and concussion ever actually increase mental abilities, but there's something intuitively plausible about such stories, perhaps because of the analogy to awakening from sleep.

[Alejandro Satz writes:

I remembered that I had read the story of Carlyle's and Macualay's first words, almost in the exact words that you copy from Jarrell, in Bertrand Russell's book An Outline of Philosophy. I checked it finding the book in Google Books and making a search of "Carlyle"; the stories come up in page 40. I reckon Jarrell might have copied it from there, or else both him and Russell copied from the same source, because the two anecdotes are there and with very similar wording. Russell's book is from 1927.

In fact, the passage has some independent interest:

Certain philosophers who have a prejudice against analysis contend that the sentence comes first and the single word later. In this connection they always allude to the language of the Patagonians, which their opponents, of course, do not know. We are given to understand that a Patagonian can understand you if you say 'I am going to fish in the lake behind the western hill', but that he cannot understand the word 'fish' by itself. (This instance is imaginary, but it represents the sort of thing that is asserted.) Now it may be that Patagonians are peculiar -- in they must be, or they would not choose to live in Patagonia. But certainly infants in civilised countries do not behave in this way, with the exception of Thomas Carlyle and Lord Macaulay. The former never spoke before the age of three, when, hearing his younger brother cry, he said 'What ails wee Jock?' Lord Macaulay 'learned in suffering what he taught in song', for, having spilt a cup of hot tea over himself at a party, he began his career as a talker by saying to his hostess, after a time, 'Thank you, Madam, the agony is abated.' These, however, are facts about biographers, not about the beginnings of speech in infancy. In all children that have been carefully observed, sentences come much later than single words.

The wider context is a theory of meaning based on conditioned reflexes -- I had completely forgotten this aspect of Russell.

Anyhow, I like his observation that "These ... are facts about biographers, not about the beginnings of speech in infancy." And this leaves me wondering about the origin and progress of the biographical tidbit about Macauley's burn. ]

[Jan Freeman writes:

I dimly recall from college days that the "Stone Soup"/Carlyle storyline -- the hero who is mute as a toddler then suddenly acquires eloquence -- is a folktale motif as well as a later urban legend. Beowulf, maybe? (I'm on deadline, and recovering from a computer catastrophe, or I would do further research.)

But I clipped "Stone Soup" for another reason. The first dialogue balloon illustrates a common agreement problem that baffles many of my peeve-conscious correspondents.

"Max is one of those kids who doesn't talk much . . . but has it all in their heads."

Many people want "one" to govern the following verb, instead of "kids" -- and that makes the rest of the sentence difficult ("has it all in their heads"). But "one of those kids who don't talk much . . . but have it all in their heads," though grammatically correct and simpler, sounds wrong to many people. Can't remember if Language Log has covered this, but it drives my readers wild, whichever version they prefer.

See " One of those who" (6/7/2006) and " One of those people that care(s)" (11/21/2006). ]

[Ray Girvan observes that the Vietnamese hero Giong is a good example of the folktale motif of a mute toddler who suddenly acquires eloquence.]

[Jesse Sheidlower writes:

A favorite literary late-speaker is Charles Wallace Murry, the genius hero of Madeleine L'Engle's A Wrinkle in Time. We are told that he was completely silent until the age of four, when he began to speak in complete sentences, with "none of the usual baby preliminaries." (From memory, but I always found that line memorable.) Later he uses some hard word, is challenged on its meaning, and responds by quoting the definition in the Concise Oxford Dictionary (which definition he disparages). This at the age of five.


[Douglas Davidson writes:

A Carlyle/Macaulay-type anecdote is also told of Einstein, e.g. Reuben Hirsch et. al., The Mathematical Experience, p. 188 (footnote):

Otto Neugebauer told the writer the following legend about Einstein. It seems that when Einstein was a young boy he was a lake talker and naturally his parents were worried. Finally, one dat at supper, he broke into speech with the words "Die Suppe ist zu heiss." (The soup is too hot.) His parents were greatly relieved, but asked him why he hadn't spoken up to that time. The answer came back: "Bisher war Alles in Ordnung." (Until now everything was in order.)

Allan Wechsler sent in the same anecdote, and adds:

Without a decent research library I can't figure out how reliable the story is, nor who the real source is [...] Perhaps some young intern at Language Log Plaza could be detailed to ascertain the truth of this legend?

I knew there was something missing around here!]

[With respet to the Macaulay quotation "Madam, the agony is somewhat abated", Thomas Thurman writes:

I first encountered this anecdote in a book by the English writer Frank Muir, who added "I think the temptation to spill coffee on such a child must have been quite strong."


[8/31/2007 -- it looks like the thread is going to go on for a while:


[Kathleen Burt writes:

It may be that words always precede sentences, but the child may practice in secret as it were. My second daughter's first noticed speech was the sentence "Here comes Joe on his motorcycle." And sure enough here he came. On his motorcycle in fact. My daughter was under a year old, and could not stand up unaided. We are pretty sure she didn't know she was overheard, and was practicing speech. In later years we knew her as a child who insisted on learning everything by herself, with as little input from her elders as possible.

Max's newfound eloquence didn't entirely surprise me, and I have heard the other examples before. I have to say that the example involving Carlyle strikes me as better founded, because "What ails wee Jock?" is the sort of household utterance that would be heard by people who were well-positioned to hear the child's first speech. Something said at a dinner party might be widely reported by people who had no real idea how well the child was able to speak, and family comments that it wasn't really the first thing he'd ever said might not be heard by those who were spreading the story.

In some cases, an infant may be treating a phrase in a holistic way. At a time when my oldest son was using only single words, and not all that many of them, he came out with the phrase "what's going on here?" It was in an entirely appropriate context, but I've always assumed that he'd learned this phrase as a unit, glossed as an expression of disapproving amazement. But I admit that "Here comes Joe on his motorcycle" is a less plausible candidate for that sort of interpretation.

Another ]

Posted by Mark Liberman at 07:52 AM

If you're a fan of cute headlines

Here's your daily dose: "Love triangle kidnap pampernaut preps wingnut defence", The Register, 8/29/2007.

But "wingnut" usually has a political dimension that's missing here -- the story makes it clear that Nowak's defense team is contemplating a plea of temporary insanity, not temporary Republicanism. (Though the way things have been going recently...)

Is frequent use bleaching the politics out of the term wingnut in general, or was The Register's editor just seduced by the "kidnap pampernaut"/"wingnut defense" chiasmus? In any case, the wordplay inspired by the other recent national adult-diaper story has been spiritless in comparison.

[hat tip to David Donnell]

[Askash Mehendale thinks it was the Atlantic ocean that did the bleaching:

I would suggest a third explanation: that over on this side of the pond (as I am, and the Reg staff mostly are), 'wingnut' means only one of:

- a particular type of nut (as in nut-and-bolt), with protruding wings that make it easier to turn
- a person with protruding ears (by association with the above)
- an insane or merely strange person

In my experience, the last arrived from AmE (if that's where it came from) without any political connotation.

One piece of evidence for this spin on the word is the Microsoft/Peter Jackson joint venture Wingnut Interactive, located in New Zealand (and thus subject to an even larger dose of salt-water bleaching). Certainly there's no evidence that Jackson has right-wing politics that he feels strongly enough about to put into his games-studio name.

However, I don't see a lot of other evidence of wingnut used simply to mean "insane or strange". ]

[Boyd D. Garrett Sr. writes:

During my tenure in the Navy from the mid-70s through the mid-90s, I worked quite a bit with colleagues from the Air Force. We frequently referred to them as "wingnuts," roughly meaning "those insane or strange people who are often organized into Wings" (a major organizational division used by the US Air Force: see the Wikipedia entry). I seem to recall the term being used generically outside any military reference, but I can't provide any citations, unfortunately.

I don't know when this term came to be popularly applied to "right-wingers," but I don't recall hearing it used that way until this century.


[George Amis writes (from Santa Cruz):

There's a famous Santa Cruz surfer called Wingnut (his real name is Robert Weaver). He appeared in a surfing film called Endless Summer II. I'd be surprised if he has much interest in politics. The Wingnut apparently refers to his wild and crazed behavior on and off the waves.


Posted by Mark Liberman at 06:13 AM

"UAT Instructor Creates Cuneiform and Hieroglyphic Translator"

An announcement at Marketwire claims that Joe McCormack, an instructor at the University of Advancing Technology (warning: awful Flash website with unreadable text and high buzzword density), has created a program that translates English into ancient Egyptian, Akkadian, and Sumerian. The story has made it to Slashdot, and the Marketwire story reports that McCormack is "talking with museums and institutions to garner further exposure".

Let's nip this one in the bud, before the BBC picks it up, and before any ill-informed museums or schools start using this thing. It doesn't work. I don't know much Akkadian or Sumerian, but I do have a fair knowledge of Egyptian, enough to test this translator. If you enter single words, it will often return a reasonable result, though in a number of cases the result is not what I would consider the usual word or spelling. The system doesn't seem to have a very large vocabulary. Quite a few of the words I tried were missing, including both ordinary words like "silent" and names of major Egyptian gods. If, however, you enter sentences, the result is invariably gibberish. This system has no knowledge of Egyptian grammar. The word order is all wrong. The verbs are not inflected. For possessed forms of nouns, which are formed with suffixes in Egyptian, some of the affixes are wrong, and in all cases that I tested they are in the wrong position.

This system is not a translator. It is a crude lexical lookup system, basically just a dictionary. It is probably fun to play with if you don't know the languages, but it is way over-hyped, and if used by schools and museums will be terribly misleading to people who would actually like to learn about these ancient languages.

Posted by Bill Poser at 03:15 AM

August 29, 2007

Intelligence analysis: football vs. law

I was impressed by a Washington Post article that describes the painstaking labor facing today's college football coaches. The University of Maryland's coach, for example, refers to tapes as his "textbooks." Coaches fill their off-field time analyzing tapes. A new industry of videotaping practice sessions and games has created jobs called "video coordinator" and at least big-time teams  are now cooperating with each other by sharing their tapes with opposing coaches. Football tape libraries are springing into existence. This is a far cry from the old days, when teams were forced to have scouts armed with clipboards gather useful intelligence for their upcoming games. But this is Language Log and there has to be a reason why I'm talking about football. So here it is. I see a relationship between between taping naturally occuring physical activity (football) and taping naturally occuring language, which may be even more fleeting and harder to capture for later analysis.

Sociolinguists who study language in its natural context know full well how difficult this task can be. We used to audiotape interviews and conversations with people (calling them 'subjects' sounds demeaning somehow). After we got the sample we wanted, we spent hours and hours doing serious analytical work on them. Every sound, morpheme, word, phrase, clause, pause, and false start were  potentially important indicators of information about such things as a speaker's regional background, social status, education, race, gender, age, and attitude. As technology developed, we turned to videotapes because they provided even more information, such as the distance between speakers, non-verbal signals, and other things.

Taking advantage of the same technology, law enforcement began using audiotapes in undercover sting operations in the late seventies and continued to do so for the decades following, finding that this practice provided more convincing evidence of the willingness  (sometimes unwillingness) of suspects to commit language crimes. As the technology advanced, these agencies also began to use videotapes. They too found that this was a lot of work. But their main problem was less in gathering intelligence than in analyzing it.

Football coaches have a distinct advantage over law enforcement officers and lawyers. Like linguists, coaches are trained to be experts in their jobs and they are willing to spend night and day analyzing their own and their opponents' skills and strategies. The big difference between analyzing football tapes and analyzing law enforcement tapes is that the police and prosecutors are not trained experts in the very language that the tapes provide them. But linguists are the real experts in this kind of intelligence analysis,  explaining why more and more of them are being called upon to analyze the language evidence in criminal and civil cases.

Posted by Roger Shuy at 02:12 PM

What she should have said

The question: "Recent polls have shown, a fifth of Americans can't locate the U.S. on a world map. Why do you think this is?"

The (famous) answer:

I personally believe
U.S. Americans are unable to do so
because uh some uh
people out there in our nation don't *have* maps
and uh I believe that our ed- education like such as in
South Africa and uh the- the Iraq everywhere like such as and
I believe that they should uh
our education over here in the U.S. should help the U.S. or- or- should help South Africa
and should help the Iraq and the Asian countries
so we will be able to build up our future
((for our children))

But what she should have said was:

That sounds like an urban legend to me --I don't believe that my fellow Americans are so ignorant. According to the Final Report of the National Geographic-Roper Public Affairs 2006 Geographic Literacy Study, "Nearly all (94%) young Americans can find the United States on the world map".

The last time I checked, 100% minus 94% was 6%, not 20%. So what's your source for the claim that 20% of us don't know where our country is on the map?

I don't know where the pageant organizers got "one fifth" from. My guess is that they just made it up -- or more charitably, one of the them dimly misremembered a number from one of the periodic hand-wringing "People are so ignorant of X" press releases distributed by the purveyors of X.

Let's note in passing that the National Geographic gave its study the same shocking-ignorance-is-rife spin in its press release, duly picked up by the press, e.g. "Study: Geography Greek to young Americans",, 5/4/2006):

The National Geographic-Roper Public Affairs 2006 Geographic Literacy Study paints a dismal picture of the geographic knowledge of the most recent graduates of the U.S. education system.

"Taken together, these results suggest that young people in the United States ... are unprepared for an increasingly global future," said the study's final report.

"Far too many lack even the most basic skills for navigating the international economy or understanding the relationships among people and places that provide critical context for world events."

Needless to say, the 94% number was not featured. In fact, the key paragraph of that news story is the last one:

The release of the 2006 study coincides with the launch of the National Geographic-led campaign called "My Wonderful World." A statement on the program said it was designed to "inspire parents and educators to give their kids the power of global knowledge."

In other words, it's designed to inspire parents and educators to give the National Geographic the power of their dollars.

(I hasten to add that geographical knowledge is a wonderful and important thing, and everyone needs more of it. But let's be clear about what's going on here...)

[Update -- Michael Kwun thinks this might be another example of inability to deal with simple proportions:

I don't know what really happened, but it could be that a question about geographic failings accidentally demonstrated an innumeracy problem.

If 94% can find the US on a map, that could easily be reported or remembered as 95%.

And that would mean 5% can't find the US on a map. Five percent can morph into 1/5 through a simple error in recollection... or because someone doesn't understand how to translate percentages into fractions.

Or you could turn 5% into 1 in 20... which could then similarly turn into 20%, and then 1/5.


Posted by Mark Liberman at 02:12 PM

It's whom

UBI Soft's Sprung for the Nintendo DS ("A Game Where Everyone Scores") is

... a dating simulation designed to challenge your thinking skills. What do you say to that cute stranger you've just met? Do you know what to say to make them laugh and smile? Will you mess up and make them angry at you?

The reviews on amazon suggest why this game has not been a hit:

I did enjoy this game enough to solve it with both a male and a female. It is quite fun to try to use items on different people to see what will happen, like spraying them with mace.

The concept seems kind of cool at first: you want to fall in love with the main character of the opposite sex. The problem is, everything in this game is completely impossible to predict. If you pick the seemingly best responses, you don't necessarily win.

I havent completed the game, But i intend to if i can get over the part where becky has to get conor to ask her out. If you know how to pass that part, PLEASE post the awnsers on here for the game. i cant even get past the part where brett is on a scavenger hunt, and has to get 2 things, 1. Womens underwear and 2. a pic of a dragon tattoo plz help me thanx

Apparently, as an alternative to spraying cute strangers with mace, you can get prescriptive with them:

(As James Thurber explained many years ago, "it is better to use 'whom' when in doubt, and even better to re-word the statement, and leave out all the relative pronouns, except ad, ante, con, in , inter, ob, post, prae, pro, sub, and super.")

It seems to me that Nintendo and Ubi Soft have missed an opportunity here. They should have hired some really interesting writers to create interactive modules for this game. You could select Alice Munro's version, or Dave Barry's -- or let the DS shift randomly among them.

If you want to explore Sprung further, this thread at Something Awful looks promising.

[hat tip to Caitlin Light]

Posted by Mark Liberman at 06:49 AM

August 28, 2007

Disarming the global road worrier

I thought that Pete Ianace had come up with a clever pun, but he seems to mean it.

I used to be a global road worrier, myself, but then I realized that I can prepare classes, do research and answer email just as well in an airport in city X as in a hotel conference-room in city Y. In fact, the surroundings are usually less distracting, the network is usually less congested, and the situation is spiritually more fulfilling.

After all, "the way of the Tao is to act without thinking of acting, to taste without discerning any flavour, to consider what is small as great, and a few as many, and to react to injury by kindness." What better way to engage these truths than to be stranded in an airport?

Posted by Mark Liberman at 10:28 PM

Political semantics quiz

Rahm Emanuel, the sharp-tongued chair of the House Democratic Caucus, said this about our soon-to-be-ex attorney general: "Alberto Gonzales is the first attorney general who thought the truth, the whole truth and nothing but the truth were three different things".

Translate Emanuel's comment into higher-order predicate calculus. What is the minimum order of predicates that you need to quantify over?

Are these three phrases actually just different names for the same thing? If not, how do you explain the (redundant?) wording of the traditional oath? If the three phrases do name different concepts, give one example of each of the three categories from Gonzales' congressional testimony.

[ Tim Finin, who should be grading the quiz, offered an answer:

Here's my attempt at the first part. I'll leave it to others to others to see if AGAG followed the traditional oath, which I'll assume is something like: "AGAG promises to tell (1) the truth, (2) the whole truth and (3) nothing but the truth."

(1) If AGAG tells P, then P is true.

All(P) tell(AGAG,P) -> P.

(2) If AGAG tells P, then P is true and there is no sentence R that is not implied by P that is true.

All(P) tell(AGAG,P) -> P ^ ~ Exists(R) R ^ ~(P->R)

(3) Same as #1: If AGAG tells P, then P is true.

All(P) tell(AGAG,P) -> P.

I am assuming what I take to be a standard interpretation of what it means to "tell the truth", i.e., to make statements that are true and not to make any statements that are not true. Of course, the standard interpretation probably also involves saying things that you 'know' to be true, but I don't want to go there tonight.

So Tim thinks that the three phrases are only partly redundant, since the first and the third mean the same thing.

That's what I thought, too -- at first. But after a bit more thinking, I decided that it's not so clear. After all, if I promise to eat the pie, the whole pie, and nothing but the pie, all three clauses seem to be independent. ]

[Simon Cauchi brings a different semantic tradition to bear, according to which telling the truth is indeed like eating the pie:

"The truth" means just that. It's not qualified in any way.
"The whole truth" expressly disavows suppressio veri.
"Nothing but the truth" expressly disavows suggestio falsi.


[Barbara Partee comments:

Only time for a quick response: I think the best way to see them all as non-redundant (and to make the "whole truth" one fulfillable!) is to think of them all in the context of answering questions.

That limits the domain of potentially relevant propositions -- I'm not taking time to work this out carefully -- we're about to leave for a few days getaway -- but it narrows down the first one so that it's not about everything you say but about giving a true answer to the question. And 'whole truth' doesn't then mean that you have to give an answer that specifies all the true propositions about the whole actual world (which Tim's requires), but just a complete answer rather than a partial answer. And then the 'nothing but' is also no longer redundant, I guess, because it means you don't add on anything false.

Wait, or does it mean that you don't add anything irrelevant? You don't add anything that isn't implied by a complete true answer to the given question? Hmm, nothing irrelevant or just nothing that's both irrelevant and false? (A clever answerer can still make hay with implicatures -- I don't think the rule says anything about not implicating anything false, only about not actually saying anything false.)


[Rory Turnbull writes:

After reading your Language Log post on the semantics of "the truth, the whole truth, and nothing but the truth", it struck me that it's really just an oath to obey Gricean conversational maxims.
The truth - maxim of quality
The whole truth - maxim of quantity
Nothing but the truth - maxim of relation


Posted by Mark Liberman at 08:39 PM

Syntax under pressure

According to the Doonesbury site's feature "Say What?" today, Lauren Caitlin Upton, the reigning Miss South Carolina, was recently asked on TV why so many Americans can't find their own country on a map, and her impromptu reply, dutifully transcribed by various sources (though not yet checked aganst the original recording by Language Log staff), was:

I personally believe that U.S. Americans are unable to do so because some people out there in our nation don't have maps, and I believe that our education like such as South Africa and the Iraq everywhere like such as, and I believe that they should... our education over here in the U.S should help the U.S or should help South Africa and should help the Iraq and the Asian countries so we will be able to build up our future.

Those who enjoy laughing at stereotypically pretty young women (yes, Miss Upton does appear to be blonde) for stereotypically lacking intelligence will get a few giggles out of this one. And they will probably not reflect on whether they themselves have ever sounded similarly stupid when speaking spontaneously under pressure and under lights, in response to an unexpected question under circumstances that made them feel they are expected to talk.

There can be no doubt that the young woman in question had no idea what she was going to say, except that she knew she should try to mention maps and name a country or two and sound sort of interested in foreign affairs and education. But I have a feeling I have occasionally blundered around in similar manner myself when faced after a conference presentation with a question I simply had no answer to.

Normally one can just say nothing, or "I have no idea", if one has no idea. But there are some circumstances in which all the attention is on you and you feel you have to provide some talk: being on TV, press conferences, classes where you're the teacher, prime minister's question time, job interviews, parole board hearings, question periods, and so on.

The one syntactic peculiarity (as opposed to the general fluency meltdown) that caught my eye was "the Iraq", which occurred twice. But this is a tricky topic (no wonder so many Chinese, Japanese, Korean, and Russian speakers are utterly baffled over when to use the definite article and when not). What Miss Upton was struggling with was the question of whether Iraq is a strong or a weak proper name. Weak proper names need the definite article. Strong ones don't. You may be clear about the difference, but large numbers of people think that Language Log is a weak proper name (needing the), as we see from our mail every day. (It is not true. Language Log is a strong proper name: we do not prefix it with the definite article.)

The matter is not trivial or straightforward. As noted in The Cambridge Grammar of the English Language, pp. 517ff, there are a number of countries that have two different ways of being referred to, strong and weak, the weak forms being a bit more common in Britain and tending to be replaced by the strong forms:

strong   weak
(no article)   (with article)
Argentina   the Argentine
Ukraine   the Ukraine
Yemen   the Yemen
Lebanon   the Lebanon
Holland   the Netherlands

There are some generalizations, but also many exceptions. Cities, boroughs, and regions are usually strong (like Amsterdam or New York or North Africa or Antarctica) but a few are weak (like the Hague or the Bronx or the Maghreb or the Antarctic). And remarkably, to a rough approximation at least, numerical freeway names are weak proper names in Southern California ("Get on the 55") but strong proper names in Northern California ("Take 17 South").

Don't laugh too hard at poor Miss Upton until you've successfully answered a few geography quiz questions under TV lights, that's what I'm saying.

Addendum: For those interested in checking the text, this blog post has a link to the video (it's on YouTube of course), and offers the following slightly different transcript (slightly more disfluent; and it agrees on *the Iraq):

I personally believe that U.S. Americans are unable to do so because, uhmmm, some people out there in our nation don't have maps and uh, I believe that our, I, education like such as uh, South Africa, and uh, the Iraq, everywhere like such as, and I believe that they should, uhhh, our education over here in the US should help the US, uh, should help South Africa, it should help the Iraq and the Asian countries so we will be able to build up our future, for us.

Posted by Geoffrey K. Pullum at 05:58 AM

Reuters says guilty of elliptical headlines

When the news hit the wires last Friday that Atlanta Falcons quarterback Michael Vick was pleading guilty to charges involving illegal dogfighting, the Reuters headline read:

NFL's Vick says guilty in dogfighting case

Which of these three possible readings would you suppose is the one that the headline writer intended?

a) The verb say(s) functions as a more informal substitute for plead(s), since plead can be followed by the adjectival complement guilty to mean 'enter a guilty plea.'

b) The verb say(s) is quotative, with guilty as a direct quotation complement (or at least a partial one). This draws on our cultural understanding that saying "guilty" in court proceedings is elliptical for the declarative speech act "I am guilty" (i.e., "I am entering a guilty plea"). Such a reading would imply that the headline writer neglected to include quotation marks around the word guilty. (Compare the recent New York Post headline, "Cousin Vinny wiseguy says 'guilty' to go free.")

c) The verb say(s) is reportative, with guilty as an indirect quotation complement (or at least a partial one). In this type of ellipsis, "Vick says guilty" is journalistic shorthand not for "Vick says 'I am guilty'" but for "Vick says (that) he is guilty."

Attentive Language Log readers will recognize that the third reading is the most plausible one, given the source of the headline. As Arnold Zwicky explained a month ago, Reuters headlines are very often of the form "X say(s) C," where C is a complement clause missing a subject.

The headline that set the Language Log water cooler buzzing last month was: "Taliban say kill Korean hostage, set new deadline" (July 25). To help illuminate this construction, Barbara Partee and I dug up a large number of Reuters headlines taking the form "X say(s) find Y," as in "Scientists say find gene for child cancer syndrome" or "Statoil says found oil northwest of Shetlands." In such cases, the complement clause for say(s) is a finite VP with the subject omitted. If we were to restore the missing subject in each of these headlines, it would be a third-person pronoun coreferring with the antecedent X: "Scientists say (that) they find gene," "Statoil says (that) it found oil," and so forth.

Now it turns out that the predicates of these subjectless clauses don't necessarily have to take the form of a finite VP like "kill Korean hostage" or "find gene." Reuters headline writers are routinely writing headlines of the form Subject say(s) Predicative with the intended interpretation of Subject say(s) (that) Subject is/are Predicative. In ordinary English, such reported-speech constructions look like the following, with a predicate consisting of a finite form of be plus AP, PP, participial VP, or NP:

1. Joan says (that) she is enthusiastic about the team. (be + AP)
2. Joan says (that) she is on the board of directors. (be + PP)
3. Joan says (that) she is getting fed up with his shenanigans. (be + VP with V-ing head = Present Participle)
4. Joan says (that) she is protected against bankruptcy. (be + VP with V-ed/-en head = Passive Participle)
5. Joan says (that) she is a finalist for the big award. (be + NP)

In the peculiar register of Reuters headlinese, we would lose the subject of each complement clause (she) along with the copular verb be:

1a. Joan says enthusiastic about the team.
2a. Joan says on the board of directors.
3a. Joan says getting fed up with his shenanigans.
4a. Joan says protected against bankruptcy.
5a. Joan says a finalist for the big award.

Copula deletion is nothing unusual in elliptical headlinese, as in these recent examples from the Associated Press: "Gonzales a lesson in cronyism," "Wilson in good condition, hospital says," and "Idaho senator arrested in airport." To my eyes and ears, it's more than a little weird to lose the copula and the implied subject of the complement clause (even if the subject is just a pronominal placeholder). But that's just what the Reuters headline writers do on a regular basis. I went through the Reuters archive and collected five examples of each predicate type numbered above, beginning with the adjectival type exemplified by the Vick headline. Most of the types are easily adduced by looking at headlines for the month of August.

  1. AP
    • Malaysia says well able to handle external shocks (Aug. 23)
    • Automakers say open to sharing parts (Aug. 23)
    • Premier Foods says set to cut 430 jobs in Britain (Aug. 22)
    • UK says confident will be cleared of foot and mouth (Aug. 21)
    • US parents say wary after China product recall (Aug. 15)

  2. PP
    • Lufthansa says on course for 2007 profit boost (Aug. 23)
    • Quest Software says not in compliance with Nasdaq (Aug. 21)
    • Hamilton says not at war with team mate Alonso (Aug. 9)
    • Al-Rajhi says on prowl for foreign expansion (Aug. 6)
    • National Health says still in talks with "third parties" (Aug. 6)

  3. Pass Part
    • French police say beaten in Guinea over immigrants (Aug. 23)
    • Thai PM says not worried about large "No" vote (Aug. 22)
    • US navy chief says reassured during China visit (Aug. 21)
    • Alitalia says contacted by group of investors (Aug. 21)
    • Casino operators say insulated from credit crisis (Aug. 16)

  4. Pres Part
    • Bear rivals say courting prime broker clients (Aug. 24)
    • Foot Locker says not providing EPS forecast (Aug. 23)
    • Pride says working with Pemex to assess rigs in Dean (Aug. 22)
    • Formosa Plastics says mulling China stainless mill (Aug. 22)
    • Albertson's says recalling some green beans (Aug. 3)

  5. NP (These are the hardest to search for in the Reuters archive, so I had to dip into the Factiva database for older examples. I've also given the first sentence of each article so that the meaning of the headline is made clear.)
    • Areva says a National Grid preferred partner (Oct. 24, 2006)
      Areva, the world's biggest maker of nuclear reactors, said on Tuesday it had been selected with its alliance members as one of six preferred partners by Britain's National Grid for contracts worth over 4 billion euros.
    • United (UU.L) says a preferred pick for water deal (Jan. 31, 2003)
      UK water and electricity provider United Utilities Plc said on Friday it had been named as a preferred bidder for a contract with state-owned Scottish Water for a 1.8 billion pound ($3 billion) investment scheme.
    • Edison says a long-term holder of Contact (Jan. 29, 2001)
      Edison Mission remains a long-term investor in New Zealand's Contact Energy, Edison's nominee on the board, Bob Edgell, said on Tuesday.
    • National Foods says a buyer, not target (Nov. 14, 2000)
      Australian dairy group National Foods Ltd said on Wednesday it was a buyer rather than a takeover target in the domestic industry's rationalisation.
    • Emulex says a victim of hoax press release (Aug. 25, 2000)
      Emulex Corp. on Friday became the victim of one of the most far-reaching hoaxes to hit the U.S. stock market, causing the data networking equipment maker's stock to plummet more than 50 percent.

There's one additional type of predicate specific to headlinese: the infinitival "to VP," understood as a simple future or as shorthand for 'be ready/set/prepared to VP.' This construction is routinely used when reporting on corporations or other institutions that make projections for the future (earnings forecasts and the like). Not surprisingly, subjectless complements of this type abound in the Reuters archive, though whether these also involve copula deletion depends on how you interpret the tricky headline infinitival:

    • AnnTaylor keeps outlook; says to target boomers (Aug. 24)
    • PetroChina says to lay pipeline to Turkmenistan (Aug. 23)
    • Citrix Systems says to double Asia revenue by 2012 (Aug. 23)
    • E*Trade says to triple non-U.S. revenue by 2010 (Aug. 22)
    • Coventry Health says to offer $300 mln senior notes (Aug. 22)

When I ran these Reuterisms past the keen eye of Arnold Zwicky, he wondered if it was possible for Reuters headlines to omit the subject in the complement clause for say without losing the copula. As Arnold notes, copula deletion is "customary in headlines, in fact, except when the head writer needs material to fill a line, but it isn't obligatory." As it turns out, copula deletion isn't obligatory even in the special Reuters case of Subject say(s) Predicative. Here are recent examples of the five predicate types with the copula intact but the subject missing:

    1. AP: ABN AMRO reports Q2, says is neutral on bids (July 30)
    2. PP: Abcam says is in talks with potential offerors (July 27)
    3. Pass Part: Arab-led Darfur rebels say are victimized too (Aug. 20)
    4. Pres Part: CIT says is looking at student loan stock deals (Apr. 5)
    5. NP: Hardliner Nikolic says is 'no danger to Serbia' (May 8)

Since the headline writers of newswires like Reuters are not concerned with the formal limitations of newspaper column widths, I doubt that the choice to keep or lose the copula is a matter of "filling a line." (It's generally not a concern for the online media outlets that reproduce the headlines, either.) Rather, it appears to be more of a stylistic choice, up to the judgment of a given headline writer or editor. And the Reuters practice of deleting the subject of the complement clause for say doesn't appear to be irrevocable "house style" either, since it's possible to find such headlines as "Bush, Karzai say they are aligned against Taliban" (Aug. 6) or "Freed Indian doctor says he is victim of conspiracy" (July 30). Still, the Subject say(s) Predicative flavor of ellipticality seems peculiar to the world of Reuterese. I wonder if it's enshrined in their style guides or if new staffers simply pick up this mannerism from their older colleagues.

All of this reminds me of another quirk of journalistic style: Time Magazine's much-maligned inverted syntax, famously ridiculed in 1936 by The New Yorker's Wolcott Gibbs: "Backward ran sentences until reeled the mind... Where it all will end, knows God!" Well, it did all end, just a few months ago. According to the New York Times, "the last remnants of Time’s signature syntax" were finally banished by editorial fiat in March, just 84 years after Henry Luce founded the magazine. Reuters, there's hope for you yet!

Posted by Benjamin Zimmer at 01:09 AM

August 27, 2007

Prepositional cannibalism at Google

A nice sighting by Tim McKenzie:

I have recently started getting a trickle of spam to my previously safe gmail address. Google helpfully (and accurately) filters the spam for me, but I often check, just to make sure they got it right. Anyway, I got one today that Google apparently thought was a phishing scam, so they warned me "This message may not be from whom it claims to be. Beware of following any links in it or of providing the sender with any personal information." [emphasis added]

Fowler named this phenomenon "cannibalism" ("That words should devour their own kind is a sad fact"), as explained here.

As Tim pointed out, the impulse to omit the second preposition may be connected with the anxiety about prepositions in relative clauses that I discussed recently.

[Update -- Sridhar Ramesh contributed a link to this cartoon, with prepositional cannibalism at the bottom of the second panel:


Posted by Mark Liberman at 07:55 AM

Intimate IM

Zits continues to explore the effects of technology on human communication:

Somewhere, there's probably a clay tablet that makes a similar observation about the effects of cuneiform.

Posted by Mark Liberman at 07:31 AM

Spreading the brain-sex gospel

This letter from Charles Raymond arrived yesterday. I've posted it in its entirety, with his permission. He mentions some earlier Language Log posts on on Brizendine, Sax and Gurian: a list of links is here.

I am a teacher in an independent school near San Francisco, and an occasional reader of the Language Log. I want to draw your attention to the rapid spread of workshops and talks pushing the "sex-difference" gospel about brain development in the private school world.

I completely understand your effort to counter Brizendine, since the public impact of her bestseller is substantial. I noticed your discussions of Sax and Gurian, but I am very curious if you or anyone you know has followed JoAnn Deak, author of Girls Will Be Girls (her next title, according to her website, will be The Brain Matters). Her writing is in the "expert-advice" genre, rather than the pernicious popular-pseudo-science mode of Brizendine. Deak, however, has built a wide following on the speaker circuit for educators, and she spreads virtually the same line, and drapes herself with the same "backed-up-by-the-latest-science" mantle. She spoke for three hours at our opening faculty meeting, and the presentation was impressive.

Her connections run deep with the schools for girls, for whom she advocates on the basis that biologically determined brain differences are the bedrock reason why many girls need unisex educational settings (completely in line with Sax, the difference being that she has a much larger existing "market" for her services). But she has built her "educational advisor" credentials much more widely than that: she speaks at conferences attended by heads of schools, and apparently receives many invitations to speak at individual schools.

I won't go into the detail -- you are familiar with the pattern -- but I will say that I laughed out loud when I read your description of Deena Skolnick's study ("Distracted by the brain", 6/6/2007). Deak's presentation could be exhibit A for that phenomenon. She was very entertaining, but "neuro-speak" was the hammer in her hand. So much of what she had to say was good and true when it came to practical advice, but the assumed authority of her voice was explicitly her willingness to absorb and accept the NewTrue science, and then to digest and translate that for us as the new gospel of pedagogical practice. Her acceptance of the science ("undeniable facts") was, by the way, really posed as a very "manly" exercise, if I may say so, in putting aside what she would "rather" have believed about men and women. In order to be true to our kids, we need to be "hard-boiled" about our biology. That rhetorical approach was so incredibly disarming and yet also intellectually intimidating at the same time.

I will attach some links on this subject below, but please do let me know of any other communication you might have had about her. I am trying to put together some materials to counter this sort of thing in my professional community, but I'm afraid it's like fighting the tide. The very day after her talk, an email went around notifying us all about a talk by Brizendine at a local bookstore. The appetites were whetted, so why not go to the source?

I should add that I support the modest growth of unisex education, but not because I believe that the biologically determined differences in cognitive development or learning styles are so salient that we need to redesign our institutions and practices around them. There are sufficient cultural and psychological arguments for different school settings. It's too bad some people feel the need to appeal to false or exaggerated biological determinism, and in a way that I believe may ultimately harm the cause of gender equity.

Some links:

Deak's own website makes it clear that she is making the rounds of independent schools.

The National Coalition of Girls' Schools endorses a similar view in a booklet that prominently cites Deak.

And individual girls' schools feature her name in various ways.

Here is Brizendine herself giving a talk quite similar to Deak's at the 2006 NAPSG meeting (the National Association of Principals of Schools for Girls).

This Canadian article (Anne Marie Owens, "Boys' Brains are From Mars", National Post, 5/10/2003) pairs Deak with Sax, and it gives an excellent description of what her rap is like.

[Guest post by Charles Raymond]

[Comment by Mark Liberman -- I found one passage from that National Post article especially striking:

The brain maps are intriguing, but what is more compelling is the sea change that must have taken place to allow these experts to show such slides in benign cookie-cutter conference rooms, pointing out the differences with clinical detachment, without any hint of confrontation or controversy. Doesn't talk of brain differences conjure up images of Canada's own Philippe Rushton and his racial brain delineations? Substitute blacks for boys in these talks and surely these speakers would have been drummed out of such a respectable gathering.

But there is no hint of quackery here. Here's an Ohio feminist with A-list credentials, a female psychologist specializing in girls' empowerment, and she's using the gendered brain maps. Here's a Maryland pediatrician, whose articles have been published in leading academic journals and who is a known champion of advancing boys in school, and he is using the gendered brain maps, too.

These were the ideas that thousands of educators from North America's most elite private schools were talking about by the end of last month's conference of the National Association of Independent Schools: What should we make of the science-based differences between boys' and girls' brains? What are the practical implications for educating boys and educating girls? Do traditional single-sex schools need to reinvent themselves to take advantage of this new knowledge? Can co-ed schools continue to make the case for mixed-sex programming in the face of this scientific evidence?

This explicitly makes the connection to old-fashioned racist anthropology and physiology, and to the recent work on sociobiology of racial differences by Rushton and others. That work has been extremely controversial, and has made relatively little headway in the culture at large, as far as I can see. In particular, it's unimaginable that today's American educational elite would enthusiastically host a set of speakers promoting the view that biologically-determined differences between the races, in abilities and interests and learning style and even perceptual sensitivity, are so great that racially-segregated education is the only way to treat each group as it needs to be treated. "Scientific" support for this conclusion would be minutely examined and criticized, and independent of the science, the conclusion would be resisted on ethical and political grounds, and its proponents would be ignored if not shunned.

The amazing thing is how far the brain-sex movement has gotten while raising hardly any controversy at all, despite its generally shoddy, misinterpreted or even fabricated scientific foundations, and its affinities with 19th-century gender stereotypes. (If you're not clear about this, read the links here.) I believe that this is because the brain-sex movement has learned to make its pitch in ways that appeal to feminist prejudices as well as patriarchal ones.]

Posted by Mark Liberman at 06:43 AM

August 26, 2007

In awe of bologna and doritos

Eric Gorski, "Hispanic congregations adding English services to the mix", AP, 8/25/2007:

On Sundays at La Casa del Carpintero, or the Carpenter’s House, they’ve raised twin yellow banners for churchgoers that read “Welcome” and “Bienvenidos.”

As a complement to the regular 11:30 a.m. Spanish service at the independent Pentecostal church, where they’ve worshipped Papi for years, there’s now a 9:30 a.m. English one where the faithful praise God the Father.

While churches from every imaginable tradition have been adding Spanish services to meet the needs of new immigrants, an increasing number of Hispanic ethnic congregations are going the other way – starting English services.

It’s an effort to meet the demands of second- and third-generation Hispanics, keep families together and reach non-Latinos.

As Geoff Nunberg wrote, in an exchange about Samuel Huntington's Foreign Policy article "The Hispanic Challenge" ("Nativism clings to life at 100 or 101", 6/24/2004):

English is too useful and important to imagine that any immigrant group would be willing to turn its back on it in order to maintain a marginal, ghettoized existence.

Gorski's article makes this point explicitly:

... as the children of immigrants grow up, churches are recognizing that it’s either bolster Spanish with English or give up on the future.

and supports it with examples:

Walter Rubio was born and raised in Guatemala and moved to the United States when he was 12, in awe of bologna and Doritos. Now raising his own family, Rubio attends English services at the Carpenter’s House.

“It’s simple,” said the 35-year-old construction worker. “My son and my daughter, they lean more toward English. If they understand it better, they get a better blessing.”

One thing that seems a bit different from other immigrant communities whose language patterns over the generations I've observed:

Some second- and third-generation Latinos prefer Spanish as their language of worship. When a group of young adults lingered after the Spanish service at the Carpenter’s House, their small talk was in English, not Spanish.

“We grew up going to Spanish services,” said Abdiel Quiles, 28. “It just feels like home.”

[Karen Davis writes:

Might a preference for Spanish as "language of worship" be similar to the longing among some Catholics for Latin masses: a combination of "you're not supposed to understand" and "this is the way it was done when I was a child"? Old Church Slavonic isn't even a form of Russian, yet it's clung to in the Russian Orthodox Church out of tradition.

I had a similar thought. But apparently these people do understand the Spanish services, though they prefer English for their own use. A traditionalist impulse must be part of their motivation, though, as oddly as that seems to sit with the immediacy of pentecostal culture.]

Posted by Mark Liberman at 08:03 AM

August 25, 2007

Another year of taboo avoidance

It's been almost a year since I assembled an omnibus posting (10/9/06) on taboo avoidance, and the mail has been piling up alarmingly.  And now that I've posted briefly on taboo avoidance and plain speaking in the New York Times, it's time to look at the rest of the media, blogs, and the like. 

It's dangerous to try to discern general trends in these things, but my impression is that while we might be living in "the Golden Age of taboo avoidance" (as Ben Zimmer has put it to me), we're also seeing some publications guardedly moving towards somewhat greater openness.  The result is, of course, confusion and inconsistency.  And, in the case of automatic asterisking/bleeping programs, downright absurdity.

Here's a sampling of things that came to me over the past year that haven't been blogged on here.  It doesn't pretend to be a complete survey of taboo avoidance (or use) during the year; it merely illustrates a variety of approaches to the problem.  (The examples are grouped into rough categories for exposition; no serious analysis is intended by this categorization.)

At this point, after two years on the Taboo Desk at LLP, I would like to take a vacation from tracking taboo avoidance, so that I can attend to some other topics.

1.  Approaches to use/avoidance. 

Some publications generally go for avoidance.  Some, like the Guardian, the New Yorker, and the Economist are on record as using words like cunt and fuck where they are newsworthy (almost always in quotations).  So you get the Economist, back on 2.28/02, in reporting a minor transport-ministry scandal, publishing the revelatory quote:

"We're all fucked. I'm fucked. You're fucked. The whole department is fucked. It's been the biggest cock-up ever and we are all completely fucked."

Lane Greene, who relayed this to me on 7/4/06, noted that the Guardian used cunt considerably more often than one might have expected, sometimes in the writer's own voice, as in this column by Peter Tatchell:

Swapping gossip with the girlfriend of a man who was previously my long-term lover, we agree he was definitely aroused by both the male and female form; equally delighted and sexually voracious with a cock or a cunt.

Nevertheless, the impulse towards modesty, or presumed modesty, is strong these days.

2.  Circumlocution, paraphrase, and allusion.

The Wall Street Journal (like the New York Times) generally avoids asterisks and the like, in favor of work-arounds, sometimes elaborate and coy ones, as in this article (of 8/18/06) about the movie Snakes on a Plane:

Also, the filmmakers added new scenes to the film, including one where Mr. [Samuel L.] Jackson's character delivers an exclamation similar to one a sound-alike had uttered in a fan trailer. In it, Mr. Jackson repeatedly uses an Oedipal expletive to describe both the snakes and the plane.

(Thanks to Jake Seliger.  More Snakes on a Plane material from Ben Zimmer here.)

Even more tortured is Leah Garchik's

The opening conversational line, scrawled on the work, was the three-word declaration 'Men are (donkey-apertures), ...

from her San Francisco Chronicle column of 10/17/06 (pointer from Ned Deily).  Figuring this one out is a lot like solving a crossword puzzle -- and it doesn't really work in British English.

Indirect allusion can get too indirect.  Here's Gawker's complaint (passed on by Matthew Hutson on 10/10/06) about advice from a Washington Post editor asking that writers on the paper find ways to avoid "the N-word", which the editor found "almost cutesy", recommending instead something like "a well-known racial epithet":

Oh, but that's no fun -- putting it like that could mean any racial epithet, and there are just so many to choose from!

(Compare the Times's reporting in the Isaiah Washington affair, where the offending expression -- faggot -- was alluded to as "the remark" and "the slur".)

Equally indirect is the avoidance of jackass on Fox and Friends, as reported by Jon Lighter on ADS-L 8/8/07:

In an oblique allusion to the MTV show Jackass, anchor Steve Doocy remarked a few minutes ago that Johnny Knoxville hosted a show "with a very inappropriate name."  The name was not uttered.

In his 10/18/06 "On Language" column in the Chicago Tribune, Nathan Bierma found himself trying to explain IM chat slang without crossing over the paper's language lines.  It turned out that the people he interviewed didn't always get the message:

ROTFLMAO: About one in three students recognized this as "rolling on the floor laughing my [uh, arm] off" (and they knew what replaces "arm"). A few only knew "ROTFL," and a few only knew "LMAO."

WTH: About half identified this as "what the [heck]?" though several said they prefer "WTF," ending with a different four-letter word.

Jane Acheson wrote on 10/24/06 to comment on the "hilariously circumspect" approach the Boston Globe takes to dubious vocabulary, citing sport stories with

I don't give a [care].

The Yankees don't [inhale excessively].

in them (the second of which took her some time to interpret).  I noted that you get rather a lot of hits for {"don't give a [care]"} and {"don't [inhale excessively]"}.

Even more obscure allusion: Jon Lighter reported on ADS-L 11/1/06 that Fox and Friends had just rebuked Barbra Streisand for using "the firetruck word" in public.  Jon noted that Google has cites back to 1999.  The expression goes back to the Turtle Club question, "What word begins with the letter F and ends with CK?"

Also on ADS-L, Barry Popik (11/9/06) quoted a 1960 Dallas Morning News article about a Texas stew known variously as "son-of-a-blankety-blank stew", "S.O.B. stew", and "Gentleman-from Odessa" or "Gent-from-Odessa stew".  The writer, Frank X. Tolbert, explained that the first two of these names used "an unfriendly term in which  it
is implied that the man receiving the insult has canine ancestry on the distaff side", and cited an informant explaining that the last two derived from the reputation of the town of Odessa:

The town had something of a reputation for hell-raising. People from Sweetwater to El Paso generally agreed that what passed for a gentleman in Odessa would be the equivalent of what was called a son-of-a-blankety-blank in more civilized prairie towns.

3.  Euphemisms and technical vocabulary. 

Note Bierma's report of hell as "[heck]", above.

In a startling development, effing (originally a kind of euphemistic abbreviation) is now regarded in some quarters as intemperate language.  From a 10/26/06 House of Commons debate (pointer from Dery Earnshaw)

Mr. Osborne: If he cannot accept that, surely the current Secretary of State for Work and Pensions is right: the Chancellor will make an "effing awful" Prime Minister?

Mr. Speaker: Order. Will the hon. Gentleman withdraw that remark? We must have temperate language in this House. I do not care what is said outside.

Mr. Osborne: I of course unreservedly withdraw the quote from the Secretary of State for Work and Pensions.

Assorted inventive euphemisms of the freakin' etc. type:

UncleJam89 wants you to funkin' read tonight. (Jon Lighter, ADS-L 2/24/07, from an eBay site)

This is fargin war!  (Scot LaFaive, ADS-L 2/24/07, from the movie Johnny Dangerously)

Technical terminology functions as a substitute in an AP story of 12/13/06 that Eric Jusino pointed me to in the Central Utah Daily Herald

A Washington State University assistant professor who used a vulgar racial term during a heated political dispute with Republican students was "immature" and "thoughtless," but his actions did not constitute discrimination, a new university report concludes.

... During the dispute, [Dan] Ryder said [John] Streamas, an assistant professor of comparative ethnic studies, called him a "white (solid waste)-bag."

College Republicans demanded that Streamas be fired. University President V. Lane Rawlins said Streamas would be reprimanded, but not fired.

For the verb, rather than the noun, here's a quote from a This Modern World cartoon by Tom Tomorrow, which I saw in the Funny Times of 2/07; TT is reviewing the events of 2006:

Sept. 24: N.I.E. acknowledges that Iraq war has increased threat of terrorism.

The report also notes that the Pope is Catholic and bears defecate in the woods.

Then in February came the grerat hoohaa episode, in which this nursery word (or one of many variants: hoo-ha, hoo-hoo, ha-ha, etc.) is used to refer to the vagina.  The appearance of "The Hoohaa Monologues" on the marquee of a Florida theater was noted in the NYT :

What's in a Name? Controversy
Published: February 12, 2007                          

Under ordinary circumstances, the opening of "The Hoohaa Monologues" on Thursday at the Atlantic Theaters in Atlantic Beach, Fla., near Jacksonville, would not attract much attention. But "The Hoohaa Monologues" by any other name is Eve Ensler's Obie Award-winning, internationally performed play, "The Vagina Monologues."

Last week, after a complaint from a passing driver who became upset because her niece had seen "vagina" on the theater marquee, Bryce Pfanenstiel of the Atlantic said, "We decided we would just use child slang for it," reported. Down came "The Vagina Monologues." Up went "The Hoohaa Monologues."

But two days later, on Thursday, in response to a demand from the organizers of the production, the original title was restored. The organizers are a group of Florida Coastal School of Law students who insisted that the original title be displayed because they had rights to the play only if they refused any censorship.

"Vagina is the essence of a woman," said an organizer, Elissa Saavedra, "and if you're going to suppress the name, then you're suppressing us as women." All proceeds are to go to charity.

Ben Zimmer was the first to post this on ADS-L, and then other posters reported occurrences of one or another variant on the TV show Ellen; in the movie Boys on the Side; in the Pussycat Dolls' song "Beep"; on South Park; on Grey's Anatomy; and from their own childhoods.  Eventually, people became to report other, non-vaginal, uses: the MAD Magazine interjection (of astonishment or triumph) hoo-hah, and the noun meaning 'fuss, to-do', in particular.

4.  "[expletive]" and related locutions. 

seems to be in fashion these days.  Susan Harrelson wrote me on 10/21/06 to report a competition between asterisking and (automatic) beep:

People posting on the IMDB message boards, even experienced users who know that taboo words will be rendered as "*beep*," may be surprised to discover the following:
F*cking = *beep*
F**king = F**king
I know I was.

But bleep lives on, as in this Variety story of 10/26/06 (passed on by Victor Steinbok) about NBC vs. the Dixie Chicks:

The national spot shows a clip of Bush authorizing troops to fight in Iraq, then cuts to a clip of Maines' comment. Next is a clip of the president saying publicly that the Dixie Chicks shouldn't have their feelings hurt if people don't want to buy their records anymore. The final frame shows Maines saying that Bush is a "real dumb (bleep)."

And then there's the audio bleep, turning up in extraordinary places.  Here's a report from the Telegraph of 1/27/07:

'Bleep' bless you ma'am: censor goes too far
By Catherine Elsworth in Los Angeles
Last Updated: 2:05am GMT 27/01/2007

An over-zealous censor bleeped out all references to God when editing an in-flight version of the Oscar-nominated film The Queen.

The operator had been told to remove all profanities when preparing a version for several commercial airlines.

When one of the characters addresses the Queen, played by Helen Mirren, passengers aboard certain Delta and Air New Zealand flights heard: "(Bleep) bless you, ma'am" rather than "God".

Jeff Klein, president of Jaguar Distribution, which supplied the airlines, said the removal of God in seven instances was a mistake by an employee who had taken his instructions too literally. The films have been replaced with unedited copies.

(From Ben Zimmer; longer story here.  Australian version of the story reported by Matthew Duggan.).

Automated BLEEP insertion (see the discussion of automated asterisking below) has reached new depths.  As Ben Zimmer posted to the ADS-L on 8/11/07:

The sports blog Deadspin has often ridiculed the censorship of user comments on, e.g. (link) (link).

Earlier today there was a post about similar censorship on, where "BLEEP" is inserted in place of offending words: (link)

Referencing censored bits here: (link)

"The San Francisco Giants trade pitchers Joe Nathan, Francisco Liriano, and BLEEP  Bonser to Twins for A.J. Pierzynski." [censoring "Boof" Bonser]

"They were tradinBLEEP oung player who had put up some nice numbers, but wasn't projected to be a star..." [censoring the letters "g a y" in "tradinG A Young player"]

Chris Waigl then discovered that

On the Foxsports page, the "BLEEP"s are hyperlinks which lead to a page entitled "About Censoring": encourages our users to express themselves on their blogs, story comments, or message boards. We don't want to slow down your game when you're dishing on your favorite teams and players.

At the same time, we recognize that not everyone out there loves a potty mouth. So if there's an obvious bad word on a blog, story comment, or message board post, we'll try to censor it.

Feeling brave, mature, and adult-ish? Or just want to get in touch with your inner sailor? You can choose to have FOX Sports do nothing, and leave all those R-rated words alone. If you do, you may see some coarse language from time to time in the community. Don't say we didn't warn you!

*Would you like to automatically censor content you view?*

Below this are two buttons. I clicked "No, don't censor" and now get the unbleeped pages.

Leaving aside the inane rules for censored strings, which are far from catching only "obvious bad words", even if you believe in such a thing, there are two remarkable things about this:

- They unapologetically call it "censorship"
- They offer "censorship" as a customer service feature

An extreme version of the "[expletive]" strategy is just to use empty square brackets, as in these excerpts from the redacted transcript of a taped conversation between Bob Woodward and Richard Armitage, in evidence at the Scooter Libby trial:

WOODWARD: ...What's Scowcroft up to?
ARMITAGE: [       ] Scowcroft is looking into
  the yellowcake thing.

... We've got our documents on it.  We're clean as a
  [         ] whistle. 

There's a lot more, but you get the drift.  (Pointer from Ben Zimmer, 2/13/07.  There's also redacted audio.)

A related usage (total omission) was reported by Scot LaFaive on ADS-L 2/24/07, re "the horse you rode in on":

I did a search of Google Books and Google for antedates but couldn't find any. I did find two interesting notes on the phrase. For one, it seems that the phrase "fuck/screw you and the horse you rode in on" is often clipped to "and the horse you rode in on," allowing for it to be used in proper company.

5.  "The x-word".

Check out Geoff Pullum's wonderful rant of 2/2/07 on the use of "the x-word" (for various values of x) on the NPR Talk of the Nation show.

6.  Asterisks and other avoidance characters.

Chris Waigl wrote about a conspicuous use of piss on television, reported without avoidance in the Guardian News Blog of 8/25/06:

Weather presenter Joanne Malin has hit the headlines for describing conditions in the way the rest of us do when, live on Central TV, she said it was "pissing it down".

but asterisked in the Daily Mirror article that same day:

TV NEWSGIRL Joanne Malin feared a downpour of complaints after accidentally blurting out that it was "p***ing it down" during a weather report.

But rather than a storm of protests, hundreds of viewers emailed to say they had not seen anything so funny for ages.

Chris (e-mail of 8/28/06) observed that the British tabloids (like the Daily Mirror) print

a surprisingly large amount of asterisked taboo words. Many are used gratuitously. Which fits: the asterisks draw even more attention to them, and serve the purpose of titillating the prudish reader. (I once spent 5 min just figuring out that an instance of "b******" stood, most likely, for "bastard". Without the asterisks, the word wouldn't have caught my attention at all.)

Here's an astonishing asterisking from columnist Kathleen Parker on 9/6/06 (hat tip to Victor Steinbok) -- astonishing because it defends the use of strong language while failing to reproduce that language:

The five-year anniversary of the Sept. 11 terrorist attacks has produced a peculiar concern--whether rescuers used proper language in the midst of mind-numbing horror and chaos. Apparently, firefighters were prompted to use profanity, a fact that some Americans now find too offensive for prime time... Usually, I'm in favor of strict enforcement of decency standards... However, there's a clear difference between gratuitous profanity contrived by unimaginative writers and the spontaneous language of real-life horror... Can anyone really imagine seeing what those firefighters saw--first one plane, then another--and saying, 'Goodness gracious, what rare deed is this?' When 'What the ---' more accurately captures the moment?

Another one from Chris Waigl, from a Guardian blog:

Tom may be gloomy, but Bono is a pr@ck. That's the difference.
Posted by Pete23 on October 17, 2006 03:05 PM.

Oh Bono, you stupid prack!

And one passed on by Ben Zimmer (8/22/07), from (8/21), with reference to Sen. Ben Nelson of Nebraska:

"This will shut that f---er up," [Sen. Tom Coburn's communications director John] Hart stated in an Aug. 1 e-mail sent from his Senate account to several of his colleagues. "I can't wait to send an In Case You Missed It to Nebraska press that will be forwarded to a--face."

Then there's automated asterisking, which I wrote about here in 2006: here, here. here, and here.  On 8/8/07 on ADS-L, Wilson Gray re-discovered the wonders of automatic asterisking on iTunes, citing the tune "El P***ycat".  Joel Shaver followed up (8/11) with an oddity from the Pandora website:

Longtime fans who were mystified by Chris Thile's experimental 2004 solo release Deceiver may c**k their
collective heads in dismay, but those who appreciate the group's searing ,musicianship, orgasmic harmonies, and genre-bending arrangements will no doubt wear out their copies of Why Should the Fire Die? within the first month of ownership.  ( ~ James Christopher Monger, All Music Guide)

Some were dubious that this should be taken seriously, but Chris Waigl pointed to the identical asterisking on (parts of) the Royal Society for the Protection of Birds site that I posted about a while back.

7.  Finally" say it with a look.

Matthew Stuckwisch (10/31/06) passed on some dialogue from the TV show Everybody Hates Chris the week before:

ROCHELLE: Hello Louise, how ya doin'?

LOUISE: Keep your nasty little nappy-headed son away from my grand-daughter.  That's how I'm doing.

CHRIS AS NARRATOR: That look means all seven of the words you can't say on television.  (pause) Because this is a family show, all she can say is this...

ROCHELLE: Excuse me?

The show has often played on the idea of a single expression representing a complex sentence, and even once featured an entire conversation (interpreted by Chris as Narrator) in facial expressionese.

The expression in question is known as "cut-eye".  See, for example, John R. Rickford and Angela E. Rickford, "Cut-Eye" and "Suck-Teeth": African Words and Gestures in New World Guise (Journal of American Folklore 1976), reprinted in John Rickford's African American Vernacular English (Blackwell, 1999).

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 04:54 PM

Prepositional anxiety and Voldemort's wand

It seems that He Who Must Not Be Named feels that prepositions are The Part Of Speech That Must Not Occur In A Relative Clause. At least, that would explain why, on p. 655 of Harry Potter and the Deathly Hallows, Voldemort says: "My wand of yew did everything of which I asked it, Severus, except to kill Harry Potter. " 

Meg Wilson, who sent this in, added "Since I'm not a linguist, I'd be interested to see a Language Log analysis of exactly what is so horrible about this sentence."

This horrible sentence is the sad result of more than three centuries of superstitious dread. In 1672, John Dryden objected to the lines

The mawes, and dens of beasts could not receiue
The bodies, that those soules were frighted from;

That's the passage from Ben Jonson's Catiline that Dryden cited in arguing that the poets of earlier generations, like Jonson and Shakespeare, were inferior to himself, due to their failure to obey various grammatical "rules" that he invented. (For the details, see "An internet pilgrim's guide to standed prepositions", "Hot Dryden-on-Jonson Action", and "Forgive me, awful Poet".)

Dryden was concerned with cases where a relative pronoun is the object of a preposition, and he promoted the idea of putting the preposition next to the relative pronoun rather than in its basic location in the clause (e.g. "the women to which I have spoken" rather than "the women (that/which) I have spoken to"). Some grammatically-naive people have generalized this preference into a global aversion to prepositions at the end of any clause at all -- even intransitive prepositions like "I was fed up". But Voldemort's sentence shows that the superstition has been escalated to a new level.

In case it's not obvious to you, here's what's going on. Simplifying it a bit, Voldemort's sentence, rendered in Heavy English with square brackets marking the relative clause, is something like:

my wand did everything [ such that I asked that thing of it ]

The normal way to render this in Standard English would be one of these:

my wand did everything [ which I asked * of it ]
my wand did everything [ that I asked * of it ]
my wand did everything [ __ I asked * of it ]

where '*' marks the canonical location of the (variable corresponding to the) relativized noun phrase.

But apparently that inoffensive little of -- even though it has no connection to the relative pronoun, and is just sitting peacefully next to its proper object "it" -- frightened J.K. Rowling, who decided to sweep it up and stick it next to which in the complementizer position. Or perhaps it was some anonymous copy editor who performed this little grammatical incorrection.

Then again, Rowling may have consciously placed a hyperconnection in Voldemort's mouth, in order to tell us something about his character. That would be a more charitable interpretation.

Posted by Mark Liberman at 11:48 AM

Deep-sea diving tutor god turns cartoonist

This was sent in by Randy Alexander, who used Stripgenerator to make it:

Posted by Mark Liberman at 09:16 AM

Liquid syntax

Bruce Lin took a picture of this sign on a coin-operated video game machine in a pub in Cambridge, UK, and sent in a link. His first thought was that this is a case of syllepsis, but then he wasn't sure. I don't think that it's an example of syllepsis in the classical sense, though it does share the trait of coordinating items that aren't strictly parallel. In any case, we can certainly count it as an example of the broader category of WTF coordination. But as Bruce observes, it's hard to classify this one, because "the wording doesn't fully make sense", even if we give the sign-maker a pass on coordinating unlike items.

If the sign read


then it would be a standard example of what linguists call "right node raising". This term was coined because in place of a structure like this (in tree form)

we get one with a single copy of the noun phrase "this machine" -- on the right -- which is not embedded as nodes 5 and 8 inside the two prepositional phrases 4 and 7, but instead is placed as node 9, raising it to a higher level in the structure, parallel to the disjunction "bring liquids to __ or place liquids on __" .

But to get Bruce's pub sign, we've somehow got to get rid of the first copy of "liquids". And I can't figure out what made the sign writer think that this was a plausible thing to do. Maybe too many liquids brought to, and placed in, his or her mouth. (If you have another theory -- or better yet, some more data -- please let me know.)

I have to admit, though, the sign writer's solution is a lot better than flushing the second copy of "liquids". In other words, I'm claiming that the choice


though flagrantly ungrammatical, is somewhat easier to make sense of than


Well, maybe not. Of course, something like


or just plain


would actually be grammatical, and would leave space for


or some more specific threat.

But signs are tough.

[Update -- Kilian Hekhuis suggests that the sign might have started out as one of the kinds of incompletely-parallel coordination that we've batted back and forth with Neal Whitman. Then the writer "fixed" it.

In reply to your article "liquid syntax", I think it may be likely the sign-writer originally had created a sign saying "do not bring or place liquids on this machine" (in analogy to, say, "do not prepare or eat food here"), then realized 'bring' needed a different preposition than 'on', and stuck 'to' after it, yielding "do not bring to or place liquids on this machine", perhaps regarding 'to' and 'on' as parts of the verbs instead of the prepositional phrases 'to the machine' and 'on the machine' (analogous to e.g. "prepare and finish up your food").


Posted by Mark Liberman at 07:12 AM

"X and its enemies"

The phrasal template "X and its enemies" seems to be especially well adapted for use in the titles of books. I wonder whether any other snowclone can claim the names of so many: there were at least five "X-and-its-enemies" clones published in 2006, and at least three so far in 2007.

I also wonder where this pattern originally came from, and how it got established. The most influential exemplar, I think, has been Karl Popper's 1945 The Open Society and its Enemies, but it was by no means the first. A couple of quick web searches located a dozen earlier works titled on this pattern, and I doubt that I've found them all. I don't think that Popper explained what inspired his title, but Charlotte Franken Haldane's 1927 Motherhood and its Enemies might have been in the back of his mind. Or perhaps in the front of it, I don't know.

The best known recent example -- at least the best-selling one -- has probably been Virginia Postrel's 1999 The Future and its Enemies. But there were at least two the year before, in 1997, and two others in 1999, and then none in 2000 and one each in 2001, 2002 and 2003.

Here's a partial list of "X and its Enemies" books, in chronological order:

Delta, Indigo and its Enemies; or, Facts on Both Sides, 1861
Leroy Foote, Christian liberty and its enemies: A book for youth, 1868
Bejamin H. Hill, The union and its enemies, 1879
John Nietner, The coffee tree and its enemies: Being observations on the natural history of the enemies of the coffee tree in Ceylon, 1800
J.S. Hunter, The Green Bug and Its Enemies: A Study in Insect Parasitism, 1904
Carl Gottfried Hartman Hartman & Lewis Bradley Bibb, The Human Body and Its Enemies: A Textbook of Hygeine, Sanitation and Physiology, 1914
William O'Brien, Sinn Fein and its Enemies, 1917
John Fremont Wilber, Progress and its Enemies: Showing the fallacy of the single-tax theory, and some other enemies of progress, 1918
William Ernest Hocking, Morale and its Enemies, 1918
William Jennings Bryan, The Bible and Its Enemies, 1921
Charlotte Franken Haldane, Motherhood and its Enemies, 1927
Fred Richard Marvin, Our Goverment and its Enemies, 1932
Karl Popper, The Open Society and its Enemies, 1945
Ludwig von Mises, The Free Market and Its Enemies, 1951
George Kateb, Utopia and its Enemies, 1963
A.P. Thornton, The Imperial Idea and Its Enemies, 1968
Philip A. Kuhn, Rebellion and its Enemies in Late Imperial China, 1970
R.A. Adeleye, Power and diplomacy in Northen Nigeria, 1804-1906: The Sokoto Caliphate and its Enemies, 1971
John O'Sullivan, The Draft and Its Enemies: A Documentary History, 1974
Thomas Ford Hoult, Social Justice and Its Enemies, 1975
Thomas Molnar, Authority and its Enemies, 1976
Peter Paret, The Berlin Secession: Modernism and Its Enemies in Imperial Germany, 1980
Paul Oliver & Ian Davis, "Dunroamin: The Suburban Semi and its Enemies", 1981
Christopher Green, Cubism and its Enemies, 1987
Edward Alexander, The Jewish Idea and Its Enemies, 1988
William R. Tonso, The Gun Culture and Its Enemies, 1990
Asger Jorn, Open Creation and Its Enemies, 1994
John Honey, Language is Power: The Story of Standard English and Its Enemies, 1997
Kenneth Mills, Idolatry and Its Enemies: Colonial Andean Religion and Extirpation, 1640-1750, 1997
Peter Hart, The I.R.A. and its Enemies, 1998
David Watson, Against the Megamachine: Essays on Empire and its Enemies, 1998
Eunan O'Halpin, Defending Ireland: the Irish State and its Enemies Since 1922, 1999
Virginia Postrel, The Future and its Enemies: The Growing Conflict Over Creativity, Enterprise and Progress, 1999
Max Skidmore & Max J. Skidmore, Social Security and Its Enemies, 1999
Ken Auletta, World War 3.0: Microsoft and Its Enemies, 2001
Nicola Di Cosmo, Ancient China and its Enemies, 2002
William Coleman, Economics and its Enemies, 2003
Lee Harris, Civilization and its Enemies, 2004
Alexander De Waal, Islamism and its Enemies in the Horn of Africa, 2004
Daniel Cohen, Globalization and its Enemies, 2006
Roslyn Weiss, The Socratic Paradox and Its Enemies, 2006
Robin Cohen, Migration and Its Enemies, 2006
Cor Struik, The Independent Spirit and its Enemies, 2006
Edward Friedman & Sung Chull Kim, Regional Cooperation and Its Enemies in Northeast Asia, 2006
Jane Duckett & William L. Miller, The Open Economy and its Enemies: Public Attitudes in East Asia and Eastern Europe, 2007
John K. Wilson, Patriotic Correctness: Academic Freedom and its Enemies, 2007
Marcus Collins, The Permissive Society and Its Enemies: Sixties British Culture, 2007

This snowclone of dialectical idealism appears to be even more popular in the titles of articles -- here's a small sample of the values of X that have been used at least once so far:

the passion, innovation, property, independence, the dream, the secular society, amalgamation, national sovereignty, population, moral purity, freedom, the enlightenment, internet gambling, migration, Israel, literacy, liberal education, algebra, Ambien, ...

[Update -- Robin Shannon points out the resonance with { "X and its discontents"}. In this case, the most famous exemplar is Freud's 1929 Civilization and its discontents, whose German version was Das Unbehagen in der Kultur ("the uneasiness in culture").]

[And Dan Tobias points out that X considered harmful is another example of a phrasal template with a long history of backgrounded use that was brought to the level of conscious cliché-making by a famous exemplar.]

[Zeno from Halfway There writes:

I immediately thought of William F. Buckley's book (written with Brent Bozell) McCarthy and His Enemies, from 1954, whose title is a variation on your theme.


[Emmanuel Maria Dammerer writes:

While reading your recent post on “X and its enemies”, I immediately thought of another “intellectual snowclone” based on Kant’s “Critique of Practical Reason” and “Critique of Pure Reason”: Major uses of the structure include Satre’s “Critique of Dialectical Reason” or Sloterdijk’s “Critique of Cynical Reason”, but there seem to be dozens of examples in both German and English (and possibly other languages, given the impact of Kant).

Robin Shannon’s example of “X and its discontents” has also led to snowclonification of the original title “Das Unbehagen in der Kultur” in two ways: It was altered to “Das Behagen in der Unkultur”, “Das Unbehagen in der Unkultur” and even “Das Behagen in der Kultur” by various authors. Also, there is the snowclone “Das Unbehagen in der X”, where X generally, but not always, is a compound word ending in “-kultur”.


[Jeff Erickson writes:

Your long list of "X and its enemies" examples immediately reminded me of my favorite mathematics lecture title: Bill Thurston's "Hyperbolic Geometry and its Friends".

There are a few books titled "X and its friends", but not many in comparison to the enemies list.]

[Nancy Friedman writes:

I very much enjoyed "X and Its Enemies" in Language Log. I've been making a small study of book-title snowclones myself (originally for a presentation to a client, later because it became a mini-obsession). I agree with you about the durability of "X and Its Enemies" and would propose a runner-up: "The End of X."

"The End of X" is a subset of the larger title snowclone "The X of Y" (The Audacity of Hope, The Elements of Style, The Grapes of Wrath, etc.). I'm sure my search isn't exhaustive, but I came up with 43 "The End of X" titles (list includes some fiction titles):

The End of Medicine
The End of Faith
The End of Poverty
The End of Iraq
The End of Oil
The End of Days
The End of Education
The End of History
The End of Religion
The End of War
The End of Fashion
The End of Reform
The End of Words
The End of Memory
The End of Food
The End of Suffering
The End of America
The End of Work
The End of Art
The End of Print
The End of Science
The End of Liberalism
The End of Biblical Studies
The End of Homework
The End of Racism
The End of California
The End of Sorrow
The End of Eternity (Isaac Asimov)
The End of Southern Exceptionalism
The End of American Exceptionalism
The End of Ancient Christianity
The End of Time
The End of Software
The End of Hardware
The End of Diets
The End of Harry Potter?
The End of Beauty (poetry)
The End of Fossil Energy
The End of Early Music
The End of Barbary Terror
The End of Certainty
The End of Human Rights
The End of Laissez-Faire

I've identified two major sub-subsets of "The End of X." The first is "The End of the X," e.g.

The End of the European Era: 1890 to the Present
The End of the Old Order
The End of the World
The End of the Line
The End of the Battle
The End of the Alphabet (novel)
...and of course The End of the Affair

Then there's "The End of X As We Know It," which probably owes a lot to Clinton's "The end of welfare as we know it":

The End of Government As We Know It
The End of Marketing As We Know It
The End of Stress As We Know It
The End of Advertising As We Know It
The End of Capitalism (As We Know It)

For another book-title snowclone (I think), see Mike Pope on noun titles.

Several older readers have pointed to the earlier roots of "the end of X as we know it", especially REM's 1987 "It's the End of World as We Know It (And I Feel Fine)". Of course, phrases like this were clichés for many decades before that. The earliest example in the New York Times archive appears to be in a book review by Isaac Anderson, "The New Mystery Stories" that ran on July 3, 1938, where it is clearly already a hackneyed phrase:

At last we have Simon Templar, otherwise known as the Saint, in the role of a hero without a single new stain on his already considerably spotted reputation. Almost single-handed, Simon averts a war that might easily have spelled the end of civilization as we know it -- and all because he listens to a radio broadcast and then goes to a fire.

A decade later, on August 30, 1948, we get a story by Robert Plumb, "Engineer Says Vast Polar Ice Cap Could Tip Earth Over at Any Time", which threatens not just civilization but all of the world:

The end of the world as we know it now can be put off with a $10,000,000 engineering project, Hugh Auchincloss Brown said today in the comfortable old-fashioned library of his home here.

The 69-year-old electrical engineer, who has been practicing his profession for nearly half a century, gave warning of a horrendous fate in store for the earth. The rapidly increasing weight of the large Antarctic ice cap can tip the globe over at any time "just as you might roll a pumpkin over so that a frosted side would thaw out in the sun," he reported.

The article further explains that

... a thirty-five-year study of the earth .. has convinced him that the globe goes through a similar gyration every 8,000 years to compensate for the unbalancing effect of heavy ice formation at the poles. The present "epoch" is up and the earth is due to tumble like a run-down top. [...]

The New York area may find itself at the bottom of thirteen miles of muddy ocean water, along with most of the civilized world.

Mr. Brown recommends using atom bombs to blow chunks off the Antarctic ice cap, thus saving the civilized world.  You'll doubtless want to read the whole thing. And you can check out Mr. Brown's original writings here. (I hope that none of my complaints about the present state of science journalism have given any of you the impression that things used to be better.)

Returning, reluctantly, to the relatively boring topic of phrasal history, I can also cite The Friends' Intelligencer for Sept. 24, 1904, where the eschatology is of the more traditional sort:

It had long been a superstitious belief and fear of Western Christendom that the end of the first thousand years after Christ would mark also the end of the world as we know it.


[Andy Hollandbeck writes:

You may never see the end of these.

Another phrasal template that is popular for titles is "The Rise and Fall of X," with X being the Third Reich, Ziggy Stardust, ECW, Great Powers, Adolf Hitler, Stuey Ungar, Athens, an Empire, Legs Diamond, the American Teenager, the Slasher Film, the Plantation Complex, a Dictator, Heidi Fleiss, California's Radical Prison Movement, English, Fred A. Leuchter, Jr., and Alexandria.

My favorite, though, is "The Rise and Fall of Dinosaurs According to Creation." It's a children's book about dinosaurs that "contains continuous Biblical references revealing God's marvelous hand in Creation and supportive evidence from the fossil record."

It's published by the Creation Evidences Museum.


Posted by Mark Liberman at 07:01 AM

August 24, 2007

The allure of eggcorns

It came up a few days ago on the American Dialect Society mailing list, but I had to see it to believe it. In the September issue of the women's magazine Allure (with Britney Spears on the cover) eggcorns and other language errors share the same page with a picture of Jessica Simpson and Eva Longoria. Unfortunately this doesn't involve Mmes. Simpson and Longoria discussing pre-Madonna or any other celebrity eggcorns. That's a missed opportunity, since Jessica Simpson obviously knows a thing or two about linguistic gaffes. Rather, it's all part of Allure's "Insider's Guide to Communication." (See the page scan after the jump.)

The photo of Jessica and Eva consulting an Internet-ready Sidekick adorns an interview with the authors of Send: The Essential Guide to Email for Office and Home, presumably because the readers of Allure need to be reassured that starlets read email too. (And they're probably a bit more photogenic than authors David Shipley and Will Schwalbe.) The mention of eggcorns, meanwhile, is in a separate article about Michael Erard's new book, Um... Slips, Stumbles, and Verbal Blunders, and What They Mean, which has a whole chapter on eggcorns and other speech errors collected by our own blunder maven Arnold Zwicky.

As befits a "guide to communication" in a glossy magazine chiefly devoted to beauty tips, Erard's book is treated as if it's in the self-help genre, alerting readers to the "Bad Words" that they should avoid, from eggcorns to disfluencies to mondegreens. The first sentence sounds the alarm: "Linguists estimate a verbal slip — everything from a stray 'um' to a mangled phrase — happens once every 4.4 seconds." (This refers to the pioneering research done in the 1950s by the Yale psychologist George Mahl, who measured the rate of "speech disturbances" in interviews with patients.) Needless to say, the Allure writer doesn't treat eggcorns in the Language-Loggian fashion as "tiny little poems, a symptom of human intelligence and creativity." Indeed, the whole thrust of Erard's argument that verbal blunders "provide a window into what humans really are" is conveniently ignored.

Then again, we should perhaps be grateful that eggcornology is getting even cursory attention from a mainstream magazine like Allure. In Mark Liberman's LSA talk on "the future of linguistics," he imagined a time when mass-media treatment of linguistics could be found in the supermarket checkout line, much as psychological research is handled by Psychology Today. (That magazine, by the way, ran a piece on eggcorns last year, which was laudable except for the misspelling of Geoffrey Pullum's name.) Perhaps the coming era envisioned by Mark is closer at hand than we realize.

Posted by Benjamin Zimmer at 12:56 PM

Dinner with the deep-sea divers

Back in November of 2006, Stephen J. Dubner, the journalist half of the Freakonomics blogging team, explained to his readers why he finds "Economist-Speak" problematic:

I try to keep up with the current economics literature, which means reading quite a few papers and a whole lot of abstracts. Most of the literature isn't very interesting or meaningful to me (this is simply a matter of preference); and some of it might be interesting or meaningful but I am unable to tell. Why? Because the language of economists is often -- not always, certainly, but often -- deeply obtuse.

It's a bit less clear what the topical economics hook was for his difficulties in understanding David Beaver's paper on the interaction of factive verbs and implicatures, but I suspect that he was mostly motivated by the desire to find an occasion to repeat a witty put-down:

The above paragraph reminded me ... [of] a comment once made by a grouchy New York Times writer discussing another New York Times writer who had just received a promotion: "He writes as if he were badly translated from the Croatian."

This reminds me, in turn, of a story Sylvain Bromberger once told me, to the effect that he had chosen a problem to investigate, and explored a particular hypothesis about it, all in order to be able to write a paper in which he could make use of a certain witticism that had occurred to him.

I suspect that this was not really true, but it made a good story. In any case, Sylvain's enterprise inspires me to quote from the start of a paper by Dubner's co-author, Steven D. Leavitt, about the economics of asymmetric communication:

The standard principal-agent model neglects the potentially important role of information transmission from agent to principal. We study optimal incentive contracts when the agent has a private signal of the likelihood of the project's success. We show that the principal can costlessly extract this signal if and only if this does not lead her to intervene in the project in any way that will influence its outcome. Intervention undermines incentives by weakening the link between the agent's initial effort and the project's outcome. If possible, the principal commits not to cancel some projects with negative expected payoffs.

And this quotation -- which, like the passage from David's paper, is straightforward to those who know the terminology and the conceptual history, but baffling to outsiders who merely know the ordinary meanings of English words -- sets up my chance to quote a clever put-down of obtuse academic writing, from p. 190 of Michael Ignatieff's Isaiah Berlin: A Life:

In letters to E.H Carr and Alan Bullock, Isaiah wrote scathingly about the positivist pedantry of American social science. He probably had not read much of it, but this did not prevent him from remarking that American academics wrote with all the grace of a deep-sea diver sitting down to a dinner party.

[Update -- Matthew Rankine writes:

Surely when Dubner describes academic writing as obtuse (which you later repeat), he means to say abstruse? I asked a question about this over on Metafilter a while ago, and it seems to be popping up all over the place recently.

Yes, I think this was a malapropism, though one where there is some overlap in meaning. The OED give obtuse the sense

2. fig. a. Annoyingly unperceptive or slow to understand; stupid; insensitive. Also, of a remark, action, etc.: exhibiting dullness, stupidity or insensitivity; clumsy, unsubtle.

which might almost fit the case to some extent, depending on what he meant. However, Ken Wilson wrote that

Using obtuse as a rough synonym of abstruse is Nonstandard, and you should avoid it.

and Paul Brians also flags it as an error:

When you mean to criticize something for being needlessly complex or baffling, the word you need is not “obtuse,” but “abstruse.”

So yes, Dubner is yet another victim of the Harman/Skitt/McKean Law of Prescriptive Retaliation, or perhaps of an amendment affecting those who complain about bad writing.]

Posted by Mark Liberman at 07:00 AM

Giuliani's lisp?

According to Peter J Boyer, "Mayberry Man", New Yorker, 8/20/2007:

It became clear at the start of Giuliani's political career that his courtroom talents -- the ability to break a witness on the stand, for example -- were not especially useful in the task of charming voters. In Giuliani's delivery style, there was no trace of the natural politician. Apart from mechanical liabilities -- including a lateral lisp that produces a slushy "s" sound -- Giuliani was impaired by a native harshness that proved resistant to the remedies of his political advisors.

It's nice to see a magazine writer using a term of art from speech pathology, where "lateral lisp" is defined this way:

Lateral lisps are not found in typical speech development. The tongue position for a lateral lisp is very close to the normal position for /l/ and the sound is made with the air-flow directed over the sides of the tongue. Because of the way it sounds, this sort of lisp is sometimes referred to as a 'slushy ess' or a 'slushy lisp'. A lateral lisp often sounds 'wet' or 'spitty'.

Unlike interdental and dentalised lisps, lateral lisps are not characteristic of normal development. An SLP assessment is indicated for anyone with a lateral lisp.

But I hadn't noticed this problem in Giuliani's speech, so I figured, maybe he got therapy for it. In search of the "before" condition, I found a Women's Coalition for Giuliani event said to be from 11/3/89, viewed 167,000 times on YouTube here. (I can't vouch for the date, but he certainly looks much younger). Here's how it starts:

There must be public funding
for abortions for poor women.
We cannot deny any woman the right
to make her own decision
about abortion
because she lacks resources.
I have also stated that I disagree
with President Bush's veto last week
of public funding for abortion.

There are plenty of /s/ and /z/ sounds there, so give a listen:

The audio quality is not the best, but maybe there's a little slushiness in his sibilants. On the other hand, it's subtle at best, and maybe I'm just primed by Boyer's description.

Or perhaps I'm just not adequately tuned in to the zeitgeist. A bit of internet searching reveals that the lisp is part of the standard story about Giuliani:

(Time magazine, 6/24/2001) Rare is the day that Giuliani's name does not appear in the papers. He is media savvy, not overtly calculating. He loves to talk (he does so with unselfconscious self-absorption), to expatiate in professorial detail (with the slightest hint of a lisp). He is also a modern haiku master who can distill a complicated answer into a crisp, 15-second sound bite.

Various sources, mostly blog commenters: I just don't see Giuliani and his lisp and his New York pedigree and his multiple marriages resonating with the red staters in a positive way.
  [A snarky YouTube question]
  Giuliani is simply George W. Bush with a lisp.
  Wow! From Tweedle Dee to someone with a Lisp
  "without 9/11, giuliani is a little man with a lisp"
  Jimmy Breslin was never fooled by Rudy the Lisp.
  So if Rudy and Ron do debate, expect Rudy to speak with a British accent (he's got the effeminate lisp down pat) and invite Paul to tea afterwards.
  The man's lisp alone disqualifies him as a real candidate.
  A physically small man possessed of both a lisp and a comb-over, Rudy Giuliani is in almost every way the sort of candidate Republican primary voters have been raised to hate...
  We really don't need a lisp talking elmer fudd look alike running this country!
  Even Karl Rove can't "Reaganize" Giuliani. He is a bald, squirrelly-looking New Yorker with a lisp.

[Update -- Marc Pelletier writes:

I think you forgot an important factor:

In the mouths of American commentators, "X has a lisp" seem to translate as "X doesn't sound like me/favored group does" and is almost unfailingly derogatory.

The examples you cited, and almost all of those I Googled myself, interpret freely as one of "X seems to have (mild) mental retardation", "X appears to be part of the wrong social class" or (supreme insult) "X sounds foreign".

I don't think the phonetics or speech physiology even enters into it.

It's not just Americans.

And the meaning of "nasal" is analogous -- for most people, it seems to mean "sounds different in a way that annoys me".]

Posted by Mark Liberman at 06:58 AM

August 23, 2007

Tutor Gods

According to Jonathan Cheng, ("In Hong Kong, Flashy Test Tutors Gain Icon Status", WSJ, 8/14/2007):

When Richard Eng isn't teaching English grammar to high-school students, he might be cruising around Hong Kong in his Lamborghini Murciélago. Or in Paris, on one of his seasonal shopping sprees. [...]

...a popular [English-language] tutor might teach 100 students in a single lesson, each paying as much as $12.50 to be there. So a tutor working 40 hours could gross $50,000 in a week. "It's a big business," says Ken Ng, a well-known tutor god. "That's why I'm driving my second Ferrari."

The motivation?

Hong Kong parents are often desperate to help their children succeed in this city's pressure-cooker public-examination system, which determines students' college-worthiness. That explains why many are willing to pay handsomely for extracurricular help. Mr. Eng and others like him have made a lucrative business out of tapping that demand. They use flashy, aggressive marketing tactics that have transformed them into scholastic pop stars -- "tutor gods," as they're known in Cantonese.

[via Victor Mair, who comments: "It's important to know that this is going on in HK, because what happens in HK presages what will happen in the PRC. "]

[And by the way, Murciélago may be the name of an Italian car, but it's a Spanish word -- meaning "bat", in the flying mammal sense. And in case you're wondering, as I did, why Lamborghini would name a hot car after a small nocturnal insectivore, generally (if unfairly) regarded as slightly creepy, they didn't. They named their car after a bull, who was named (in Spanish) "bat".]

Posted by Mark Liberman at 04:24 PM

The NYT transgresses

As far as we can tell here at Language Log Plaza, the New York Times broke new ground yesterday, when it printed the word shit in a quotation from someone other than the President of the United States.  Well, piece-of-shit, but surely that counts as an instance of the word shit.  The expression occurs in "an anonymous, invective-laced phone message" (audio available here) left for Bernard Spitzer, the father of New York Gov. Eliot Spitzer; the story begins on page 1, and eventually quotes some of the nasty stuff:

In the message, the caller says, referring to a potential subpoena: "There is not a goddamn thing your phony, psycho, piece-of-shit son can do about it. Bernie, your phony loans are about to catch up with you. You will be forced to tell the truth and the fact that your son's a pathological liar will be known to all."

[But wait! Here's the Times a month ago (7/22), quoting Rudy Guiliani saying bullshit, back in 1992: "A block away from City Hall, Mr. Giuliani gave a fiery address, twice calling Mr. Dinkins's proposal "bullshit." The crowd cheered. Mr. Giuliani was jubilant."]

For some time now, we've been tracking the NYT's handling of taboo vocabulary.  The paper's policy is not to print dirty words, even in quotations where they might be relevant, and also not to use asterisking, "[expletive]", "the F word", or other standard avoidance techniques, preferring instead to allude indirectly to the banned words (or to omit the material entirely).  However, over the years the paper has relaxed its policy, allowing shit when the President says it -- first, in 1974, from Richard Nixon (Abe Rosenthal at the time: "We'll only take shit from the President"); then in 1976, attributed to a fictionalized version of Lyndon Johnson; and, more recently, from George W. Bush (details here and here).  The paper has also had to relax its policy on avoidance, in order to refer at all to book titles and the like (Harry Frankfurt's book On Bull_ _ _ _).

But now the Presidential Shit Privilege has been relaxed.  What next?

[Addendum 8/25/07: Grant Barrett writes to report that the NYT "City Room" blog of 8/23 has a note from its standards editor about this latest four-letter word: "We rarely permit the use of profanity in our columns, even in quotations. We made a rare exception in this case because we felt that readers would more easily understand why the Spitzers were so upset about the message if they knew what the language was."]

Meanwhile, here are some examples of NYT shit-avoidance (of several different styles) that we haven't previously blogged:

From Brett Reynolds, 11/21/06: in the "Science Times" section that day, "The Best Science Show on Television?":

'This is where we blow stuff up.' Jamie Hyneman -- who, to be honest, did not actually use the word 'stuff' -- stood...'

Meghan O'Rourke, review of Up Is Up But So Is Down, Book Review of 11/19/06, p. 22:

... Most of the art may have been" -- insert four-letter word here -- "but it was a glorious time.

Elizabeth Weil, Magazine story "The Needle and the Damage Done", 2/11/07, p. 50:

One member described the execution team's training by saying: "Training? We don't have training, realIy." A nurse responsible for mixing the drugs, when asked how much she knew about the anesthetic, said: "I don't study. I just do the job. I don't want to know about it." Another team member dismissed mistakes by saying that "[expletive] does happen."

On another taboo front, the NYT walked a fine line in a review of the show "Nigger Wetback Chink", 6/9/07:  it quoted all three of the ethnic slurs in comments from the actors, but gave the title as "N*W*C*".  (From Ben Zimmer, who got it from Grant Barrett.)

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 03:08 PM

Syllepsis Today

Craig Daniel, quoting a friend on AIM:

"i've gotta get my glasses fixed and a haircut but we can definitely do stuff this week"

Steven Taschuk sent in this, from metafilter:

I haven't seen Brazil sinnce [sic] I was a kid and Battle of Algiers at all.

Steven comments:

The question of why this sentence is so jarring (I recklessly assume that you too find it so) struck me as something Language Log might take an interest in.

Certainly -- see here and here and here -- but unless you'd been reading Language Log over the years, how could you tell that we've written about this topic many times? Well, by searching for the term syllepsis, coined by rhetoricians a couple of millennia ago. Providing a handle for text search is a modern bi-product of the old human propensity to invent names for things.

[Update -- Marcus Hum writes:

Speaking of syllepsis, I've been meaning to send in the following example to Language Log for examination for some time now. It comes from the cartoon "Futurama". The speaker is a referee/ring announcer introducing a menacing robot in an Ultimate Robot Fighting bout:

  And in this corner, from and made of Parts Unknown: The Clearcutter!

(The script for the entire episode is here.)

Clearly, this was constructed deliberately for humorous effect like the joke headline previously cited on LL: "Teacher who starred in porn movie a decade ago wants forgiveness, it harder, faster, OH GOD YES".


Posted by Mark Liberman at 07:41 AM

Eugene Volokh re-discovers eggcorns

And invites his readers to contribute.

Here's a thought: someone might add an optional "eggcorn alert" to AbiWord and OpenOffice and similar tools.

Each hit could be provided with a link to relevant entry in the Eggcorn Database, so that errant authors could evaluate the evidence themselves.

In other eggcornological news, Jan Freeman recently contributed a compelling analysis of this comic strip's use of the term bi-products (" Mr. Boffo lays an eggcorn", Boston Globe, 8/15/2007):

If you think about it, "biproducts" might have been a more logical name than "eggcorns" for these little poems of misapprehension. But "eggcorn" is better, I think: in metonymy, evocativeness trumps logic.

Posted by Mark Liberman at 07:03 AM

August 22, 2007

Philosophy of Mind Reading Group forming

Just got an email with this title from the Cognitive Science email list here at the University of Arizona. I was really excited at first because I thought they meant

But then I read the rest of the email and I realized they just meant

Ah well! I expect it'll still be interesting.

If they'd only sent the invitation as a sound file, this would never have happened. They would have said this,

not this

and I'd have understood right away--that's what the Compound Stress Rule of English can do for you (see, e.g., the second para here).

(Hat tip to Andrew Carnie.)

Posted by Heidi Harley at 03:30 PM

Innumeracy Cannot Be Overestimated

The innumeracy that Mark has discovered among public relations professionals just scratches the surface. Some years ago I was staying with friends who are schoolteachers. On the desk in the spare bedroom was a handout, five or six pages long, from a workshop that a visiting reading specialist had presented. The subject of the handout was how to calculate the error rate on a test of reading ability. This is not hard. You take the total points possible and subtract the points received for correct answers, yielding the points incorrect. Dividing this by the total points possible yields the fraction incorrect. Multiplying this by 100 yields the error rate as a percentage. One subtraction, one division, one multiplication. Something that with perhaps minor variations (such as computing the percent correct rather than the error rate) surely most teachers have done many times. The math involved is, I believe, typically covered in North American schools in grade 5 (when students are around 10 years old).

It should perhaps be shocking that the presenter felt it necessary to prepare a handout on such a topic, but what is REALLY shocking is that the description of the procedure in the handout was wrong, and that the error could not be attributed to a simple typographical error.

The appalling level of mathematical knowledge on the part of a very large fraction of those who do not specialize in a field with a strong mathematical orientation helps to explain why otherwise intelligent people are so readily deceived by such things as claims about language and gender or crank hypotheses in historical linguistics.

Update: Posts on related topics can be found here, here, and here.

Posted by Bill Poser at 01:29 PM

This could explain a lot

Among the most important critters in the ecology of science journalism are the "public relations" or "communications" specialists who write the press releases for universities, corporations, scientific societies and so forth. As far as I can tell, nearly all the science stories that make the news are picked from among the press releases that flow through conduits like EurekAlert. And in most cases, what's published or broadcast features the general spin and also the specific details -- quotes and numbers -- provided by the press releases.

The people who write the PR materials have a hard job. They need to take complicated results, full of background assumptions and layers of caveats, and present them in a form that non-specialist journalists will understand, and will find interesting enough to choose from out of the flood of competing alternatives.

The PR people are not specialists themselves, and in many cases may not have a lot of scientific or mathematical background. But I was still surprised to see this, on Dan Santow's excellent Word Wise blog ("Writing tips for public relations professionals"):

I just came across an online tool that could help writers (who are not known for their mathematical skills - or is it just me?) that’s really worthwhile (especially if you work in financial communications). It’s called the percent change calculator and all you do is enter a number, then the number it changed to and it tells you the percent change. No long division, no embarrassing yourself by asking colleagues if they know how to figure out the percent change from 745 to 13 (98.3 percent decrease), no feeling really old and decrepit. It’s divine.

I gather from this that there are educated, intelligent and otherwise skilled adults who are not sure how to turn two numbers into a percentage change, and that some of them are working as public relations professionals. I'm not trying to be snarky, I'm just truly and sincerely surprised.

[Update -- Ed Keer is surprised that I'm surprised:

Let's see if I can help Mark understand. First, in my experience most writers in PR, advertising, or communications have humanities educations. Remember back in college there were all those people taking English courses and their parents would say, "What are you going to do with that?" Well, that's what they're doing with that. And second, if the last time you were asked to figure out a percentage change was 20 years ago, when you were more focused on the cute potential mate in the seat across the way, you might be a little rusty.

But, but, percentages are taught in the 5th grade, more or less, well before most students are distracted by mating opportunities. And they play a role in a number of everyday activities, like tipping and sales discounts and the like (though I admit that those can easily be faked). So this seems unfair and disrepectful to English majors, who are generally smart people. I've known quite a few English majors who could calculate percentages better than I can.

And Holly Cordner, who a college student "studying English" (well, and also "information systems") agrees:

If you can’t even do elementary math, how did you get through your formal education? How did you get a college degree? No one expects journalists to be the next Isaac Newton or anything, but they should at least be math- and science-literate enough to be able to recognize blatant errors in either calculations or reasoning.

But Zeno at Halfway There has some hair-raising anecdotes about people in positions of authority who demonstate complete failure to understand elementary percentages and related concepts. And Dick Margulis blogged some serious innumeracy in the 1/22/2007 issue of the New Yorker (though percentages are not crucially involved, just an apparently inability to detect a factor-of-a-thousand error in the sum of a set of numbers given as components of total energy expenditures in the U.S.).

When you add this to Bill Poser's anecdote about a case in which working teachers were felt to need instruction in how to calculate the percentage error in students' in test results, I guess that I have to stop being surprised.. There's a prima facie case that there are some high-functioning adults who are seriously percentage-impaired.

If this is true -- and it would be nice to have some real evidence about how many people are affected, not just suggestive stories -- then we've got a major disability on our hands that has somehow escaped the process of identity politics. Where are the lobbyists, counselors and support groups for the percentage-impaired? Where are the stories about the effect on our GNP of the financial-communications professions who worry about "embarrassing yourself by asking colleagues if they know how to figure out [a] percent change"? Seriously, I worry that people with this problem are embarrassed to admit it and to seek help. In some cases, they may not even realize that they need help. But they do. ]

Posted by Mark Liberman at 10:19 AM

A Little Date Puzzle

One of my books, published during the Second World War, is dated 2603. Where was the book published?

(Answer below the fold.)

The book was published in Japan. Most books published in Japan at this time, at least the majority of those I have seen, are dated in the usual Japanese dating system, in terms of the year of the reign of the Emperor. The Emperor during the Second World War is known in the West by his given name Hirohito (裕仁), but in Japanese he is known as the Showa Emperor (昭和天皇), and it is this name that is used for dates. In Japanese there are very few circumstances in which it is appropriate to address or refer to a man whom one has not known since childhood by his given name. To refer to the Emperor by his given name is so rude that it offends me, and I am neither Japanese nor particularly sympathetic to monarchy.

The Showa Emperor's reign began in 1926. The year 1943, in which this book was published, is therefore Showa 18 (昭和 十八). This book, ironically, one in English, is dated from 660 BCE, the beginning of the reign of the semi-mythological Emperor Jimmu (神武天皇), the first Emperor of Japan. 660 + 1943 = 2603.

The present Emperor, referred to as Tenno Heika (天皇陛下) "His Majesty the Emperor", will be known after his passing as the Heisei (平成) Emperor, and this is also the name by which the current era is known. Since he ascended the throne in 1989, the current year is Heisei 19 (平成十九).

Posted by Bill Poser at 01:34 AM

August 21, 2007

Sve je semantika

When I got back from my summer travels last week I learned that NYT journalist Stephen Dubner, coauthor with Steven Leavitt of the wonderful book Freakonomics, had happened on something I'd written. This is what he wrote in the NYT's Freakonomics blog:

Stephen J. Dubner
It's All Semantics
For reasons that may not make sense to anyone else, I recently performed a Google search for "They Might Be Giants"and "Belly Button."This was the second hit: a paper by a Stanford linguist named David Beaver (that's not an aptonym, is it?) called "Have You Noticed That Your Belly Button Lint Colour Is Related to the Colour of Your Clothing?"Here is the abstract:
Karttunen identified a class of semi-factive verbs. This was erroneous, but enlightening. Stalnaker and Gazdar explained Karttunen's data as involving cancellation of presuppositions as a result of pragmatic reasoning, an account reformulated by van der Sandt. In this paper I present a large number of naturally occurring examples bearing on the question of how factive verbs interact with implicatures, and show that many of these examples are problematic for existing accounts. I end by presenting suggestive evidence involving the relation between presupposition and information structure.
I love living in a society that values this kind of research. But I also think it is funnier than Woody Allen's best writing. The above paragraph reminded me a bit of some earlier economics papers I discussed, as well as a comment once made by a grouchy New York Times writer discussing another New York Times writer who had just received a promotion: "He writes as if he were badly translated from the Croatian." If anyone can translate the abstract above out of the Croatian, and additionally tell me how it relates to belly buttons, I'd be most obliged.

(Side note: I'm now a University of Texas linguist, not a Stanford linguist. A better weblink for me would be this one. And there's more about my non-aptonymic name here.)

My mother thought it was wonderful to be mentioned in such a widely read forum. I am a little bemused by the piece. My wife says it's nothing to be proud of. And the jury of public opinion, as rendered in the voluminous comments thread to Dubner's post, is hung as regards whether I deserve to live or not.

Anyhow, two reasonable requests from Dubner. First, to translate out of the Croatian. And second to explain the belly button link.

My Croatian is not good, which is why the original has until now been supressed, but, modulo font rendering issues, here it is:

Karttunen identificiran razred od polu - tvornica glagol. Ovaj je kriv , ali rasvijetliti. Stalnaker i Gazdar objasniti Karttunen's podaci kao uklju?u?i poništenje od predmnijevanje kao rezultat od pragmati?an zaklju?ivanje , ra?un reformulated mimo kombi der Pijesak. In ovaj papir Ja prisutan velik broj od prirodno dvokratan primjer nošenje na pitanje kako tvornica glagol djelovati me?usobno sa implicatures , i pokazivanje taj mnogobrojan od te primjer jesu problemati?an za sadašnjost ra?uni. Ja kraj mimo prisutan sugestivan jasno?a uklju?u?i povezivati se izme?u predmnijevanje i obavijest struktura.

My own translation into English is obviously inadequate. That's what caused the problem in the first place. So instead, here is InterTran's translation into Mr. Dubner's favorite language:

Karttunen identified learner with semi factory verb. This had wry, limit lamp. Stalnaker plus Matron unfold Karttunen's data as a including that undoing with presumption as a upshot with pragmatic chain of reasoning, bill reformulated past van der Sand. In this paper I present swarm with truly occurring Primakov wear at an issue of how factory verb interact from an implicatures, plus show this many with these Primakov are problematic for present bill. I end past present suggestive intelligibility including that relate to betwixt presumption plus notice legality.

No doubt this is exactly what Dubner sought, a variant abstract which is completely free of the lingua-speak which I use to prevent myself being understood. The new version uses presumption for presupposition, factory for factive, and Primakov for the technical term examples.

Now the second request: how does the paper relate to belly buttons? Well, not at all, really. The title, as one commenter observed, is just an example I came across. It was taken from a survey used in a groundbreaking piece of navel gazing research by one Dr. Karl: he won an IgNobel prize for it. One question on this survey, Have you noticed that your belly button lint colour is related to the colour of your clothing?, appears to take for granted (presuppose) that the respondent's belly button lint is related to his/her clothing color. Why? Because the question includes the verb notice, and this is one of many verbs which commonly comes along with a presumption that the stuff sitting next to it is a fact. That's why we call them factive verbs. Or, at least we did. It turns out that we should call them factory verbs. What I found curious was that nothing about lint color was actually presumed in the survey: it was apparently intended to be neutral as to whether there was any connection between belly button lint and clothing. Indeed, while the belly button research turned up many strong results (``It seems as though the Snail Trail has something to do with BBL levels''), there was no clear connection established with clothing color. From Bellybutton Lint - The Results:

About 37% of people with BBL said that the colour of their BBL was related to the colour of their clothing. About half of these people had blue BBL. Most people wear various shades of blue. But we really can't explain why some people consistently have BBL in a colour that is not present in their clothing.

What I was wondering about in that paper Dubner found was this: under what conditions does someone using a factive factory verb take the stuff sitting next to it for granted. I found that no existing theory covered the ground, and speculated that... well, I'm afraid I'm boring you: if you really want to know, you can ask me, or read the paper.

It seems that my unfortunate predilection for difficult words combined with the title of my paper to get me into an awful and very public mess. In fact, Dubner has taught linguists the world over two valuable lessons. We must only use easy words, and we must be very careful in our choice of Primakov.

[Hat tip to Arnold Zwicky and Karin Golde for pointing me to the Dubner piece.]

Posted by David Beaver at 11:38 PM

Explaining the systems slowdown

Walt Bettinger, the president and chief operating office of the Charles Schwab brokerage, sent out a personally signed message to customers on August 18 in which the following was the main content:

On August 16, you may have experienced difficulty accessing Schwab online or through our phone centers as a result of our systems slowdown. We want to apologize and let you know what happened.

In the process of expanding system capacity earlier that morning, a systems slowdown occurred. As a result, if you tried to access Schwab online or via the phone, you may have experienced slower-than-normal service, or in some cases, may have been unable to access our services at all.

So let me summarize this for Language Log readers, in case you didn't get that. There was a systems slowdown, you see, and the explanation turns out to be that while system capacity was being expanded a systems slowdown occurred, with the result that people found (indeed, you yourself may have found) that the systems slowed down. So that's the news from the Department of Redundancy Department at Charles Schwab. It makes you wonder about the people who draft letters for chief operating officers, doesn't it?

Seriously, there really is a linguistic incoherence here, and to at least some extent I can point out to you where it lies. The phrase our systems slowdown in the first paragraph is a definite singular noun phrase, so it is being presupposed that there is a unique and identifiable systems slowdown that you already know about. But a systems slowdown occurred, in the second paragraph, has an indefinite noun phrase as subject, appropriate only under the assumption that no identifiable systems slowdown has been present in the discourse so far. So while expecting an explanation, we progress backwards in our information state, from being presupposed to be aware of the systems slowdown to being presupposed not to have known about it. Not ungrammatical; just utterly incompetent at the level of discourse design and paragraph content.

Posted by Geoffrey K. Pullum at 11:43 AM

That was a stupidity, certainly

In their neverending quest for fresh sources of filter-fooling text, spammers have apparently started to mine Esperanto translations of Russian science fiction novels.

Gene Buckley sent in this example that arrived in his inbox (and mine) a few days ago, having succeeded in fooling the local spam filters:

Subject: Tio estis stultajxo, certe, kvankam kauxzita de katastrofaj sociaj movigxoj de la dua kvarono de la antauxa jarcento.

Gene notes:

Based on what I learned more than 20 years ago (and the apparent convention that "x" means "accent previous letter"), it says "That was a stupidity, certainly, although caused by catastrophic social movements of the second quarter of the previous century." Properly, <Tio estis stultaĵo, certe, kvankam kaŭzita de katastrofaj sociaj moviĝoj de la dua kvarono de la antaŭa jarcento.>

A quick web search finds the source, apparently the fourth chapter of the (translated) novel Gravitavio «Carido» by Vjacheslav Rybakov.

The message was sent to a local mailing list that we boh subscribe to. I'm puzzled about why it got through the spam filters on our mail server, because the body of the note uses one of the usual recent text-hiding techniques, starting with the headline:


and continuing with a screenful of stuff like this

N*o_t o_n'l'y d,o'e_s t_h,i,s f+i.r*m h.a.v-e fun,damental's,
b+u_t getti*ng t'h+i*s oppor*tun_ity at t'h,e righ_t t*i.m+e*,
righ t befor'e t.h'e is w_h.a_t m+akes t'h,i+s d-e.a+l so swee*t!
T-h-i,s a grea+t opp*ortu_nity to at leas,t dou ble up!

I thought that the current generation of spam filters looked at character n-grams, among other features -- surely this style of substitution should be easily identifiable by that technique -- if you know exactly what weakness the spammers are exploiting here, please tell me.

Posted by Mark Liberman at 05:45 AM

August 20, 2007

"Don't say 'tin' to Rebecca, you know how it upsets her"

Chris Pauls wrote to point out that word aversion, "like pretty much every other subject, has found its way into Monty Python's Flying Circus".

Here's the sketch:

and a transcript is here.

I've certainly seen that sketch, but I guess it never occurred to me that the word-aversion part of it was any more true to life than the part of it about shooting caribou on the lawn or wiring up a fakir as a remote control.

[Update -- Jake Schneider writes:

In terms of your Python post, don't forget the much more famous example of Monty Python's playful approach to word aversion--The Knights of Nee's inability to stomach the word "it." Of course, in that example, the word involved is far to common to be remotely realistic.

How could I have forgotten? Here's the transcript:

Knight of Ni: Then, when you have found the shrubbery, you must cut down the
              mightiest tree in the forest...
              Wiiiiiithh....  A HERRING!
(minor music)
Arthur:       We shall do no such thing!
Knight of Ni: Oh, please!
Arthur:       Cut down a tree with a herring?  It can't be done!
Knights of Ni: AAugh!  AAAAAH!  Oww!! (writhe in pain)
Knight of Ni: Don't say that word!
Arthur:       What word?
Knight of Ni: I cannot tell; suffice to say, it is one of the words the
              Knights of Ni cannot hear!
Arthur:       How we *not* say the word if you don't tell us what it is?!
(Knights of Ni are in pain again)
Knight of Ni: Ahhhh! 'E said it again!
Arthur:       What, "is"?
Knight of Ni: No, not "is"!  You wouldn't get very far in life not saying
Bedevere:    My liege!  It's Sir Robin!
Sir Robin and his minstrels "ride" up.
Minstrels (singing):  He's sacking it in, and packing it up,
                      and sneaking away, and buggering up,
                      And chickening out, and pissing a pole...
Arthur:       Sir Robin!
Robin:        My liege!  It's good to see you!
Knight of Ni: Now *'e* said the word!
Arthur:       Surely you've not given up the quest for the Holy Grail!
Minstrels, by way of answering:
                      He's sneaking away, and buggering up,
Robin:         Shut Up!
               No no, no, far from it!
Knight of Ni:  'E said the word again!
Robin:         ...I was...looking for it...
Knights of Ni: AAAAAAAuugh!
Robin:         uh, here--here in this...forest.
Arthur:        No, it is far from this place.
Knight of Ni:  Aaaaaaugh!  Stop saying the word!!!!
Arthur:        (getting really annoyed with the Knights of Ni) OH, STOP IT!!
Knight of Ni:  Ow!  He said it again!
Arthur:        Patsy!  (motions all of his party to move on)
Knight of Ni:  Wait!  I said it!  I said it!
               Oh!  I've said it again!
               And there again...that's three hits!
Arthur, Bedevere, and Sir Robin ride off with the minstrels and Patsy.


[Update #2: Philip Downey sends a link to The Frantics "Dirty Words" skit.]

[And Keith Ivey reminds us that you're not supposed to say "mattress" to Mr. Lambert. ]

Posted by Mark Liberman at 11:42 AM

Ask Language Log: The moist panties phenomenon

Greg Sabin wrote:

I'm writing to ask you about a certain word association quirk that seems to affect my wife and a few other women that I know. The issue centers around the word "moist." Both my wife and a close friend (also female) cannot stand the word, either written or spoken. (As you can imagine, this makes watching cooking shows rather difficult.) They are totally fine with "moisturizer," but cringe and shudder at "moist," or even "moisten." Another female friend has a similar aversion to "suckle."

So two questions: 1. Is this a phenomenon with which you are familiar? Have there been any studies about this type of "word aversion?" and 2. Is this a issue that is more likely to affect women (since I know of no men who have similar aversions)?

I've never come across it before, but a little web searching suggests that this sort of thing is pretty common. Many people cite a reaction to moist, along with some other specific words such as panties. I haven't found any systematic studies -- if you know of any, please tell me. Nor do I know whether there are sex differences. The web search found mainly female reactions, but that may be because of the way I searched.

Note that we are NOT talking about the kinds of "word rage" often discussed here, where people get angry at jargon or slang associated with a despised group, or upset because a word or phrase is felt to be incorrectly used, or annoyed at language that they perceive as redundant, or overly complicated, or pretentious, or a cliché, or trendy, or politically incorrect. Rather, these are cases where someone finds a word "revolting", "ugly", "disgusting" in itself.

We're also not talking about ethnic slurs, or words that are sexually or religiously taboo or offensive (though some of the words do have sexual associations, even if weak ones).

Sometimes people say that they "hate" these words, but this seems less the angry kind of "hate", and more the "cringe", "shudder", "shiver", "gives me the willies" kind.

(The discussions that I've found on the web don't distinguish this "moist words" reaction from anger and annoyance at jargon, slang, word misuse, rudeness and taboo violations -- the quotes below are selected and edited so as to focus on the topic of interest here.)

Thus Lisa at Put Zee Candle Beck wrote ("Word aversion", 12/1/2006):

There is one word that I hate above all others. If I come across it, I must immediately declare my hatred of it to anyone who is there to listen. If there’s no one around, I’ll resort to primal arghing and hit the page where the word resides.

The word is . . . hardscrabble.

I don’t have a logical reason for hating this word. I haven’t had a traumatic experience with it in the past.

I simply find it revolting. It’s ugly.

Among the reactions from her commenters:

Anonymous: Luggage. Can't stand that word. Luggage. It just feels gross.
Raul: I hate the word pugilist.
grudge girl: Tissue. *shiver* It just gives me the willies.
Anonymous: my girlfriend's sister hates the words "moist fist" used together. we think it says something about her...
Steph: Panties. It always has a sort of creepy pedophila connection for some reason or other. Undies is fine. Thong, okay, just never panties. Ugh.

And back on 6/2/2003, in the perfect world forum, "kismet" wrote:

You know they're out there, words that make you need a cool cloth to your head when someone says them aloud. Words that can ruin enjoyment of a decent novel.

Share your (moist, creamy) instruments of torture here.


Obviously, I hate moist. And creamy.
I'm not too fond of fleshy, either.

Among the 1,694 responses:

Elizabeth Barrett: moist and panties. Either separately or in conjunction. Blech.
Em T.: My mother hated gut. Would not let us say it, as if it were the worst word in English.
Maizie B.: goosepimple
kismet: Oh, I hate panties too! Everybody I know has to refer to them as underthings.
VanPear: I guess it's two words, but mother's milk squicks me terribly.
susan b.: I hate "chunk" and "chunky". I also hate "wedge". "Cut into wedges". "serve with a wedge of cheese". Ick. I also hate "moist". And I dislike "meal".
Eustacia: Baffle. Squab. Cornucopia.
Reggae Junkie: Big toe. Navel. Armpit. Lunch meat. insert. bra strap
Harri P. Boob. Panties. Swimsuit.
kismet: Giggle. Hate hate HATE giggle. With the concentrated hatred of a thousand hate filled suns.
BlueBirthday: I hate the word "moist" so much I can hardly type it.
sohcahtoa: I hate "gig", "motif", and "whimsy". No rational reason, just hate them.
Diana Barry: "Navel" and "furtive." Ugh.
Cathy Georges: Also clabber and squall. And plumbago.

On 8/13/2007 in the cincymoms forum, "UndomesticGoddess125" asked:

Is there any phrase or word in the English vernacular that just makes your hair stand on end? ... I really hate the word "egg". It feels weird in my mouth.

Among the responses:

babsmama: the words moist and yeast, not sure why, but I cringe every time I hear those.
gwbjdk: The word Crudd (sp?), in place of dirt, and the word Chunks. It just sounds so gross. My husband will use them sometimes, just to see me give that yucky face I always give when hearing them.
Chrissy341: I don't really have any, but my sister hates the words moist and the word panties (not together lol). So whenever I have cake I always tell her how moist it is so #1 she grimices (sp?) and #2 more cake for me!!!
WestChesterMommy: I hate the word "moist" too! How odd.
mamadejjm: "moist" (yuk - hate even typing it). "p_ssy" (put a U in the blank). "Panties" (just ick).
2OBoys: Moist. YUCK. Don't know why.
Kimberlydawn: I absolutely hate the word panties. I don't like to read it or hear anyone say it.

Some more random anti-moistness:

In this interview with J.T. Ellison, she says

EE: What is your least favorite sound? Or word?
JT: I cringe every time I hear the word 'moist'. I just don't like it. It's icky.

On 12/18/2002, Michelle Howey asked "What would you like to rename? What words bother you?" and got these responses, among others:

Megan: The word “moist” bothers me. It’s funny, ‘cause on Jeopardy the other day, they had the topic “Moist Things.” I guess David Letterman had a top 10 of topics that would never be on Jeopardy, and just to spite him, Alex Trebeck had it on there. Or the writers on the show did, rather.
I hearby cancel the word moist. You can now only use the word ‘damp.’
Other words I hate: 1) creamy 2) catsup (it’s fuckin’ ketchup, ok?) 3) goiter
becca: megan, it’s funny that you said moist, because i can’t stand that word either. can’t even say it.
mihow: I actually wrote about hating the word “moist” before on here. I started to today and then erased it for fear of repeating myself.
freakgirl: “slacks.” I HATE that word.
tobyjoe: i swear i have known at least 5 girls in my life, aside from present company, who cringe at the words “moist” and “panties”
i dare you to find a guy who cringes at those two words (invalid if you add the words “of Oprah”)

And the mother lode of moist aversion -- on 7/8/2007, Heather Hunter at This Fish Needs a Bicycle interviewing her "favorite commenter", Mike from Chicago:

Q: What's your least favorite word (mine is a tie between 'fudge' and 'moist.' Though, 'panties' is pretty excruciating, too) ...
A: That's easy. Least favorite word: conduit.

Among the comments:

ladykatya: My sister also hates the words 'moist' and 'panties'. I, however, find it HILARIOUS to walk behind her at the mall whispering over and over "moist panties. moist panties. moist panties". She turns this lovely shade of red as she gets more upset with me. ;)
Sunshine: Hey Fish, 'moist' (eeewyuck!) is one of my cringe words - I have a personal vendetta against it....and everytime I say 'panties' I frown.
mel: i too hate the word moist. almost as much as i hate the word fondle.
the other amy: I hate the word moist too. And squat. And sprinkle. All disgusting on their own, but really awful combined.
DLB: Moist is pretty bad, and was my cringe word for along time, then my ex introduced me to the word "chum".
Ms. Tabitha: I love that you hate the word "Moist." I have a few friends like that, and the rest of us can't help but say it with every opportunity.
Kristin: Ditto the word moist, I also strongly dislike the word cream.
mindy Moist is totally my least favorite word! I didn't know we had so much in common. (I just physically cringed when I wrote it, as I did when I read it above) My guy friends, and pretty much any aquaintance I have knows this, too, and they use every opportunity they have to use it in a sentence.
kristen: add me to the 'i hate moist and panties and moist panties' club...

Questioned about her choices, Eustacia at the perfect world explained:

I really just don't like those words. I don't dislike their meanings, but phonetically they conjure up all sort of unpleasant textures. 'Baffle' sounds fibrous and tough like a mat of hair, and 'Cornucopia' is too much like 'corpulent'. I won't get started on 'squab'.

I probably need counseling for this, don't I?

This reminds me somewhat of the way that people talk about synesthesia -- I wonder whether there's any connection.

Summing up the linguistic side of this word-willies phenomenon, we observe that some people develop a strong aversion to certain words, without any obvious reason. The words in question are not taboo in the culture at large. Women seem to be more more likely to have this reaction, though perhaps they are just more likely to talk and write about it.. Sounds and sound associations may play a role (the diphthong usually spelled 'oi', certain consonant clusters, etc.); semantic associations may play a role (slimy textures, lower-body garments like panties and slacks); but the process seems pretty random and erratic, also hitting on random-seeming words like hardscrabble, baffle and tissue. Nevertheless, certain specific words (such as moist and panties in English) seem to be frequent victims. This lexical specificity could be because the process is more deterministic than it seems, or because of cultural transmission that doesn't reach the threshold of creating new lexical taboos, but does create a widely-shared aversion to particular words well above chance levels.

[Several readers wrote to draw my attention to the moist v. used item on an internet gender test -- here's the first of these notes to arrive, from bexquisite:

I enjoyed reading your LanguageLog post on gender-based aversions to certain words and was interested to note that according to's Gender Test, which requires users to answer 50 multiple choice questions, which they claim will predict their gender, women who have taken their test (8,028,608 people in total have taken the test) did seem to find the word moist grosser than the word used, relative to men.

25 Which word is grosser?

For instance, here is a live stat from the test:

Which word is grosser?
#27 Moist Used
Men 48% 52%
Women 56% 44%

Most of the other questions are similarly random (another stat they cite is that men are less likely than women to realise that clams are alive, for example) and, for the record, the test incorrectly predicted me as a man (though with only 4% certainly so I guess they think I'm pretty borderline).

And Jessica Pease commented

I remember, back when the quiz first made the internet rounds that it sparked a discussion on several forums/blogs (I couldn't tell from the site how old the quiz was, but it claims to have been taken over eight million times, and, as I recall, it's pretty old in internet time) as to this phenomenon. I wonder if that's an inadvertent cause of the current moist dislike (which I personally can't remember hearing about previous to the quiz/discussion, not that that means anything), or simply another symptom.

The quiz, by the way, pegs me as statistically a man, even though I'd identify myself as a woman.


[Update #2 -- Mrs. Chili wrote:

It's a cultural reference, and I'm not sure it'd be helpful to your inquiry, but there's a show called Dead Like Me (I think it runs on the Sci Fi channel) where a character has an aversion to the word "moist."

The show is about a young woman who is killed in a freakish accident and becomes a grim reaper - someone who collects the souls of the soon-to-be departed just prior to whatever freakish accident will claim their lives. As the main character is adjusting to her new position, she visits her house and, in one episode, actually enters the house and rearranges some refrigerator magnets to spell the word. Her mother sees it later in the episode and seems caught between that initial reaction of being angry at being baited (she and her now dead daughter didn't get along so well in the end) and wondering just how in the hell the word got there in the first place, given that the kid who would have done the letter arranging is dead.

I don't know if that's helpful at all, but it was the first thing that came to mind when I read the article about women having an aversion to the word "moist." Just as an aside, I'm a woman, and that word doesn't faze me at all.

Wow, cringe words are such a cultural commonplace that SciFi-channel writers have a ghost rearrange refrigerator-magnet letters to spell one out -- and I'd never heard of them!]

[Update #3: David Donnell writes:

My ex-wife hated the word "britches".

A girlfriend years before hated the word "wholesome".

And my mom hates the word "crud".

Sure seems like a female thing--anecdotally at least.

And John Cowan writes:

I ran into this once long ago when I was trying to adjust an application that my team had written for a user's preferences. The user told me what was inconvenient about the app, and I cheerily replied, "Okay, we'll just tweak that for you." She was highly offended, and I couldn't figure out why; I tried to switch to "modify" or "adjust", but "tweak" still slipped out once or twice.

Later, I told my wife the story: she suggested that the problem was the association of "tweak" with "nipple(s)"; I couldn't think of anything better either. Nowadays, of course, [tweak UI] and [tweak Firefox] get far more ghits than [tweak nipple].


[Update #4 -- Jill Lundquist reports a cross-language reaction:

I didn't think I had a reaction to "moist" until I ran into it in a Mandarin Chinese context (I've studied Mandarin for several years). A friend of mine's name includes the character 润 (run4 in pinyin), which translates as "moist". I was mortified learning her name, and found myself wanting to cover my face with my hands, amazed that it didn't sound as lewd in Chinese as it did to me. Fortunately I enjoy learning cross-cultural differences. This is rather like having to learn that 'thick' connotes generosity rather than stupidity, only more embarrassing.


[Update #5 -- Ben Zimmer notes that on 5/17/2007 SarahJane at The Rain in my Purse posted about "conjuring gross and beautiful things", and observed that "People hate moist, but have no reaction to mist, or hoist or joist." ]

[Update #6 -- Neven Morgan, who is male, writes:

Excellent article today. I just thought I'd add that while I've hated the word "moist" for the longest time, there's a fouler word:


It has the same mouth feel as "moist", yet it's somehow worse. Must be that "ntm".

I don't think that there are any words that give me this sort of reaction, other than by association with their meanings -- but similarly, I don't see letters and numbers as colors, though I know that some people do.]

[Update #7 -- Lance Nathan wrote to draw our attention to a 1999 Salon article "The Name Game" (from the height of the dotcom bubble, which includes this passage:

It seems that when Altman and Manning presented the name Jamcracker to a client recently, the reception was not everything they had hoped for.
"I put the name up in front of their creative people," Manning says. "There were a couple of women sitting in. One of them got up and said, 'Oh, that's disgusting.' Another said, 'This is really sick.' I said, 'Excuse me, what are you talking about?' They said, 'We can't explain it, but that name is just creeping us out. We don't know what it is, but could you take it off the wall, please?'" Manning remains mystified by the incident. "There's apparently some strange, uncomfortable meaning attached to it in the minds of some women," he says. "God knows what that could be."

Lance points out that

this didn't stop from taking the name. (I hope it's working for them. I have no idea what they do.


[Mary Louise Ray writes:

My nominee for ickiest word:

"Pus." To me, no other word in the English language more accurately evokes what it describes. I have  always thought of it as some kind of variant of onomatopoeia. It doesn't reproduce a putrid (another icky word) sound in word form, but it certainly reproduces a putrid feeling.

And no, hearing/saying "infection" doesn't bother me in the least bit. Slathering Neosporin on my two boys' cuts and scrapes to get rid of infection doesn't bother me. The word "infection" evokes the same image in my mind, but it doesn't make bile crawl up my esophagus in the same manner as "pus." Gawd. What an awful word. I have actually banned my husband and children from saying the word around me.

Thanks for letting me vent; I have hated that word for more than 30 years.

And a psycholinguist friend writes:

I just quickly polled my sons on words they don't like, and the answers startled me.

M__, age& 11: "rough" and "coarse"
Z___, age 8: "hubbub"

Interestingly, neither of them could explain why. There's obviously something semantic going on. Z___ knows what "hubbub" means, but, when pressed, he said it just sounds bad.

Daniel Ginsburg writes:

Today's post about word aversion reminded me of when I was very young. At about age 2, I felt a distinct dislike for certain words. For example, I didn't like people to call me a "boy;" I strongly preferred "guy." While adults probably thought I was trying to seem mature, in reality it was because of phonetics: I didn't like that /oj/ diphthong, and looked for near-synonyms with long vowels. I also preferred "game" to "toy." Short vowels were not as bad as /oj/, but still problematic, and this led me to dislike my own name (Daniel) and go through a series of assumed or invented names with long vowel sounds.

By age 4 I didn't really care about these things any more, and in retrospect found my various name changes to be embarrassing.

I don't know if this is a common developmental stage, but I thought it might be interesting to you after today's post (and the Monty Python clip).

Jay Cummings points out that

Panties is a diminutive, and so might be associated with the childlike female image, as opposed to a more generic term. The film Anatomy of a Murder introduces the word, with much courtroom laughter, as the only truly descriptive term of a specific item. In the movie it was clearly considered to be much more unmentionable than "underwear", and apparently "underpants" was not an option for the court. This was odd to me, but at the time of the movie, "underpants" seemed not to be a description appropriate to women's garments, though the woman had in fact worn pants, and "underwear" would not be specific.

Some additional comments can be found here, here and here. ]

[Ryan Jordan points out that recent episodes of the CBS show How I Met Your Mother have featured a plot line in which Lily, one of the main characters is bothered, to the point of phobia, by the word "moist". Here's part of the recap from Give Me My Remote:

Robin was complaining that her face was dry and needed some moisturizer, and Ted was so kind as to oblige with some that he had in their bathroom.
(Lesson: Lily HATES the word “moist” – it made me crack up laughing, I have a friend who has the same aversion and same cringing reaction to that word and I love using it. And well, *ahem* apparently a someone else around the world has problem with that word too?, so, Lily is not alone)

Here's what Michell Heller at TV Guide had to say:

Lily’s hatred of the word “moist” was just plain eerie, because I’ve had the same problem for as long as I can remember (I also hate “cheese” and “oily”); I almost thought the writers were spying on me for a moment there.

If I'm following the recaps correctly, there's another episode in which Barney puts on a one-man play that leads with many repetitions of the word "moist".]

[More on moist:

Don't say 'tin' to Rebecca, you know how it upsets her
The long moist tail
Morning mailbag
From cringe to offense


Posted by Mark Liberman at 05:45 AM

August 19, 2007

And why "without sauce"?

Because I've posted from time to time on the neuroscience of sex differences, several readers have sent me links to this passage from Barbara Ehrenreich's 8/2/2007 blog post "Opportunities in Abstinence Training":

Most people, though, require a bit of training to get into the abstinence training business, so I went to the website of WAIT Training to look at the sample curriculum for an abstinence course. The suggested syllabus contained a lot about love, marriage and STD’s—none of it terribly technical – until I got to the part about how to explain the difference between the sexes, where the following demonstration was suggested:

Bring to class frozen waffles and a bowl of spaghetti noodles without sauce. Using these as visual aides, explain how research has found that men’s brains are more like the waffle, in that their design allows them to more easily compartmentalize information. Women’s minds, on the other hand are more interrelated due to increased brain connectors.

Maybe my spaghetti brain wasn’t up to this challenge, but it did seem to imply that sex would involve a mixing of waffles and pasta, possibly with maple syrup for lubrication. Disgusting, yes, but no doubt a surefire recipe for abstinence.

The WAIT training manual in question is here, and the waffles-v.-spaghetti passage is on page 197, under the heading "Make an impact".

The manual gives as a source “Man’s World, Women’s World? Brain Studies Point to Differences” by Gina Kolata, New York Times, Feb. 28, 1995. There was nothing in Kolata's story about waffles or spaghetti, with or without syrup and sauce. And there was nothing in it about effect sizes, either, though there was a flurry of caveats from quoted scientists, like Sally Shaywitz's "We have to be very, very careful". The WAITers seem to understand that concept better with respect to adolescent sexuality than with respect to neuroscience.

Kolata's article does cite some not-so-careful assertions that have since been refuted, for example:

Several years ago, Dr. Witelson reported that women have a larger corpus callosum, the tangle of fibers that run down the center of the brain and enable the two hemispheres to communicate.

Two years later, in 1997, there was a pretty definitive refutation of that idea by K.M. Bishop and D. Wahlsten, "Sex Differences in the Human Corpus Callosum: Myth or Reality?", Neuroscience & Biobehavioral Reviews, 21(5) 581-601, 1997:

It has been claimed that the human corpus callosum shows sex differences, and in particular that the splenium (the posterior portion) is larger in women than in men. Data collected before 1910 from cadavers indicate that, on average, males have larger brains than females and that the average size of their corpus callosum is larger. A meta-analysis of 49 studies published since 1980 reveals no significant sex difference in the size or shape of the splenium of the corpus callosum, whether or not an appropriate adjustment is made for brain size using analysis of covariance or linear regression. It is argued that a simple ratio of corpus callosum size to whole brain size is not an appropriate way to analyse the data and can create a false impression of a sex difference in the corpus callosum. The recent studies, most of which used magnetic resonance imaging (MRI), confirm the earlier findings of larger average brain size and overall corpus callosum size for males. The widespread belief that women have a larger splenium than men and consequently think differently is untenable. Causes of and means to avoid such a false impression in future research are discussed.

There are some differences in the average characteristics of male vs. female brain anatomy and physiology, but all of them involve highly overlapped distributions and small effect sizes. It's not like frozen waffles vs. sauceless spaghetti -- it's more like Eggo vs. Pillsbury frozen waffles, or Barilla vs. Buitoni spaghetti.

If a WAIT trainer compared a bowl of Barilla spaghetti to a bowl of Buitoni spaghetti in order to demonstrate the difference between male and female brains, this would make a more scientifically valid impact. Here's a suggestion for an even more impactful exercise -- combine a handful of each type out of the box, cook them together, drain, serve -- and ask the class to sort the strands by brand.

Posted by Mark Liberman at 06:25 PM

Peeveblogging marches on

Nearly two years ago, in one of my first posts on Language Log, I introduced the term "peeveblogging" to describe "weblogs slavishly devoted to particular points of grammar, punctuation, or usage." My two exhibits at the time were "Literally, A Web Log" (excoriating the not-so-literal usage of literally) and "Apostrophe Abuse" (bane of the greengrocer and comrade-in-arms to Lynne Truss). Now, thanks to Nancy Friedman at Away with Words, here are two more like-minded peeveblogs.

The "blog" of "unnecessary" quotation marks (Making Fun of Bad Punctuation Since 2005) is exactly as advertised: a collection of signs, ads, and articles festooned with random inverted commas. An example: The "last" person to leave "dont forget" to close the door. Blogkeeper Bethany comments: "While I'm sure they appreciate the reminder, I wonder how you know if you are the 'last' person or not?"

Along the same lines, lowercase L asks the plaintive question, "Ever notice hand-written signs with letters in all-caps, except for the letter L? It looks like an uppercase i ... WHY DO PEOPlE WRITE lIKE THIS?"

A full analysis of these language-related gripes would require situating them in a larger blogospheric context. Bloggers these days play the "naming and shaming" game with a wide array of social artifacts, from customized license plates to passive-aggressive notes to crummy church signs. The targets of these snarkers tend to be various public displays of language use in graphic form. Thanks in large part to the prevalence of cellphone cameras, such displays can now be easily tracked down "in the wild" and shared with online communities of sympathizers. It's all part of the complex and ever-evolving ecology of peevology.

Posted by Benjamin Zimmer at 01:41 PM

I mean, you know

Matthew Hutson writes:

Sometimes I wonder if there are underlying personality differences between people who punctuate (litter?) their speech with "you know" versus those who use "I mean" more frequently. Any hunch on that?

I don't have any hunches, and I don't know any studies about correlations between personality dimensions and choice of fillers, though I'll ask around.

However, we might be able to infer something from demographic variables. Since LDC Online lets me do database queries over boolean combinations of text strings and demographic categories, this is a perfect topic for a Breakfast Experiment™.

In the 14,137 conversations (26,151,602 words) of the LDC conversational telephone speech corpus, the frequency of "you know" compared to "I mean" appears to increase with increasing age:

"you know"
"I mean"
"you know"/"I mean"

On the other hand, the frequency of "you know"relative to "I mean"appears to decrease with more years of formal education:

"you know"
"I mean"
"you know"/"I mean"
High school

And there's also slight tendency for the relative frequency of "you know" to be higher among women than among men:

"you know"
"I mean"
"you know"/"I mean"

You could spin out a theory that greater use of "I mean" means greater involvement with self as opposed to others, and that age makes people less self-involved, but education makes them more self-involved, and men are somewhat more self-involved than women.

But this would be even more tenuous than such explanations generally are, since the demographic variables in this collection of conversations are not orthogonal -- in other words, the age categories are not balanced by education and sex, and the educational categories are not balanced by age and sex, and the sex categories are not balanced for age and education. So you'd at least want to do some sort of multiple regression, and I don't have time for that this morning (because I'd need to re-scan the underlying data to get the raw materials).

Posted by Mark Liberman at 07:53 AM

August 18, 2007

News flash: the biggest users of "like totally" are middle-aged men

Yesterday's post, "Like Totally", cited Iyeiri et al.'s finding that men use discourse-particle like more than women in a range of professional settings. They also found the expected pattern of more use of discourse-particle like in settings that are more casual, less formal, less rehearsed, etc.. Several readers put these two things together and made the connection to the well-known tendency for men to use less formal speech styles, other things equal -- see for example the facts about g-dropping in English. Some readers went further, and speculated that professional settings might interact in a special way with sex differences in the use of discourse-particle like, since it's not only stigmatized but also female-associated. For example, Bryn LaFollette wrote:

I wonder if the disparity shown in this data might in fact be a result of the well-known stereotype being a negative influence on female speakers due to the public or group setting. That is to say, are female speakers in each of the settings described holding back from using like due to the worry, consciously or not, of fulfilling the stereotype and thereby being taken less seriously, whereas the male speakers are not similarly constrained?

This could well be true -- through the observed sex differences in like usage were not larger than reported differences in stigmatized forms that are not stereotypically sex-associated.

I thought I'd take a look at sex and age differences in use of discourse-particle like in a different sort of corpus, namely the collection of transcribed telephone conversations from the LDC, which includes 15,672 conversational sides where the speaker is female and 12,571 conversational sides where the speaker is male. The uses of like -- 184,184 for the women and 156,799 for the men -- are not tagged according to part of speech or function, and I don't have time this morning to select a sample and tag it, so I used some proxies.

I started by looking at the frequency of some strings like "like totally", "like really", "like wow". Here are the raw counts:

  Men Women
"like totally"
"like really"
"like wow"

Normalizing by conversation, men used "like totally" once per 137 conversations (on average), while women used "like totally" once per 141 conversation; men used "like really" once per 22 conversations, women once per 21 conversations; men used "like wow" once per 59 conversations, women once per 45 conversations.

Normalizing by "like", men used "like totally" once per 1,704 overall instances of "like", while women used "like totally" once per 1659 instances of "like"; men used "like really" once per 274 uses of "like", women once per 244; men used "like wow" once per 704 uses of "like", women once per 531.

These results suggest rough parity, with perhaps very slightly more frequent use of (these cases of) discourse-particle like by women -- though men still used "like totally" slightly more often, at least on a per-conversation basis!

Looking quickly at the effects of age, it seems that discourse-particle like is associated with middle-aged as well as young people (in these conversations, recorded mostly around 2003). The numbers in the table below are overall counts for "like" for the specified group, divided by counts for each of the cited strings. Thus people aged 20-39 used "like totally" once per 1,648 uses of "like", and people aged 40-59 used "like totally" once per 1,619 uses of "like", whereas people aged 60 and over used "like totally" only once in 3,646 uses of all sorts of "like".

  20-39 40-59 60-69
"like totally" 1,648 1,619 3,646
"like really" 192 187 892
"like wow" 695 538 2,675

Finally, one weird statistic, for which I have no explanation. (This is no longer about discourse-particle like, but rather about the conjuunction-ish usage; but it's too strange to leave out, though it's not really relevant.)

In this collection, women used the expression "like I say" 545 times, and the expression "like I said" 2,264 times. But men used "like I say" 563 times, and "like I said", 1,302 times.

For women, the ratio of "like I said" to "like I say" was 4.51 to 1,, whereas for men, it was only 2.3 to 1. Why are women more likely to use the past tense in this expression -- or men more likely to use a timeless generic?

Posted by Mark Liberman at 06:47 AM

August 17, 2007

Like totally presidential

Listening to President Bush's press conference last Thursday, Patrick McCormick was surprised by the sentence quoted below (as per the transcript at

It surprised me, frankly, because the impression you get from people who are reporting out of Iraq is that it's like totally dysfunctional -- that's what your -- I guess your kind of -- your friend or whoever you talked to is implying.

Patrick's reaction:

It struck me as unusual to hear an old fogey like George Bush use "like, totally". So I'm curious, is it unusual for people over age fifty -- old conservatives in particular -- to use that construction? Do you suppose he acquired it from younger staff?

Over age sixty, even -- W was born 7/6/1946.

[Update -- I've added some actual facts at the bottom of the post. If you want the executive summary, it's that

  • Adults were using discourse-particle like fairly frequently in professional settings, even a decade ago;
  • Contrary to stereotype, men apparently use it more than women do.


First, let's get some transcriptional issues out of the way.

Here's the audio from the phrase in question, first in context and then by itself:

Here's my transcript (which is the same as the official one, except for false starts and filled pauses):

I've- I've- it surprised me, frankly, because
the impression you get from
people who are reporting out of uh
uh Iraq is that
it's like totally dysfunctional.
That's what your- your- I- I guess your kind of-
your friend or your- whoever you talk to is implying

Second, I'd like to point out that this is not an isolated case. On July 19, 2007, President Bush said:

And the reason I say that, it just shows how difficult it is to do what some assume can be done, which is, like, totally seal off the border.

And on April 2, 2004:

I mean, I think it's a wonderful story about a mom and a wife who, instead of getting, like totally distraught with the circumstances, says, I'm going to go back to school.

So it seems that W is a regular user of "like totally" But I don't think this is surprising for someone W's age. That's because I say things like that myself, and I'm only a couple of years younger than he is. And I think I have a reasonable story to tell about why this is.

High-frequency use of totally is now associated with Valley Girl modes of speech, especially when it's used as an emphatic adverb meaning something like "definitely", e.g. "omg, that's totally going on my myspace". But as an occasional modifier of gradable predicates (like dysfunctional), totally has been totally unsurprising for hundreds of years. Thus Jane Austen, who was fond of the word, wrote in Emma:

Harriet Smith has some first-rate qualities, which Mrs. Elton is totally without. An unpretending, single-minded, artless girl --- infinitely to be preferred by any man of sense and taste to such a woman as Mrs. Elton.

As for the use of like as a particle "to express a possible unspecified minor nonequivalence of what is said and what is meant", as Muffy Siegel quotes Lawrence Schourup as putting it, it's also now associated with ValSpeak, but it started with the beat generation in the 1950s ("it's like nowheresville, man") and continued with hippies in the 1960s ("it's like psychedelic, man"). I'd guess that Moon Unit Zappa's generation got this feature from their parents, not the other way around.

So maybe W has been primed by the twins to use "like totally" -- but it's also plausible that he picked up like and totally like totally naturally, as a youth in the 1950s and 1960s, just like the rest of us did.

[Another possibility is that W meant to use the young-female associations of "like totally" to tag Iraq nay-sayers as unmanly. But the other quoted examples suggest that this is like, unnecessarily subtle.]

[Topher Cooper comments:

Seems to me that at least since the 60's there has been more and more use of "like" to mean something other than "to express a possible unspecified minor nonequivalence of what is said and what is meant" and more to simply add emphasis or draw attention to the word it is modifying. If a "hippie" said "That's, like, cool, man!" they (ok, we) were definitely describing "that" as "cool" not as something "very similar to cool." But it isn't really what I understand the term "intensifier" to mean either -- e.g., "it" isn't necessarily "very cool". There's more of a sense of "look here, the coolness of 'it' is somehow surprising or notable" but not necessarily in degree.

I also think that something got missed in your analysis was what the "like" modifies. The usage that came in with ValSpeak treats "like totally" as a unit. So when Moon Unit said that something was "like totally grody" the parsing might almost have been "(like totally) grody." But Bush's phrase would almost certainly have been "like (totally dysfunctional)".

I apologize for giving an oversimplified if not entirely false impression of Muffy Siegel's excellent paper on the meaning of like -- she explicitly asserts that the current usage is different from "the much older (and perhaps fictional ...) 'beatnik' use". But she doesn't offer an argument for this differentiation, and I'm not entirely convinced. At least, there are a range of cases where older and newer uses (including entirely standard uses) overlap.]

[OK, I thought I'd better try to find out what the facts are. Luckily, someone has actually looked into things empirically, rather than just evaluating their own stereotype-tinged impressions, as I did.

According to Yoko Iyeiri et al., "Gender and Style: the Discourse Particle like in the Corpus of Spoken Professional American English", English Corpus Studies 12, 2005.

We have investigated the Corpus of Spoken Professional American English and found that the discourse particle like is attested in the exploratory talk of the national meetings of mathematics tests and reading tests, both held in the 1990s, to a noticeable extent. By contrast, the expository talk of White House press conferences and faculty meetings of the University of North Carolina provides far fewer examples of the discourse particle like. As for gender differences, the same item is more frequently employed by male speakers. This result does not necessarily support the generally accepted view, which argues that it is a characteristic feature of young female speech.

Iyeiri et al. identify the like of examples like "right so I'm just like below the allotments just now" with the old parenthetical use, which the OED calls "dialectal and vulgar", and traces back to Fanny Burney's Evelina (1789):

Father grew quite uneasy, like, for fear of his Lordship's taking offense.

They argue that "the discourse particle like is not meaningless or expletive, although it is not always an easy task to describe the meaning", and observe that "on the whole, it seems to convey the feelings of approximation and illustration".

Here's their figure showing the frequency per 10,000 words of the discourse particle like by men and women across four of the settings sampled in the Corpus of Professional American-English, which "includes transcripts of conversations of various types occurring between 1994 and 1998", and"consists primarily of short interchanges by approximately 400 speakers that are centered on professional activities broadly tied to academics and politics, including academic politics". WH (White House press conferences), FM (faculty meetings at UNC), CM (conferences on mathematics tests) and CR (conferences on reading tests):

You can read their preprint for more information and discussion, but two things are clear:

  • American adults were using discourse-particle like fairly frequently in professional settings, even a decade ago;
  • Contrary to stereotype, men use it more than women do.


Posted by Mark Liberman at 08:21 AM

August 16, 2007

[Sic] news from Nature

Nature, the "International weekly journal of science", is updating its mission statement. Below, I've reprinted the entire announcement ("Men [sic]", Nature 448, 728, 16 August 2007), because ... well, just because.

Our 1869 mission statement is out of date.

It was 1833 when the English polymath William Whewell first coined the word 'scientist'. Over subsequent decades, the word gradually replaced such commonly used terms as 'natural philosophers' and 'men of science'.

By the middle of the nineteenth century, this last phrase was already out of date: pioneering women such as Mary Fairfax Somerville and Caroline Herschel were proving their worth as astronomers, mathematicians, botanists and palaeontologists.

The original mission statement of this journal, first printed in Nature's second issue on 11 November 1869, was therefore running behind the times when it referred to "Scientific men" -- even though, to be fair, the word 'scientist' did not enter general circulation until the end of the nineteenth century. In other respects it is well worded -- which is why we print it every week in the Table of Contents.

The statement expresses two purposes for this publication. The first is "to place before the general public the grand results of Scientific Work and Scientific Discovery ; and to urge the claims of Science to a more general recognition in Education and in Daily Life". Today this is as important as it has ever been -- although members of the public have important considerations to lay before scientists, and Nature reflects them also.

The second thrust was expressed as follows: "to aid Scientific men themselves, by giving early information of all advances made in any branch of Natural knowledge throughout the world, and by affording them an opportunity of discussing the various Scientific questions which arise from time to time."

In printing the statement verbatim every week as we have done, making it clear when it originated, we have hitherto assumed that readers will excuse the wording in the interests of historical integrity. But feedback from readers of both sexes indicates that the phrase, even when cited as a product of its time, causes displeasure. Such signals have been occasional but persistent, and a response is required.

There is a convention within the English language by which writers quoting text can indicate their view that a particular phrase is inappropriate. That is to insert sic, a Latin word meaning 'thus', after the phrase -- in effect expressing the sentiment 'alas, dear reader, this is what was said'.

This is what we will do in the mission statement from now on. The small, belated change takes place against the vast backdrop of a scientific world where the upper echelons of academia, academies and prestigious awards are still numerically greatly dominated by men, and where outright discrimination can still rear its ugly head (see page 749). In this context, the insertion of a Latin word in a couple of paragraphs may be a tiny step: but it is at least one in the right direction.

[Via The Chronicle of Higher Education's News Blog ("Is This What They Mean by Evolution?", 8/15/2007), via Helen Davies]

Posted by Mark Liberman at 02:21 PM


One of the stories on All Things Considered yesterday was "What can satellites do for domestic spying?", in which "John Pike, director and founder of, talks with Robert Siegel about the capability of satellites that will be used by Homeland Security to spy within the United States."

The end of the interview gives us an unusual example of the subtle social sanctioning of morphological innovation. Or is it just appreciative emphasis?

Robert Siegel: When we see what's done with uh say closed-circuit television imagery, in- in Britain,
*after* uh crimes have been committed,
it seems that all those images are very useful to go back and see what happened, that is retrospectively.
Would that also be a use of satellite imagery, as opposed to real-time predictive # uses?
John Pike: Probably # not, because these uh cameras in the spy satellites are point and click.
You have to consciously decide, "I'm going to acquire an image of this particular area".
Maybe happenstantially you're going to find some information that would be
relevant, but I wouldn't count on # it.
Robert Siegel: "Happenstantially".
John Pike: Yes.
Robert Siegel: Thank you very much for ...

According to the OED, happenstance is a "amalgam of HAPPEN(ING + CIRCUM)STANCE", coined in America in the late 19th century. The two earliest citations are:

1897 Outing (U.S.) XXX. 557/1, I guess it was just a 'happenstance'.
1911 Dialects Notes III. 544 Happenchance, happenstance, happening, circumstance. Used facetiously. Blend-formations.

The OED has no entry for happenstantially, but there are more than 300 Google hits. The word also occurs 22 times in sources indexed by Google Scholar. These mostly seem to mean "according to circumstantial accident", as in a book review by Jeannette Mungo, "Empowering the past, confronting the future: the Duna people of Papua New Guinea", Journal of the Royal Anthropological Institute 13 (2), 521–522, 2007:

The authors' idea of flexible groups -- cognitive descent lines that are deployed happenstantially in response to the on the ground situation of descendants ...

But sometimes the meaning seems to drift in less expected directions (W.A. Greene, "A Kernighan-Lin local improvement heuristic that softens several hard problems in genetic algorithms", The 2003 Congress on Evolutionary Computation, v. 2, pages 1006-1013):

In this section we experiment with four problems which have been touted in the literature as being ones especially difficult for genetic algorithms. Happenstantially, earlier researchers who worked on these problems asserted that niching was necessary for solving them. Our experiments will demonstrate, in particular, that our Kernighan-Lin local improvement heuristic can replace and indeed improve upon niching as the key for solving these problems.

The NYT does not seem to have countenanced happenstantially, but has printed happenstantial several times, including Carol Vogel, "The Barnes stays local in selecting its leader", NYT 8/8/2006:

Mr. Rudenstine said it was only ''happenstantial'' that Mr. Gillman led a Philadelphia institution. He said that the Barnes's search was international in scope, with more than 130 candidates expressing interest. ''We started out with no geographical ideas,'' Mr. Rudenstine said.

And Holland Cotter, "Bill Rice, 74, Downtown Artist, Actor and Impresario, Dies", 1/29/2006:

Mr. Rice was recently described by the photographer Larry Mitchell as ''the last Bohemian,'' chronically but contentedly short of money, interested only in happenstantial fame, rarely traveling more than a few blocks from his home.

The Irish Times allowed the same word in Douglas Kennedy's 11/12/2005 book review:

The Year of Magical Thinking is Didion's attempt to sort through the psychic debris of these personal disasters; to try to understand the terrible happenstantial nature of things.

And Google finds more than 1,000 examples of happenstantial, which is apparently the thin edge of the happenstantially wedge.

Thus John Pike deserves neither credit nor blame for coining happenstantially, but Robert Siegel's double-take was strikingly effective. Without it, I don't think I would have noticed the word. Now it's going to be hard to avoid having it creep into my own vocabulary.

[Update -- Ben Zimmer writes:

Though it doesn't show up in an archive search on the New York Times site, I found an example of "happenstantially" via ProQuest in the NYT Book Review of July 1, 1970. Reviewing novels by Carol Evan and Katherine Dunn, John Leonard writes: "Here, then, are two first novels, both of them accomplished, and both of them written, happenstantially, by women."


[Update #2 -- Bruce Rusk points out that Google Books has quite a few hits for happenstantially and even more for happenstantial.

Although Google Books' dates (and other metadata) are very unreliable, one citation appears to be from a chapter "Dimensioning in Associative Memory", by Benjamin F. Cheydleur of the Philco Corporation, pp. 55-92 of a 1963 work entitled "The Augmentation of Man's Intellect by Machine" (Vistas in Information Handling, vol. 1), edited by Paul William Howerton and David C. Weeks.

"Whatever the device, it must function precisely as a 'content-addressable' memory, i.e., as a memory which has cells that are addressable by the presentation of a part of the code happenstantially contained in the cell, rather than by the physical location code ('address') of the cell."

This is a somewhat famous book, since Douglas Englebart has a chapter in it under the title "A Conceptual Framework for the Augmentation of Man's Intellect by Machine".

And at least the 1983 edition of Laurence Urdang's Synonym Finder has happenstantial in the list of synonyms for ironic (!):

It may be there in the 1961 edition as well.]

Posted by Mark Liberman at 09:11 AM

August 15, 2007

The sound of surprise

A guest post by Cynthia McLemore.

David Brooks met an East Texas truck driver at a diner in Virginia, and describes him like this:

I don't know what came first, the mystique of trucking or the country music songs that defined the mystique, but this trucker had been captured by the ethos early on and had never let it go. He wore the right boots and clothes. He had a flat, never-surprised way of talking. He didn't smile or try to ingratiate.

Now that's the guy you want to sit next to on a commuter train. He's not likely to distract you from your Harper's with a bunch of attention-grabbing pitch peaks aimed at some wireless target:

Well we were TALKing about going to SPAMalot, but we....

And I bet he wouldn't be as likely to mistake the close connection between two cell phones for the real physical distance between one in LA and one in NY, and shout:

LISten, CHARlie, I MADE reserVAtions for us at the ContiNENtal.

Maybe he wouldn't even raise his voice -- volume and pitch -- in anger:


Because when you've traveled mile after endless mile of flat gray road, maybe you know that every little bump eventually resolves into more of the same. His son is probably a very calm person. Instead of

Bud, MOVE! That's a SCORpion, honey, brush it OFF!

He'd probably just lean over and wordlessly flick it away.

My model of traditional American macho is a Texas variety -- farmers, engineers, and, yes, my grandfather worked as a cowboy. Macho male storytellers in Texas can surprise you in the middle of a yarn with a big bunch of intonational activity -- flat, flat, flat, WILD, WILD, WILD, whew, The End -- that kind of thing. I've heard recordings of men from rural West Texas, men in boots and cowboy hats, using the same intonational structure for narratives as the sorority girls whose speech I studied in Austin. Of course it's hard to see the similarity until you look at the pitch tracks.

Truck drivers encounter hills, valleys, overpasses, underpasses, sharp turns, and winding, winding roads, right? After the third or fourth time you meet a variation, I guess it's pointless to get excited about it, play with it in your sound structures, analogize to your thoughts and feelings. Then again, the mind is a restless, tricky, symbol-making thing.

One of the best surprises I got as a grad student was from an African Fulbright scholar who'd recently arrived in Austin. He was puzzled. He'd lost something precious to him, he said, and when he told the Fulbright office director, she said:

Oh NO! you DIDn't!

Struggling with the new language and culture, he'd stripped away the pitch peaks and heard:

oh no, you didn't.

Why would she contradict him? Did the intonation convey surprise? But why did she negate what he said?

My Fulbright friend was delighted to delve into these issues, buoyant in his new cultural immersion, and thrilled to discover that someone you hardly know would get into your head and express what they think you're thinking or feeling -- i.e. empathize, feel your pain, try to hold off the sting of your loss. We just sat there loving language for a while after we'd traveled down that winding road of shared thought.

Maybe truck drivers with flat intonation liven up at the end, when they get where they're going? The dog! Brooks said that guy travels with a dog. Maybe it barks and whimpers and warbles and wails.... maybe it even sings to C&W.

[Guest post by Cynthia McLemore]

Posted by Mark Liberman at 02:37 PM

When bad interaction happens to good people

About a year ago, puzzled by the language in a software-update dialog box, Geoff Pullum wrote ("If you can answer this, you are not paying attention", 7/10/2006):

Producing language that other people will be able to understand involves not just having a picture in your mind of the scenario and designing a nice-looking (and policy-compliant) dialog box that you feel represents your view of it. You have to deploy a shared linguistic system, according to established rules, using lexemes of known meaning, to present that picture to others in a way that will work for them. You have to consider whether there are other ways of viewing the situation at hand. You have to examine the wording you have chosen to see if it has ambiguities or unclarities. You have to put yourself in the place of a person who did not work with the developers of the operating system, someone who sees your dialog box without the benefit of any prior experience with the way you conceptualize things, and you have to ask yourself whether they would understand what to do.

I've recently encountered a web application that exhibits the most intense and systematic violation of this advice that I've ever seen. Unfortunately, it's an interface that I'm in some sense responsible for.

Allow me to hold you with my glittering eye, like the Ancient Mariner, and tell the tale. (Though if you're impatient, you can evade the backstory and go directly to the software description...)

The ship was cheered, the harbour cleared, and the university where I work installed a new computer application that will make it possible to keep much better track of maintenance and repair activities. Better information on the history of individual repair requests will be available faster, and planners will have access to a whole new world of summaries organized by time, location, type of work, and so on.

This system, known as Maximus FacilityFocus, does many other things besides, and is highly buzzword-compliant, as the company's web site explains:

FacilityFocus® allows your organization to improve, automate and integrate all of your existing facility management, asset management and maintenance operations. Based on open, object-oriented development technologies, FacilityFocus is Web-enabled and offers compatibility with multiple operating systems, multiple databases and multiple platforms.

MAXIMUS has developed an unparalleled level of depth and breadth of coverage in FacilityFocus, allowing our customers to confidently manage millions of square feet of buildings and associated assets. From work management and equipment maintenance, to inventory control and lease and contractor management, to linking with your ERP solution, FacilityFocus is designed as a flexible solution to work the way you do. To help you, as the asset manager, control budgets while in the midst of change, be it technological, managerial, or customer-driven.

The "Web-enabled" part means that the whole thing is available via web interfaces to all members of the university community, with each person's level of access to inputs and outputs controlled through the university's central Kerberos authentication system. Individuals can enter repair requests directly, and monitor the progress of their requests, without having to wait on the phone to talk to someone in an operations center attempting to translate their requests and questions into some combination of interaction with filing cabinets full of paper forms and commands to some clunky old DOS-era application querying an archaic database.

Truly, this is all a big step forward. What's not to like?

Before I get to that, let me mention that one of my jobs is Faculty Director of my university's residential system, known as "College Houses and Academic Services". There are about 7,000 students, faculty and staff who live in these residences -- including me -- and every week, a certain number of us need to report a leaking faucet or a faulty light switch or a broken window. We used to do this by calling the people in the facility operations center, or by interacting with a web application that basically just sent an email to one of the same people, who would interact with the back-end systems to get the needed work taken care of.

But now, we're asking everyone to enter their (non-emergency) work requests via the web interface to FacilityFocus. And that web interface is a wonderful example of what can go wrong when a designer fails to heed Geoff's advice to "put yourself in the place of a person who did not work with the developers of the operating system, someone who sees your dialog box without the benefit of any prior experience with the way you conceptualize things, and ... ask yourself whether they would understand what to do".

I don't really know what the design process was in this case, but I suspect that the planners at Penn trusted the designers from Maximus, and the folks at Maximus who designed the interface were planning for use by building managers and accountants and other staff, to the extent that they were thinking about users at all and not just web-enabling the basic structure and function of their back-end database.

A few days ago, I got my first look at FacilityFocus. In a couple of weeks, thousands of undergraduates are going to start trying to use it. And when they run aground, as many of them probably will, they're going to complain to (people who will complain to) me. So I've written an underground guide, "The Legend of FacilityFocus", which uses the metaphor of beating a fantasy adventure game like The Legend of Zelda, and offers tips and tricks to make it to the final screen. If you want to know what I think (some of) the problems of the Maximus interface are, read the guide and you'll get the idea.

But adventure-game interaction is really the wrong metaphor. The designers of good adventure games have a excellent idea of what their target users are like, and they've carefully planned and tested for their users' reactions to each display and each event in the game. The obscurity and difficulty of the interaction is carefully crafted to be suspenseful, entertaining -- and eventually overcome. In contrast, an interface like FacilityFocus seems to be "mind blind".  The obscurity and difficulty of the interaction is a random result of an apparent failure to try to model user reactions at all.

I should say that the people at Penn who worked with Maximus to install this system are not only excellent managers but also excellent communicators, both one-on-one and in public presentations. But there's something about the indirect nature of communication through a software interface that seems to decouple good communicators from their normal ability to craft messages with the audience's reactions in mind. When the interface comes out of a set of committees spanning several organizations, the indirection is even greater, I guess, and the stage is set for a web-enabled interactive shipwreck.

Posted by Mark Liberman at 09:51 AM

August 14, 2007

Limiting diversity: One negative too far

Bob Lieblich forwarded this quote from Andrew Sullivan's blog ("Thanks", 8/13/2007), thanking the bloggers who substituted for Sullivan while he was away on vacation:

A moment of sincere and deep thanks to my guest-bloggers this past week. Aaron insisted I stay off the web for the week, and, amazingly, I did, so reading the week's Dish today in one sitting was a great pleasure. Thanks: Liz, Bruce, Stephen and Eric. Enough diversity for a crackling debate at times, but not too much for incoherence. [emphasis added]

This follows the general recipe for overnegation: two or more negatives (one of them within a word), and a meaning that deals with regions on a semantic scale (here the effects of diversity of opinion in various quantities ranging from inadequate to excessive).

To help you think through this particular case, here's an example where everything works:

...enough breeze for sailing, but not too much for comfortable conversation and sight-seeing...

To reframe this phrase along the lines of Sullivan's attempt, we could change it to:

...enough breeze for sailing, but not too much for blowing the roofs off of houses...

Or we could edit Sullivan's phrase so as to limit the cited diversity in a logically correct way:

Enough diversity for a crackling debate at times, but not so much as to lead to incoherence.

Anyhow, Andrew Sullivan knew what he meant, and you probably did too. Our poor old monkey brains are not quite evolved enough for this stuff yet: Multiplex Negatio Ferblondiat.

[Update -- John Cowan writes:

This example reminds me of what James Thurber said Harold Ross said (according to other sources, to Robert Benchley): "I don't want you to think I'm not incoherent". Thurber characterized this overnegation as "[Ross's] limited vocabulary got tangled up in his fluency."


Posted by Mark Liberman at 08:36 AM

Present tense in advice and predictions

Kathy Jolowicz fowarded this item from a words forum she frequents:

Suppose person #1 says that he's soon going to Athens and will visit the Acropolis. Person #2 replies: "That's great! But you want to go early in the day before the heat builds ... and before the tourists overrun the place."

Easier question: Is person's #2's line supposed to be "You want to ..." or "You'll want to ..."? (Or does it work either way?)

Harder question: What kind of a statement is #2's statement anyway? It's not a prediction (I guess). It just doesn't seem to fit into the normal pattern of past, present, and future for verbs. (As in "You went early," "You're going early" and "You will go early", where a person's going early is some event that's located in the past, the present, or the future.)

It may confuse things that the example uses "you want" in the old sense that refers to requirements rather than desires. The same questions come up if we phrase the same advice in another way, e.g. "It's better to go early in the day" vs. "It'll be better to go early in the day".

And the answer to the first question, it seems to me, is that it depends on what you mean. The "present-tense" versions are generic, i.e. timeless, while the versions with will make a somewhat more time-specific statement. There's not much difference in this case, since the generic statement is implicitly constrained to relevant times, and it's obvious in context that the relevant time is the period of person #1's visit to Greece. But my impression is that the generic versions of such advice are a bit politer, due to being vaguer.

As for the second question, I don't think there's much of a mystery. "Present tense" morphology is routinely used in English, as in many other languages, for generic statements, which are timeless or at least true at all contextually relevant times. And generic statements are often used to be give advice or to make advice-like predictions.

Posted by Mark Liberman at 08:34 AM

August 13, 2007

Why Do Canadians Eat Donair?

There is a meat dish which in slightly different forms is widely eaten in the Eastern Mediterranean as well, in recent years, in many other countries. (See the Wikipedia articles Döner kebab and ドネル ケバブ.) In the United States, it is usually called gyro(s), from Greek γύρος, sometimes pronounced [dʒaiɹow] according to its spelling, with the <s> taken to be the plural morpheme, sometimes [jiɹos] as in Greek. In Canada, the same dish is almost always known as doner, from Turkish döner, also spelled donair. A few restaurants in Ontario seem to call it gyros, but here in British Columbia, and in my experience in Alberta as well, gyros is virtually never used. This is true even in restaurants run by Greeks. A new place specializing in doner just opened here in Prince George. The owners are Greek,but they use the term donair on their menu and even in the name of their restaurant. I have been wondering for a long time why it is that this dish almost always goes by its Greek name in the United States but by its Turkish name in Canada.

One hypothesis that comes to mind is that it has to do with the number of immigrants from the two countries. Perhaps Greeks predominate in the United States but Turks in Canada. That doesn't seem to work. The number of Greeks in Canada is about 215,000, whereas there are only about 25,000 Turks. It is conceivable that relatively more Turks are in the restaurant business, but although I don't have statistics on this, it seems unlikely: I have encountered a lot more Greek restaurants than Turkish restaurants.

I'm guessing, instead, that this is an example of a founder effect, that is, that it is essentially an accident, due to the language used by the first people to introduce and popularize the dish. If the initial introduction is successful and other restaurants imitate it, the term originally used may spread. In the case of Canada, if doner was used first, if Greek restaurants introduced the dish out of awareness of its popularity in other restaurants, where it was called doner, they may have used doner rather than their own name in order to attract customers already familiar with the dish under its Turkish name.

Such information as I can find on the introduction of doner into Canada supports this hypothesis. According to the History of Donair in Canada web site, this dish was introduced in Canada at Velos Pizza in Bedford, Nova Scotia, which later became a place called King of Donairs. I have no idea how authoritative the history given by these sites is.

If any of our readers know more about the history of doner in Canada, I would be most interested.

Incidentally, the Greek term is actually derived from the Turkish. The earlier Greek term is reported to be ντονέρ [doner]. Greek γύρος "turning" is a calque of a Turkish original that was first borrowed into Greek, then replaced after independence.

[Update: 2007-08-14. One reader writes that doner makes him think of "Donner Party Kebab". The Donner Party story being presumably better known to Americans than Canadians, one could imagine this discouraging the use of the name doner in the US but not in Canada, though I don't think that this is the real causal factor. (The Donner Party was a group of settlers who got bogged down in the snow while attempting to cross the Sierra Nevada into California in 1846-18947 and ended up eating several of their party in order to survive.)

Posted by Bill Poser at 11:33 PM

August 12, 2007

Is the People's Democratic Socialist Islamic Arab Republic Freer than the Kingdom of Denmark?

For the answer, see Brendan O'Connor's hysterical and insightful post It's All in a Name, with graphs and statistics, but no odds ratios.

Posted by Bill Poser at 11:28 PM

e e cummings and his iPod: Faith vs. WF again

I've been spinning out a series of postings on (several different kinds of) conflicts between faithfulness (Faith: roughly, stick to the original) and well-formedness (WF: roughly, make things fit your system).  Yet another case came up on the American Dialect Society mailing list back in December: must the first letter of a sentence always be capitalized?

Jim Smith asked on 12/14/06:

Although there are obvious and simple ways to avoid this, if pH, e e cummings, or another similar word or phrase is at the beginning of a sentence, is the initial letter capitalized?

and Beverly Flanigan followed up with:

... would you all capitalize "r-lessness" at the beginning of a sentence??  With phonemic slashes [i.e., "/r/-lessness"], maybe, but how about if spelled out, as I've done?

In this case, Faith says to preserve the details of expressions (including initial lower case), while WF says to make the spelling conform to a convention that demands initial upper case.

(Note that "well-formedness" here does not refer to some absolute sense of correctness, but only to conformity to some system -- a variety of language, a style sheet, whatever.)

I posted on Language Log a while back on some cases where points of mechanical style in writing are problematic when material is quoted: order of quotation marks and periods/commas, double vs. single quotes, general lowercasing vs. conventional capitalization, indications of emphasis, the serial comma.  Here, well-formedness generally trumps faithfulness; original schemes for such things are generally converted to the quoter's home scheme.

In many other details, WF will always win: except in very special circumstances, no one attempts to reproduce type fonts, line divisions, or many other details of the physical appearance of text.  Other details -- whether paragraphs are indented or flush left, whether a dash is indicated by one en-dash, two en-dashes in sequence, or one em-dash, and whether the interpolated material is solid with the surrounding text, or separated from it by a space, etc. -- are occasionally preserved, but usually not.

As for the serial comma in coordination, as far as I can tell, its use generally conforms to the quoter's preferred scheme, whether this is anti-serial, with no serial comma (the majority style), or serial (which is my own): anti-serialists quoting people generally remove the commas, while serialists put them in.  This is WF.  A notable exception to this is in titles, which are often quoted faithfully.  The New York Times, which is pretty relentlessly anti-serial, nevertheless seems generally to preserve the commas in titles (of books, in particular).

Now to initial capitalization.  There are (at least) three conventions at issue:

(WF1a)  The first non-quote character of a sentence must be a capitalized letter.

(WF1b)  As a consequence, the first non-quote character cannot be anything other than a letter.

(WF2)  Personal names have capitalization on all their parts.  [with exceptions for some names in "von", "van", "de", etc.]

Expressions like "pH" and "iPod" present immediate challenges to (WF1a).  Unfaithful spellings like "PH" and "IPod" are unacceptable to many -- probably most -- people.  The question is the status of faithful spellings.  Some style manuals are resolute about (WF1a); faithful spellings are unacceptable, and so such expressions must be avoided as the first words of sentences  (you can, however, write "A pH..." or "The iPod...").  If both faithful and unfaithful spellings are unacceptable, we have a STALEMATE between Faith and WF, and the conflict must be avoided in one way or another.

The Chicago Manual of Style (15th ed.) allows unfaithful spellings but prefers avoidance: for "eBay", "iMac", and the like, it says (p. 366):

Chicago recommends either capitalizing the first letter in that position or, better, recasting the sentence so that the name does not appear at the beginning.

CMOS requires that things like

   r-lessness is widespread in the U.K.

be replaced by one of the following:

   R-lessness is widespread in the U.K.  [(WF1a) trumps Faith]

   The phenomenon of r-lessness is widespread in the U.K.  [avoidance]

Indeed, examples with (WF1a) trumping Faith are not hard to find.  From "Among Believers" by A. O. Scott, in the 9/11/05 NYT Magazine, p. 40, about a literary journal whose name is n+1:

... Keith Gessen, who edits n+1 along with Kunkel...

  N+1 is not the first small magazine to come out of this ambivalence...
But spellings in which Faith rules are also easy to find:

iPod is a brand of portable media players designed and marketed by Apple Computer and launched in 2001.  (link)
iPodLinux is currently safe to install on 1st, 2nd, and 3rd generation iPods.  (link)

For Hill in 1821 this is clearly a new innovation as a prestige pronunciation.  r-lessness is thus not probably part ... (William Downes, Language and Society (2nd ed.), p. 158)

[Addendum 8/13/07: Marc Pelletier and Fernando Colina note a context where (WF1a) isn't really an option, and you have to use faithful spellings or avoid the issue: in writing about material in computer languages, since in most computer languages case is meaningful, so that "someFunction()" and "SomeFunction()" are not equivalent.]

If you take (WF1a) to be an inviolable constraint, then you're committed to (WF1b) as well, and you can't write things like the following:

4,357 complaints were filed in 2005.

/r/-lessness is widespread in the U.S.

(1) is ungrammatical.

Instead, you have to avoid the offending initial character in one way or another:

Four thousand three hundred fifty-seven complaints...

The phenomenon of /r/-lessness...

Example (1) is ungrammatical.

But, as we've seen, not everyone treats (WF1a) as inviolable, or at least as inviolable in all circumstances.  And some of these people have no problem with some or all of the examples that violate (WF1b).

On to personal names, like "e e cummings".  For these, (WF1a) and (WF2) are both in play, and there are four outcomes of the conflicts between them and Faith.  Since I happen to have collected some cites for "bell hooks" (and since, as many readers have now pointed out to me, the poet himself used conventional capitalization and punctuation in his name), I'll use her name (rather than cummings's/Cummings's) to illustrate the cases.

Winner-takes-all outcomes

Outcome 1.  (WF1a) and (WF2) are both inviolable; Faith loses everywhere.  We get things like the following:

Bell Hooks, who spells her name without capitals, is arguably the most widely published black feminist scholar ever.  (link)
plus sentence-internal occurrences of "Bell Hooks".

Outcome 2.  Faith wins over both (WF1a) and (WF2).  We get things like the following:

bell hooks (born Gloria Jean Watkins on September 25, 1952) is an American intellectual, feminist, and social activist. hooks focuses on the interconnectivity of race, class, and gender and their ability to produce and perpetuate systems of oppression and domination.  (link)
plus sentence-internal occurrences of "bell hooks".

Note the spelling in the url:  Wikipedia's software is committed to initial caps, no matter what those who maintain the site think; see Outcome 3 below.  As a result, the Wikipedia page is labeled "Bell hooks", with the entertaining warning:

The title of this article is incorrect due to technical limitations.  The correct title is bell hooks.

Mixed outcomes

Another possibility is that Faith wins over (WF2) but not (WF1a), which is  inviolable.  There are two solutions:

Outcome 3.  (WF1a) straightforwardly wins over Faith: we get "Bell hooks" initially (as in the wiki page title below, and in most, but not all, of the rest of the page), but "bell hooks" internally (as in the rest of this wiki page).

Bell hooks (link)

On the ADS-L, Bill Mullins (12/14/06) brought up a case similar to "bell hooks":

A former columnist for The Buyer's Guide to Comics Fandom/Comics Buyer's Guide is cat yronwode (pronounced "cat ironwood").  She is also formerly associated with the underground comics houses Kitchen Sink Press and Eclipse Comics.

(Mullins expressed considerable annoyance at people who go "to such great lengths to make their name flout normal conventions of spelling and capitalization".)  John Baker added that the woman in question opts for Outcome 3:

As it happens, cat yronwode herself does not object to her name being capitalized when it begins a sentence.  She posted this on the discussion page for the Wikipedia article about her:  "I use lower case i and lower case name (cat / catherine yronwode) but i do capitalize the first word in a sentence. Some folks who like me think i insist on all refs to my name must be in all lower case, but that is not so. Cat yronwode is my name and my name is cat yronwode -- first letter of the sentence is capped."

Outcome 4.  The conflict between (WF1a) and Faith is a stalemate: no sentence can begin with the woman's name, though "bell hooks" is fine internally.  This solution is not something you can search for, but I'm pretty sure that you could find people with this system.

I have no idea what the relative frequencies of these systems are, or even whether writers are consistent in the way they negotiate conflicts between Faith and WF in different cases.  (I suspect that I'm not consistent: I have no problem with sentence-initial "iPod" and the like, but I'm less happy with sentence-initial "bell hooks" and the like; and I'm not bothered by sentence-initial "(1)" or "/r/-lessness", but I tend to balk at sentence-initial "4,357".)  But it seems likely that all the logical possibilities are out there.

Having read this far, people often write me to ask which system of capitalization is the CORRECT one: what should they DO?  I generally refuse to be directive, on the grounds that whatever you do is likely to annoy someone or another.  As a general matter, conflicts between Faith and WF do not have clearly "correct" solutions -- because Faith and WF are both defensible principles.  The best you can do is use capitalization practices that satisfy you aesthetically and won't annoy too many people in your audience.  If you're writing for publication, of course, an editor or style sheet will probably make the decisions for you.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 01:09 PM

Standardizing non-standard language vs. careless misquotation

Yesterday, the Washington Post's ombudsman, Deborah Howell, took up the question of quote cleaning: "Quote, Unquote", 8/12/2007:

When you read a quote in The Post, is what's between the quotation marks exactly what the person said? Post policy says it should be, but it ain't necessarily so.

Several readers of an early edition of the July 28 Sports section noticed different versions of the same quote from Redskins running back Clinton Portis in a story by Howard Bryant and a column by Mike Wise. In Bryant's story, Portis said: "I don't know how anybody feels. I don't know how anybody's thinking. I don't know what anyone else is going through. The only thing I know is what's going on in Clinton Portis's life." Wise quoted him as saying: "I don't know how nobody feel, I don't know what nobody think, I don't know what nobody doing, the only thing I know is what's going on in Clinton Portis's life."

According to Howell,

The Post's policy couldn't be clearer: "When we put a source's words inside quotation marks, those exact words should have been uttered in precisely that form."

So Bryant didn't follow the policy, but he said he had never heard of it. To make things worse, Wise's verbatim quote, caught on tape, was changed to agree with Bryant's.

Now, the first thing to say here is that Wise's quote was NOT actually verbatim. Journalists' attempts at quotation almost never are, as we've document here many, many times (see the bottom of this post for a list of links). It took me just a couple of minutes to go the Washington Redskin's website, find a video of Clinton Portis' 7/27/2007 media session, and record and transcribe the relevant bit of audio -- and sure enough, Wise got it (somewhat) wrong.

Here's what Portis actually said during the passage in question, lined up with what (Howell said) Wise said he said:

Wise:  I don't know how nobody feel, I don't know what nobody think,
I don't know how nobody feel, I don't know what nobody thinking,

Wise:  I don't know what nobody doing,         the only thing I know is what's going on in Clinton Portis's life.
Portis: I don't know what nobody going through.Only thing I know is what's going on in Clinton Portis life

These are small errors, and not many of them, by the spectacularly lax standards of big-time journalistic quotation. By the unforgiving standards that NIST applies to speech-recognition error rate calculations, I believe that we have four word errors here (one substitution, one deletion and two insertions in 31 original words, for a Word Error Rate of 4/31 = 13%.

But what fascinates me here is that Howell didn't bother to (have someone) take ten minutes to check what the verbatim version of the Portis quote actually was.

For the record, here's my transcription of the relevant Clinton Portis Q&A, along with an audio clip so that you can listen for yourself. I've normalized the pronunciation (e.g. "ask" for [aks], "thinking" for "thinkin", "I'm going to" for [ɑɪ.mən], etc.) but I've tried to get the word sequence right.

Q: How hard will it be for you to accomplish that?
I mean, you like to say things, will it be kind of hard to bit your tongue on some things?

A: It won't be hard, I just got to mind my business, you know?
((So that)) I can't fight nobody else battles.
I'm going to mind my own business,
I'm going to keep Clinton Portis out of trouble,
I'm going to keep Clinton Portis focused,
I'm going to keep Clinton Portis on top of his game.
Outside of that, you can't ask me about the next man,
I don't know how nobody feel,
I don't know what nobody thinking,
I don't know what nobody going through.
Only thing I know is what's going on in Clinton Portis life,
and- and that's the only thing I control.

Now, Howell ends up focusing on a fascinating and important question -- whether journalists should change sources' morphology, word choice and sentence structure to conform to the norms of the standard language. Here's what she, Bryant and Wise say about this, according to her:

Bryant, who just left The Post for ESPN, thinks the policy is wrong. "For me, having covered athletes for 15 years, I've always felt conscious and uncomfortable about the differences in class, background and race -- I'm an African American -- and in terms of the people who are doing the speaking and the people who are doing the writing. I really don't like to make people look stupid, especially when I understand what they're saying."

What Bryant did is common among sports journalists, said Emilio Garcia-Ruiz, assistant managing editor for sports. "Sportswriters have been making minor grammatical fixes to athlete's quotes forever. The meaning of what the athlete is saying is not altered, just the grammar. It's rooted in the belief that you shouldn't embarrass someone whose command of grammar is weak. We have told our writers to run quotes verbatim or paraphrase when the grammar is horrific, but some old habits die hard. We will try to do better."

What if television or a tape recording should catch a quote that Bryant changed? "I don't really worry about it," Bryant said. "I am totally convinced -- along racial, class and cultural lines -- that when it comes to white players from the South, reporters instinctively clean up their language. Redskins coach Joe Gibbs, in his own way, can sound as inarticulate as Portis in terms of perfect grammar, so I clean up his language to not embarrass him. I also do it with athletes. What's fair is fair."

Wise disagrees, and he didn't like the fact his verbatim quote was changed without consultation. "I just have a hard time cleaning up anyone's quotes. I just feel it robs people of their personality. And if I'm not capturing who the person is through the rhythm and cadence of their words, I'm not telling the readers who they are. I just feel people need to be portrayed as they sound, irrespective of whether you're an aging white coach or a young black athlete. Otherwise, we run the risk of homogenizing everyone."

But she completely glosses over another point, which is that a significant fraction of the words in journalists' "quotations" are simply wrong, even when they're not trying to standardize a speaker's grammar.

Let me stress that Portis got off relatively lightly. Here are a few of the other ways his answer was quoted in print media, with the (relatively few) mistakes indicated in red:

Joseph White AP (from USA Today, "As camp opens, Portis tries to escape the sting of a lost year", 7/27/2007)

"I'm going to mind my own business," Portis said. "I'm going to keep Clinton Portis out of trouble. I'm going to keep Clinton Portis focused. I'm going to keep Clinton Portis on top of his game. On top of that, you can't ask me about the next man. I don't know how nobody feels. I don't know what nobody's thinking. [...] The only thing I know is what's going on in Clinton Portis' life."

Jim Ducibella, "Portis preaches goal to stay focused", Roanoke Times, 8/2/2007

"I'm going to keep Clinton Portis out of trouble," the Washington Redskins running back vowed last week. "I'm going to keep Clinton Portis focused. I'm going to keep Clinton Portis on top of his game. ... I don't know how anybody feels. I don't know what anybody's thinking. I don't know what anyone's going through. The only thing I know is what's going on in Clinton Portis' life."

Adam Himmelsbach, "Clinton Portis: The chip's back on his shoulder", The Fredericksburg Free Lance, 7/20/2007

"I just have to mind my business," Portis said. "I can't fight no one else's battles. I'll keep Clinton Portis out of trouble. I'll keep Clinton Portis focused and I'll keep Clinton Portis on top of his game. Outside of that, you can't ask me about the next man. I don't know how no one's feeling. [...] I don't know what nobody's going through. All I know is what's going on in Clinton Portis' life."

I agree with the Post's policy, which I believe is pretty much the standard one (though I've got no problem with regularizing spelling and even morphology, and editing out most disfluencies). But I don't believe that I've ever checked a journalist's transcription against a recording and found that it was actually word-for-word accurate, even allowing for editing of this type.

In this case, why should Clinton Portis's "nobody" be replaced with "no one", or his "outside of that" be replaced by "on top of that", or his "going through" be replaced by "doing"? That's not standardizing his language, it's just slipshod transcription.

These are not isolated errors. They're representative of the normal practice of journalism. The superficial issue is that journalists -- as a culture -- don't act as if they care whether quotes are accurate. The deeper issue, in my opinion, is the role that such quotes play in journalistic rhetoric. They're usually not treated as data, facts about the world in need of explanation, but rather as illustrations or expressions of the writer's opinions and conclusions, put into someone else's mouth because the rhetorical norms of the profession require it.

Often, such quotes are made to order by getting sources to answer leading questions, over and over again, and ignoring all of the answers that don't fit the framework that the writer has in mind. In other cases, bits and pieces of quotation are taken out of context and strung together in order to create a meaning that suits the writer's intent (which may or may not have been the speaker's intent).

When you think of quotes from sources in that cultural framework, it's hardly surprising that it's hard to get journalists to pay attention to whether someone said "going through" or "doing".

Some other relevant Language Log posts:

"Journalists' quotations: unsafe in any mood", 5/24/2007
"News and entertainment", 9/11/2006
"'Approximate' quotations can undermine readers' trust in the Times" 8/27/2005
"This time it matters", 8/13/2005
"'Quotations' with a word error rate of 40-60% and more", 7/30/2005
"Ethnograpy, journalism and interview rituals",
"Bringing journalism into the 21st century", 6/30/2005
"More comments on quotes", 7/1/2005
"Down with journalists!", 6/27/2005
"Ritual questions, ritual answers", 6/25/2005
"Ipsissima vox Rasheedi", 6/24/2005
"What did Rasheed say?", 6/23/2005
"Typography, truth and politics", 9/15/2004

[Update -- Suzette Haden Elgin writes:

I'm writing as someone whose native dialect of English is non-standard [and who has always done her best to present her "scholarly papers" at conferences in militant Ozark English].

And what I want to say -- finally getting to the point, you know how us Ozarkers go on and on and on and never get to the point -- is that the sequence "cleaning up non-standard language" _presupposes_ that non-standard language is dirty. Filthy, even.

And she's absolutely right -- in the original version of this post, I took the "cleaning up" phrase from the WaPo reporter's quote, and used it without even any scare quotes. So I've gone back and changed the wording to use "standardize" and other such expressions instead.

There are lots of these nasty metaphors, according to which non-standard speech and language are not only dirty, but also lawless and diseased. It's hard to avoid using them, because they get to be so bleached out after a while that we forget what they really mean.]

Posted by Mark Liberman at 07:05 AM

August 11, 2007

The Semantics of Pork

A new controversy has arisen over the bill to expand health care for children. Hitherto, the controversy has been that proponents of the bill, roughly speaking, the Democrats, think that as many children as possible should have health care, whereas the opponents, roughly speaking, the Republicans, do not. According to the New York Times critics of the bill are now claiming, with some reason, that the bill is loaded with pork, that is, with allocations to specific hospitals and projects. The critics point out that Congress has undertaken both to reduce the amount of pork and to be more transparent about it. This bill violates that commitment by not naming the hospitals to receive special funding but instead specifying them indirectly.

For example, the bill provides that, for purposes of Medicare:

any hospital that is co-located in Marinette, Wis., and Menominee, Mich., is deemed to be located in Chicago

Now, as it happens, there is only one hospital, Bay Area Medical Center, that meets this description. Critics say that this is a way of obscuring where the pork is going. The effect of deeming such a hospital to be located in Chicago is to increase the reimbursement rate for physicians since physicians in Chicago are more highly paid than in outlying areas. Proponents of the bill argue that this allows hospitals in outlying areas to compete for staff with big cities. In theory, then, there is a virtue to the abstract formulation of the bill, namely that since all hospitals in a given area will be subject to the same conditions, the abstract formulation has the virtue of automatically including every such hospital, whereas, if the hospitals were specified by name, newly created hospitals or hospitals omitted inadvertently from the bill, would be excluded. As far as I know, no proponent of the bill has actually made this argument. I suspect that it isn't a very strong argument since the local Congressperson is not likely to forget about the hospitals in his or her district and new hospitals do not spring up very often.

Whatever we make of the politics, this controversy provides a beautiful illustration of an important concept from semantics, namely the distinction between intension and extension. The denotation of a referring expression such as a Noun Phrase may be defined as its extension, that is, as the set of entities to which the expression refers. The extension is simply a list. Another way of defining the denotation of an expression is by means of its intension, that is, the set of entities that satisfy constraints imposed by the expression. For example, if the expression is "red balls", I could tell you which balls it refers to by pointing at each of the balls, which would be an extensional definition, or I could explain to you the meaning of "red", which would allow you to identify the red balls by determining which of the entities in the universe are balls and are red.

The distinction between intension and extension is important because people are not always aware of the relationship between the two. The classic example is the assertion: "Hesperus is Phosphorus", which translated from classical terms into modern English is equivalent to: "The Evening Star is the Morning Star." The ancients thought of these two heavenly bodies as distinct, but we now know that they are actually the same entity, namely the planet Venus. If we look only at the extension, it is a mystery that the statement "The Evening Star is the Morning Star." is informative, since it is equivalent to "The planet Venus is the planet Venus.", a tautology. If we look at the intensional meaning, however, we can make sense of this, since what the proposition means is: "The entity with the properties we associate with the Evening Star is the same as the entity that we associate with the Morning Star." Unless and until we know that both are the planet Venus, this is not self-evidently true and is thus informative.

In the case discussed above, the language of the bill describes an intension. In the world as it is, the extension consists only of one entity, the Bay Area Medical Center. Although in this bill the use of intensional descriptions is questionable, in general it can be quite useful because it abstracts away from our ignorance and from changes that we cannot anticipate.

My favorite example of this distinction, albeit one that isn't quite as perfect, is due to Clint Eastwood. In 1980 he was interviewed during the filming of Bronco Billy. The interviewer said that some would define a Clint Eastwood picture as a violent, ruthless, lawless, and bloody piece of mayhem. He then asked Eastwood how he would define a Clint Eastwood picture. To the interviewer's intensional definition, Eastwood gave the (nearly) extensional response:

To me, what a Clint Eastwood picture is, is one that I'm in.

[The reason that the Eastwood example is not quite as clear as the hospital example is that "one that I'm in" is, if we are very fussy, still intensional. It comes close to being extensional both in that it is a small finite set and in that it is essentially arbitrary.]

Posted by Bill Poser at 06:55 PM

Yet another snowclone omnibus

Living on the slopes of Snowclonia as I do, I'm always in danger of being swept away by an avalanche.  And those snowclones have been piling up for nine months since my last snowclone omnibus.  It's time for some snowclone removal.

By way of introduction, here's a tribute to the current World Champion Snowclone, The New Y:

(Hat tip to Ornithopter.  Note: I am no longer adding examples to my file on The New Y; the case is closed.)

In this posting, I'm going to survey snowclones (well, snowclone candidates) that are new to Language Log since my earlier omnibus posting.  In another posting, I'll comment on some of the older ones.

1.  From Alex Baumans, 11/12/06: "It's X, Jim, but not as we know it" (from Star Trek; there are examples without a name and with names other than Jim).

2 and 3.  From Michael Covarrubias, 11/12/06, two formulas closely related to "As a X, N is a great Y":

The first is the standard "X is to Y what Z is to Q" -- which can be used as a complement, an insult, or a neutral (though sometimes odd) observation.  E.g. "She is to academics what Olivier is to acting" or "He is to relationships what Gallagher is to watermelons" or "She is to cooking what Stephen King is to writing."  [Since blogged on, with an example from Zippy the Pinhead.]

The second is a simple change in the formula: X is to Y what Z is to Y.  There's usually a partial echo between X and Z.  I first heard it used by Comedian Jeff Ross in a Friars Club Roast of Drew Carey (1998).  He said "Drew Carey is to comedy what Mariah Carey is to comedy."

4.  From Rebecca Egipto, 11/11/06, "I'm in ur Noun V-ing ur Noun" (according to the account here, taking off from "I am in your base killing your d00ds"); mentioned by Mark Liberman back in April, in the lol-lexicography thread.

5.  From Paul Kalmar, 11/11/06: "Save a X, ride a Y".  This seems to have spread from the (catchy) country song "Save a horse, ride a cowboy", in the 2004 recording by Big & Rich (on Horse of a Different Color), though others (e.g. Haley Bonar) have recorded the song.  I found lots of variants:

Save a horse, ride a: skater/feminist/donkey/cowgurl/Hornet/tractor.

[reversed] Save a cowboy, ride a horse.

Save a skateboard, ride a skater.  Save a wave, ride a surfer.

[ride and save reversed] Ride a Llama, save a NukeWorker.

6.  From a correspondent on 11/11/06, "Sufficient unto the X is the Y thereof" (based on "Sufficient unto the day is the evil thereof", Matthew 6:34).  Examples from my correspondent:

Sufficient unto the logic is the rigor thereof.  [grad students in philosophy, about their logical studies]

Sufficient unto the occasion is the idiom thereof.

Sufficient unto  the toaster is the destruction thereof.

Sufficient unto the facts is the paranoia thereof.

Sufficient unto the day is the drivel thereof.

7.  From David Hilberg, 11/14/06: "You can't X your Y and Z it too" (based on "You can't have your cake and eat it too"; yes, I know the proverb doesn't make much sense in this form).  Some examples Hilberg googled up:

You can't capitalize your cake and expense it too.

You can't have your torture and ban it too.

You can't have your pyatiletka and NEP it too. [DH: Whatever that may mean.]

You can't have your cliché and use it too - can you?

Two Eskimos sitting in a kayak were chilly; but when they lit a fire in the craft, it sank, proving that you can't have your kayak and heat it too.

Alas, you can't have your Cab and drink it too. No sooner enjoyed but worthless, it's sometimes even dead on uncorking.

Who says you can't have your data and store it, too?

Who says you can't have your heritage and develop it, too?

Who says you can't have your fUZZ and WAH it too?

"You" "can't" "have" "your" "use" "and" "mention" "it" "too." --Douglas Hofstadter

"You Can't Offer Your Sacrifice and Eat It Too: A Polemical Poem from the Aramaic Text in Demotic Script." Journal of Near Eastern Studies 43 (1984) 89-114.

Sudz Cola - Who says you can't take your bath and drink it too? [DH: Supposedly this is an old saw, though I was never so warned.]

8 and 9.  Two snowclones reported by Mark Peters on the ADS-L, 12/6/06: "Pimp my X" and "Stupid X tricks".

10.  From James Sinclair, 12/28/06: "If X are outlawed, only outlaws will have X" (presumably the original has X = guns).  Sinclair found the following, among many others, in a Google search:

puns, cigarettes, giant pythons, cigars, cluster bombs, chickens, socks, pickles, Beanie Babies, pit bulls, tomatoes, catapults

He was spurred by an article by Patrick Hruby on's Page 2, where Hruby declares:

If dead eels tied to ropes are outlawed, only outlaws will have dead eels tied to ropes.

11.  From Liz Coppock,  12/24/06: "A watched X never Ys" (based on "A watched pot never boils").  Her first example was

A watched file never downloads.

and you can google up many other variants.

12.  From Jason Grafmiller, 1/16/07: "The once and future X" (based on "The once and future king").  First ten examples (with X other than "king") googled up:

web, sun, country, threat, action network, analytic powerhouse, carbohydrate economy, cosmos, nanomachine, Steve Jobs

13.  From Andy Hollandbeck, 1/25/07: "Nothing says X like Y" (based on the Pillsbury advertising slogan "Nothing says lovin' like something from the oven").  Many occurrences with X = lovin', but others can be googled up.  From Hollandbeck himself, "after a particularly lascivious time-out routine by the Pacer Pacemates" (i.e., cheerleaders, in old-fashioned talk):

Nothing says "Go team!" like simulated sex.

14.  Blogged about by Mark Liberman a while back: "X: panacea or Y?"

15.  Joining "X are from Mars, Y are from Venus": "Men are from X, women are from Y", as in

... "in both the United States and Europe, there appears to be a striking discrepancy between the body that that men think women like and the body women actually like."  Put another way, men are from Schwarzenegger, and women are from Quetelet.  (Stephen S. Hall, Size Matters, p. 235)

More recently, there's Janet Hyde's wonderful

Men are from North Dakota and women are from South Dakota.

mentioned in Mark Liberman's posting "Men are from ..."

16.  "Are we X yet?" (and its variant "Are we X now?"), which I finally got around to blogging on recently.

17.  From Jon Winokur's Encyclopedia Neurotica, a reference to comedian Richard Lewis's claim to have originated the phrase "the BLANK from Hell/hell" ("the date from hell", "the roommate from hell").  From the Wikipedia page for Lewis (back in December 2006):

This theory is expounded in the Curb Your Enthusiasm episode "The Nanny from Hell". Lewis has petitioned the editors of Bartlett's to be given credit for the coinage, but the editors claim that the phrase was a common idiom prior to Lewis' use of it.

The Yale Book of Quotations (p. 458) credits Lews warily:

Richard Lewis
U.S. comedian, 1947-

1 [Self-description:] Comedian from hell.

Quoted in Chicago Tribune, 20 Apr. 1986.  Earliest documented example of the expression "from hell" referring to a person.

18.  From Cole Paulson, 2/11/07: "X city".  The example Paulson sent me was

This new girl is random city!  We have nothing in common!

and he also had noted occurrences of "weird city" referring to people.  Both have an adjective X.  Paulson posted his observations to ADS-L on 2/19/07.  Dennis Preston then suggested that "Fat City" was the original, and added that in his perceptual dialectology work in the 1980s he got lots of "N City" nonce names for areas of the U.S.: "Rebel City" (the South), "Eskimo City" (Alaska), "Cowboy City", etc.

I added that I'd been assuming that the original had a noun X; at some point we got "Sin City", and then "Spin City" (the television sitcom) as a take-off on that.  More recently, we get adjective Xs, and the use of the formula has extended from place/region names to predicatives applicable to all sorts of things (or people).  These extensions could be from "N City" examples, or they could have developed from "the Adj City" names (like "the Windy City" for Chicago), with the common-noun construction, having a definite article, turned into an anarthrous proper name.  Or, of course, both.

I suggested that the early uses of "Fat City" were for actual places -- "Los Angeles is Fat City", meaning it's a place of opportunity or success  -- or for metaphorical places, as in "I'm in Fat City now" (note the preposition), meaning I've achieved success.

Finally, I noted "X City" examples with X probably to be analyzed as a verb: "suck city" (city that sucks), "barf city" (a place that makes you want to vomit, or the act of vomiting), "fuck city" (a place where you can get laid, or getting laid).

19.  In the middle of February, I came across an example of "As X falls, so falls X Falls".  The original is Pat Metheny Group's "As Wichita Falls, So Falls Wichita Falls" (first released in September 1980).  There aren't a lot of examples, because of course you need a place name "X Falls" to build on, and there are only so many of these.  But I quickly found cites for:

Seraphim, Idaho, Cuyahoga, St. Anthony, Niagara

20.  Then there's "X for Jesus", blogged about back in March by Mark Liberman under the title "Snowclones for Jesus", which cited Karl Hagen's blog on the topic.

21.  And "X's X", which I posted about in April.

22.  On 4/21/07, James Harbeck posted to the ADS-L about "Step away from the X":

Here's a lately popular idiom, started by "Step away from the vehicle," I'm pretty sure. "Step away from the Blackberry" gets 158 Google hits all by itself. If you search just "step away from the" you get nealry a million hits; some of the ones on the first couple of pages are graffiti, cold medicine, keyboard, spell-checker (yay for that one!), social media release, PC, mousse (with your hands up), shovel, podium, Gatt chart, jokes, computer, and tofu burger.

Larry Horn added:

... crucial too is the intonation, which ideally approaches that of a police officer enunciating clearly and forcefully, possibly through a loudspeaker.  There's often a slight pause between "step" (with unreleased [p], rather than the usual elision with the following vowel) and a slight rise on "away"; someone with a better control of the descriptive terminology could  do a better job of narrowing down exactly what this intonation is.  This has been quite popular in

23.  Then came the lolcats snowclones, blogged on in at least five postings here: 4442, 4485, 4500, 4507, and 4508.

24.  Late in May, the folks on ADS-L mused on a set of formulas involving the prefix Mc-.  One of these, "Xy McXerson" ("Drinky McDrinkerson"), was in my last snowclone omnibus, but there are more, including a somewhat less restricted formula for derogatory invented proper names; Joe Salmon's favorite (5/24/07) was "Drunky McPukeshoes" for Tom DeLay.  Searching the ADS archives for May under the subject line {"Mc-" prefix} will get you the whole convoluted thing.

25.  In July, Mark Liberman took up "X considered harmful" here.

26.  And in August "I am X, hear me Y".

27.  At the beginning of August, Bonnie Taylor-Blake started an ADS-L thread on the quotation "He may be a son of a bitch, but he's our son of a bitch" (attributed to FDR with reference to Nicaraguan dictator Anastasio Somoza, in the wording given by the Yale Book of Quotations (p. 647)).  I then issued a

snowclone alert!  In the first 40 google webhits for {"he may be a * but he's our"} I get 14 different fillers for

He may be a X, but he's our X

(in addition to versions of "son of a bitch"):

jackass. bastard (one hit attributing this to FDR, of Somoza), fool, crook, jerk, lunatic, monster, scorpion, swine, terrorist, sleazebucket, scumbag, butcher, devil

So much for the "new" snowclones.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 04:23 PM

les spams français

Je reçois de plus en plus des couriels spams en mauvais français, pleins d'erreurs grammaticales et de mots mal choisis. Je me demande si maintenant l'Académie Française ou le gouvernement de la France s'adonnera à la supprimation du spam?

Posted by Bill Poser at 03:57 PM

The bard of Springfield

Ben Macintyre ("Any word that embiggens the vocabulary is cromulent with me", Times OnLine, 8/11/2007), considers "the role of the The Simpsons in the evolution of the English language", and quotes me:

According to Mark Liberman, of the University of Pennsylvania Linguistic Data Consortium: "The Simpsons has apparently taken over from Shakespeare and the Bible as our culture's greatest source of idioms, catchphrases and sundry other textual allusions."

This was the inspiration for the cute illustration heading the piece:

Since I may become famous as the source of this quotation, I'm encouraged to wonder whether it's true.

This all started back in March of 2004, when I mentioned the idea in passing and in a rather non-authoritative tone:

Bert sends along an intriguing example from The Simpsons (By the way, I've been told that The Simpsons has now taken over from Shakespeare and the Bible as the largest single source of quotations and allusions in English-language text. I'm not sure who measured this, or how, or when. Most likely someone just made it up, like 87% of all cited statistics. However, it might well be true...)

I recycled the idea, more boldly but still modified by "apparently", in a post titled "Homeric objects of desire", 1/7/2005:

The Simpsons has apparently taken over from Shakespeare and the Bible as our culture's greatest source of idioms, catch phrases and sundry other textual allusions.

So now that it's too late, I'm asking myself, in my intermittently positivist sort of way, is it true? I'll do some thinking about how to test such a thing, but if you have any suggestions (or even better, any results), let me know.

[Note that Macintyre is also indebted to Ben Zimmer's Language Log post " Meh-ness to society", 6/8/2006; and probably also to the many contributions of the world's foremost expert on Simpson inguistics ("Simpsonolinguistics"? "Simpsolinguistics">?), Heidi Harley.]

Posted by Mark Liberman at 11:34 AM

Etymology as poetry

Before the cold-soup season is over, you should swing by Language Hat's place to check out (or contribute to) the most interesting blog topic of the week: research into the origin of the word gazpacho. "The etymology", as Hat quotes Holt and Barbara explaining, "is wonderful".

In particular, I hope that you're as happy as I am to learn that the two leading scholarly candidates for the source are Spanish caspicias "remainders, worthless things" (suggested by the American Heritage Dictionary), and Greek γαζοφυλάκιον (gaza-phulakion) "treasure-guarder, treasure house" (suggested by the Real Academia Española's Diccionario de la lengua española), and that telling philological objections have been registered against both proposals.

Posted by Mark Liberman at 08:30 AM

Baby videos and Google News comments

A few days ago, an interesting paper came out: F. J. Zimmerman, D.A. Christakis, and A.N. Meltzoff, "Associations between Media Viewing and Language Development in Children under Age 2 Years", The Journal of Pediatrics, article in press, available online 7 August 2007. The study design:

A total of 1008 parents of children age 2 to 24 months, identified by birth certificates, were surveyed by telephone in February 2006. Questions were asked about child and parent demographics, child-parent interactions, and child's viewing of several content types of television and DVDs/videos. Parents were also asked to complete the short form of the MacArthur-Bates Communicative Development Inventory (CDI). The associations between normed CDI scores and media exposure were evaluated using multivariate regression, controlling for parent and child demographics and parent–child interactions.

And the results?

Among infants (age 8 to 16 months), each hour per day of viewing baby DVDs/videos was associated with a 16.99-point decrement in CDI score in a fully adjusted model (95% confidence interval = −26.20 to −7.77). Among toddlers (age 17 to 24 months), there were no significant associations between any type of media exposure and CDI scores. Amount of parental viewing with the child was not significantly associated with CDI scores in either infants or toddlers.

If you've been reading Language Log recently, you might think that I'm going to complain about how this paper was described in the mainstream media. But I'm not -- the coverage was a bit sensationalized, unsurprisingly, but not especially problematic. Instead, I want to point to two interesting and basically good things, one from a big-time new-media source and the other from the paper itself.

Also on August 7, Google News rolled out "Perspectives about the news from people in the news".

We'll be trying out a mechanism for publishing comments from a special subset of readers: those people or organizations who were actual participants in the story in question. Our long-term vision is that any participant will be able to send in their comments, and we'll show them next to the articles about the story. Comments will be published in full, without any edits, but marked as "comments" so readers know it's the individual's perspective, rather than part of a journalist's report.

[...] [W]e're hoping that by adding this feature, we can help enhance the news experience for readers, testing the hypothesis that -- whether they're penguin researchers or presidential candidates-- a personal view can sometimes add a whole new dimension to the story.

In the case of the baby-video study, there are comments by two of the three authors, Frederick J. Zimmerman and Dimitri A. Christakis.

Given how much I've complained about misleading science journalism, it's easy for me to see the upside. At least, researchers will have an enhanced opportunity to reach the public with their own perspective on work that makes it into the popular press.

But there's an interesting and important set of editorial problems here. How will the folks at Google News decide who is one of the "actual participants in the story in question"? There are some people who obviously are in that set, and some who obviously are not, but there's a grey area, where some arbiter will have to decide who's in and who's out. Who will that arbiter be? What criteria will be used? And what sort of accountability or appeal will there be?

And how, on the notoriously fraud-prone internet, will Google's comments editors verify that Dr. Martha D. Penguin-Researcher <> is who she says she is? I wonder when we'll see the first hoax.

I also want to mention a couple of good things about the Zimmerman et al. study.

First, they focus on the magnitude of the negative effect of "baby videos", not just on the question of whether  the effect was "statistically significant". And they express the effect's magnitude in three ways -- (1) as regression results, where the regression coefficients are denominated in intuitively meaningful units, namely percentile increases or decreases in normed CDI scores; (2) as a comparison of the effects of various interventions, including positive ones like reading or telling stories to the children at least once a day; (3) as a translation of the normed-CDI-score effects into more concrete terms, namely the expected difference in vocabulary size.

Second, they present a clear and sensible discussion of what their results might mean, including an explicit discussion of several alternative interpretations other than (their preferred) conclusion that baby videos harm infant language development. It's surprisingly rare to see this kind of frank and honest appraisal of alternative explanations.

(For an example of a published study whose presentation is, well, different from this, see here.)

Here's the table of regression results from Zimmerman et al., followed by part of their discussion:

Age 8 to 16 months
Age 17 to 24 months
VariableCoefficient[95% CI]Coefficient[95% CI]
Parental interactions
 Reading at least once daily7.07low asterisk [0.53,13.60]11.72low asterisk [1.86,21.59]
 Storytelling at least once daily6.47low asterisk [0.23,12.71]7.13† [−0.11,14.37]
 Music listening at least several times weekly5.36[−1.92,12.64]7.2[−2.10,16.50]
Children's media watching time (hours/day)
 Baby DVDs/videos−16.99low asterisklow asterisk [−26.20,−7.77]3.66[−4.45,11.77]
 Children's educational shows1.72[−4.42,7.87]2.21[−1.74,6.15]
 Movies and children's noneducational TV6.6[−1.81,15.02]2.03[−2.78,6.83]
 Grownup TV−1.42[−11.57,8.73]2.38[−5.68,10.45]
Parental viewing with child
 Rarely or about half the time (referent)
 Usually or always5.57[−2.10,13.23]0.39[−6.74,7.52]
 N/A: no media viewing−7.70 [−15.49,0.08]2.65[−7.29,12.60]

This analysis reveals a large negative association between viewing of baby DVDs/videos and vocabulary acquisition in children age 8 to 16 months. The 17-point difference associated in the analysis with each hour of baby DVD/video watching corresponds to a difference of about 6 to 8 words for a typical child out of the 90 included on the CDI. There are 3 possible reasons for this association. First, because many baby DVDs/videos are heavily advertised as promoting cognitive, language, and brain development, it is possible that parents who are concerned about their child's language development turn to baby videos for help. If this is indeed the case, then it would be fair to say that the poor language development causes greater viewing of baby DVDs/videos.

A second possible explanation for the association between baby DVD/video viewing and vocabulary is that of residual confounding; that is, other variables (not measured in our data) could lead to both high baby DVD/video watching and slow language development. One possible example to illustrate this would arise if those parents who have their children watch a heavy dose of baby DVDs/videos are those who are less motivated to actively promote their children's language development. We partially controlled for this possibility with the parent-interaction variables, but we cannot capture the quality of these interactions, which surely varies. A second possible source of residual confounding would exist if parents who are inattentive, distracted, or simply pressed for time are more likely to rely on baby DVDs/videos as a babysitter. Such parents also might be less likely to know how many words their children know. Although we attempted to adjust for many social and demographic factors that might confound the observed association, it is possible that this adjustment was incomplete.

Finally, it is possible that heavy viewing of baby DVDs/videos has a deleterious effect on early language development. The first 3 years of life are characterized by rapid brain development, and environmental factors are known to influence how the brain develops. It is plausible that extensive exposure to an absorbing but not developmentally constructive stimulus could affect brain development and language acquisition. Heavy viewing of baby DVDs/videos may constitute such an environmental influence. If so, there are several potential causal mechanisms through which such an effect might occur. The viewing of baby DVDs/videos might crowd out interaction time with adult caregivers in ways not measured here. For example, we did not measure the time parents spend directly talking to their infants, or the nature and quality of this verbal input, which are known to be important factors in early language development. Baby DVDs/videos contain limited language and display a certain combination of formal features (short scenes and flashy screen images), which might not promote vocabulary learning or might lead to habits of mind that actually impede it. Whether these formal features are systematically different than those of the other content types represented here has not been formally studied.

Here's a screenshot of what the Google News "comments" page for the Zimmerman et al. story looks like this morning, since I have no idea how long this story and its associated comments will remain accessible:

[Update -- Disney defends Baby Einstein against the UW press release, and demands a retraction. Disney makes some good points -- the UW press release begins

Despite marketing claims, parents who want to give their infants a boost in learning language probably should limit the amount of time they expose their children to DVD's and videos such as 'Baby Einstein' and 'Brainy Baby'. Rather than helping babies, the over-use of such productions actually may slow down infants eight to 16 months of age when it comes to acquiring vocabulary, according to a new study by researchers at the University of Washington and Seattle Children's Hospital Research Institute.

although the study didn't actually examine the effect of baby videos in an experimental setting, being based only on answers to a survey that didn't ask about any products in particular, and carefully avoids making causal claims. The study didn't deal with anyone's marketing claims at all. One point that Disney fails to make is that all the survey results put together accounted for only 17% of the variance in CDI scores.

However, it would be a lot easier to feel sympathetic to Disney if they had themselves done any experimental validation of their marketing claims about the product that they sell, and in particular any experiments to verify that the product doesn't actually do harm. (I conclude that no such experiments have been done from the fact that the flack doesn't cite them.)

Anyhow, the Google News comments from the study's authors are gone, replaced by a stern note from Baby Einstein's general manager:


Posted by Mark Liberman at 07:34 AM

August 10, 2007

SCO Loses!!!

Three years ago Geoff Pullum and I both wrote about the claims by a company known as SCO that the Linux operating system contains large amounts of code taken from Unix, to which SCO claimed to own the copyrights. In addition to threats to Linux users, SCO filed a $3 billion dollar lawsuit against IBM. Further complicating matters, Novell announced that they rather than SCO owned the Unix copyrights, leading SCO to sue Novell for slander of title.

The outrage over SCO's claims was due largely to the wholly unsupported allegation that Linux had stolen large amounts of proprietary Unix code, for which SCO was unable to produce any convincing evidence. The courts have not yet ruled on that aspect of the case, though the judge in SCO v. IBM has commented on the paucity of evidence that SCO was able to produce. Today, however, Judge Dale Kimball ruled that Novell not SCO owns the copyrights. This of course guts SCO's case against IBM as well. The Free Software movement has been vindicated and SCO will soon be bankrupt.

Posted by Bill Poser at 10:07 PM

A bet against speech recognition

From blogger Russell Shaw at ZDNet: Man endures thumb surgery to better enable iPhone use. The subject of the surgery, one Thomas Martel, tells a reporter:

"Sure, the procedure was expensive, but when I think of all the time I save by being able to use modern handhelds so much faster, I really think the surgery will pay for itself in ten to fifteen years."

When I read this story I couldn't help but think back to ten years ago, when Bill Gates was predicting that speech would be a standard part of the PC's human-computer interface within a decade. Blogger Matthew Paul Thomas has a collection of Bill Gates quotes on speech recognition showing how this projection has evolved over time (with links to sources), e.g.

Bill Gates, 1 October 1997: “In this 10-year time frame, I believe that we’ll not only be using the keyboard and the mouse to interact, but during that time we will have perfected speech recognition and speech output well enough that those will become a standard part of the interface.”
Bill Gates, 24 March 1999: “Speech recognition … I don’t think you’ll see dictation as something that most people will use in the next couple of years. The extra processing power, getting the extra memory I think has us on a track to provide that, but for most people, I think it will be more like a five-year time frame before that’s a standard way of interacting.”
Bill Gates, 26 March 2001: “Because of where we’re going with real-time communications, including the instant messaging that will be included in Windows itself, voice annotation, voice communication and speech recognition are becoming mainstream capabilities. And so we believe that virtually all PCs should have that right out of the box.”
Bill Gates, 25 February 2004: “Now, with speech it’s not as easy. Speech is another one that will be solved, and will be solved for a broad range of applications within this decade.”

Speech technology has definitely been improving -- beyond the formal studies in the research community, witness all the interaction we can do these days to get telephone listings, driving directions, movie tickets, etc., not to mention the whole variety of successful speech transcription products out there in the market. But will the technology advance quickly enough to make Mr. Martel regret his choice? Time will tell.

[Hat tip to Greg Schnitzer]

[Update: it turns out that the surgery story was a hoax. Oddly, the newspaper in which it appears says that it "features stories from the paper as well as other news, satire and commentary", but nothing at the site gives any indication of which is which. Nonetheless, it was an interesting opportunity for a ten-year perspective on Bill Gates's predications...]

Posted by Philip Resnik at 01:48 PM

I am X, hear me Y

According to Allen Salkin, "Be Yourselves, Girls, Order the Rib-Eye", NYT 8/9/2007,

In an earlier era, conventional dating wisdom for women was to eat something at home alone before a date, and then in company order a light dinner to portray oneself as dainty and ladylike. For some women, that is still the practice. "It's better not to have a jalapeño fajita plate, especially on the first date," said Andrea Bey, 28, who sells video surveillance equipment in Irving, Tex., and describes herself as "curvy." [...]

But others, especially those who are thin, say ordering a salad displays an unappealing mousiness.

"It seems wimpy, insipid, childish," said Michelle Heller, 34, a copy editor at TV Guide. "I don't want to be considered vapid and uninteresting."

Ordering meat, on the other hand, is a declarative statement, something along the lines of "I am woman, hear me chew."

This is a snowclone that we've somehow managed to miss.

The original, of course, was Helen Reddy's 1972 hit "I am woman":

I am woman, hear me roar
In numbers too big to ignore
And I know too much to go back an' pretend
'cause I've heard it all before
And I've been down there on the floor
No one's ever gonna keep me down again

Chew is new, I think, but the process of substituting something else for roar probably began soon after the song came out. (Though the earliest example I found in a few seconds of searching was the headline "I am woman, hear me pour breakfast juice", from the Dallas Morning News, August 16, 1990.)

Some substitutions are rhymes: bore, snore, tour (well, in some dialects), Gore, soar, whore, more, pour, war, etc.

But the first few pages of verb-substitutions from a Google search for {"I am woman hear me" -roar} includes shop, rock, bitch and moan, walk, despair, set off the airport security detector, kick ass, blubber, meme, campaign, run, ramble, expound, meow, click, whine, moo, stab, whimper, game, laugh, blend, sing, caulk, scream, draw, rhyme, blog, sing torch & twang, purr, shoot -- and on through some 28,000 other pages.

Looking at the other obvious substitution -- "I am X, hear me roar" -- we find X = Peter, worm, GeekGirl, Justin Bonomo, Boobalicious, Kittenwar, protoplasm, Naturezilla, Superwoman, Catwoman, Hobbes, geek, blogger, Monki, Lizmonster, Hellionexciter, lion, mommy, milquetoast, Corolla, Gibbon -- and on through more than 100,000 other pages in the obvious Google search.

And of course there are plenty of double substitutions, like "I am Hannah, hear me croak", "I am Puppy, Hear Me Yap", "I am cow, hear me moo", "I am geek, hear me beep", and so on, further than I'm willing to read.

Posted by Mark Liberman at 07:13 AM

The Word "Islamic Terrorism"

In last Sunday's Republican presidential debate Rudi Giuliani criticized the Democratic candidates for the failure of any of them in the course of four debates to utter "the word 'Islamic terrorism'". There's a very good reason for this: "Islamic terrorism" is not a word; it's a phrase, specifically a Noun Phrase. Opinions vary as to how the Republicans are doing in the War on Terror, but they're definitely losing in the War on Incorrect Grammatical Terminology.

Posted by Bill Poser at 03:44 AM

August 09, 2007

Echoes from the dance of the elephants

A few days ago, I learned that I'm co-author of a chapter in a book whose existence I had previously not suspected, and that as a result, a medium-sized European publishing conglomerate has paid a not-entirely-trivial sum of money to a much larger European publishing conglomerate. This makes me feel, in a small way, like an athlete who learns that he has been traded from one team to another. Except that I don't have to move.

Here's the relevant piece of the email:

Dear Steven Bird and Mark Liberman,

I am delighted to inform you that the 6-volume collection "Corpus Linguistics: Critical Concepts in Linguistics" edited by Wolfgang Teubert and Ramesh Krishnamurthy was published by Routledge in June this year. The publication contains the following article of yours: Steven Bird and Mark Liberman, ‘A formal framework for linguistic annotation’, Speech Communication, 33, 1-2, 2001, pp. 23-60.

In many cases, Routledge contacted only the previous publishers of the article with regard to copyright permission, and the editors are therefore aware that the authors did not receive any royalties from this. In this connection, the editors would just like to inform you that the original publisher of your article [i.e. Elsevier] received 646 pounds for the article from Routledge.

I guess it's fair enough that Taylor & Francis (who own Routledge) paid Elsevier (who own Speech Communication) rather than paying us, since Elsevier never paid us anything either, following the normal political economy of the scientific publishing industry: academics do the research, writing and editing, funded by universities and government research grants, and the publishing conglomerates hold the copyrights and collect the money. (That's in order to reward creativity and to ensure that the authors will remain motivated to dream up new things in the future, you understand.) To add to the fun, this arrived a few days later:

You will be happy to hear that we have managed to persuade Routledge to offer a discount of 40% to the authors of the articles in the 6-volume collection "Corpus Linguistics: Critical Concepts in Linguistics" edited by Wolfgang Teubert and Ramesh Krishnamurthy. You will need to order the collection through Routledge customer services team. The person to contact is Kerry Tobin at XXX@XXX. You need to state that they you were a contributor to The Corpus Linguistics: Critical Concepts set, and mention Simon Alexander (Senior Development Editor, Major Works) as having authorised the discount.

I believe this means that instead of paying $1,395 for this work, we're entitled to buy it for a mere $837. (Unless, of course, this is an odds ratio type of discount, in which case the price would be $1304.33 :-)... Before considering the offer further, though, I'll need to find out about taxes and shipping, since I can buy the same work from Amazon for $878.85 with free 2nd-day delivery.

Meanwhile, if you're interested in our article, you can read the 1999 technical report version for free (from various sites linked) here, and (depending on your subscription status) the 2001 Speech Communication version here, or (without let or hindrance) here.

This sort of publishing has become a strange ceremonial dance among business conglomerates, the libraries of research universities, and the governments who pay the library costs. It plays almost no role at all in actual scientific and scholarly communication, at least in the fields that I work in.

I doubt that more than a handful of individuals will buy this new Routledge six-volume collection -- if indeed there are any sales to individuals at all. Routledge is counting on the usual suspects, a couple of hundred research libraries worldwide, who feel compelled to buy such publications on behalf of their users. Thus the economic calculation for a publisher like Routledge must be, roughly, "it'll cost us about $50K to print and ship 300 copies of these sets, and we'll collect about $300K, of which we'll have to pay $50K in royalties to other publishers, leaving $200K net". My guess is that in this business, the break-even point is at around 100 copies; 300 copies is a success; and 500 copies is cause to break out the champagne. (All my numbers are invented, but I don't think they're off by more than a factor of two -- if you've got better estimates, tell me about them. )

But the sad thing about all this is that hardly anyone will ever read the printed volumes, which will sit gathering dust on the shelves until eventually they are shipped to off-site storage. That's not because no one is paying attention to the contents -- I can't answer for the rest of this particular publication, but the Bird & Liberman article that they're reprinting has been widely read and cited, and we continue to get interesting feedback from people who are reading it and thinking about it and trying to improve the ideas in it. However, I can't remember the last time that I went to the library to find the text of a recent article or book chapter in paper form -- everything that matters is available online. (That's not true, I hasten to add, for older works, for which I often do rely on the library; and I'm still a big fan of physical books, as a glance around my offices and living quarters will reveal.)

The libraries who buy these publications are mostly, in the end, funded by taxpayers. Certainly in the U.S., the budgets of university research libraries form part of the overhead that universities charge on government research grants (which of course also pay for much if not most of the research whose results are published or reprinted in these volumes). In general, research libraries are wonderful institutions, more than worth what they cost; but the process that we're talking about is driving their costs way up, with little benefit to anyone except the publishing conglomerates.

[Update -- as of 8:30 a.m., I've already gotten a few emails from people who have managed to remain unaware of the Open Acess phenomenon. If you're one of them, start here.]

Posted by Mark Liberman at 06:49 AM

August 08, 2007

It's a time sex thing, baby

From Victor Mair (and originally from here), another adventure in Chinese-English translation.

This label shows a disposable coffee cup and a bilingual legend whose English half is "A TIME SEX THING". But it's not from the cover of a racy new novel about coffee-break quickies among over-scheduled young Hong Kong investment bankers. Nor is it from the latest CD by the Shanghai rockers Assembly Line Love Machine. It's not even the lead-in to a shocking tabloid exposé of caffeine-fueled Olympic stopwatch-fetishism in the Beijing elite. No, it's a word-by-word mistranslation into English, apparently without ironic intent, of the distinctly un-erotic Chinese phrase:

一次性 用品   =  YI1CI4XING4 YONG4PIN3

Victor breaks it down for us:

The individual syllables of this phrase signify:

YI1 "one"
CI4 "time; occasion" (as well as many other meanings that are not relevant here)
XING4 "nature, character, quality; sex"
YONG4 "use"
PIN3 "article, product" (as well as many other meanings that are not relevant here).

Puting it together, YI1CI4XING4 is an adjective modifying the noun YONG4PIN3.

YONG4PIN3 means "article for (daily) use," which comes out in the illustrated Chinglish phrase as "thing" (not too bad).

YI1CI4XING4 is much harder, because it is a fairly recent neologism, and not many people are familiar with it, although I've been hearing it occasionally for the last 15 years or so. The translator chose to render each syllable of the trisyllabic adjective YI1CI4XING4 individually: "a time sex" (he / she was thinking "one" = "a"). In fact, YI1CI4XING4 literally means "having the nature of a single use", which is to say, "for single use", or more simply, "disposable".

The correct literal translation of the Chinese phrase should be something like "daily use article for single use." More loosely, one might say simply "disposable cup."

What may perhaps be happening here is that the "translator" (if we can use that term) looked up each morpheme in a Chinese-English dictionary, and picked the shortest and plainest alternative in each case. This is often a good thing to do when you're choosing words in a language that you know, but it can lead a translator straight into disaster.

Then again, perhaps this has turned out to be an unexpected bonanza for the cup manufacturer. And I do think that there's a good rock song in there someplace.

[Tom McGrenery writes:

“What may perhaps be happening here is that the "translator" (if we can use that term) looked up each morpheme in a Chinese-English dictionary, and picked the shortest and plainest alternative in each case.”

I don’t think there’s really any doubt about this. When I worked at a bilingual magazine in Beijing, we used to refer to these as “dictionaryisms” – the phenomenon common to language students (we’ve all done it) wherein you pick either the first translation given in the dictionary or, even worse, ignore the first definition given and plump for the second or third along instead in an attempt to be clever, without knowing properly how to use the words in the other language.

A related tendency is that of memorizing only one given translation for a word and using it in all contexts, which is why you get Chinese English-speakers calling a receptionist a “waiter” (服务员


[Update -- here's a picture of the sign's original context, labelling the disposal-cup aisle at Century Mart:

A Wharton graduate who took Classical Chinese at Penn with Victor, and has worked in China for many years, has this to say about Century Mart:

CenturyMart is part of the recently created gigantic Shanghai retail group 百联。The two major Shanghai supermarket chains are 联华 and 华联。These traditionally are small by today's standards, about 3,000 - 5,000 square meters. They are still all over Shanghai and throughout China. As past of government funny business, these two were joined in a colossus to compete with foreign retailers, though they are still run mostly separately. Each of the two also has convenience store brands, as well as hypermarket brands, 9,000 - 10,000 square meters, like Carrefour, Wal-Mart and my old company's The Home World. Lianhua's hypermarket brand is 世纪联华, while Hualian's is 吉买盛。 The organization of the company is a tortured mess of government involvement, cross ownership, and parts listed on the Shanghai and HK stock exchanges.


Posted by Mark Liberman at 07:17 PM


In this morning's mail:

My wife is unable to answer the following questions (this  despite her having been awarded a Ph.D in linguistics by your esteemed  institution of higher learning); so she suggested I put them to you:

So we're sitting around talking to my mother, a native of Chelsea, MA, and  she produces (roughly) the following utterance (during a discussion of  efforts in her hometown of Lexington, Massachusetts to reduce class size in  local public schools): "Back in my day, public school had classes with thirty  to forty students. Howmsoever, the boys wore suits and ties, and were much better behaved [than today's students]." I was startled by her use of  "howmsoever," which clearly could have been replaced by "however" -- although  I'm fairly certain I've heard her use it before.

I googled "howmsoever" and couldn't come up with a dictionary definition.  Indeed, there appear to be no dictionaries that recognize it. But I did find  something like 111 written uses of the word, spelled as I've spelled it, and  used in nearly all cases in places where "however" could have been used.

The OED has an entry for "howsomever" as, essentially, an archaic version of  "however." So I -- by which I mean, "my wife" -- imagine[s] that via  metathesis this could have become "howmsoever." But why would this process  be occurring today with a word that has largely been out of circulation for a  few hundred years? Where is my mother getting this? She, of course, has no  idea. I don't recall hearing this from any other member of my family,  including my mother's parents. And what about the 111 other folks who are  using it, and spelling it, this way?

I know you have better things to do. But, help us please, we're baffled.

In addition to 116 hits for {howmsoever}, I also get 155 for {howmever}, six for {howemsoever}, three for {how'msoever}, and two for {howmsomever}.

Based on this array of results, I speculate wildly that some people have interpreted the -m- of whomever as a marker of formal style (or of free-choice uncertainty, or whatever), and mapped it by analogy onto however, howsoever, howsomever, etc.

All that I can add to this speculation is a literary precedent, from p. 21-22 of Patrick O'Brian's novel The Truelove (Clarissa Oakes in the UK). Jack Aubrey is inspecting his crew, who are mustered at divisions, when the scene shifts to the sick bay (emphasis added):

On to West -- poor noseless West, a victim of the biting front far south of the Horn -- and his division, the waisters; and as Jack inspected them, so down in the sick-berth one of their number, an elderly seaman named Owen, absent from divisions because of illness, said 'And there I was on Easter Island, gentlemen, with the Proby clawing off the lee shore and me roaring and bawling to my shipmates not to desert me. But they were a hard-hearted set of buggers, and once they had scraped past the headland they put before the wind -- never started a sheet until they crossed the Line, I swear. And did it profit them at all, gentlemen? No, sirs, it did not; for they was all murdered and scalped by Peechokee's people north of Nootka Sound, and their ship was burnt for the iron.'

'How did the Easter Islanders use you?' asked Stephen.

'Oh, pretty well, sir, on the whole; they are not an ill-natured crew, though much given to thieving: and I must admit they ate one another more than was quite right. I am not over-particular, but it makes you uneasy to be passed a man's hand. A slice of what might be anything, I don't say no, when sharp-set; but a hand fair turns your stomach. Howmsoever, we got along well enough. I spoke their language, after a fashion ...'

'How did that come about?' asked Martin.

'Why, sir, it is like the language they speak in Otaheite and other islands, only not so genteel; like the Scotch.'

'You are familiar with the Polynesian, I collect?' asked Stephen.

'Anan, sir?'

'The South Sea language.'

'Bless you, sir, I have been in the Society Islands this many a time; and sailing on the fur-trade so long, to north-west America, when we used to stretch across to the Sandwiches in the winter when the trading was over, I grew quite used to their way of it too. Much the same in New Zealand.'

'Anyone can speak South Seas,' said Philips, the next patient on the starboard side. 'I can speak South Seas. So can Brenton and Scroby and Old Chucks -- anyone that has been in a South Seas whaler.'

O'Brian was linguistically scrupulous, in general (if occasionally prone to invention with respect to culinary obscurities and the details of his own biography), so I think we can take this as evidence that howmsoever is an attested dialect form. I believe that I heard it from time to time where I grew up, in rural Connecticut, though I wouldn't swear to this.

As for how my correspondent's mother came by this form, given the hypothesis that her parents didn't use it, there are several possibilities: she might have gotten it from her peers, or from a babysitter; or she might have re-invented it on her own. Unless she is prone to picking up dialect forms later in life from reading, we can absolve Patrick O'Brian, since The Truelove/Clarissa Oakes was not published until 1992.

[Update -- Aidan Kehoe suggests another etymology:

Looks like a reanalysis of 'how and soever' to me, but then I grew up with Irish English (where, with the exception of one Scot, all the Google results for that come from), and so the reanalysis may have been a historical one from a pre-existing 'howmsoever.'


[Update #2 -- Alaina Sloo writes:

I was curious, so I checked Dictionary of American Regional English. No howmsoever, but it lists both howsoever archaic and howsomever, and says that howsoever is archaic but howsomever (also howsomebeever, howsom(e)dever, howsum(dever)) is in current usage, "scattered, but chiefly S and C Atl, S Midl." The definition includes examples from 1795 through 1984.


[Update #3 -- Josh Millard writes:

I didn't expect to find much, and got what I expected: a single google hit for 'wheremsoever'.

'The content problem will be *much* harder to solve. Find the best HTML page person you have available (from a pretty pages *and* a usability standpoint), let them re-make your home page (which I *hope* lives at, wheremsoever else it might live, too), and then let the other departments "discover" how much prettier looking it is, and say "hey; can we borrow her?" :-)'

Doesn't look like a coy language joke, and it's a pretty implausible miskey (in QWERTY at least).


Mm, yes. If Language Log had a merchandising department, it would now commission a bumper sticker reading "Another family for free-choice -m-".]

Posted by Mark Liberman at 08:13 AM

[ɬ] on CTV

The CTV news tonight had a segment on the activities in China of a group of Tibetan freedom activists, including Executive Director of Students for a Free Tibet Lhadon Tethong, who is blogging from Beijing. To my amazement, the reporter pronounced her name, correctly, with an initial voiceless lateral fricative [ɬ].

Posted by Bill Poser at 03:22 AM

Pails and flounders

Ever since I got into the eggcorn business, people have been nominating errors as eggcorns, or asking if some error is an eggcorn.  The American Dialect Society mailing list has a thread headed "Eggcorn?" every so often, and I get lots of mail with that header.  Some of these errors are already in the eggcorn database, some are lovely new finds, but others don't seem to me to be eggcorns, for one reason or another.  The latest chapter began on August 3, with an "Eggcorn?" posting from Wilson Gray asking about the following, from MacUpdates comments:

If the author cleans up that one glitch, then I'll make a b-line for his app.

Larry Horn and I saw no semantic motivation for b-line and suggested that it was probably as opaque for the writer as bee-line would have been.  And I launched into yet another discussion of things that aren't eggcorns but resemble them in some respects.  Here's a somewhat spiffed-up version of what I said.

1.  Pails.  B-line exemplifies a fairly common error type, involving a part X of an expression that can be parsed out but can't be easily assigned a meaning: in [bi]-line, line is a recognizable element, but what is [bi]?  The name of the letter B?  The verb be?  The noun bee?  The proper name Bea?  Something unique to this expression (a "cran morph")?

[Before you complain, let me explain that the technical term "cran morph" (from the cran part of cranberry) was coined well before the world was faced with cranapple juice and similar products.]

If you can think of an item pronounced like X, or something close to it, that would seem to contribute some sense to the whole expression, then interpreting the expression as containing that item and spelling the expression accordingly produces an eggcorn (or, of course, gets you the right analysis and spelling).

On the other hand, if you're stumped about the identity of X -- that is, if the larger expression seems irretrievably idiomatic to you -- you can just pick some existing item Y pronounced like X, ideally one of the right sort of category to fit where X occurs (so, for [bi]-line, a noun); you'll probably be biased towards picking a frequent word, or one with a short spelling, or maybe you'll pick one at random.  The result is a type of error i'm now calling a PAIL, after the (very common) spelling "beyond the pail", where the baffling noun pronounced [pel] is taken to represent the everyday noun pail; yes, it doesn't make sense, but then idioms are like that. 

(I'm paraphrasing my son-in-law Paul Armstrong here, who used the "pail" spelling in his blog a while back, was astounded to discover that the spelling was supposed to be "pale", and even more astounded to read about the history of the expression.  His actual words: "For better or worse, it's an idiom I picked up and I use it as a whole.  I don't know where I picked up the pail spelling but I considered it an idiom and thus seemingly odd spellings or disjoint meanings are not beyond reason...")

In any case, I take b-line to be a pail rather than an eggcorn.

(Let me stress, as I have before in similar situations, that the stories I told above about eggcorns and pails are stories about the genesis of these errors.  Once the incorrect interpretations and spellings are out there, other people pick them up.  For these later users, these interpretations and spellings are just the way things are, and they either make some sense, in the case of the eggcorns, or they're merely odd idioms, in the case of the pails.)

2. Flounders.  More things that aren't eggcorns.  Back in May, Michael Quinion and I had an exchange about


(an example contributed to Michael's World Wide Words newsletter), with advent for event.  I said at that time (May 24):

I'm inclined to see it as a simple confusion of phonologically and semantically similar words, like flaunt/flout, militate/mitigate, flounder/founder, etc.  (Incidentally, it would be nice to have a technical term for these confusions.  Let me suggest FLOUNDERS.)

[In fact, Geoff Pullum has just posted about the flounder flaunt/flout.  (Try saying "the flounder flaunt/flout" three times fast.)]

Flounders are the counterpart of ordinary classical malapropisms ("ordinary" here means: not of the eggcorn subtype).  In both flounders and -- let me continue this frenzy of naming with yet another term -- PINEAPPLES ("He is the very pineapple [pinnacle] of politeness", from Mrs. Malaprop herself), an incorrect word E is substituted for a phonologically similar word T, but in flounders, the error word E and the target word T also overlap semantically, while in most pineapples E and T are semantically distant (if E is an existing word at all).  Obviously, there's some room here for borderline cases.

Flounders and pineapples as a set (FLOUNDAPPLES?) are distinguished from pails and eggcorns as a set (PAILCORNS?) in that the former involve confusions of wholes, while the latter involve confusions of parts of (at least partially) fixed expressions.

3. Four types.  For those of you who like squares of oppositions, the story so far can be summarized as:

Flounders   --     Pineapples
    |                            |
Pails           --     Eggcorns

All four types involve relationships between meaningful elements of some sort, a characteristic that distinguishes them from simple spelling errors like "loose" for lose or "there" for their.  Though writers are often exhorted not to "confuse" expressions like its and it's, there's no confusion going on in such errrors: the identities of the expressions involved -- that is, the pairings of pronunciation and meaning that they represent -- are perfectly clear to the writers; their problem is the link between the expressions and their spellings.  The four types above (usually) can be detected through what look like non-standard spellings, but they aren't orthographic errors at root.

It would be nice to have a cute term that picks out these four types as a set, and distinguishes them from simple spelling errors and also from aberrant pronunciations, like "REtart" for "REtard", and aberrant meanings, like ritzy taken to mean 'cheap, trashy' (and some other things to come).  But my clever-terminology machine is worn out for today, so I'll have to resort to something more technical: EXPRESSION-SUBSTITUTION ERRORS, because they involve one form-meaning pairing substituting for another.

4. Esculators.  Among the other things that aren't eggcorns are reanalyses motivated not by semantic considerations but by morphological (or morphophonological) considerations, reanalyses that I've treated in several earlier postings.  Some representative examples:

nucular, perculate, esculate, simular, jubulant, nuptuals
doctorial, pastorial, pectorials, similiar
intravenious, mischievious, grievious, heinious
overature, aperature, fixature, mixature, strucature

To which I can add some morphological re-shapings involving -edly vs. -ably that have excited some discussion on ADS-L over the last few years:

supposably, assumably, reputably [-ably for -edly]
presumedly [-edly for -ably]

These ESCULATORS don't fit into the scheme above.

[Added 8/9/07: And then there are the eggcornesque misquotations (like "trill/toll the ancient Yuletide carol") that I reported on in "Cousin of Eggcorn".]

5. Thinkos vs. typos.  Another reminder that I've issued several times: all this is about "advertent" errors, or what Geoff Nunberg (in Going Nucular) has called "thinkos" (vs. "typos", if we extend that term to include all sorts of inadvertent errors, including Fay/Cutler malapropisms, word retrieval errors based on semantics, inadvertent blends, telescopings, transpositions, omissions, perseverations, anticipations, and more).  When people make advertent errors, they're saying or writing what they intend to; the problem is that what they do isn't in line with what most other people do.  Thinkos are "false knowledge".

A further reminder: the same production can represent different things in different contexts.   One person's general practice can be another person's momentary lapse; "beyond the pail", for instance, can occur as a typo as well as by intention.  In fact, on different occasions the same production could be (at least) an inadvertent error, an advertent error, a dialect form, or a deliberate creation.  In giving examples above, I wasn't claiming that every occurrence of the examples should be categorized the way I categorized them -- only that an appreciable number of them can be.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 01:48 AM

August 07, 2007

Fog gets in your eyes

Yesterday's Google weather report said:

Fog, 83 degrees, winds west at 13 mph, humidity 16%.

Fog? When I looked outside, for the life of me I couldn't see any fog. That black stuff in the air is smoke from the huge forest fires about 30 miles from here, dangerously close to Language Logger Sally Thomason's summer cabin home.

The reporter's choice of "fog" to report "smoke" made me wonder about the inventory of weather reporting terms. There are some standard ones, like "snow," "sleet," "rain," and "fog." But "fog" doesn't come close to describing the smoke that we have. It's so thick I can't see the nearby mountains from my windows (which I have to keep shut to make breathing easier). It's not "smog" either, because that would require some mixture of fog, but there isn't any sign of that.  And "haze" won't quite do either, because  this word is far too tame to describe the black stuff overhead around here. A quick check of the The Sunday New York Times weather section told me this about "hazy": "Hazy skies are typical of hot, humid weather." Not much humidity here. Never is. Just smoke.

Okay, I think I know what you're going to say. The American Heritage Dictionary's second sense of "fog" reads: "A mass of floating material, such as dust or smoke, that forms an obscuring haze." Merriam Webster's Collegiate doesn't mention smoke in its definition of fog. Perhaps the Random House College Dictionary  comes a tad closer, defining "fog," as "a darkened state of the atmosphere," but it doesn't say anything about what darkens it, including smoke from forest fires.

It occurs to me that maybe weather reports don't  mention smoke because smoke really isn't weather. But it still seems odd to call this stuff "fog." I'm going to have to check this out with Foggy the Bear.

Posted by Roger Shuy at 09:41 AM

No thanks, I had a late breakfast

This is surprising, given the cross-cultural sensitivity documented here.

[From Victor Mair]

Posted by Mark Liberman at 08:10 AM

Hand-waving in the Washington Post

According to Rick Weiss, "Gestures Convey Message: Learning in Progress", Washington Post 8/6/1007,

Teachers who use gestures as they explain a concept ... are more successful at getting their ideas across, research has shown. And students who spontaneously gesture as they work through new ideas tend to remember them longer than those who do not move their hands.

Weiss tells us that some new research by Susan Wagner Cook and others shows

... that even abstract gestures can enhance learning. In a classroom, she had some students mimic her sweeping hand motions to emphasize that both sides of an equation must be equal. Other students were simply told to repeat her words: "I want to make one side . . . equal to the other side."

A third group was told to mimic both her movements and words.

Weeks later, the students were quizzed. Those in the two groups that were taught the gestures were three times as likely to solve the equations correctly than were those who had learned only the verbal instructions, she and two colleagues reported in the July 25 issue of the journal Cognition. [emphasis added]

OK, time for a quiz. (Re-read the passage above, and feel free to gesture or not, as you like.)

In a post-test after the instruction, the students in the "Speech" group got an average of 2.0 out of six equations right. How many equations, on average, did the students in the "Gesture" group get right? (Justify your answer.)

If you've been paying attention, your answer will be something like "well, Weiss says that 'those in the two groups that were taught the gestures were three times as likely to solve the equations correctly than those who had learned only the verbal instructions'. And three times 2.0 is 6.0, so the kids in the Gesture group must  have gotten pretty much all the equations right."

Bzzzt! No, sorry, the answer is 2.6 out of six, as you can learn by reading the research report, which is Susan Wagner Cook, Zachary Mitchell, Susan Goldin-Meadow, "Gesturing makes learning last", Cognition, in press.

How likely was this difference to have occurred by chance? Well, you can't tell just from the mean values, but according to the research report, the difference was evaluated as p=0.52, i.e. a difference in sample values of this size is likely to happen by chance about 50% of the time, when there is actually no difference in the underlying behavior being sampled.

What's going on? "Aha", you may be saying to yourself, "maybe it's odds ratios again".

No, that's not it (though it's something related).

The thing is, the effect that Weiss is talking about didn't show up during or immediately after the controlled teaching experiences. Instead, it showed up in a delayed test, four weeks later.

OK, so what was the difference in performance between the Speech and Gesture groups then?

Unfortunately, I don't know. The research report doesn't provide this useful piece of information.

Instead, the experimenters reason as follows:

If children retained the knowledge learned during the math lesson, we should be able to predict their performance on the follow-up test 4 weeks after instruction from their performance on the posttest immediately following instruction. We used a regression model to predict follow-up test performance, using posttest performance and condition (Speech, Gesture, Gesture + Speech) as factors. As is evident in Fig. 2, regression coefficients differed across the three groups (F(2, 78) = 5.79, p = .0045). The unique predictive power was significantly greater for the Gesture (β = .80, t(28) = 7.94, p < .0001) and Gesture + Speech (β = .92, t(23) = 12.00, p < .0001) groups than for the Speech group (β = .33, t(27) = 1.89, p = .069); t(78) = 2.64, p < .01 and t(78) = 3.22, p < .01, respectively.

OK, fair enough. I'm convinced that the use of hand gestures in this experiment improved the correlation between post-test performance and follow-up test performance.  (Weiss's "three times" appears actually to refer to this difference in test-performance correlation, not to any difference in performance.)

The fact that use of hand gestures increased the correlation between post-test scores and follow-up test scores is interesting and meaningful. But I'd rather know how much the use of hand gestures actually improved follow-up test performance, and I'm puzzled that the authors didn't tell us that. Could it be because their experimental manipulation didn't produce a statistically significant difference in the follow-up scores either?

I think this might be true. While the paper doesn't provide the raw test scores (though a small table would have sufficed to do this), there is a scatterplot showing the follow-up scores vs. the post-test scores for each of the three conditions:

Fig. 2. The top panel displays regression lines relating performance on the 4-week follow-up to performance on the immediate posttest. The bottom three panels display scatterplots of the number of problems solved correctly on the follow-up test and the posttest by condition (Gesture + Speech, Gesture, Speech). The size of the dot at each point represents the number of children who fell at that point.

Based on measuring the dots in these plots, I deduce the following table of follow-up scores for the Gesture and Speech conditions (where as in the plot, the number in each cell represents the number of students with each score):

  0 1 2 3 4 5 6
Gesture 10 0 0 7 4 5 6
Speech 12 2 0 4 0 6 6

This doesn't quite tally with the other information in the report, which says that there were 30 (rather than 32) students in the Gesture group, and 29 (rather than 30) in the Speech group; but I don't see any other way to get numbers from the graph. (Maybe there is some sort of round-off error resulting from taking proportions and scaling back to integer circle sizes?)

The average follow-up score of the Gesture group in this reconstruction is slightly higher than the average follow-up score of the Speech group -- 3.1 correct out of 6 vs. 2.7 correct out of 6 -- but there's no way the difference is statistically significant. (The p value in a t test is about 0.53 -- but if you think I've got this reconstruction wrong, and especially if you know the actual numbers from this study, please let me know.)

Looking at the scatterplots and the regression lines, you can see what the researchers are getting at. The difference in teaching methods is having some effect, and their comparison of regression coefficients seems to be a plausible way to get at it.

But there's no warrant whatever in the paper for Weiss's interpretation that "Those in the two groups that were taught the gestures were three times as likely to solve the equations correctly than were those who had learned only the verbal instructions". In fact, depending on which test you look at, the difference seems to have been something like 15% to 30% (and not statistically reliable), not 300%.

Now, this misunderstanding is Weiss's fault, He should be able to read the paper and understand that it absolutely and categorically doesn't say what he says it says. If he wasn't sure, he should have asked one of the authors when he interviewed them.

But in this case, the authors bear some of the responsibility, for failing to note a very relevant and (apparently) negative result, and leaving out the simple table of raw scores that would have allowed readers to draw their own conclusions.

A table in the form that I gave above, providing all the numbers in the scatter plot, would have had just 3 tests by 3 conditions by 7 outcomes, or 63 cells. The full table of raw scores is just 84 subjects by three test scores = 252 numbers. It could easily have been provided as online supplemenary materials, if not as a small-type table in the text itself. I'm surprised that the editors and reviewers at Cognition didn't insist on this, but psychology journals often provide regression or anova results, without giving the data they were based on, or even simple descriptive summary statistics. (I bear some personal responsibility here, since I once did a stint on the editorial board of Cognition, and I never tried to change their policies in this respect.)

So let me add another commandment to the decalogue of scientific rhetoric: Thou Shalt Report Raw Numbers. This is really directed at scientists, not journalists -- but you journalists should ask for the numbers, if you don't see them in the research report. Sometimes the raw numbers are so noisy you can't see the signal, and only sophisticated statistical analysis let's you see what's going on; but when that's true, you need to think about what it means for the functional significance of the effect.

I don't really expect my rhetorical decalogue to have much impact on the way people act, any more than the original one does. We're swimming upstream against the current of human nature, in a system where journalists (and for that matter scientists) are paid to get your attention. But it might have an effect on how people (including the scientists who review papers) interpret what they read.

[Update -- an informative response from Susan Wagner Cook:

I saw your comments on the recent press coverage of our Cognition paper: "Gesturing makes learning last" and thought that I could provide a bit more context. I agree wholeheartedly that the statement in the Washington Post article is a mischaracterization of our findings, and I appreciate this being brought to light. The statement should be qualified as follows: Of children who learned to solve the problems correctly during instructions, those in the two groups that were taught the gestures were three times as likely to solve the equations correctly three weeks later than were those who had learned only the verbal instructions.

With regards to the data analysis, your characterization of the data is right on target. We felt that including the scatterplots provided a descriptive picture of the data, however, perhaps quantifying the relations would have helped avoid misinterpretation. I am not sure what algorithm Excel uses to calculate the circle sizes in bubble plots, and whether it relates to radius, area, or diameter, but the actual data look like this:

  0 1 2 3 4 5 6 Mean
Speech 18 1 0 3 0 3 4 1.68
Gesture 15 0 0 4 1 4 6 2.4
Speech+Gesture 12 0 0 1 1 5 6 2.72

In an ANOVA with condition as a factor, these differences are not significant, F(2,81)=1.15, p=.32.

However, this analysis leaves out an important source of known variability on the follow-up, post-test performance, and accounting for this variability in the analysis improves statistical power by decreasing error variance. There should be a strong relation between post-test performance and follow-up, particularly if the experimental manipulation is affecting learning. Children who solve the problem correctly on the follow-up who did not solve the problem correctly during training or on the posttest are likely to have learned in the intervening period, while they are receiving mathematics instruction in their classrooms, not from our instructions.

The regression analysis reveals that children in the two conditions including gesture are maintained the learning they showed during our instruction. They are performing comparably on the posttest and the follow-up - maintaining nearly all of the learning they showed during our instruction. In contrast, in the speech conditions, children's follow-up performance in not related to their performance during out instruction, suggesting that whatever correct performance is observed is not related to our instructional manipulation, but rather to factors in the intervening period. Indeed, if the speech manipulation were difficult even for some students who were ready to learn the concept, then we would even expect proportionally more students in this condition to learn in the intervening period, precisely because they could not learn from our instructions.

Thank you for raising some important issues and for your thoughtful attention to our paper!

I guess that Excel must be using relative area (which makes sense, on reflection), whereas I was measuring relative diameter. All the more reason to give a table of the data, which I'm thankful to see!

Some other relevant work: Lindsey E. Richland et al. "Cognitive Supports for Analogies in the Mathematics Classroom", Science 316(5828)1128-1129, May 2007, which suggests that using a scale balance to teach equations is a good idea -- it would interesting to compare the value of a physical prop like a scale balance with the (less expensive but less concrete) balancing gestures used in this study. ]

Posted by Mark Liberman at 07:27 AM

August 06, 2007

Define this, you nitwits

Snopes, the foremost online repository of urban legends, reports that the following email is making the rounds:

I don't have a 1999 Random House unabridged handy, but I hope this one is true. Received via email:

Disgruntled Former Lexicographer

The following definition was discovered in the 1999 edition of the Random House dictionary. The crafting of this definition was the final assignment of Mr. Del Delhuey, who had been dismissed after 32 years with the company.

Mutton (mut'n), n. [Middle English, from Old French mouton, moton, from Medieval Latin multo, multon-, of Celtic origin.] 1.The flesh of fully grown sheep. 2. A glove with four fingers. 3. Two discharged muons. 4. Seven English tons. 5. One who mutinies. 6. To wear a dog. 7. A fastening device on a mshirt or mblouse. 8. Fuzzy underwear for ladies.

9. A bacteria-resistant amoeba with an attractive do. 10. To throw a boomerang weakly. 11. Any kind of lump. (slang) 12. A hundred mittens. 13. An earthling who has been taken over by an alien. 14. The smallest whole particle in the universe, so small you can hardly see it. 15. A big, nasty cut on the hand. 16. The rantings of a flibbertigibbet. 17. My wife never supported me. 18. It was as though I worked my whole life and it wasn't enough for her. 19. My children think I'm a nerd.

20. In architecture, a bad idea. 21. Define this, you nitwits. 22. To blubber one's finger over the lips while saying, 'bluh.' 23. I would like to take a trip to the seaside, where no one knows me. 24. I would like to be walking on the beach when a beautiful woman passes by. 25. She would stop me and ask me what I did for a living. 26. I would tell her I am a lexicographer. 27. She would say, "Oh, you wild boy." Exactly that, not one word different.

28. Then she would ask me to define our relationship, which at that point would be one minute old. I would demur. But she would say, "Oh please define this second for me right now." 29. I would look at her and say, "Mutton." 30. She would swoon. Because I would say it in a slight Spanish accent, at which I am very good. 31. I would take her hand and she would notice me feeling her wedding ring. I would ask her whom she is married to. She would say, "A big cheese at Random House."

32. I would take her to my motel room, and teach her the meaning of love. 33. I would use the American Heritage, out of spite, and read all the definitions. 34. Then I would read out of the Random House some of my favorites among those that I worked on: "the" (just try it); "blue" (give it a shot, and don't use the word 'nanometer'). 35. I would make love to her according to the O.E.D., sixth definition.

36. We would call room service and order tagliolini without looking it up. 37. I would return her to the beach, and we would say good-bye. 38. Gibberish in e-mail. 39. A reading lamp with a lousy fifteen-watt bulb, like they have in Europe. Also: a. muttonchops: slicing sheep meat with the face. b. muttsam: sheep floating in the sea. c. muttonheads: the Random House people.

As the Snopesters reveal, the above text is actually a humorous piece by Steve Martin, published in the "Shouts & Murmurs" section of the Oct. 11, 1999 New Yorker. It's part of a large genre of urban folklore common to the digital era, in which authored bits of fiction become decontextualized from their sources and get circulated electronically as if they were factual. Sometimes the fiction is intentionally misleading, as in the Belgian video about voices released from the grooves of ancient pottery (an April Fool's prank, as I described in a post last year). In many other cases, a piece of satire loses its satirical context, leading credulous readers to believe that they're enjoying a "strange but true" account of Microsoft error messages in haiku, a calamitous Bangkok piano recital, or Elizabeth Hurley's pubic hair extensions.

Steve Martin's piece works well in this genre, since it poses as the type of apocryphal story we've all heard about: a disgruntled employee leaves behind a concealed bit of sabotage, be it a rattle in a Cadillac or hidden phallic artwork. But Martin moves the sabotage narrative into finely wrought literary terrain, with the sad Mr. Delhuey joining the ranks of such thwarted dreamers as Walter Mitty and J. Alfred Prufrock. (Apparently, no amount of literary flourishes can keep some people from disabling their irony detectors, at least when they're reading their email.)

One line in particular struck the fancy of the writer Diane Ackerman, as she recounts in her book An Alchemy Of Mind:

When I got to definition 6 — "To wear a dog" — I began laughing spasmodically, and again at odd moments throught the day, whenever the image of wearing a live squirming mutt infiltrated my thoughts. At dinner that evening, over curry, I tried sharing its humor with friends. Only one of three got it, and she started laughing crazily, too, while the others seemed mystified by our apparent stomach cramps and bad taste. On the basis of that single event, I conclude that humor is subjective.

I'm more fond of the later definitions, with the vividly imagined world of those harmless drudges who toil anonymously in the lexicographical trenches. Now that I've joined that world, I can report that most lexicographers aren't all that Prufrockian. Not that they don't dream of finding someone who swoons at the definition of mutton.

(By the way, the sixth definition of the noun love in the Oxford English Dictionary is "the animal instinct between the sexes, and its gratification." Mr. Martin clearly did his homework.)

Posted by Benjamin Zimmer at 06:04 PM

Taking down the hermeneutic lightning rod

In response to my earlier post about -30- as journalistic jargon for "the end" ("February 30", 8/5/2007), Mark Eli Kalderon noted that web search turns up at least 10 distinct stories about the origin of "this obscure piece of markup" -- the traditional closing time of telegraph offices was 3:00; wire service employees were allowed 30 telegrams a day; 30 ems was the maximum length of a linotype line; Judas was paid 30 pieces of silver; the Spartans appointed 30 tyrants to rule Athens; "XXX" was misinterpreted as roman numerals; etc., etc. -- and asked "Why is it a hermeneutic lightning rod?"

I don't know. But I believe that I now know the true history, thanks to email from Boyd D. Garrett Sr. and Charlie Clingen.

The note from Mr. Garrett arrived first:

When I read your post about the use of -30- to indicate the end of a news story, it rang a few Amateur Radio bells in the back of my head.

Back in 1859, Western Union established some standard numeric codes to be used for common telegraphic conventions (you can see the entire list here.) Telegraphy operators then and now have always sought ways to keep transmissions as brief as possible, since telegraphy is a relatively slow and highly manual mode of operation. A few of these codes are still in use today ("73" being the most common, used among Amateur Radio operators to say "best regards").

My speculation is that this telegraphy convention moved into the journalism arena with the advent of "wire stories" sent by telegraphers, so they naturally put the "30" at the end of a news article to clearly indicate the end.

I don't read Language Log enough, but when I do, I always find something that piques my interest, such as your post that generated this response. Keep up the good work!

Well, 73 right back atcha, Mr. Garrett!

Hard on its heels was a note from Mr. Clingen:

According to the ARRL (American Radio Relay League), it was also used by Western Union before the Civil war, and subsequently adopted by the amateur/ham radio folks, although by then the representation had morphed into "SK" in modern International Morse Code. Nowadays it is used as a "prosign", a brief code, embedded in the Morse code text message, and the meaning has remained essentially unchanged -- end of message.

[The AARL Ham Radio History page says:]

Many of the expressions and procedure signals still in use in radiotelegraph had their origins in the early days of the landline telegraph--long before Marconi sent his letter "S" across the Atlantic.

In sending formal messages by c.w., the first thing a beginner hears is "don't send punctuation. Separate the parts of the address from each other with the prosign AA." This is ironic, because in the American Morse Code the sound didahdidah is a comma and was doubtless the origin of our prosign. Originally, a correctly addressed letter was punctuated with commas following the name and the street address, each of which was (and still is) on a separate line although the commas have been dropped, even in mail addresses on letters. The comma was transmitted by Morse operators and thus, AA came to mean that the receiving operator should "drop down one line" when sent after each part of the address and it is so defined in the operating manuals of the time.

Our familiar prosign SK also had its origin in landline Morse. In the Western Union company's "92 code" used even before the American Civil War, the number 30 meant "the end. No more." It also meant "good night." It so happens that in Landline Morse, 30 is sent didididahdit daaah, the zero being a long dash. Run the 30 together and it has the same sound as SK.

(Hey, a morse code eggcorn!) While I haven't been able to verify the historical accuracy of the Signal Corps Association and Amateur Radio Relay League pages, they certainly seem authentic. Maybe there was some earlier history to the choice of "30" to mean "no more -- the end", or maybe it was just a random numerical assignment. But the path from 1859 to the journalistic use documented in the 1895 Funk's Standard Dictionary appear to be plain.

The NYT Corrections: For the Record item said:

Although many no longer use it or even know what it means, some journalists continue to debate its origin. A popular theory is that it was a sign-off code developed by telegraph operators. Another tale is that reporters began signing their articles with “30” to demand a living wage of $30 per week.

Hermeneutics aside, it looks like historical truth in this case is on the side of telegraphy. Didididahdit daaah.

Posted by Mark Liberman at 04:39 PM

Software with a liberal imagination

Amazon has got my number:

Geoffrey Nunberg,
We've noticed that customers who have purchased or rated Sincerity and Authenticity (The Charles Eliot Norton Lectures) by Lionel Trilling have also purchased Historical Linguistics 2005: Selected Papers from the 17th International Conference on Historical Linguistics, Madison, Wisconsin, 31 July - 5 August 2005 (Current Issues in Linguistics Theory) by Joseph C. Salmons. . .

Posted by Geoff Nunberg at 02:04 PM

Flouting facts in The New Yorker

Here's something that surprised me quite a bit: a flagrant malapropism in The New Yorker (see it here, in an article about Avraham ("Avrum") Burg by David Remnick):

One subject that especially infuriated Shavit, and provoked countless letters to the editor, e-mail screeds, and editorial-page rebuttals, was Burg's depiction of the European Union as an almost irresistibly attractive "biblical utopia" and his flouting of the fact that he holds a French passport, because his wife is French-born, and voted in the recent French elections. When Shavit asked Burg if he recommended that all Israelis acquire a second passport, Burg replied, "Whoever can"—a moment of determined cosmopolitanism. Shavit sarcastically called Burg "the prophet of Brussels."

That use of flouting is a very clear case of a famous confusion. Remnick means flaunting.

Flouting something is treating it with contemptuous disregard (often it will be a rule or standard or convention). Flaunting something is displaying it ostentatiously or impudently (often something like wealth or privilege). Confusion between these words is very common (relative to their comparatively low overall frequency of occurrence); MWCDEU provides (as usual) an excellent article surveying the history.

I think (I have no quantitative backup) it is more usual for flaunt to be used where flout was meant, and I can see why there is confusion in that direction: you can boastfully exhibit your contempt for normal standards, and thus flaunt your flouting of them. Webster's actually gives "flout" as one of the meanings of flaunt, citing Louis Untermeyer as having talked about someone having "flaunted the rules", which is exactly the kind of use I am saying I can understand the motivation of.

But that is not relevant here, where flout has been used for flaunt. It is absolutely clear what Remnick means: Burg flaunts the fact of his French nationality and passport and electorally franchised status. He stresses it proudly, and recommends that other Israeli Jews should have a second nationality if they can. In no way does Burg exhibit contempt for his French status; the charge people make against him in Israel is quite the opposite — that he flaunts his Frenchness and flouts the notion of being an Israeli. Remnick simply dipped into his mental lexicon and came up with the wrong word, and there's no possible exculpatory argument (unless you take the view that the linguistic change train has left the station and these lexemes have now merged in educated Standard English, which I do not). And no one in the New Yorker editorial office spotted it. That's quite unusual, in this highly selective and very carefully edited magazine.

Posted by Geoffrey K. Pullum at 12:45 PM

Annals of spam

The most recent (8/6/07) New Yorker has an unsettling piece by Michael Specter on spam, in an "Annals of Technology" category: "Damn Spam: The losing war on junk e-mail" (pp. 36-41).  The title pretty much tells the story: it's an arms race, with both sides evolving, but the spammers seem to be winning.

Bayesian filters try to catch spam by looking at properties of previous spam: looking for the word Viagra, for instance.  Spammers respond by re-spelling the word, as, say, "\ / iagra"; this one looks transparent, but Specter reports (p. 40) that a blogger has estimated that there are over 600 quintillion ways to "spell" Viagra.

Another technique that's been around for a while is to bury a few instances of tell-tale words in bizarrely phrased, but comprehensible, text, and to toss in some ordinary text lifted from another source, to throw the Bayesians off the scent, as in these two gems from my mailbox last week (I've removed the "From" and "To" addresses, since these were probably hijacked or forged):

(1) From: ...
Subject: Find out the sex craving all guys have
Date: August 2, 2007 7:39:52 AM PDT
To: ...

Dames always hee-hawed at me and even men did in the urban water closet!
Well, now I laugh at them, because I took M_E_G. ADI. K
for 7 months and now my dick is dreadfully more than civil.
Mr Ducat said that he had no intention of harming the
According to Reuters, police have found no evidence of
Mr. Kenrick and Mr. Smith both denied to disclose how
on Google Video and YouTube. It is a segment taken from

(2) From: ...
Subject: These positions will help you reach your peak
Date: August 3, 2007 2:10:27 AM PDT
To: ...

Cuties always srieked at me and even boys did in the unrestricted toilet!
Well, now I laugh at them, because I took M E _G_A_D_ IK
for 7 months and now my dick is dreadfully preponderant than civil.
Saturn's moon Enceladus taken in 2005, has shown that
Department and the CIA approved of using harsh
represents them, and the courts are closed to public
The Duke of York will leave New Zealand on Thursday 22nd

Each message has only one occurrence of the word dick, and the product name MEGADIK (in all-caps, iconic of bigness), itself a (modest) re-spelling, is expanded via spaces, underlines, and periods in such a way that human readers will have no problem recognizing it, but programs that search for orthographic patterns might be out-foxed.

As Specter notes, another strategy is to conceal the message one level down, in something other than a text file at top level.  Specter describes the image file strategy, where the message is encoded in an image rather than in text.  I get piles of penis-enlargement spam in image form every day (much of it depicting monstrously large and monstrously ugly penises), and it gets past both the spam filters I currently have in place (one at CSLI, one on my Mac).

The amount of spam sent to me has clearly been increasing; the amount of spam sent to everybody has.  I'm now getting significant amounts of spam in languages other than English: German especially, but also (just in the last week)  Chinese, Japanese, Hebrew, and Spanish.  And recently there's been a flood of two new (to me) types of one-level-down spam: stuff in a "greeting card" from someone -- "a friend", "a family member", etc. -- and a minimalist strategy, involving an e-mail message headed X (for some innocuous word or phrase X) and containing nothing but an attached file named X.pdf or header "alert", file "alert.pdf" or "".  My spam filters are getting better at weeding out the first sort (I fear the legitimate electronic greeting card business may be in for a bad time), but so far they don't catch any of the second sort.

Well, I get a LOT of e-mail that's basically just a .pdf file -- departmental business, research project files, reports to and from students -- so it's hard to see how the junk could be filtered out without looking into the files, and in a sophisticated way.  (I also trade some .zip files -- on Friday, a zipped version of Garner's Modern American Usage, with my current undergraduate intern.)

So for the moment I'm getting a lot of junk I have to expunge by hand.  And I'm not alone.

(Notes: Thanks to Doug Kenter for getting me to think about spam in the first place.  And a warning to readers: this is only a report of personal experience; I am an idiot about spammish details, and I'm not proposing to survey the topic or to keep on top of developments in the worlds of spam dissemination and detection.)

(Note of a more linguistic sort: Specter's article has the noun spam "doubly categorized", used sometimes as a mass noun and sometimes as a count noun, even in the same sentence:

As the Web evolves into an increasingly essential part of American life, the sheer volume of spam [Mass] grows exponentially every year, and so, it would appear, do the sophisticated spams [Count] from a twenty-dollar broadband account each month; at those rates, a penny would pay for fifty thousand pieces of mail.

Double categorization is a wrinkle in the system of Count/Mass assignment in English that I keep putting off for a future posting.  But if you don't mind a moderately technical and abbreviated treatment, you can look at the discussion here.)

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 12:15 PM

The Gray Lady discovers the Search box

The online Corrections: For the Record page at the New York Times is getting to be more and more fun. Today's entries are not as informative as last week's note about -30-, but they have a certain geekish charm, as the Gray Lady learns to seach her own archives:

An article in some copies on Wednesday about Congressional efforts to pass legislation to expand the government’s electronic wiretapping powers misspelled — yet again — the surname of the attorney general of the United States, in three of four references. He is Alberto R. Gonzales, not Gonzalez. (The Times has misspelled Mr. Gonzales’s name in at least 14 articles dating to 2001 when he became White House counsel. This year alone Mr. Gonzales’s name has been misspelled in February and March, and in two articles in April.)

Mr. Gonzales loses the errata-count race to Mr. Willkie:

An article on the Street Scene page in Business Day on Friday, about the law firm Cravath, Swaine & Moore’s entry into bankruptcy law practice, misspelled the name of another law firm that recently lost a bankruptcy specialist. It is Willkie Farr & Gallagher, not Wilkie. (The Times has misspelled the firm’s name in at least 50 articles since 1958. The “Willkie” comes from Wendell L. Willkie, who joined the firm shortly after losing the 1940 presidential election to Franklin D. Roosevelt and remained there until his death in October 1944.)

And both are battered by the misspelled-name champion ("so far", as Homer would add), Mr. Neiman:

An obituary on July 21 of Shirley Slesinger Lasswell, who marketed memorabilia and toys based on A. A. Milne’s children’s books about Winnie the Pooh, misspelled the name of the department store that agreed to let her set up Pooh Corners for children. It is Neiman Marcus, not Nieman Marcus. (The Times has misspelled the company’s name in at least 195 articles since 1930.)

Seriously, the NYT seems to take corrections much more seriously than most other news organizations do. The BBC's efforts, for example, are almost non-existent in comparison.

[Hat tip to Jonathan Falk]

[Update -- fev from HeadsUp: The Blog writes:

Enjoyed your note on corrections today. If you haven't yet, check out the collection the NYT published a few years back, "Kill Duck Before Serving." (One of the authors appears to be the guy who replaced Bob Byrne as chess columnist; go figure.) Good rundown as of a few years ago on the misspelled-name leaderboard. And gems of this nature:

Because of a transcription error, an article about Senator Alfonse M. D'Amato's remarks about Judge Lance A. Ito misquoted the Senator at one point. In his conversation with the radio host Don Imus, he said: "I mean, this is a disgrace. Judge Ito will be well-known." He did not say, "Judge Ito with the wet nose."

I think the Grauniad has a similar book out too, but I haven't read it. Alas.

Enjoyed the "wisdom vs. ignorance" too, by the way. We're still working on the science thing.

Ben Zimmer discussed Kill Duck Before Serving a couple of years ago ("Needling The Times, 10/31/2005). Amazon's Search function turns up some other gems:

March 11, 1975. In yesterday's issue, The New York Times did not report on riots in Milan and the subsequent murder of the lay religious reformer Erlembald. These events took place in 1075, the year given in the dateline under the nameplate on Page 1. The Times regrets both incidents.

November 2, 1968. The New York Times apologizes for imputing the use of the expression "Communist faggots" to Alexander Sacks, Republican-Conservative candidate for Congress in the 23rd Congressional District. [...]
The original copy clearly contained the expression "Communist fronts." Through an error that occurred in the composing room the word "faggots" was used instead of "fronts."

September 17, 1995. A quotation by the late civil rights lawyer William M. Kunstler was rendered incorrectly in some copies. He wrote, "The Kennedy's real immorality has to do with their lack of ethics as politicial leaders rather than their sexual exploits; he did not say "immortality."


Posted by Mark Liberman at 09:51 AM

Wikihow on body language

In a recent post I hinted about a future post on body language, not to be confused with bawdy language. For the latter, you can read posts by Mark and Arnold.  Wikihow has two links on body language, one about how to read it and another about how to understand it. I'm not sure what the alleged difference is, because last I heard, reading and understanding seem to have a lot in common.

From the Bureau of Apparently Unresearched Information at Wikihow's  How to read body language, we find the following curious statements about this topic. This piece looks like an example of what Mark called "wisdom vs. ignorance in networked crowds." Warning: the comments in parentheses are mine ...  and I promise to say nothing about "someone" taking a plural pronoun.

  • "The closer that someone is to you, the warmer his or her opinions are of you." (Is this why Mediterraneans are thought to be better lovers? Should we avoid talking on the telephone? Where does bad breath come in?)
  • "Overly tilted heads are a potential sign of sympathy." (Unless there are other reasons for "over-tilting," such as a bad  case of strabismus or simply being perplexed. And just how overly is "overly?")
  • "Lowered heads indicate a reason to hide something." (Some of the time, maybe, but how about when you're tired or when you're tring to remember something?)
  • "Look into their eyes ... you can actually learn ... how to observe behavior to judge whether someone is lying ... it's easy to spot a confident person, they will make prolonged eye contact." (Oh no, not that old eye aversion thing again!)
  • "People who look away ... are thinking about something else." (But have you ever noticed how some males tend to talk to each other while standing side-by-side rather than face-to-face?)
  • "If someone mirrors or mimics your appearance, this is a very genuine sign that they are interested in you." (Well, unless they're making fun of you, of course.)
  • "People with crossed arms are closing themselves to social influence." (Except when it's habitual, or when they're shy and reserved, or when they have a soup stain on their shirt.)
  • "If someone rests their arms behind their neck, they are open to what is being discussed and are interested in listening more." (But how can we be sure that they're not just stretching?)
  • "If someone brushes their hair back ... their thoughts are about something  conflicting with yours." (So always keep your hair combed neately, or use lots of hair spray, or wear a head-band, or shave your head entirely.)
  • "If someone is biting their lip, they are anticipating something." (Or they're nervous, or they have a suddently itchy lip. And what about prominent overbites?)
  • "A confident person ... will have a strong posture." (Among other things maybe, but   what about all those confident slouchers out there?)
  • "If people laugh excessively, they are just trying to wheedle their way into your good graces." ( Avoid excessive pleasantness, keep the conversation serious, and don't tell jokes.)
  •   Wikihow adds few disclaimers at the end:

    • "Unfortunately, there are always exceptions." (Lots of them, I'd guess.)
    • "There are wide cultural differences." (You got that right.)
    • "Don't isolate yourself by constantly examining body language." (Ignore this list?)

    As Mark points out, "the key problem is that no single source ever gives us 'all the information,' much less all the interpretation," and even Wikihow seems to acknowledge this. We have to wonder though whether it's worthwhile to give readers only part of the information, along with some misleading interpretations. There is a growing research literature on non-verbal communication that Wikihow doesn't seem to take note of. And it's not that hard to find. For example, anyone can go to Google, type in "Paul Ekman," and find a bunch of recent research reports written by him and his associates.

    Posted by Roger Shuy at 08:48 AM

    Nerdcore: Runnin with my beta cuz I'm takin chances

    In 2001, as Mary Bucholtz wrote in the Journal of Linguistic Anthropology, California high-school nerds "[employed] a superstandard language variety to reject the youth culture norm of coolness", adopting linguistic and other practices that "ideologically position nerds as hyperwhite by distancing them from the African American underpinnings of European American youth culture."

    This cultural polarization was part of the joke behind "Weird Al" Yankovic's 2006 album Straight Outta Lynwood, especially the track White & Nerdy:

    So it made sense for Benjamin Nugent ("Who's a Nerd, Anyway?", NYT 7/29/2007) to focus on this polarization in promoting his forthcoming book American Nerd: The Story of My People.

    But American culture continues to transcend oppositions, as Alex Williams illustrated in a NYT Fashion & Style article yesterday ("Dungeons, Dragons and Dope Beats", NYT 8/5/2007).

    There was a time that brainy, pimple-cheeked misfits could only work out their frustrations alone, in action-figure-filled bedrooms, blasting through level after level in "shooter" video games likes Wolfenstein 3D.

    Then nerdcore came along.

    A largely white subgenre of hip-hop that celebrates the solitary pleasures of science fiction, computers and bad teenage movies, nerdcore is emerging from the shadows of the Internet, where it spent the last half-decade as an in-joke. This do-it-yourself brand of rap, part self-expression and part self-satire, has inspired two documentary films, and its own festival, Nerdapalooza, in California. This month, MC Chris -- otherwise known as Christopher Ward, 31, the son of a finance executive from the affluent Chicago suburb of Libertyville, Ill. -- will attempt an unprecedented nerdcore crossover when he joins mosh-pit-friendly rock acts like New Found Glory and Sum 41 on the Warped Tour.

    Nerdcore is still mostly a genre-bending joke:

    Many nerdcore anthems -- "You Got Asperger's" by MC Frontalot, "Fett's Vette" by MC Chris, "View Source," by Ytcracker ("Eagerly awaiting my macro advances/running with my beta cuz I'm taking chances") -- are as much efforts at comedy as they are attempts at sincere hip-hop.


    From its early days, hip-hop has been an art form born from oppression and marginalization, where performers sought to turn limitations into strengths, and the harshest circumstances yielded the best material. While no one is going to compare life in the high-school computer lab to the streets of the South Bronx -- or certainly, wedgies to racism -- suburban dweebs have their beefs with society, too.

    If historical tragedies can repeat themselves as farce, maybe this cultural farce will someday express a tragic vision. Meanwhile, nerdcore performers like ytcracker are certainly not positioning themselves as hyperwhite, either stylistically or linguistically:

    Posted by Mark Liberman at 07:32 AM

    Into Britain (and Br E)

    Barbara and I have accomplished the major task on our agenda for the summer: moving to the UK from California. We are staying awhile in the North Downs area of Surrey, recuperating from the last month of frantic physical labor. Or labour, I should say, because now comes the language learning task. It's re-learning, actually: we are both moderately well acquainted with British English (BrE) in a passive way. Now we have to re-activate our command. We have to name the last letter of the alphabet in a way that rhymes with shed, not with she; we have to pronounced schedule and its derivatives in a way that begins with sh-, not sk-. We have to call the trunk of a car the boot, and the hood the bonnet, and park in a car park, not a parking lot. Behind us, the area colloquially known as the ass or butt in American English (AmE) is known here as the arse or bum. Mutual funds are (I think) known as unit trusts here, and soccer is called football. Cookies are called biscuits and what Americans call biscuits would probably be described as scones. Nouns denoting collectivities of human beings generally control plural agreement (Ford have announced a profit, West Ham are winning, The government are determined to stamp out foot and mouth). Yet to be decided is whether I will post on Language Log in BrE or in AmE. The matter is before the Language Log House Style Committee right now. They're a bunch of semicolon-flaunting pedantic wankers, of course, but everything depends on which variety they decide to favor. Or favour, as the case may be.

    [Update: Paul Battley has written to tell me that nearly all the above is out of date or slightly inaccurate because of recent Americanization in BrE. Or Americanisation, as the case may be. If this is so (Andrew Clegg says he thinks it isn't, and my Americanisms above are still clearly Americanisms), it may or may not make it easier to fit in. Time will tell. I just don't know yet. But I do know enough to be aware that reference to rubbers and fags is less taboo in BrE than in AmE because of semantic differences (they are merely erasers and cigarettes, respectively) but with fanny the situation is very much the other way round. No fanny packs here. In BrE the fanny is NOT the arse. And only females have them. Don't ask me to explain any further, we're deep in taboo land here.]
    Posted by Geoffrey K. Pullum at 05:26 AM

    August 05, 2007

    Anita Loos is Alive and Living in Brooklyn

    From the "Vows" feature in the Weddings/Celebrations pages of today's New York Times style section:

    It was Mr. Hernandez, along with Brian Brooks, also of IndieWire, who prodded Ms. Torneo to meet Mr. Paladino, who like she, lived in Brooklyn.

    Shades of Anita Loos.

    Posted by Geoff Nunberg at 02:57 PM

    Blunder maven speaks

    Ben Zimmer has reported here on Michael Erard's forthcoming book on speech errors and disfluencies, Um...

    In which I'm featured, in chapter 9, as a "blunder maven".  Some reflections on my life in erroria...

    First, it's an odd experience being a character in a book.  My daughter even plays a role, talking about what it was like growing up in a slip collector's household.  (In another chapter, Jeri Jaeger and her family are featured.)  Michael summarizes some of my work, of course, but he embeds this discussion in more personal details.  Like where my interest in classical malapropisms came from.  My e-mail answer to him in February 2006:

    I started collecting them to use as illustrations and exercises in beginning linguistics courses.  They're entertaining, and they show the relevance of phonological features, prosodic structure, grammatical category, etc. to real life.

    A surprising amount of my research started from teaching introductory courses.

    (There was also the influence and example of my friend Vicki Fromkin, who's an important figure in Michael's book.)

    That last sentence is important.  Though I started my academic career doing more or less "straight" syntax and phonology, plus some morphology and mathematical linguistics, teaching introductory courses turned to me other things as well, very quickly: usage, a lot more morphology, dialects, variation, errors, casual speech, applied linguistics, argumentation and evidence in linguistics, poetic form, style and register, the social side of sociolinguistics, language play, idioms and other formulaic language, and more.

    In any case, I got into speech errors in the 70s and I've been in the neighborhood ever since.

    Second, Michael's website for the book has a lot of stuff not in the printed version, including an extensive bibliography and also photos of some of the characters in the book: in the version that's up today, these are, in (neither chronological nor alphabetical) order:

    Reverend Spooner, Giovanni Morelli, Kermit Schafer. Thomas Edison, Arnold Zwicky, Randy Harvey, Ralph Smedley, Victoria Fromkin, Rudolf Meringer, Sigmund Freud

    Of these, only Randy Harvey (winner of the 2004 Toastmaster's World Championship of Public Speaking) and I are still alive.  I hope that Jeri Jaeger will be added to our company, and maybe Anne Cutler as well.

    Third, Michael's plans for the book moved me to apply to teach a Stanford sophomore seminar on slips of the tongue, which I last taught in 2001.  (I taught graduate seminars on the topic at Ohio State in the 80s, and a course at the 1982 Linguistic Institute, at the University of Maryland, College Park.)  It's on the books for spring quarter 2008.  I hope the students will be entertained by having their instructor be a character in their textbook.

    Fourth, ever since I got into the business of studying both variation and errors, thirty years ago, I've carried a notepad with me to record interesting data on the fly.  The Dreaded Notebook has developed a certain fame of its own.  Though I try to record things surreptitiously, people notice when I whip it out.  So during a faculty meeting in February 2005, when I scribbled something on it, Paul Kiparsky stopped in mid-sentence and demanded in consternation, "What did I say?"  (What he said was

    It's just fishing in the dark.

    which i took to be an inadvertent blend, based on "just a shot in the dark", with "fishing" -- from "just fishing" 'just searching aimlessly', or from "shooting fish in a barrel", or both -- substituted for "a shot".)

    My year at the Stanford Humanties Center (2005-06) was full of such incidents.  The Fellows had lunch together every day, chatting about matters scholarly and personal, so there were many opportunities for me to collect data.  For a while, the Fellows panicked every time my notepad appeared, but eventually they accommodated to it.  Though even then, they were sometimes distressed by it.  At one lunch, a Fellow was talking engagingly about matters medieval when I pulled out the pad.  He stopped dead and asked, "What did I do wrong?"  And I got to tell him that I'd just thought of some things to add to my grocery list.  (The pad serves many purposes.)

    Then there were visitors.  At one point Lani Guinier appeared for lunch, during a visit to give a big Stanford lecture plus a discussion session at the SHC.  The people at our table introduced ourselves, and general conversation ensued.  I thought she was so absorbed in talking to the director of the Center (sitting next to her) that she didn't notice when my pad came out.  But, alas, she did, and she remembered that I was a linguist, and she was freaked out (as she explained later to the Director).  I got to use the example in a talk I gave the next day (she was not in attendance).

    Ok, here's what she said, from my data log at the time:

    [10/31/05, talking about Constance Baker Motley]  Growing up, she was a real hero of mine.

    This would be, according to many authorities, a "dangling modifier", since the subject of grow up is not supplied by the subject of the main clause.  But it's one of several types of examples that I consider to be entirely innocuous.  So Lani shouldn't have been distressed.

    (A few years ago I started doing a series of academic collages -- a framed collection is now on display at the SHC, and many are pasted up on my office door at Stanford -- beginning with an amended version of a Edward Gorey drawing.  It shows Gorey himself, in his celebrated raccoon coat, lurking in an alcove while drawing on a notepad.  The caption I supplied: "Professor Zwicky was never without his notepad, and often collected data in unlikely spots.")

    Fifth, and finally: in many courses I teach on variation and on errors, I have the students keep data logs of their own, which they then submit to me and I comment on.  This is a good exercise in learning to attend to what you hear and read (not an easy thing to do) and in becoming attuned to data of interest in one way or another.  There's a lot of wonderful stuff out there.

    zwicky at-sign csli period stanford period edu

    Posted by Arnold Zwicky at 02:48 PM


    Back in June I posted about the "New Word Open Mic" held at the Dictionary Society of North America's biennial conference in Chicago. At the event, hosted by the effervescent Erin McKean, members of the public were encouraged to present new words and phrases that they felt deserved more widespread usage. There was some press coverage at the time in the Chicago Tribune, and I mentioned the results over on my OUPblog column. Now there's an edited podcast of the Open Mic, thanks to the combined efforts of Grant Barrett (a judge at the event and cohost of the sadly cancelled radio show "A Way With Words" on KPBS) and Charles Hodgson (the voice of Podictionary: The Podcast for Word Lovers). You can hear it either at the KPBS site or at Podictionary. A recommended listen if you want to hear lexicographers and fellow word lovers exploring new coinages like hangry and newsrotica in all their neologiciousness.

    Posted by Benjamin Zimmer at 11:34 AM

    Emotional code

    A typically warm and insightful xkcd from Randall Munroe:

    (Title: "I've just received word that the Emperor has dissolved the MIT computer science program permanently.")

    If you don't Get It, you may be lacking some of the background that Randall assumes. One crucial piece is what Obi-Wan said to Luke in the first Star Wars movie:

    I have something here for you. Your father wanted you to have this when you were old enough, but your uncle wouldn't allow it. He feared you might follow old Obi-Wan on some damn fool idealistic crusade like your father did. It's your father's lightsaber. This is the weapon of a Jedi Knight. Not as clumsy or as random as a blaster, but an elegant weapon for a more civilized age. For over a thousand generations, the Jedi Knights were the guardians of peace and justice in the Old Republic. Before the dark times, before the Empire.

    For the other piece, you could start with Paul Graham's essay "The Roots of Lisp", and especially the code for the implementation of Lisp in itself. [In case you can't read the .ps file that he provides for the essay, a .pdf version is here.]

    Posted by Mark Liberman at 10:16 AM


    In today's Boston Globe, Michael Erard pinch-hits for "The Word" columnist Jan Freeman and gives a preview of his new book, Um... Slips, Stumbles, and Verbal Blunders, and What They Mean. Erard writes:

    It's typical to think of verbal blunders as embarrassing slip-ups that we should avoid. But I've just written a whole book about verbal blunders, and I find them fascinating. Why? Because they're signs of the wild. Not in the sense of rough or savage, but because they're pure and untameable. They provide a window into what humans really are: biological organisms who live in complex groups and have really amazing brains. Blunders of the verbal sort may seem like violations of the order of language, but in fact they're spontaneous eruptions of the qualities that gave us this order in the first place.

    I'm greatly looking forward to reading Um... (Amazon says it will be released on Aug. 21), especially Chapter 9 ("Fun With Slips"), since according to Erard's site it will focus on a favorite Language Log topic: eggcorns, as illuminated by "blunder maven" Arnold Zwicky.

    Posted by Benjamin Zimmer at 10:08 AM

    February 30

    The most informative erratum of the year ("so far", as Homer would add) comes from the New York Times' online "Corrections: For the Record" page of July 30:

    An article on Thursday about the arraignment of three men in the shooting of two New York police officers, one of whom died, misstated the schedule set by a judge for a trial in the case. The trial is expected to begin by February, not by “Feb. 30.” The error occurred when an editor saw the symbol “— 30 —” typed at the bottom of the reporter’s article and combined it with the last word, “February.” It is actually a notation that journalists have used through the years to denote the end of an article. Although many no longer use it or even know what it means, some journalists continue to debate its origin. A popular theory is that it was a sign-off code developed by telegraph operators. Another tale is that reporters began signing their articles with “30” to demand a living wage of $30 per week. Most dictionaries still include the symbol in the definition of thirty, noting that it means “conclusion” or “end of a news story.”

    [For the context of the error, see Michael Brick, " Prosecution Presents array of Evidence in Killing of Police Officer", 7/262007.]

    This sense of thirty was news to me, demonstrating yet again my ignorance of the newspaper business. Here are the OED's citations:

    1895 Funk's Standard Dict., Thirty..among printers and telegraphers, the last sheet, word, or line of copy or of a despatch; the last; the end. 1929 Amer. Speech IV. 290 ‘30’ or ‘Thirty’ indicates the end of a shift or of the day's work, and has come to mean, also, death. 1938 Sun (Baltimore) 20 Jan. 2/8 Newsmen..mourned today at the bier of Edward J. Neil,..who was killed by shrapnel while covering the civil Spain. Prominent..was a shield of white carnations with a red~flowered figure ‘30’{em}the traditional ‘good night’ in the lore of the fourth estate. 1941 J. SMILEY Hash House Lingo 58 30, end of anything. 1945 J. O'HARA in New Yorker 27 Jan. 22/3 ‘I say thank you and thirty.’ This last, the word ‘thirty’, is the traditional signing-off signal of the newspaper business. 1973 R. LUDLUM Matlock Paper xxix. 251 The number 30 at the bottom of any news copy meant the story was finished. 1978 G. VIDAL Kalki IV. i. 88 ‘When we know those two things, it's fat thirty time.’ Bruce had obviously been impressed by journalism school.

    [via Nancy Friedman at Away with Words]

    [Marilyn Martin writes:

    During the '30s and '40s there was a nightly radio news broadcast called The Richfield Reporter [sponsored by Richfield Gasoline]. At the conclusion of each fifteen-minute [sic] broadcast, the reporter would always sign off with

    "That's 30 for tonight."

    And John Cowan, commenting at Nancy Friedman's site, suggests another etymology:

    The story I heard (but I don't vouch for it), was that the original form of "- 30 -" was "- XXX - ", and that this was later read, or misread, as a Roman numeral. It indicates, when a story is written in takes, that the story is complete: this is the last take, no more to come.


    [Update #2 -- Dick Margulis writes:

    When typewritten copy was sent to the composing room, in the days of the Linotype, a late-breaking story might be distributed among two or more typesetters. Every sheet but the last had -more- at the bottom, indicating explicitly that it was _not_ the last sheet, and paragraphs were never broken at the end of a page, thus ensuring that lines would not need to be reset. Even today, printed news releases (formatted as double-spaced typed pages) often conform to the same convention, although the "-30-" is usually replaced with "###" to indicate the end of the article.

    I'd never seen the "XXX" hypothesis before, but it certainly makes intuitive sense.

    Mark Eli Kalderon has compiled a collection of additional etymological speculations here, some of them spectacularly fanciful. In the absence of evidence, this sort of thing becomes a sort of large-scale game of Balderdash. Of course, there are theories in which all rational thought is an internal version of this style of post-doc story-telling... ]

    [Update #3 -- Don Porges points out that "The journalistic meaning of '-30-' used to be well-known enough" that there was a 1959 movie by that name. Starring Jack Webb, yet.]

    [And -- drum roll... -- the envelope, please!]

    Posted by Mark Liberman at 09:22 AM

    August 04, 2007

    Results from ILO5

    Drago Radev, the coach of the U.S. team that has just competed in the 5th International Linguistics Olympiad in St. Petersburg, Russia, reports that his two teams of American high-school students have triumphed in the competition. Fifteen teams of four people each participated in the Olympiad.

    First, USA's Team 2 won the team contest in a tie with one of the Russian teams. The new world co-champions are Rebecca Jacobs of Los Angeles, Michael Gottlieb of Dobbs Ferry, NY, Josh Falk of Pittsburgh, and Anna Tchetchetkine of San Jose.

    Second, Adam Hesterberg of Seattle (a member of USA's Team 1) won the individual contest with a huge point difference in front of everyone else.

    Third, Jeffrey Lim, also a member of USA's Team 1, won a Best Solution award for problem 1.

    The International Linguistics Olympiad is a direct descendant of the Olympiad of Linguistics and Mathematics, which was founded in 1965 in Moscow. High-school students compete by solving linguistics and logic problems based on natural language. The U.S. teams have funding from NSF, Google, and NAACL, as well as private funds from some of the students.

    For a sample of the problems the teams are asked to solve, see Mark's post here.

    A follow-up note via Blackberry from Drago Radev in St. Petersburg:

    Adam had 90 points out of 100. The second highest score was 85.5 from Poland. Third was 78 from Russia. Fourth and fifth were tied at 76.5 from Bulgaria and Russia.

    The Swedish team sang 2 great a cappella songs.

    Adam is also part of the USA math camp. He is going to be a freshman at Princeton next year. The other two seniors on the team are going to Caltech and Cornell.

    Rebecca is 15 and was at the LSA in Stanford earlier this year.

    I was the coach for the two teams and Lori Levin of CMU and Amy Troyani of the gifted high school program in Pittsburgh were also part of the team's entourage.

    This is the first time that a US team has been at the ILO.

    Next year's contest will be on the Black Sea in Bulgaria.

    Posted by Sally Thomason at 03:34 PM

    Getting the name right

    On August 1, the New York Times printed an affecting op-ed piece "Leave Your Name at the Border", about the Anglicization of Mexican names -- pronouncing the names as if they were English, replacing them by English "equivalents" (Connie for Concepción, Raymond for Ramón), or abandoning traditional names in favor of decidedly non-Mexican English names (Ashley, Bradley) -- and its personal and social significance.  The author was the writer Manuel Muñoz, who grew up bilingual in the Central Valley of California.  (His collections of short stories, Zigzagger (2003) and The Faith Healer of Olive Avenue (2007), picture the lives of "Mexicans", as they are referred to locally, there; gay characters play an important role in both books.)

    The Times published the piece with Muñoz's family name spelled correctly, tilde and all.  And, if you try to search the Times site for the writer, you'll have to spell it correctly too: a search on {Manuel Munoz} gets no hits, but {Manuel Muñoz} works just fine.  I was surprised at how scrupulous the Times was -- though, frankly, it would be more helpful if its searches treated n and ñ as equivalent (as Google does).  A glance at Amazon's listings for his books suggests that the site replaces "funny" characters in its headers, uses them variably in text, but treats n and ñ as equivalent in searches.

    Muñoz's own site gets it right, of course.  But not in the url, where the ñ has been de-tilded:  Anglicization lives on the web.

    [Addendum: in the face of e-mail complaining that the last sentence is just wrong -- this isn't Anglicization, it's just that the DNS system uses only alphanumeric characters (plus a very few punctuation marks) -- let me clarify things.  The alphabetic characters that count as "alphanumeric" in the DNS system are just the ones used in English, for the obvious reason that it was speakers of English who set the system up; speakers of Spanish, German, French, Finnish, etc. would have made different choices.  The result is that spellings in languages that use a different set of alphabetic characters from English have to be Anglicized in order to be parts of valid addresses.  Anglicization is built into the system.]

    [Later addendum: Michael Wahlster reports: "Since 2004, domain names can have special characters. It is not very useful, though, since it requires the potential visitors to be able to reproduce the special character in the address field. See here."]

    zwicky at-sign csli period stanford period edu

    Posted by Arnold Zwicky at 01:57 PM

    Vocalization of /l/ in the funny papers

    Here's an un-illustrated (but true) l-vocalization story. I was a graduate student, and my oldest son was three. He was riding his tricycle in the driveway of our apartment in Dorchester, steering around in ever-narrowing circles.

    "Don't lose your balance!", I warned.

    He stopped, looked at me, got off the trike, and walked to the porch, where he found his soccer ball. He carried it to middle of the driveway, raised it above his head with both hands, and threw it as hard as he could against the ground. He retrieved it and repeated the process again, with the same result. He turned to me, put his hands on his hips, and shook his head with an expression of patient resignation.

    "I can't lose my bounce", he told me, and went back to riding around in circles.

    That was [ˈba.wəns], of course.

    Posted by Mark Liberman at 01:28 PM

    Wisdom vs. ignorance in networked crowds

    A fascinating post by Lauren Squires at Polyglot Conspiracy documents and discusses the widespread public disdain for academics in general and linguists in particular ("On the uselessness of linguistics in particular and academia in general", 8/3/2007). Lauren looked at the reaction in blog comments to Benjamin Nugent's mention of Mary Bucholtz's work on nerdiness and racial identity; and she didn't enjoy the experience:

    I was really interested to see how people would interpret her work given the complete lack of important context in the NYT, and how the notions of linguistics and academia would figure into their comments about the piece. Taking to the blogosphere via Technorati and Google blogs searches, I found a bunch of really awful, mean, spiteful, depressing commentary. Almost all of it was completely reliant on Nugent's piece, and hardly any of it reflected an actual reading of Bucholtz's work (you can tell, because half the time the very points commenters bring up are points that Bucholtz covers in her articles, including the 2001 one that Nugent cited). This didn't surprise me - who's going to wade through a 17-page academic article if they're not attempting to get an academic degree or maintain their academic status? - but I was really, really shocked by how hypocritical people are. Because basically they are fast-as-lightning to criticize Bucholtz's work for failing to consider all angles of the issue, but they do so without themselves considering that perhaps the NYT didn't give them all the information.

    It's true, there's something about blogospheric comments (and web-forum and newsgroup postings) that tends to bring "awful, mean, spiteful, depressing commentary" front and center. In these forums, the wisdom of crowds fights with the effects of transient, cost-free and semi-anonymous communication, and it's not often the wisdom that comes out on top.

    Laura is not the first blogger to get discouraged about blogospheric discussions of science. A couple of years ago, Chris at Mixing Memory wrote:

    As an academic, I have spent a lot of time hiding away in the ivory tower, oblivious to the larger world around me. As a graduate student, especially, I had almost no time to pay any attention to what non-scientists were saying about cognitive science. However, on a fateful day in early 2004, I chose to crawl out of my hole and actually look at what other people were saying. I started reading blogs. And now I want to crawl back in!

    And it's experiences of the same sort that led us here at Language Log to abandon our experiments with comments.

    But all the same, I believe that blogs and other new media are capable of giving birth to better public understanding of research (in linguistics as in all other areas), whereas traditional science journalism is doomed to repeat the doleful patterns of the past.

    In a response to Chris's post, I suggested that the way to make things better is to "Encourage everyone to think about science, and to write about it on the web, whether they know anything about it or not", and to "improve the (professional scientific) literature" by promoting open access not only for all scientific publications but for all the data and programs that they rely on. I claimed that "open intellectual communities intrinsically tend to generate a virtuous cycle", both for professionals and amateurs. ("Raising standards -- by lowering them", 3/7/2005.)

    You can go read what I wrote, if you like, and see if you agree with me that things have been moving in the right direction over the past couple of years. I'm not going to recapitulate the arguments here, other than to note that Chris is still blogging. In the rest of this post, I'd like to explain why I think that traditional media are unlikely to contribute much to the process.

    The key problem is that no single source ever gives us "all the information", much less all the interpretation. The source of motive power here is a generalization of Moglen's Metaphorical Corollary to Faraday's Law, extended from software to scientific understanding:

    ...if you wrap the Internet around every person on the planet and spin the planet, software flows in the network. It's an emergent property of connected human minds that they create things for one another's pleasure and to conquer their uneasy sense of being too alone. The only question to ask is, what's the resistance of the network?

    But journalists have no tradition of providing references or links, and generally don't read primary sources themselves anyway, and operate as if their goal were to be the sole source of information for their readers, not a node in a network of communication and creation. These attitudes and practices are deeply embedded in journalistic culture, and deeply connected to many other aspects of its social and economic context, and reinforced by the traditional attitudes and practices of scientists.

    When I blogged briefly about the NYT Magazine article that Lauren Squires took as her starting point ("Language and identity", 7/29/2007), the main thing that I wanted to do was to present some of Bucholtz's work in her own voice, and to give people a link to her original paper, so they could read it for themselves.

    Unfortunately, this wasn't nearly as easy to do as it should have been. I started with the obvious thing, namely a link to Bucholtz's paper in the online archive of the Journal of Linguistic Anthropology. JLA is "a semiannual publication of the Society for Linguistic Anthropology (SLA), a Section of the American Anthropological Association". JLA is available online via AnthroSource -- if you're accessing it from an institution that has bought a subscription. Otherwise, you need to pay a per-article fee, which I think is generally $12.

    That's less than some journals charge, but it's a lot to pay for electronic access to a few pages of text that you're not sure in advance is worth anything to you at all.

    (Several people complained about this access barrier, both to me by email and in blog posts on the topic. And it turned out that Bucholtz has a pdf of the article on her website, so I redirected the link there. But more often than not, linguists don't put free copies of their articles anywhere -- a simple cultural change in this practice would make an enormous difference in our field's relations outside the academy. ROA and LingBuzz are steps in the right direction.)

    Anyhow, The Times Magazine's editors characteristically didn't bother to put a link to Bucholtz's work in (or next to) Nugent's article. One excuse might be that the archival version is behind the AnthroSource pay wall -- but that's not the real reason, because the NYT doesn't provide links to Open Access journals either. In fact, they generally don't provide links outside their own sites, either because they don't want to seem to endorse other sources, or because they don't want to send traffic outside their walled garden, or because their unconscious ideology is that they should be the sole source of information for their readers.

    Company policies aside, it's clear that most journalists feel that the public is not capable of reading primary sources. They feel that way because they themselves typically don't read primary sources, when they're writing about science or for that matter about other areas of intellectual inquiry. Instead, they generally rely on press releases and on their (often muddled) notes from interviews with experts.

    We've documented the sad results over and over again, as have many other bloggers. This post is already too long, but let me digress anyhow to comment on two small but telling recent cases. (I'll add a grey background, so you can skip these depressing little case studies if you want.)

    One example was documented in my post "A bulletin from the Language Log Early Warning Center", 8/1/2007.  The culprit, Chip Scanlan, has paid his journalistic dues:

    Former reporter (Providence Journal, St. Petersburg Times, Knight Ridder Washington Bureau), author of "Reporting and Writing: Basics for the 21st Century" (Oxford University Press). Co-editor, "America's Best Newspaper Writing" (Bedford/St. Martin's) Edited "Best Newspaper Writing" 1994-2000. Teaches reporting, interviewing, coaching skills, nonfiction narrative, personal essays and deadline storytelling.

    Scanlan, who is obviously smart, hard-working and responsible, now teaches at the Poynter Institute and produces a writing advice column for Poynter Online. His column from 6/28/2007, "Brain Science for Writers: Active verbs move nerve cells, too" tells us that the common advice to "use active verbs" has been "put ... under the gaze of science", by British neuroscientists who have shown that "if my characters kick, kiss or dance, so will my readers' brains".

    Cute, but the original article is not about active verbs but about action words, that is, "words that have a clear semantic relationship to actions, typically action verbs ... or nouns referring to tools ". The stimuli in the study were mostly words that are ambiguously nouns or verbs (such as "kick, kiss, or dance"), and they were presented in isolation, not in either active or passive sentence frames.

    Now, Scanlan was not the author of this misunderstanding. He took it directly from a piece in Science News (Bruce Bower, "The Brain's Word Act: Reading verbs revs up motor cortex areas", Science News, 2/7/2004). I'll bet that Scanlan took the Science News article at face value, and that it never even occurred to him to go read the original research reports.

    And it wouldn't surprise me to find that Bruce Bower in turn was misled by some anonymous PR person who wrote the press release about the original research, either at the MRC lab in Cambridge or at the journal Neuron

    It's certainly a PR person who was responsible for a striking numerical botch at Science Daily a few days ago.

    According to "Genetic Mutations Linked To Lupus", Science Daily, 8/1/2007

    The study involved 417 lupus patients from the United Kingdom and Germany. Mutations were found in nine patients with lupus and were absent in 1,712 people without lupus.

    But according to the original research report (Min Ae Lee-Kirsch et al., "Mutations in the gene encoding the 3'-5' DNA exonuclease TREX1 are associated with systemic lupus erythematosus", Nature Genetics, published online 7/29/2007):

    We identified five heterozygous missense changes and one frameshift change in 6/218 individuals with SLE from the UK compared with 0/200 nonsynonymous changes in controls .... In the German SLE cohort, we found four heterozygous missense changes, one frameshift and a single 3' UTR variant in 6/199 affected individuals but only 2/1,512 controls ....

    So Science Daily says that "mutations were found in nine patients with lupus", while the research report says that they found mutations in 12 "affected individuals", 6 in each of two cohorts.

    And Science Daily says that the mutations "were absent in 1,712 people without lupus" -- the research report says that the mutations were found in "2/1,512 controls" in one cohort, and 0/200 in the other, summing to 2/1,712.

    How did 12 turn into 9 and 2 turn into 0? Carelessness, certainly, but whose? A clue is provided by a note at the bottom of the Science Daily article:

    Note: This story has been adapted from a news release issued by Wake Forest University Baptist Medical Center.

    Adapted, hell -- it was copied word-for-word, as this reprint of the original news release at shows. So far, the rest of the mainstream media haven't picked up this particular story, and maybe they won't, because the press release gives actual mutation counts rather than odds ratios, and also there was some apparently more important biomedical news about lupus during the same period. But what do you want to bet, if the story makes it onto the Reuters or AP newswires, or into BBC News or the NYT, that the numbers reported will be 9 and 0, not 12 and 2?

    OK, back to the main thread...

    According to Andrew Keen's recent screed "Against Open Culture" in Always On,

    ...formal cultural gatekeepers are good. There is massive commercial and intellectual value in the traditional ecosystem of professional scouts, agents, editors, producers, and publicists skilled in the discovery and development of talent. Sure, this may be a system run by elites, but at least it's a meritocratic one.

    Wouldn't it be pretty to think so? The trouble is, the "elites" in question -- at least the elites responsible for informing the public about science -- are collectively ignorant, careless and lazy. Of course, the individual human beings involved are no doubt mostly smart, responsible and hard-working. But the culture they live in doesn't allow them to show it.

    This "traditional ecosytem" of science-related journalism is probably impossible to reform. As bad as its results are, each of its parts is highly adapted to the economic and cultural niche it inhabits. It's fun to make fun of the system's botches, and this helps to correct misinformation and to make people suitably suspicious of the media's coverage of science in general, but it would be naive to expect that sniping from the sidelines can really change the fundamental dynamics of the industry.

    It's better to build an alternative culture of connections between researchers and members of the public, bypassing the "formal cultural gatekeepers" entirely. That culture, already in the process of being born, will not consist of everyone yelling at once, anonymously, in some hyperdemocratic network of web forums. It'll be contentious at times, no doubt, and also complicated, highly structured, and uncomfortably dynamic. But if it works, as Eben Moglen wrote about free software, it'll be because

    ... in the end ... it's just a human thing. Rather like why Figaro sings, why Mozart wrote the music for him to sing to, and why we all make up new words: Because we can. Homo ludens, meet Homo faber. The social condition of global interconnection that we call the Internet makes it possible for all of us to be creative in new and previously undreamed-of ways.

    [Update -- Fev at HeadsUp: The Blog speculates about me: "At a guess, it's an artifact of news routines (did somebody give Mark a Herbert Gans anthology for his birthday or something?) rather than deliberate news slant."

    This confirms my suspicion that Fev is a closet sociologist. In fact, I've never read anything by Herbert Gans, but now I've ordered "Deciding What's News" and "Democracy and the News", and expect to be enlightened. ]

    [Update #2 -- Peter Svensson, a Technology Writer at the Associated Press, writes:

    My friend Ed Keer pointed me to your Aug. 4 Language Log post on science journalism, and while I think it raises worthwhile points, I take exception to the implication that AP reporters don't check primary sources when writing about science.

    Adapted, hell -- it was copied word-for-word, as this reprint of the original news release at shows. So far, the rest of the mainstream media haven't picked up this particular story, and maybe they won't, because the press release gives actual mutation counts rather than odds ratios, and also there was some apparently more important biomedical news about lupus during the same period. But what do you want to bet, if the story makes it onto the Reuters or AP newswires, or into BBC News or the NYT, that the numbers reported will be 9 and 0, not 12 and 2?

    I've participated in our science coverage, and I can tell you it's based on the original papers, plus interviews with the authors. I'd be surprised if the other members of the MSM that you mention don't have the same practices. I'd be happy to put you in touch with one of our current science writers, so you can go straight to the source.

    Touché. This post doesn't document lack of primary-source reference on the part of an AP writer (though you could look here for an apparent example from an earlier post), and it was irresponsible of me to make an insinuation about the AP without citing a specific example to back it up.

    However, I believe that that my basic point was valid, independent of whether particular journalists do or don't look at the scientific papers they're reporting about. The determination of what science and technology news to report, and what to say about that news, seems to be substantially determined by a relatively uncritical replication of the press releases issued by journals and scientific societies, universities, and companies. I rarely see, in the AP or anywhere else, a story that looks critically at the content of an original paper, even when the reporter takes a few facts or quotes from such a source. Nor do I often see a case where a story features a research result or a research program whose newsworthiness has not been decided by someone's public-relations department.

    Of course, I don't really know how science journalists work. All I can do is evaluate what they write.]

    Posted by Mark Liberman at 12:14 PM

    August 03, 2007

    Wikihow on lying

    A few days ago I posted about how not to spot a liar. Then yesterday on my Google home page, under the category of "How to of the day," I saw a link called "how to communicate with body language." Being natrally curious, I checked it out and found a number of popular ways to read and understand such things as position of the head, eye aversion, movement of arms and legs, and other clues to deception, which could be the subject of another post. But it also included another link called "how to detect lies" that caught my immediate attention.

    Part of this link repeats the advice found on the body language link, adding that we should pay attention to how a people smile, including such things as tightening around the eyes, moving only the muscles around the mouth, how people touch their face, nose, or behind the ear, how they avoid eye contact, and how they delay expressing their emotions, and how much they sweat. Ah, if it were only that simple! But what I was really looking for was what the link had to say about verbal language. I found some of the same old stuff about spotting a liar, including being defensive about questions, delaying responses to questions, and:

      • repeating the exact words of a question,
      • not using contractions,
      • avoiding direct statements or answers,
      • speaking excessively in an effort to convince,
      • speaking in a monotone,
      • leaving out pronouns,
      • speaking in muddled sentences,
      • equivocation in non-answers,
      • using humor and sarcasm to avoid the subject,
      • showing discomfort by pausing,
      • and changing the subject quickly.

    It's hard to know where to begin criticizing this list. I leave it to you to decide whether using contractions, pausing, or leaving out pronouns indicates untruthfulness. And, mercy, you'd better not change the subject or produce a muddled sentence. Note that this list shares many clues with Statement Analysis, which I discussed in my earlier post.

    Perhaps the best thing about the list is that it is followed by some disclaimers, including:

    • Just because someone exhibits one or more of these signs does not mean they are lying. The above behaviors should be compared to a person's base (normal) behavior whenever possible.

    • Some of the behaviors of a liar ... also coincide with those of an extremely shy person, who might not be lying at all ... or when the topic is sophisticated or the person is stressed.

    • Eye contact is considered rude in some cultures.

    • Botox, plastic surgery, Autism, or Asperger's syndrome can provide false positives.

    I'm reminded here of those recent full-page, print-too-small-to-read advertisements for pharmaceutical products that  promise relief from various illnessses, but are required by law to include a long list of frightening possible side effects -- only in this case, the bad stuff seems to be followed by the possible good.

    Posted by Roger Shuy at 10:15 AM


    Melissa McEwan, writing about the milblogger fuss at TNR:

    And I don't give a rat's ass if Beauchamp is married to someone who works at TNR. Wev.

    "Wev" as an abbreviation for for "whatever" is new to me, but a quick search of weblogs makes me feel out of it. "Wev" is not very common, but it's widely distributed around the world, and it's been out there for a while:

    Is it boring to listen to my stream-of-consciousness? It must be, or I would have more readers! However, this blog is fun, regardless, so wev.

    (Gotta love that paranoid chipmunk...except that it's really a prairie dog, but wev.) .

    Just my opinion though mind you, so wev.

    I've only missed this one day so far, and I could do those tests asleep and I have an A already before the extra credit, so wev.

    (I did this one over a year ago, but wev.)

    HAH. ok wev...moving on.

    But, wev. It's done.

    Kai von Fintel has analyzed the semantics of standard English Whatever, but I'm not sure whether anyone has documented in detail the process whereby traditional whatever evolved into the now-common stand-alone expression.

    The OED "draft additions September 2001" traces it back to 1973:

    whatever, pron. and a.

    * int. colloq. (orig. U.S.). Usually as a response, suggesting the speaker's reluctance to engage or argue, and hence often implying passive acceptance or tacit acquiescence; also used more pointedly to express indifference, indecision, impatience, scepticism, etc.: 'as you wish'; 'if you say so'; 'it makes no difference to me'; 'have it your own way'; 'fine'.

    1973 To our Returned Prisoners of War (U.S. Secretary of Defense, Public Affairs) 10 Whatever, equivalent to 'that's what I meant'. Usually implies boredom with topic or lack of concern for a precise definition of meaning. 1982 San Francisco Examiner 7 May A3 When someone responds 'whatever', he or she seems to be saying 'I'm amenable to anything. I'll defer to you.' But in my experience, when a person says 'whatever', he or she is really saying, 'I don't want to take any responsibility. You do all of the deciding and then I'll pass judgment.' 1986 D. A. DYE Platoon (1987) iii. 21 Feed any of these guys a full-scale briefing..and you'd get the same response: "Yeah, right. Whatever, man, whatever". 1990 G. G. LIDDY Monkey Handlers iv. 53 Levin gave a mirthless smile. 'The Heads from Hell. They wear embroidered signs on the back of vests.'.. 'Colors,' Stone interjected. 'Whatever,' said Levin. 'You'll be able to tell them by it.' 1995 New Yorker 16 Oct. 131/2 You get to the point where it would be foolish to be surprised at anything. A sports bar opens. Then it closes. Whatever. 1998 Village Voice (N.Y.) 21 July 28/1 If someone came running to say he'd just seen Jesus preaching on the steps of the 72nd Street subway stop, most New Yorkers would reply, 'Whatever'. 2000 D. WAUGH in J. Adams et al. Girls' Night In 529 The secretary admitted that the list had been 'temporarily mislaid'. Whatever.

    But there's a somewhat different syntactic form involved in examples with connectives like so and but-- stand-alone whatever (or wev, whatever) is getting re-integrated with propositional force into larger structures, meaning something like "I don't care" or "it doesn't matter" or "that's OK":

    You probably didn't care to hear any of that, but I'm kinda high off the crafting I've been doing this evening, so whatever.
    Hardly a precise way to tune, but then the principal oboe is the boss, so whatever.
    It was his sister-in-law though, so whatever.
    I asked for a Heineken, but whatever.
    It comes from Jason, who admits it might be a little busy, but whatever.
    Toni's rushing me as always but whatever.
    I was gonna say the truth, but whatever.
    The quality is so unbearable. Still, it might not be for you so whatever, but still I'd say give it another chance.
    i dont really care though so whatever if im a loser!
    i was at work later than i planned, but whatever because it's not like i have anywhere to be.
    Okay, you know, I love to sit in a booth at a restaurant, but whatever, since I like to hang out with her and this is the best way to do it.
    After yoga, I decided to go running for half an hour, which was of course the exact opposite of what I should have done, but whatever, because Sam's Mom's and my favorite movie of the last decade, Galaxy Quest was on the television in front of my treadmill.

    Another historical question: was the reduction from "whatever" to "wev" purely phonological (as in the pronunciation of "Worcester"), or did it start as an IM or SMS short form? I'd guess that it's the latter, but I haven't been able to find any evidence.

    [Update -- Topher Cooper writes:

    Unless I'm inserting a (slightly) later idiom for some other single word with identical meaning, I'm pretty sure that this use of "whatever" was a regular bit on "All in the Family". Archie would garble someone's name, confuse facts about their ethnicity, religion, political beliefs etc.. They would correct him with obvious affront, to which Archie would reply -- you guessed it -- with a long suffering "Whatevah!" Don't know when this bit was introduced so I don't know whether it predated the earliest OED citation (IMDB says the show ran from '71 to '79) but I'm willing to bet (a bet I'm unlikely to lose since we are unlikely to have enough evidence to prove anything one way or another) that this spread and popularized the expression.

    I also associate this with "Valley Speak" of the late 70's and the Wikipedia article on "Valspeak" backs this up as a typical expression (spelled there as "what-ever" to imply something of the intonation).

    It's likely that the isolated "whatever" comes from the phrase "whatever you say" which has pretty much identical meaning.

    Archie Bunker, Valley Girl. I like it.]

    [Ben Zimmer writes:

    Most likely "wev" formed through a process of clipping, whatever -> whatev(s) -> wev. "Whatev" and "whatevs" are well-attested online, with the latter owing much of its popularity to, a blog that started up in early 2002. For more on "whatevs" and similar forms popular in the blogosphere (like "obvs"), see this post.

    Certainly there are plenty of uses of "wevs" out there:

    I didnt want to go because Im a whiny pms girl but wevs it was fun.
    I think he is gross, but wevs.
    She'll probably tell Beth and Bob that I'm a horrible person, but Bob doesn't really like me anyway since I'm Catholic so wevs.


    [Lauren Squires writes:

    It seems like Ben's answer is right, but maybe with an additional point: in IM, text, etc., people often break up words with two+ syllables (or compound words) and somehow note the syllable/word break with punctuation marks. Some obvious examples are b/c (because) and w/o (without) (I realize these conventions may precede or have evolved independently of computer-mediated communication, but that's not really relevant here). UrbanDictionary also gives w/e as a form of 'whatever', and it also gives w/ever. So I am guessing that one additional step in Ben's clipping process as outlined is the breaking up of [what] and [ever], and the abbreviation of both (keeping the "ev" as in "wev" rather than coming all the way down to "we" is probably a matter of avoiding ambiguity).


    [Aviad Eilam writes:

    There has in fact been some work on discourse marker "whatever" (I know due to my work on the formal semantics of the Hebrew counterpart to standard "whatever"):

    Blake, Renee, Maryam Bakht-Rofheart, Stefan Benus, Sabrina Cooper, Meredith Josey, Erica Solyom. 1999. "'I have three words for you...': Whatever as a discourse marker." Paper presented at NWAV 1999.

    Also, perhaps some support for the hypothesis that the isolated "whatever" derives from the phrase "whatever you say" is the fact that the Hebrew equivalent is "what you say". Unlike a simple definite description, the latter has to have high pitch/intensity on the wh-word to get the -ever component. Maybe the fact that there is no lexical marker of -ever prevents the form from being reduced, as arguably happened in English.

    Another paper from the same group is "'WHAT-EV-ER': More than just a gendered discourse marker," presented at the International Gender and Language Association (IGALA) Conference in 2000. Unfortunately, as far as I can tell, neither paper is on line or has been published anywhere accessible. It's too bad that the field of linguistics hasn't developed the norms of open-access online archiving that are routine in mathematics and physics. ]

    Posted by Mark Liberman at 07:44 AM


    ... handed on from one generation to another:

    [John Cowan writes: "Gray hair, they say, is hereditary: you get it from your children."]

    Posted by Mark Liberman at 07:04 AM

    August 02, 2007

    AWWW ...

    KPBS, my local public broadcasting station, has just decided to cancel two popular (but apparently too expensive) locally-produced shows. One of these is their radio language call-in show, A Way With Words. (Finally, the word play in the title is making some sense.) You can read all about it in this official press release and in this San Diego Union-Tribune article, the accuracy of which KPBS general manager Doug Myrland vouched for this morning.

    (Some of us -- OK, mostly me -- have ranted about this show or its hosts before; see the collection of links at the end of this post.)

    I suppose it had to happen sometime. The show began in 1998 and was originally hosted by Richard Lederer and Charles Harrington Elster. Harrington Elster (or is it just Elster, or just Harrington? I never know with these multiple surnames) left the show in 2004 due to a contractual dispute, and was replaced by Martha Barnette. Lederer himself retired in 2006 and was replaced by Grant Barrett (who also maintains the American Dialect Society website). The show's producers have long been trying to get the show nationally syndicated, but apparently to no avail; the most they've been able to achieve is to podcast the show and to broadcast it on Wisconsin Public Radio, WFYI-FM in Indianapolis, and WFPL-FM in Louisville, Kentucky (Barnette came to San Diego from Louisville, which may partly explain the latter).

    Co-hosts Barnette & Barrett (there's something about that combo of names, isn't there?) are understandably disappointed by this decision, but from what I've been reading they are confident that the show will land on its feet somewhere. In the meantime, Barrett has put together a blog for the show (where they have already begun to field questions and comments about the cancellation; see also the comments here).

    In closing, I can't resist another jab at Barnette. In the comments section of this post, Barnette replies to a comment from a listener in which there are two glaring misspellings: sence for sense and preceeds for precedes. (The worst part about the second one is that precede is spelled correctly in the comment that Barnette is replying to -- or, as devout listeners of the show tend to prefer, to which Barnette is replying.)

    [ Comments? ]

    Posted by Eric Bakovic at 02:40 PM

    Plus ça change

    A New Yorker cartoon from the late 1940s, illustrating an earlier version of the "talkative women" meme (sorry, I don't have a more specific citation or the identity of the cartoonist):

    The cartoon shows six women talking, as opposed to one man -- but there are four other men waiting in line. So perhaps it's about female social dominance (six to one) rather than female talkativeness (merely six to five, with some other men having perhaps given up and moved on). Or maybe it's about expected conversation length.

    I have very little idea what big-city telephone-booth culture might have been like sixty years ago. I've never commuted via the sort of train station where such long rows of booths could once be found. And I suppose that younger people today are no more familiar with telephone booths than they are with swingletrees or coulters.

    But one thing that remains constant, I think, is that people are much more annoyed when they're inconvenienced by behavior that fits a group stereotype than by the same behavior without the group-stereotype association.

    [Update -- Marc Naimark writes"

    You missed a telling point in the cartoon: the men are all waiting at the phone booth occupied by the man. They know that even the last man in the line will reach an available phone faster in that booth rather than in any of the others occupied by women.

    Well, they believe that, anyway. I was obscurely alluding to that concept when I referred to "expected conversation length". Another obvious feature is that the women are all smiling and talking in an animated way, with body language evocative of engaged communication, while the man in the booth and the men in line are just standing there, with their backs to the viewer.]

    [Ben Zimmer writes:

    I'm reminded of the Simpsons episode where Apu comes with Homer and Marge to the Monstro Mart and advises them which check-out line to use:

    Apu: Let's go to...that line.
    Marge: But that's the longest.
    Apu: Yes, but look: all pathetic single men. Only cash, no chitchat.


    Posted by Mark Liberman at 06:43 AM

    August 01, 2007

    Cousin of eggcorn

    Over on the American Dialect Society mailing list, we've been looking at the verb troll, in meanings similar to the verb trawl, and in passing the Christmas carol "Deck the Halls" was mentioned (as irrelevant to the topic), because of its line

    Troll the ancient Yuletide carol

    Beverly Flanigan reported that she'd always heard the line with trill.  I immediately trotted out the relevant OED2 subentry (which had cites for this use of troll from the 16th century through 1977) and noted that {"trill the ancient Yuletide"} got only one Google webhit, while the version with "troll" got 3,590.  Trill is clearly a reshaping; although there's an OED entry for the singing sense of troll, few people these days are likely to have encountered this sense anywhere except in "Deck the Halls", so that it's not surprising that some people have altered the verb to something that's recognizably musical.

    What we have here is a cousin of the eggcorn, a misquotation that improves a line by replacing an archaic or rare word by a phonologically similar word that makes sense in context.

    (This is entirely beside the point, but I can't hear any part of "Deck the Halls" without calling up the Pogo version, "Deck Us All With Boston Charlie", in which the counterpart to the troll line above is

    Trolley Molly don't love Harold

    Now I've probably given many of you an earworm.)

    Here's the OED's entry for the transitive musical verb troll:

    10. a. trans. To sing (something) in the manner of a round or catch; to sing in a full, rolling voice; to chant merrily or jovially.

    ... Perh. originally fig. from 6 = to sing in succession, as a round or catch (each line being as it were passed on to the next singer).

    The speculation about its source refers to an entry for a now-obsolete sense of troll:

    6. trans. To cause to pass from one to another, hand round among the company present; esp. in phrase to troll the bowl.

    As far as I'm concerned, sense 10 is virtually obsolete itself.  I'm a bit surprised that more people haven't "fixed" the Christmas carol by shifting to trill.

    [Added: or, as Thomas Thurman points out to me, to toll, treating the carol like a bell.  I got 10 webhits for {"toll the ancient Yuletide"}, including this one, from Yahoo Answers, where one helpful poster explains that "it is toll... not troll and it means to tell over and over" and another that "you mean Toll the ancient yuletide carol. it means to say somthing over and over again".  Such responses are very much like the ones you often get from people defending garden-variety eggcorns.  (No hits on {"tell the ancient Yuletide"}, alas.  But Will Fitzgerald tells me that some people have gone one step further and replaced troll by sing; there's a "sing the ancient Yuletide carol" version by the Carpenters.)]

    [Whimsical addendum 8/2/07: My correspondents have been wondering if someone has yet taken the step to syntactic reanalysis, as "Troll, the ancient Yuletide carol", like "Olive, the other reindeer" (Andy Hollenbeck) and "Gladly, the cross-eyed bear" (Larry Horn).]

    In any case, eggcornesque misquotations are not uncommon, though there seems to be no generally accepted name for this specific type of misquotation.  Two well-known cases:

    "Once more unto the breach" altered to "into the breach"

    "All that glisters is not gold" altered to "glitters" or "glistens"

    "Unto" has a slight edge over "into" in the first quotation (13,000 to 96,900 raw webhits), but "glitters" has definitely won the day over "glisters" in the second (258,000 to 11,500, with "glistens" getting a mere 2,550).

    Eggcornesque misquotation can be seen as arising from yet another type of conflict between Faithfulness (in this case, preserve the wording of the original) and Well-Formedness (in this case, make the word choice appropriate to modern English), with Well-Formedness (WF) winning over Faithfulness (Faith) in the misquotation. 

    I first talked explicitly on Language Log about the conflict between the two principles Faith and WF in a discussion of the conventions of punctuation ("Dubious question marks"), with a side excursion into spelling conventions, in particular British Labour vs. American Labor.  A conflict arises when material printed according to one set of conventions is quoted in places where a different set of conventions is in force: Faith says to reproduce the original, WF says to convert it to follow the local conventions.

    As I said in that posting,

    The larger point -- the conflict between faithfulness and well-formedness in linguistic mention -- is a gigantic one.  I originally started a Language Log posting on the topic back during the discussion of taboo words in titles of books and movies, but it quickly bloated up horribly.

    Suppose you want to refer to Harry Frankfurt's 2005 book that was on the best-seller lists for many weeks.  (See the Language Log posting here, with links back to earlier Frankfurt-related postings.)  Faith says to cite it as On Bullshit, but depending on who you are and what context you're writing (or talking) in, local conventions of modesty (a species of WF) might tell you to avoid the taboo word in one way or another.

    More recently, I posted about Faith confronting WF in the spelling of English plurals.  What is the plural of the common noun ducky -- duckys (Faith) or duckies (WF)?  And what is the plural of the proper noun Germany -- Germanys (Faith) or Germanies (WF)?  Both versions occur (in both cases), and frequently.

    I'll have more to say about Faith vs. WF, with several new types of examples, in a while.  For the moment, I'll just point out that eggcornesque misquotations seem to illustrate another type of conflict between the principles.

    zwicky at-sign csli period stanford period  edu

    Posted by Arnold Zwicky at 02:42 PM

    How not to spot a liar

    Well over a year ago, Mark posted about the atomistic way some researchers use hi-tech to detect deception, including the MRI-scanner and voice stress analysis. There may be no end to the ways that law enforcement can misuse such information, especially when it is hawked by those who know only a little about it.

    I leave the technological research on this topic  to Mark and to others who know it better than I do. I'm more familiar with the equally atomistic, easy-fix, instructional programs that a few former police officers and others are selling to law enforcement agencies these days. These techniques can be just as susceptible to misuse as the hi-tech ones because  they're oversimplified, atomistic, under-researched and can be outright misleading. Law enforcement officers face a constant problem  of trying to figure out when a suspect is lying and they constantly hope to find ways to do this. A machine would be nice, but so would a less technological approach. Some of the currently available programs are called SCAN, the Reid Technique, and Statement Analysis, which I described briefly in an earlier post.

    Clearly, it would be helpful if we could tell when the verbal and non-verbal behavior of people could give us the key to deceptive statements. Non-verbal signals got the first attention, loosely based on the very credible and promising research of Paul Ekman and his associates in San Francisco. One of his terms, "leakage," came into vogue very quickly, but it was far removed from Ekman's painstaking and complex research setting and it was translated in a grossly oversimplief manner by the commercial vendors of deception detection. They apparently relied on the theory that if Ekman's analysis of many simultaneous non-verbal clues found in many viewings of many videotapes of many subjects, including visual records from their heads to their feet, can offer some clues to deception,  then one or two isolated clues by themselves, regardless of the suspect's culture, age, gender, or ethnicity, must enable the police to do the same thing, even in that emotional, instant setting of a police interrogation. For example, if suspects don't look the cop in the eye, they must be lying, despite the cultural basis that supports eye-aversion in known contexts.

    The most common commercial program for spotting liars is based on the Statement Analysis technique. Suspects write an account of what happened on the day of the event in question. This approach contains no non-verbal clues, of course, but it's advocates claim that the following language features can help cops decide whether suspects are being untruthful:

    Making overly detailed statements
    Repeating oneself spontaneously
    Complicating the story unexpectedly
    Giving unusual details
    Providing marginally relevant details
    Giving related external associations
    Displaying subjectivity
    Correcting spontaneously
    Admitting memory loss
    Self-referencing excessively
    Manifesting verbosity
    Pausing excessively
    Using unnecessary connectors
    Using pronoun deviations such as "you" for "I"
    Producing disproportionate amounts of language in the prologue, central action, or epilogue portions of the narrative
    Providing low lexical diversity by means of type-token ratio

    Many questions can be asked about Statement Analysis. It has the advantage of getting a suspect's words on paper before the interrogator has a chance to foul things up. But it's based on little or no credible research that would indicate that these features are diagnostic and it's hard to know what is meant by "overly," "unexpectedly," "unusual," "marginally," "excessively," "unnecessary," "disproportionate," and "low."  These features come most frequently from police officers' reported past experiences about what worked for them in the interviewing process.

    I've heard this approach praised at academic meetings and I've read books and papers on it (are these  overly detailed or unusual details?). I guess I'm being overly detailed here (am I spontaneously repeating marginally relevant details?). I wasn't impressed by the papers I've heard (am I displaying subjectivity from my external associations?). I may have forgotten to mention that I may not have been impressed by the speeches either (am I admitting memory loss, correcting spontaneously, and hedging a bit?). Have you noticed how many times I self-referenced here ... and paused ... and used unnecessary connectors ... and produced a disproportionate amount of language in my prologue? I'll let you perform a type-token analysis of my lexical diversity.

    All I can say is that I must be lying.

    Posted by Roger Shuy at 01:31 PM

    A bulletin from the Language Log Early Warning Center

    A perfect storm of linguistic misinformation is brewing. Anti-passive prejudice has merged with the mind-bending power of brain-talk, powered by careless and credulous representatives of the fourth estate, and it's coming your way.

    According to Glenn Abel at Write for Blogs ("Get an active workout", 7/6/2007):

    I worked with a crazy man long ago who counseled me to go through my stories and eliminate all passive verbs. I took the advice and it worked like crazy. [...]

    Now comes scrientific [sic] evidence suggesting that people's brains respond to active verbs by sending signals to the appropriate body part.

    Abel got his "scrientific evidence" from Chip Scanlan, "Brain Science for Writers: Active verbs move nerve cells, too", Poynter Online, 6/28/2007:

    Use active verbs.

    It's a prescription for writing success long promoted by writers and teachers. [...]

    Earlier this week, while researching another story, I discovered a fascinating report that put this time-honored writing technique under the gaze of science.

    "For more than 60 years," a 2004 story in Science News Online reported, "scientists have known that a strip of neural tissue that runs ear-to-ear along the brain's surface orchestrates most voluntary movement, from raising a fork to kicking a ball."

    It turns out this part of the brain also fires up when people silently read certain words, scientists reported in the journal Neuron.

    "They have to be action words -- active verbs," the Science News Online story said, a conclusion certain to cheer wordsmiths.

    Scanlan in turn unearthed his information from Bruce Bower, "The Brain's Word Act: Reading verbs revs up motor cortex areas", Science News, 2/7/2004:

    For more than 60 years, scientists have known that a strip of neural tissue that runs ear-to-ear along the brain's surface orchestrates most voluntary movement, from raising a fork to kicking a ball. A new brain-imaging study has revealed that parts of this so-called motor cortex also respond vigorously as people do nothing more than silently read words.

    Not just any words get those neurons going, however. They have to be action words -- active verbs.

    OK, let's go to the sources. (And you know what we're going to find...)

    That would be two articles in Neuron -- first an editorial and then the research report. The editorial is Victor de Lafuente and Ranulfo Romo, "Language Abilities of Motor Cortex", Neuron 41(2) 178-180, 2004:

    Understanding the meaning of words that relate to a motor action, such as "dance," may need more than the well-known language areas of Broca and Wernicke in the left hemisphere of the brain. In this issue of Neuron, Hauk et al. (2004) report the surprising discovery that the mere reading of action-related words also activates the motor homunculus -- a cortical region of the brain that controls voluntary movements of our different body parts. Remarkably, just the reading of feet-related action words such as "dance" makes this motor homunculus move its feet.


    The results of Hauk et al. (2004) suggest that the cortical network supporting language is not localized in single areas but may involve widely distributed areas, differentially activated according to the semantic content of the word. As we have seen, the process of getting the meaning of a word engages the premotor and primary motor areas. An important question, however, is yet to be answered: is the reading task activating the motor areas just because an action and its name commonly cooccur in time (i.e., they are temporally associated), or is this a functional relation in which the motor areas play an active role in comprehension?

    Interesting stuff indeed -- but you might notice that there is nothing there about "active verbs" -- instead, we're told about "words that relate to a motor action", a category that can (for example) include nouns for tools, nouns for motor actions, and verbs for motor actions in the passive as well as active voice.

    The primary research report is  Olaf Hauk, Ingrid Johnsrude and Friedemann Pulvermüller, "Somatotopic Representation of Action Words in Human Motor and Premotor Cortex", Neuron 41(2) 301-307, 2004.

    The strings "active verb" and "passive verb" do not occur in this article.

    The lexeme "verb" occurs only once, in the literature review section:

    When hemodynamic and neurophysiological imaging studies compared words referring to objects with words that have a clear semantic relationship to actions, typically action verbs ... or nouns referring to tools [...], the latter elicited strong frontal activation including premotor cortex, suggesting that the frontal activation might reflect aspects of the action-related meaning of action words [...]. If so, the cortical locus of meaning processing could be, in part, determined by the general neuroscientific principle of Hebbian learning according to which neuronal correlation is mapped onto connection strength [...]. [emphasis added]

    The word "passive" occurs several times, but only in the phrases "passive reading task" and "passive word reading", which refer to the experimental paradigm in which subjects silently read visually-presented words, without taking any action. An example:

    Here we use event-related fMRI to show that action words referring to face, arm, or leg actions (e.g., to lick, pick, or kick), when presented in a passive reading task, differentially activated areas along the motor strip that either were directly adjacent to or overlapped with areas activated by actual movement of the tongue, fingers, or feet.

    The list of stimuli is not given, but we can infer from the examples given that most of the words were ambiguously verbs or nouns (e.g. dance, lick, kick). Some may well have been adjectives. Since the words were presented in isolation, and in their base form, the study did not in any way test whether the response to active verbs was different from the response to passive verbs, or from nouns or adjectives for that matter. All the simuli were "action words" of ambiguous lexical category, and the only independent variable manipulated was the body-part associated with the action in question.

    The whole misunderstanding started because Bower misunderstood "action words" to mean "active verbs".

    We'll keep an eye on this one -- my prediction is that it will gradually become part of the standard toolkit of misinformation about writing. The process may be slower because the misinterpreted research report is three years old. But hell, recent popular misinformation about the "emerging science of gender differences" was largely based on research that never took place at all!

    Reading about science in the popular press (or in meta-journalistic sources like Poynter) can be depressing, if you're laboring under the misapprehension that the goal is to understand and evaluate research, and to explain things to the public in a clear and interesting way. From this perspective, what you usually see is a process of progressive misunderstanding, distortion and exaggeration -- and you might conclude that science journalists are too lazy to read the original research reports, or too stupid to understand them, or too cynical and manipulative to care whether their stories bear any particular relationship to the truth.

    But this misses the point, which is not provision of information, but rather moral uplift and reinforcement of cultural norms.

    Posted by Mark Liberman at 10:16 AM