Language Log: July 2006 Archives

July 31, 2006

Automatic asterisking

My mail box overflows with information about taboo language in the world of rock music, including several suggestions that the odd awarding of asterisks in iTunes music store listings is the result of a script that does the asterisking automatically, and therefore can't block words that have significant non-taboo uses. I resisted this idea at first -- it seemed so, well, stupid, scarcely an effective way of protecting young minds from dangerous content -- but some research shows that this is almost surely so. Here's the system, as it was working today:

1. The search software accepts a taboo word as input, and finds everything with that word in it (yes, there are some occurrences that escape asterisking, as I'll demonstrate below), AND everything with the asterisked version of the word in it. So a search on cock pulls up occurrences of "cock" and also "c**k". The search ignores case ("dick" and "Dick" are the same word), apostrophes, and hyphens. And deals only with whole words, so fuck or fucks won't get you the band The Crucifucks. (Joe Salmons and Monica Macaulay, in their 1988/89 Maledicta piece "Offensive rock band names: A linguistic taxonomy", awarded the prize to Crucifucks. The band is still in print, I see.)

2. Asterisking preserves the number of letters in the word, and the first and last letters. So:

f**k, f***s, f****d, f*****g, f****n, f****n'

3. Asterisking affects the song titles and the album titles, and with (so far as I can tell) perfect consistency -- but, bizarrely, NEVER affects the name of the artist(s). So we get:

Artists: Cock Robin, Cock & Pussy, Goblin Cock, Cock Lorge. Artist: Big Cock. Album by them: "C**k Rock". Song on this album: "Year of the C**k".

Artist: Anal Cunt. Artist: Mary's Cunt. Their song: "Mary's C**t -- Pitbull Pete".

Artist: Holy Fuck. Album by them: "Holy F**k".

Artist: Crazy Penis. Artist: Penis Flytrap. Song by them: "P***s Flytrap | Wait".

Artist: The Piss Drunks. Artist: Piss Ant. Album by them: "P**s Off".

Artist: Nashville Pussy. Album by them: "Let Them Eat P***y".

Artist: Suck It To Ya. Album by Boris the Sprinkler: "S**k". Song by Dude Offline: "You S**k, You Drank My Beer!"

No human intelligence is applied in any of this. "Piss" is asterisked even when it doesn't refer to urine or urinating. Non-fellatial "suck" is asterisked. "Balls" is not asterisked, even in the song title "Blue Balls".

4. The obvious miscreants (and their variants) are asterisked; these words count, in some people's minds anyway, as taboo vocabulary across the board:

asshole, bitch, cocksucker, cunt, fuck, motherfucker, nigga, nigger, piss, shit

Some items get on the list because a high percentage of their uses are sexual:

ass, clit, cock, jism, masturbate, penis, prick, pussy, slut, suck, tit(s), vagina, whore

On the other hand, words that have significant non-sexual uses aren't asterisked:

balls, butt, dick, ho, hoe, jack (off), jerk (off), nuts

Nor are a few words that must have seemed to the iTunes folks to be technical or medical:

anal, anus, phallus, semen, testicles

There's a fine line here between these words that make it through the filter and "masturbate", "penis", and "vagina", which don't. (Another bit of oddness is the fact that "rape" escapes the iTunes filter.)

One wonderful result of all of this is Lil' Kim's track listed in iTunes as "S**k My Dick".

[Final disclaimer: I make no claim to completeness in these lists. Update 8/1/06: Though still scratching my head about a world in which "masturbate" is a dirty word, but "rape" is not, I've checked out some more vocabulary. A nice minimal pair: "blow job" is ok, because each word on its own is ok, but "blowjob" turns into "b*****b". "Bullshit" is, of course, out. "Turd" is out, but "crap", "poop", and "fart" are ok. And "fag" and "faggot" are out, but "gay". "queer", "homo", and "dyke" are ok.]

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 07:31 PM

Virtues, pleasures and myths

Geoff Pullum's note on F.A.N.B.O.Y.S. — a mnemonic acronym for the (alleged) English "coordinating conjunctions" for, and, nor, but, or, yet, & so — reminded me of the list at the end of the Ladies' and Gentlemen's Guide to Modern English Usage by James Thurber:

You might say: "There is, then, no hard and fast rule?" ("was then" would be better, since "then" refers to what is past). You might better say (or have said): "There was then (or is now) no hard and fast rule?" Only this, that it is better to use "whom" when in doubt, and even better to re-word the statement, and leave out all the relative pronouns, except ad, ante, con, in , inter, ob, post, prae, pro, sub, and super.

I expect that most people who learned Latin the old-fashioned way will recognize that list, and may even remember what it's a list of. (It's prepositions, not relative pronouns -- and similarly, the FANBOYS list is a mixed bag of central coordinators, unusual coordinators, marginal coordinators, and adverbs commonly used as connective adjuncts.). Here's the source of Thurber's list, in Allen & Greenough's New Latin Grammar for Schools and Colleges chapter 370:

370. Many verbs compounded with ad, ante, con, in, inter, ob, post, prae, prō, sub, super, and some with circum, admit the Dative of the indirect object:—
1. “neque enim adsentior eīs ” (Lael. 13) , for I do not agree with them.
2. “quantum nātūra hominis pecudibus antecēdit ” (Off. 1.105) , so far as man's nature is superior to brutes.
3. sī sibi ipse cōnsentit (id. 1.5), if he is in accord with himself.
4. “virtūtēs semper voluptātibus inhaerent ” (Fin. 1.68) , virtues are always connected with pleasures.
5. omnibus negōtiīs nōn interfuit sōlum sed praefuit (id. 1.6), he not only had a hand in all matters, but took the lead in them.
6.“ tempestātī obsequī artis est ” (Fam. 1.9.21) , it is a point of skill to yield to the weather.
7.“nec umquam succumbet inimīcīs ” (Deiot. 36) , and he will never yield to his foes.
8.“cum et Brūtus cuilibet ducum praeferendus vidērētur et Vatīnius nūllī nōn esset postferendus ” (Vell. 2.69) , since Brutus seemed worthy of being put before any of the generals and Vatinius deserved to be put after all of them.

... and so on. (Emphasis added.)

I remember this list because James Thurber and I were taught Latin by methods that rewarded us for memorizing such things. Thurber seems to have had mixed emotions about the experience, and so do I.

Here's how it worked. For each class meeting, we were expected to prepare the next chunk of Caesar, Cicero, Virgil, Tacitus or whatever we happened to be struggling through. Then students would be chosen at the teacher's whim, one after another, to read the passage, sentence by sentence, and translate it. After the translation of each sentence came a grammatical interrogation.

For example, if we were reading the part of Cicero's De Finibus Bonorum et Malorum where the business about virtues being connected with pleasures comes from, someone would have to stand up to recite and construe I.68:

Quocirca eodem modo sapiens erit affectus erga amicum, quo in se ipsum, quosque labores propter suam voluptatem susciperet, eosdem suscipiet propter amici voluptatem. quaeque de virtutibus dicta sunt, quem ad modum eae semper voluptatibus inhaererent, eadem de amicitia dicenda sunt. praeclare enim Epicurus his paene verbis: 'Eadem', inquit, 'scientia confirmavit animum, ne quod aut sempiternum aut diuturnum timeret malum, quae perspexit in hoc ipso vitae spatio amicitiae praesidium esse firmissimum.'

Hence the wise will feel the same way about their friends as they do about themselves. They would undertake the same effort to secure their friends' pleasure as to secure their own. And what has been said about the inextricable link between the virtues and pleasure is equally applicable to friendship and pleasure. Epicurus famously put it in pretty much the following words: "The same doctrine that gave our hearts the strength to have no fear of ever-lasting or long-lasting evil, also identified friendship as our firmest protector in the short span of our life. [Translation by Raphael Woolf, published as ("On Moral Ends", 2004]

Usually, the teacher would not directly criticize the English version, which the more diligent and less facile students might have memorized from one of the forbidden interlinear translations that we called "trots". (In the UK, I understand that these were called "cribs".) Instead, the teacher would probe the student's understanding of the structure.

For example, he might ask you to "tell us about inhaererent." In response, he'd want the principal parts (inhaereo, inhaesi, inhaesum) and the form in this instance (imperfect subjunctive active 3rd person plural). Then he might ask "why should it be in the subjunctive?", "what is its subject?", etc. Eventually he'd get to voluptatibus. It's the plural ablative or dative of voluptas, sir. Well, make up your mind, which one is it? Um, ablative, I guess. Really? Can you tell us why? Um, uh, ablative of specification, sir. Nice to see that you've finally learned some grammatical terminology, Liberman, but that's not the answer we're looking for here. Anyone else?

And some obnoxious swot who has memorized the whole grammar book (I won't name names) pipes up smugly from the back row: "Sir, verbs compounded with ad, ante, con, in, inter, ob, post, prae, pro, sub, and super admit the Dative of the indirect object." Dative of the indirect weenie, if you ask me.

Now the trick here is that Latin noun forms in -ibus are generally ambiguous between the dative plural and the ablative plural. So to name the case, in any given example, you need to decide what the construction is, and deduce from that whether the dative or the ablative would have been used, if a word had been chosen where you could tell the difference.

The result is an elaborate functional taxonomy of the Latin case system -- the Datives of Agency, of Reference, of Purpose or End, of Service, of Fitness, the Ethical Dative, etc. -- which gave me endless trouble when I was a student. For a long time, I thought this was because I skipped first-year Latin, and was thrust directly into construing Caesar's Gallic Wars, sink or swim. (Similarly, my father, who was color blind, thought for years that he must have been out sick when they taught about colors in school.) And when I tried to memorize the terminology, I always got distracted by the examples. (Did Cicero really say that virtues are always connected with pleasures? No, alas, it turns out that he puts these words into the mouth of Torquatus, and argues against them. Cicero was no Epicurean.)

Content aside, many of the examples struck me as subject to more than one interpretation, falling in the cracks between the case-taxonomy categories. And in fact, the role of voluptatibus in the line adapted from Cicero, virtutes semper voluptatibus inhaerent, is a case (so to speak) in point. Allen & Greenough might cite it as an example of how "verbs compounded with ad, ante, con, in, ... admit the Dative of the indirect object", and Lewis & Short's entry for inhaereo might quote the same (modified) quote in a list of examples of inhaereo with the dative. But Allen & Greenough add that:

In these cases the dative depends not on the preposition, but on the compound verb in its acquired meaning. ... The construction of § 370 is not different in its nature from that of §§ 362, 366, and 367; but the compound verbs make a convenient group.

Convenient for them, maybe. Tracking down the cross-references, we learn that:

362. The Dative of the Indirect Object with the Accusative of the Direct may be used with any transitive verb whose meaning allows (see § 274).
366. The Dative of the Indirect Object may be used with any Intransitive verb whose meaning allows.
367. Many verbs signifying to favor, help, please, trust, and their contraries; also to believe, persuade, command, obey, serve, resist, envy, threaten, pardon, and spare, take the Dative.

It's not clear to me whether inhaerent in this phrase is really an example of an intransitive verb whose "meaning allows" the Dative of the indirect object -- there's no (literal or metaphorical) transfer of substance from virtues to pleasures, for instance. And Lewis & Short start the entry for inhaereo with a bunch of examples where it takes the ablative case:

I.(a). With abl.: sidera suis sedibus inhaerent, Cic. Univ. 10 : animi, qui corporibus non inhaerent, id. Div. 1, 50, 114 : visceribus, id. Tusc. 2, 8, 20 : constantior quam nova collibus arbor, Hor. Epod. 12, 20 : occupati regni finibus, Vell. 2, 129, 3 : prioribus vestigiis, i. e. continues in his former path, Col. 9, 8, 10 : cervice, Ov. M. 11, 403 .

Their second example of inhaereo taking the ablative (animi, qui corporibus non inhaerent = "souls which aren't connected with bodies") strikes me as similar in meaning to the example we started with (virtutes semper voluptatibus inhaerent = "virtues are always connected with pleasures").

Well, if I'd brought all this up in Latin class, Mr. Mansur would have accused me of being a Philadelphia lawyer, and told me to get on with Cicero or sit down. And you're no doubt reacting in a similar way.

I do have a point, though. Brett Reynolds' post on FANBOYS observes that lists and hierarchies of this kind are "myths" that "[give] the faithful a comfortingly simple handhold in a confusing world". I'm sure that this is true -- but such myths can be confusing and even disturbing, not comforting, for those who think about them too seriously. In secondary-school Latin, I was torn between believing that the whole grammatical apparatus was a well-founded logical structure that I might grasp some day, if I applied myself, and seeing it as a mass of half-digested confusion, as the grammarians themselves sometimes seemed to admit:

As the Romans had no such categories as we make, it is impossible to classify all uses of the ablative. The ablative of specification (originally instrumental) is closely akin to that of manner, and shows some resemblance to means and cause. [Allen & Greenough, 418.a.]

It came as a breath of fresh air when I took a linguistics course in college, and learned that modern grammarians offer systematic arguments for their assumptions, categories and analyses, and that they sometimes admit that they are wrong. Well, they more often admit that their colleagues are wrong, but collectively it comes to the same thing.

Posted by Mark Liberman at 10:32 AM

July 30, 2006

On the taboo watch: The rock report

We're staying on the alert here at Language Log Plaza for the way taboo vocabulary is deployed, or avoided, in various settings, from the prim pages of the New York Times to the maximally immodest covers of gay porn magazines. Today we're enjoying a musical interlude, a brief look at taboo language in rock music.

Language issues come up in four places: in lyrics, in song titles, in album titles, and in band names. Song titles, album titles, and band names present some of the same problems as porn magazine covers, since they will be displayed in public, on album covers and on play lists. (All four are on display at concerts, of course). Rock music is sturdily defiant, so you see people pushing the language line pretty hard in all four places, and eventually in visual materials as well (most famously, with the poster for the Dead Kennedys' album "Frankenchrist").

I'm hoping that somebody's done a more thorough study of taboo topics and language in popular music, because there's a whole lot there and I'm just giving a tiny sample here. [Update 7/31/06: Greg Stump points us to the Sonic Breakdown entry for fuck, where the section "Notable fuck bands" gives a capsule history of this particular word in music.] There's a long pre-rock history of suggestive lyrics: "It ain't the meat, it's the motion", "She got pinched in the As / tor Bar", all those versions of Cole Porter's "Let's Do It", and much much more. Eventually the fuck hits the rock fan, obscenity laws are challenged, ratings systems are proposed, and Tipper Gore enters the arena. Rap/hip-hop has been a scene of obscenity contention for twenty years now, with 2 Live Crew as the most famous early offenders.

Along the way we have people choosing band names that are right up against the line -- the Butthole Surfers -- and then over it, as with this Toronto band, as described by a fan:

Hell Yeah Fuck Yeah - Toronto's best, doing their worst! Toronto rockers HELL YEAH FUCK YEAH are set to unleash their furious brand of high voltage, punk-infused rock & roll on the unsuspecting masses. A band well versed in bullshit and disappointment, H.Y.F.Y. have individually left their mark on the world through highly successful past endeavors including; Project Wyze, Damn 13, Constable Brennan and Canadian punk legends The Almighty Trigger Happy. Despite only forming recently, this pack of rabble rousers have already left a huge impression on the people who have been fortunate enough to see them rip through their fast paced, high energy live show. Highlights include a sold out show at Toronto's legendary Bovine Sex Club, where H.Y.F.Y.'s massive buzz left an enormous line of disappointed fans out in the cold, lined up down the block. A must see? Hell yeah, fuck yeah!! Toronto's worst doin their best! HELL YEAH FUCK YEAH!

[Thanks to Tom Limoncelli for the pointer to the band.] You'll note the initialism "H.Y.F.Y." -- usually given as "HYFY" -- which provides a way to refer to the band without being officially obscene. [Update 7/31/06: A number of correspondents have now nominated the band Anal Cunt for the Bad Taste Palm.]

Album titles follow the same arc, with Gene Simmons's 2004 "Asshole" a recent entrant in the deliberate-offense sweepstakes. (Well, it's Gene Fuckin' Simmons.)

Then there are the song titles. Sometimes songs with taboo lyrics are given neutral titles: Nine Inch Nails's "Closer" and Pansy Division's "Anthem", for instance. This works for public display, but fans often refer to these two songs via their central lines anyway, as "I want to fuck you like an animal" and "We're the buttfuckers of rock and roll", respectively, from:

I want to fuck you like an animal
I want to feel you from the inside

We're the buttfuckers of rock and roll
We wanna sock it to your hole

Pansy Division has also been known to take the route of avoidance by initialism, as in their song "C.S.F.", a gay male re-working of the defiant anthem "Colored Spade" from Hair, which begins:

I'm a cocksucking faggot, a flaming faggot
A fuck bunny, fruitcake, cum superdeli, homo

But mostly Pansy Division just puts those words right out there in the song titles: -- "Fuck Buddy", "Cocksucker Club", "Political Asshole", "He Whipped My Ass in Tennis, Then I Fucked His Ass in Bed" -- and prints them on their albums, presenting a problem for Apple's modest iTunes music store. The iTunes store asterisks out the usual suspects: "The C********r Club" (yes, eight asterisks, all in a row), "Political A*****e", "...F****d His...". Remarkably, iTunes also avoids "ass" ("Two Way A*s" and "He Whipped My A*s in Tennis...") and even "slut" ("I'm Gonna Be a S**t"). You can see that their asterisking is systematic: preserve only the first and last letters. But their choice of words to asterisk is puzzling; "ass" and "slut" are out, but "dick" gets by (in "Dick of Death"), and so does "jack off" (in Prince's "Jack U Off", covered by Pansy Division). Once again, "dick" and "cock" seem to be on different sides of the offense boundary.

As for the words, sometimes they seem to be there primarily as a gesture of defiance (as in "C.S.F.") or insult [Update 7/31/06: A Bad Taste Palm to Jimi Lalumia and the Psychotic Frogs, for their thoroughly nasty cover of "Eleanor Rigby", with the refrain line "All you fuckin' people".], but in some cases they're pretty much intrinsic to the content. A lot of rap/hip-hop is about sex, and virtually all of Pansy Division's energetic and cheery songs are, and both naturally use everyday vocabulary for talking about sex. HYFY.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 03:00 PM

US Continues to Fire Gay Language Specialists

The US military continues to fire badly needed language specialists, including specialists in Arabic, because they are gay. The latest case, described in this news report, indicates that they are actually becoming more agressive and not abiding by their stated policy of "don't ask, don't tell". Arabic specialist Bleu Copas was dismissed after an eight month investigation triggered by anonymous emails. Apparently the "War on Terror" justifies patently illegal warrantless wiretapping but not easing up on the war on homosexuals in order to retain people with badly needed skills.

Posted by Bill Poser at 01:29 PM

July 29, 2006

Summer homes and strawberry rhubarb pie

An article about Art Buchwald in the garden section (of all places) of the New York Times (here) descrbes his love for his "summer home." The rich and famous in large Eastern cities seem to have one of these homes to use as a retreat from work and the hustle-bustle of urban life. It's also a place where friends and relatives can visit. Even though he's now 80 and dying, Buchwald loves getting company. People like Walter Cronkite, Dave Barry and Carly Simon drop by regularly. He chose this summer home as the place to spend the waning days of his life. And he's still writing his syndicated columns. The article made me think a bout the "summer homes" of linguists and other academics.

Sometimes our salaries don't allow us the luxury of even one home, much less two. But we can use our summer break time to give lectures or teach courses in other places, attend conferences, and mostly do our research and writing--because we love what we do. Some pick nice places to do their research, like the mountains of Montana, where my fellow Language Logger, Sally Thomason, researches the Salish language in the Flathead. Before I moved to Montana I used to spend parts of my summers here, my wife's home state, and I fell in love with this place. When I retired from teaching at Georgetown, we moved here and, like Buchwald, we made it our permanent "summer home." I now do my writing and research here, far from the madding crowd (and any loose gerunds that happen to be running around).

But it was the last quote in this article about Buchwald that hit me between the eyes. His housekeeper broke into his conversation with the interviewer and asked if he'd like some strawberry rhubarb pie:

"Later on today I will probably have some," Mr. Buchwald says. "But at this moment in my life I am so happy I can't do anything different."

He was where he wanted to be, doing what he wanted to do. How many people can say this? But that's what loving your work and doing it in the place you love can do for you. From all I can tell, most linguists dearly love their work and even do it in their spare time. They may not always love living where their jobs are, but they can usually find "summer homes" of sorts to help take care of that need. And they can move to those places when they retire, getting the best out of both work and home. When this happens, like Buchwald, they can be so happy that they can't do anything different. And they can have their strawberry rhubarb pie whenever they want it.

Posted by Roger Shuy at 05:57 PM

Roses of Mohammad for breakfast, elastic loaves for lunch...

Back in January, the edict came down in Tehran that Danish pastries (shirini danmarki) should henceforth be called "roses of Mohammad" (gul-e-muhammadi). At the time, this seemed to be a specific reaction to the Danish cartoon controversy; but it seems that the Farhangestan -- Iran's equivalent of the Academie Française -- has a much longer hit list, including pizza (now to become "elastic loaf") and helicopter (now to be "rotating wing"), and Mahmoud Ahmadinejad has decreed that the changes should be mandatory for official documents, schoolbooks and newspapers. The Reuters article observes that "[t]he words created by the Farhangestan as replacements to European loan words often sound cumbersome or comic to Iranians". None of the English-language news articles so far give the recommended Persian (or if you prefer, Farsi) circumlocutions.

My favorite example of this genre is the attempt some years ago, by one of the Francophone language-policing bodies, to replace "bulldozer" with "tracteur à lame horizontale" (= "horizontal-bladed tractor"). This was so completely unsuccessful that the proposed replacement now has a Google count of zero.

[Hat tip: Gwynn Dujardin.]

[Update -- Abnu at Wordlab pointed me to a paper by Ebrahim Monajemi, "Can ethnic and minority languages survive in the context of global development?", which gives some further details about the Farhangestan organization:

One of the academic cultural centers which tries to keep the Persian language free of alien words is the Academic Center of Persian language and literature or the department of Farhangestan-e-Zaban Va Adab Farsi. This organization has the duty to coin or adapt new words for the non-Persian ones. It consists of 25 Persian language experts and professors who are the final decision-makers. There are several specialized sub-departments such as Engineering, Medicine, Agriculture, Transportation, Military, Economic, and so on, that cooperate with the Center.

Farhangestan-e-Zaban VA Adab Farsi is responsible for coining the new Persian words against New Latin ones using by people or may be used in future. This organization follows the below procedures to coin or adapt appropriate words for the Latin ones. Its main policy is as follows (Farhangestan-e- Zaban-2001):

1. In coining and choosing a new word, Persian phonetic rules and learned speakers’ way of talking and Islamic points of views should be regarded as criterion.
2. Phonetic rules should be obeyed according the Persian way of talking.
3. New words that are found or created should follow the Persian grammatical rules for coining nouns, adjectives, verbs and so on.
4. New words should be chosen or coined out of the most common or frequent words that have been used since 250 AD.
5. New words can be chosen from among the most frequent and common Arabic words, as they are used in Persian.
6. New words can be chosen out of the middle and Old Persian languages
7. There should be only one equivalent in Persian for any of the Latin ones, particularly for technical words.
8. It is not so much necessary to adapt or create new Persian words for those Latin words which have been used internationally and globally.

It's not clear to me why "pizza" wouldn't count as one of "those Latin words which have been used internationally and globally".

Anyhow, searching for {Farhangestan Zaban} turns up quite a bit of other interesting information, including http://www.persianacademy.ir. ]

Posted by Mark Liberman at 02:14 PM

They sure is

Family get-togethers can be tough for little ones -- they want to stay up and have fun with everybody else, but the adults keep trying to calm them down and get them to go to bed. Last night we set up sleeping spots for our niece (2 1/2) and nephew (6 1/2) in the living room, turned down the lights and put a movie on to keep them entertained. Their father says to them: "Lie down and watch the movie." To which my niece replies: "We're am!"

[ Comments? ]

Posted by Eric Bakovic at 01:22 PM

F A N B O Y S ?

Brett Reynolds in the inaugural post of his new blog comments on something that baffled him when he first began college teaching: FANBOYS. I hadn't heard of this before either. FANBOYS is nothing to do with fangirls. Says Brett:

The first time I walked into our writing centre, I noticed that FANBOYS was pasted in large letters across one wall. While many readers may be familiar with FANBOYS, I'd never heard of them, but according to many freshman writing textbooks, FANBOYS is a mnemonic for the co-ordinating conjunctions in English (for, and, nor, but, or, yet, & so).

This is supposed to be a list of words that pattern alike. (Check it out. They do not.) Much of what traditional grammar says about the purported "co-ordinating conjunctions" is a mess, like what it says about the pseudo-class of "conjunctions" generally; The Cambridge Grammar tries to straighten this out. Brett explains some of the more complex reality very nicely, and he also understands what makes an easily memorized oversimplification so seductive: "it gives the faithful a comfortingly simple handhold in a confusing world." It does indeed. A lot of style and grammar guide authors must look at a list of desiderata such as (1) simple, (2) memorizable, and (3) accurate, and think to themselves, two out of three isn't bad.

Posted by Geoffrey K. Pullum at 01:16 PM

James J. Kilpatrick, grammarian

A couple of weeks ago, James J. Kilpatrick opened his column with an astonishing piece of linguistic misanalysis (July 16, 2006, "Even a little ambiguity"):

This was a headline in USA Today on April 28: “Mass Transit Not an Option for All Drivers.”

Did you wince? Roll your eyes? Did you groan? Then you have the soul of a grammarian, and will go to heaven when you die…. There you will lecture the seraphim on the distinction between “all” and “not all,” and you will explain to them that if mass transit is not an option for “all” drivers, it cannot be an option for even one driver.

Neal Whitman at Literal Minded ("If It’s Not for Everyone, It’s Not for Anyone" 7/21/2006) called Kilpatrick to account. As any linguist would, Neal explained the offending ambiguity in terms of the relative semantic scope of the negative not and the quantifier all -- in heavy English rather than predicate calculus, it's the difference between "it's not the case that mass transit is an option for all drivers" and "for all drivers, it's not the case that mass transit is an option."

And as any sensible speaker of English would, Neal observed that Kilpatrick is full of it when he asserts that "if mass transit is not an option for 'all' drivers, it cannot be an option for even one driver".

Neal's evidence was his own linguistic intuition:

[I]n Mass transit [is] not an option for everyone, the most natural reading for me ... is the one the headline writer intended, the one with the wide-scoping negation...

My intuition agrees with Neal's, and we could cite evidence from published studies to the effect that most native speakers of English agree with the two of us rather than with Kilpatrick. But instead, since Kilpatrick is an authoritarian conservative who cares little for the opinion of the vulgar mob, I'll invoke the authority of respected authors over the centuries. When Kilpatrick starts giving his grammar lessons in the streets of paradise, there are going to be a lot of giggles from the better-informed pedestrians.

Anthony Trollope, The Last Chronicle of Barset, vol II, chap. LXXI:

The solution of the mystery was not known to all,---was known on that night only to the very select portion of the aristocracy of Silverbridge to whom it was communicated by Mary Walker or Miss Anne Prettyman.

= it was not the case that the solution of the mystery was known to all
≠ the solution of the mystery was unknown to all

Herman Melville Mardi and a Voyage Thither vol. 1 chap. LXXXIV

But the imperial Marzilla was not for all; gods only could partake; the Kings and demigods of the isles; excluding left-handed descendants of sad rakes of immortals, in old times breaking heads and hearts in Mardi, bequeathing bars-sinister to many mortals, who now in vain might urge a claim to a cup-full of right regal Marzilla.

Joseph Glover Baldwin: The Flush Times of Alabama and Mississippi (1853), "Ovid Bolus, Esq., Attorney at Law and Solicitor in Chancery":

He did not confine himself to mere lingual lying: one tongue was not enough for all the business he had on hand. He acted lies as well. Indeed, sometimes his very silence was a lie.

William Shakespeare, As you like it, Act 3, scene 5; Rosalind says to Phebe:

But Mistris, know your selfe, downe on your knees
And thanke heauen, fasting, for a good mans loue;
For I must tell you friendly in your eare,
Sell when you can, you are not for all markets:
Cry the man mercy, loue him, take his offer,
Foule is most foule, being foule to be a scoffer.

T.S. Eliot, Choruses from 'The Rock' VI:

24 But the man that is will shadow
25 The man that pretends to be.
26 And the Son of Man was not crucified once for all,
27 The blood of the martyrs not shed once for all,
28 The lives of the Saints not given once for all:
29 But the Son of Man is crucified always
30 And there shall be Martyrs and Saints.

Since Kilpatrick may regard Shakespeare as out of date, and Melville as an untrustworthy Massachusetts liberal, I'll cite an example from a recent source with politically appropriate credentials:

In a sign of the difficulties Mr. Gingrich could face from other Republicans, however, Representative Tom DeLay, the House Majority Whip, immediately denounced the fund. ''Giving the I.M.F. more money is not a panacea for all the troubles that bedevil the Asian economy,'' he said. ''In fact, in many instances, the I.M.F is the problem, not the solution.'' [NYT, "Gingrich Clarifies G.O.P. Stands on Trade", by Alison Mitchell, June 26, 1998]

Or if The Hammer's recent corruption problems retrospectively disqualify him, how about the commissioner of baseball?

"There's no sense kidding ourselves about the ballparks,'' Selig said. "They've been great for the game, but they're not a panacea for all our ills.'' [Houston Chronicle, "Baseball attendance flagging for several reasons", by Richard Justice, 4/16/2003]

Or if politial outlook doesn't matter, let's move a few degrees to the left for a quote from Clyde Prestowitz:

Clyde Prestowitz, whose 1988 book, "Trading Places,'' predicted that Japan's government-business partnership would allow it to dominate high technology at America's expense, now declares that "the Japanese model was a fantastic catch-up model, but it was not a model for all seasons.'' He has taken to denouncing crony capitalism and sternly lecturing Japan on the need for fundamental reform. [Paul Krugman, [NYT, "Predicting doom in Asia's 'miracle' economies", by Paul Krugman, 5/5/1998]

In fact, after a modest amount of searching, I haven't come across a single published example where a competent writer of English follows Kilpatrick's theory of semantic interpretation. There must be some out there -- if you can find one, please let me know.

What led Kilpatrick to open his column so confidently with such a spectacularly wrong assertion about how the English language works? I won't speculate about his psychology, and I don't know about possible precursors in the prescriptivist literature for this particular piece of weird semantics. But my impression is that artificial rules about usage often start when a half-educated commentator with more self-confidence than insight, and with no respect for either demotic or elite traditions, decides that some common practice is inefficient or illogical. Why such pronouncements occasionally gain widespread acceptance is a question that could be the subject of several dissertations in intellectual history or social psychology. My own guess, FWIW, is that more insight will come from the natural history of religion than from rational choice theory.

This puzzle goes beyond the distinction that Eugene Volokh pointed out several years ago ("The Language Police", 1/26/2003):

Language defined by changing usage is what some call a "grown order" -- a judgment formed by millions of people, based on their senses of what is convenient and comfortable for them. (Free market economic decisions are another classic example of something that's mostly a grown order.) Linguistic prescriptivism (dictionarymakers recording what they think should be the usage, not what is the usage), is a "made order" -- a judgment of a small group of people selected for the purpose of rendering their judgment. Made orders are sometimes useful, for instance in the setting of technical standards. But as to language, I think the grown order approach is far more likely to yield a language that is genuinely responsive to users' needs than the made order approach.

In the first place, we're not talking about changing patterns of usage here -- as far as I can tell, Kilpatrick is not only wrong about contemporary English, he's wrong about Shakespeare and everyone in between. Nor does he claim that he's talking about a change that should be resisted. And in the second place, this is nothing like an ISO committee setting up a technical standard, it's one isolated individual, with no particular standing, laying down the law about how things ought to be, while pretending that his irrational prejudice is a foundational principle of grammar. In commentary on Eugene Volokh's post Neal Whitman's brother Glen observed that

The linguistic prescriptivists are analogous to the managers of a firm who, upon observing a new competitor that claims to make a better mousetrap, stubbornly insist that the old-fashioned mousetrap is superior. And maybe they’re right; the real test is in the mousetrap-buying choices of consumers. Likewise, in language, the test of the prescriptivists’ prescriptions is their staying power.

But presciptivists -- like Kilpatrick in this case -- often claim the support of logic rather than tradition. In this context, most linguists are not really either "prescriptive" or "descriptive" -- we try to evaluate claims about tradition, contemporary usage and logic in an honest and realistic way. We often wind up debunking the false claims of those peddling dubious linguistic prescriptions, but we're just as happy to debunk false descriptive claims. And of course we're happiest to join in advancing the understanding of how (and why) speech and language work, rather than the essentially negative enterprise of debunking nonsense of any sort.

I can't resist ending with a small personal note. Neal observes that the only way he can get Kilpatrick's favored reading for "Mass transit [is] not an option for all drivers" is to imagine saying it "with a seriously high pitch on all drivers". I believe that this is a reference to a phenomenon that Ivan Sag and I discussed, under the name of the "contradiction contour", in one of my first published linguistics papers: Mark Liberman and Ivan Sag, "Prosodic form and discourse function", CLS 10, 416-27 (1974). The version of this work that we presented at the annual meeting of the Chicago Linguistics Society may well have been the first scholarly paper ever performed with kazoo accompaniment -- in order to show how effective pitch contours can sometimes be in conveying meaning in English, we acted out some little skits in which Ivan's side of the conversation was performed on the kazoo. And we didn't debunk anything, though we did respectfully disagree with Ray Jackendoff about how to explain the effects of intonation on semantic scope. But all that belongs in another post.

[I should also mention that Eugene Volokh was mistaken about how most dictionary-makers see their role, when he wrote that they "[record] what they think should be the usage, not what is the usage". While lexicographers try to distinguish older usage from recent usage, and standard usage from dialect usage, and formal usage from informal usage, they're definitely in the business of describing rather than legislating. As a result, Robert Hartwell Fiske, in his screed The Dictionary of Disagreeable English, calls them "laxicographers". ]

Posted by Mark Liberman at 10:37 AM

July 28, 2006

X as the Y of Z

Recently, a reader alerted us to an incipient snowclone in the speeches and writings of Rees Lloyd:

(link) "The ACLU has lost all moorings and common sense and rationality and proportionality," he said. "It's become the Taliban of American liberal secularism.
(link) “Today, I am literally ashamed, ashamed that [the ACLU] has become the Taliban of American liberal secularism, wiping our history clean."
(link) "... the ACLU, which I believe has become, by its fanaticism, the Taliban of American secular totalitarianism.”
(link) ... the ACLU is now so fanatic and loosed from common sense that it has become the Taliban of liberal secularism ...

The idea of X as the Taliban of Y has been more widely used:

(link) "The Taliban of Modern Market Capitalism: Fear the accountants as much as the terrorists"
(link) I heard my denomination, the Lutheran Church-Missouri Synod, described as "The Taliban of American Christianity" the other day.
(link) "If allowed to, Hezbollah could easily become the Taliban of Lebanon."

And sometimes the group compared to the Taliban are put in the of-phrase:

(link) Michael Moore and Al Franken are very proud that you and the taliban of liberal loonies are out making fools of themselves.
(link) In jumping into the Schiavo case, the Republicans are simply once again engaging in crass pandering to the Taliban of the religious right...

But the Taliban metaphor is not yet all that frequent, and in any case, it's a particular instantiation of a more general and much commoner meta-snowclone, which starts from a correspondence of the form A:B::C:D, where A and B are famous and evocative of some desired properties and relationships, and then transfers those properties and relationships to C in the context D, with a phrase like "C is the A of D". One of the commonest of these correspondences sets up George Washington and the establishment of the United States of America as the A:B pattern --

(link) In meetings, Deputy Undersecretary of Defense William Luti described him [Ahmed Chalabi] as the "George Washington of Iraq."
(link) Ali Al-Sistani: The George Washington of Iraq?
(link) He is called El Liberator (The Liberator) and the "George Washington of South America."
(link) ...it is probably only Nelson Mandela whose charisma, role, and accomplishments give him some claim to be the George Washington of his country
(link) The files contain an array of his edgy political positions, including his statement in Philadelphia that "Ho Chi Minh is the George Washington of Vietnam."
(link) Well, it happened, though, to the George Washington of Chile, Augusto Pinochet, the man who kicked the Reds out of Chile.

Similar phrases involving other founding (or at least epitomizing) fathers and mothers are common:

Price is truly "the Charles Darwin of nutrition".
Andrew Leonard is the Charles Darwin of bots...
Loch Eggers is the Charles Darwin of surfing.
...[Stilgoe] is the Charles Darwin of real estate, utility and transportation development...
Geoffrey Moore is both the Carl Linnaeus and the Charles Darwin of business and markets.
One writer referred to Breathnach as the "Isaac Newton of the Simplicity Movement."
A professor of philosophy from the University of Texas says, "William Dembski is the Isaac Newton of information theory."
Griffith, for good reason, is considered the Isaac Newton of filmmaking...
Arguably the most brilliant thinker of ancient China, and certainly the most systematic, he [Xunzi] has been called "The Aristotle of the East."
Scott McCloud, known as the "Aristotle of comics", writes that ...
Could Kerry be the 'Hitler of the Unborn'?
Saddam Hussein is the Adolf Hitler of the 1990's.
Elijah Muhammad is the Adolf Hitler of the black man.
Tila Tequila is the Adolf Hitler of culture.
Nancy G. Brinker calls herself ''the Carrie Nation of breast cancer.''
And the revolutionary was a woman oft hailed as a pioneer for women’s rights, the Carrie Nation of contraception, Margaret Sanger.
Ishimoto Shizue: the Margaret Sanger of Japan.
Marie Stopes, by the way, was the Margaret Sanger of England...

And there can be correspondences that don't involve people at all:

(link) Patents have become the nuclear stockpiling of the software industry.

[Update -- James Callan wrote in with links to his posts on "X is the Saudia Arabia of Y" ("I've discovered a snowclone", 7/13/2006), and "X is the Seattle of Y" ("The Things We Learn from Google", 7/20/2006).

And Benjamin Zimmer wrote:

This reminds me of an essay by Douglas Hofstadter, "Analogies and Roles in Human and Machine Thinking" (Sci. Am., Sep. 1981, reprinted in Metamagical Themas, 1985), where he considers what it means to call Denis Thatcher "the First Lady of Britain" or "the Nancy Reagan of Britain".
Searchable Metamagical Themas links:
http://www.amazon.com/gp/reader/0465045669/
http://books.google.com/books?id=o8jzWF7rD6oC

]

Posted by Mark Liberman at 07:23 AM

July 27, 2006

Stupid self-defeating warning label nonsense

I have been, for many years, a student of the language found in the stupid warning labels that grace increasing numbers of products in this increasingly litigious society. I have written before about the ultimate arch-warning that said "Do not misuse." But that one is rather like the sign Mark was once asked to make, saying that those who are not authorized are not authorized, or the "Do not use in the shower" label on Susanne Goldmann's hair dryer, or the warning that Peter-Arno Coppen saw quoted from a bike manual (in the excellent Dutch language magazine Onze Taal) to the effect that "Removing the wheel can influence the performance of the bicycle", or the astonishing sign that Barbara Phillips Long saw in an elementary school in Ithaca, N.Y., that said "Do not use elevator when no one is in building": cases like this may seem intuitively unnecessary, but ithey certainly imply directives that absolutely everyone is well advised, even obliged, to agree with and obey. However, I have recently been noticing warning labels that are impossible to obey without ruining the usefulness either of the label or of what it is attached to.

Just yesterday I received from some music club a sheet of stick-on security labels saying "This CD is the property of Geoffrey K. Pullum", and the sheet also carried a warning: "Do not affix directly to CD." Think about that. They have made me some free labels to mark my CDs as my personal property and warned me not to put them on them. (Yes, I know I could put them on the boxes. It would be great to be confident that those empty boxes would always be returned to me.)

And today, as I approached an automatic sliding door at an Office Max store that opened in response to a motion detector set to activate fairly close in, I noticed a bold sign on it saying, "Automatic door — Keep clear." Are people actually thinking about the sentences they put on such things? Or do they just (unlike Mark) make whatever signs they are told to make, no matter how ridiculous the assignment?

Just two more things about warning labels and then I promise I'll shut up. I know you want me to, but just two things. They aren't anything to do with the theme of this post, about language that clearly and necessarily defeats its own purpose (like "I am not moving my lips"); I just want to say these things and then that'll be that, OK?

1. Just once, I would love to use a stepladder that did not bear a label warning me not to treat its top step as a step. Just make the thing robust and leave it to me how high I want to go.

2. I still think the all-time most insane warning message I ever saw on anything anywhere was the message on a windshield-size folding cardboard sunscreen that I bought. (Let me just explain to residents of the Falkland Islands that the idea is to block out the rays of the sun from the front of your car while it is parked, so you don't burn your fingers on the steering wheel when you come back after a few hours in the hot California sunshine and try to drive off. You must understand that — while the weather is gorgeous in Santa Cruz — earlier this week in the town of Bradley, an hour or two to the south of here, the temperature hit 120°F in the shade.) On the rear (inside) surface of the opaque cardboard it said: Do not drive with shield in place.

Posted by Geoffrey K. Pullum at 06:23 PM

Unlike no other

The Marketplace radio show yesterday ("Faith Nights get the call", 7/26/2006) interviewed Brent High, CEO of Third Coast Sports, a company that produces "Faith Nights" at baseball games and other sporting events, and recorded him saying

"It is an opportunity -- unlike no other -- to introduce people to the church in an environment that is not churchy."

This seems to a clear example of overnegation, like "No head injury is too trivial to ignore", or "This is sure to be a killer tournament, don't fail to miss it!" It's obvious what Mr. High meant, but what he said seems literally to mean the opposite. If this opportunity were "like no other (opportunity)", or "unlike any other (opportunity)", it would be a uniquely good (or perhaps uniquely bad) opportunity. But if this opportunity is "unlike no other (opportunity)", then all opportunities are the same, and you might as well pass out tracts on a random streetcorner as set up a Faith Night at Turner Field. At least, that's how it works if two negatives make a positive.

Barbara Wallraff considered this expression in her Word Court feature back in 2003. Her judgment, rendered in response to a reader who found the expression confusing, was:

I looked for examples of “unlike no other” in print and, to my surprise, found them. “Unlike no other” is a double negative. If that’s what people are saying, you’re not confused—they are.

Barbara may be right that some of these are examples of negative concord, the "It ain't no cat can't get in no coop" construction persisting in the vernacular from the old grammar of negation in English, before our linguistic ancestors got mixed up by those French invaders. The wikipedia entry on double negatives quotes a song lyric

Well, I ain't never done nothing to nobody.

I ain't never got nothing from nobody, no time.

And, until I get something from somebody sometime,

I don't intend to do nothing for nobody, no time.

and the dialect expression "I am not never going to do nowt no more for thee." Linguists generally treat these multiple negations as a form of agreement or feature spreading, which is obligatory in the standard versions of many languages. Examples from the wikipedia article include Serbian:

Niko nikada nigde ništa nije uradio, literally Nobody never nowhere nothing did not do, meaning "Nobody ever did anything anywhere."

Czech:

Nikdo nic nevyhrál, literally Nobody didn't win nothing, meaning "Nobody won anything."

Hungarian:

Soha sehol ne mondj el semmit senkinek, literally Never nowhere don't tell no one about nothing, meaning "Don't ever, anywhere tell anyone about anything."

Classical Greek:

μὴ θορυβήσῃ μηδείς, literally Do not let no one raise an uproar, meaning "Let no one raise an uproar."

Afrikaans:

Dis (=Dit is) nie so moeilik om Afrikaans te leer nie, literally It's not so difficult to not learn Afrikaans, meaning "It's not so difficult to learn Afrikaans."

(There are a lot of interesting issues about how far various sorts of negation spread or don't spread in different languages, but that's a matter for another post.)

Some of the "unlike no other" examples in English might be related to the vernacular pattern of negative concord. Here's a possible case from an AP story about NASCAR ("Busch Turns the Corner with a Surge of Success", 7/16/2006):

Roush said: “Well, it’s been a great ride with Mark Martin for 600 starts now. He’s brought intensity, enthusiasm, great driving ability and integrity to the driver’s seat, unlike no other driver that I can recall.”

I've got no idea how Jack Roush talks when he gets comfortable, but it's possible that he's fine with saying things like "he ain't like no other driver", and in that case, the extra "no" in his quote might just be negative concord. But there are other examples, in more formal contexts, that I'm pretty sure are just ordinary overnegation, where people have just gotten confused about how many negatives are really needed to make their point. Here's an example from the presumably well-edited O'Reilly Safari site for David A. Karp's "eBay Hacks, 2nd Edition":

Unlike no other book, eBay Hacks, 2nd Edition also provides insight into the social aspects of the eBay community, with diplomatic tools to help to get what you want with the least hassle and risk of negative feedback.

That's not a dialect form or an idiom, it's just a mistake. Or is it? Could (some) overnegations in English be a formal residue of a stubborn hankering for negative concord? On this view, confusion about the semantic complexities of multiple negation plays the role of a sleepy gatekeeper, allowing vernacular impulses to sneak into the standard language.

Posted by Mark Liberman at 07:04 AM

July 26, 2006

The Perils of Transliteration

Just now on Jeopardy the question was "Who is Park Chung-Hee?", the President of South Korea from 1961 until his assassination in 1979. Alex Trebek pronounced it incorrectly, with an r, like the English word "park". Actually, the family name of 朴正熙 (박정희) is pronounced [pak], roughly like English "bock". In the current official South Korean romanization it is rendered Bak. Trebek's error is as much Park's fault for choosing a non-standard romanization as Trebek's. I suspect that the romanization with an r, which is pretty common, was based on an r-less dialect of English, probably British, and was meant to prevent it from being interpreted as [æ], as in "back" and "Mac".

Posted by Bill Poser at 10:55 PM

The moan of the dunes

Who you gonna believe, the NYT science section or your own lyin' ears? Reporting on some very interesting work by Stéphane Douady and others on the physics of "singing sands", Kenneth Chang ( "Secrets of the Singing Sand Dunes", July 25, 2006) can't resist a musical lede:

The dunes at Sand Mountain in Nevada sing a note of low C, two octaves below middle C. In the desert of Mar de Dunas in Chile, the dunes sing slightly higher, an F, while the sands of Ghord Lahmar in Morocco are higher yet, a G sharp.

Unfortunately for Chang's credibility, he (or more likely one of his editors) chose to accompany the article by a lovely video clip (identified as being "sounds .. 'played' on a singing dune located in the Atacama Desert in Chile") providing singing-sand sounds that are more like moans than sung notes, each sound clearly spanning a fairly wide range of pitches.

Here's a graphical representation of one example [audio clip], with an automatically-generated pitch track on which I've manually sketched some trend lines for clarity:

This particular sound sequence, produced by one sweep of the dune-player's hand, has an initial segment that falls from about 308 Hz (a bit below the D# just above middle C) to about 226 Hz (half-way between A and A# below middle C), followed by a period-doubling (i.e. pitch halving) and a somewhat more level-pitched segment ending around 103 Hz (just below G#, in terms of musical pitch-classes relative to A=440).

In other words, this moan of the dunes starts just above middle C, falls by a tritone (six semitones), and then drops by an octave to end about an octave and a fifth lower.

Even if you have perfect pitch (and I certainly don't), you probably can't hear the intervals involved accurately, since people with perfect pitch don't perceive the pitch of such glissandi very clearly -- or at least that's what a couple of them have told me. However, anybody with normal hearing and pitch perception can tell by listening to this clip that the dune's sound, in this case, is not well characterized as a "note", but rather is a falling glissando spanning a considerable range.

This is a small nit to pick, in an interesting and well-written article about a nice piece of research. I've posted about it not because I like to play "gotcha" with journalists -- in fact I don't enjoy it at all -- but because this is such a clear example of such a common problem. More often than not, the popular presentations of scientific or technological results are strikingly at variance with features of the results themselves, features that are obvious to anyone who knows anything about the subject or who looks at the primary sources with a bit of common sense. Sometimes this is entirely the fault of the scientists and engineers, and their PR representatives often contribute as well, but in the end, the largest share of guilt belongs to the reporter and editor, whether they juice up the story themselves or credulously accept the juice from another source. And in this internet age, it's increasingly easy for readers to check things out, and to blog what they find.

There's not a lot of extra "juice" in this case -- just Chang's choice to lead with a comparison of deserts as if they were organ pipes. We're not talking about a flat-out fabrication, like the stuff about how email lowers IQ more than pot and men are emotional children, or a preposterous exaggeration, like the idea that Germans are grumpy because of umlauts. In this case, it's just an attempt to grap the reader's interest with the cute idea that deserts have characteristic pitches, before getting to the physics part. And Chang's lede is not totally invented, anyhow, since Douady's article suggests that the pitch of the dune sounds depends on the statistics of grain sizes (which may be different for different deserts), and offers some lab results citing different pitches for sand samples from different locations. However, the video clip accompanying the article makes it clear that at least some of the dune sounds, in their natural state, are not stable pitches at all. So Chang's lead, though attractive, is directly contradicted by the evidence in his sidebar video.

Well, actually, Chang compounds the problem when he tries to explain, later in the opening paragraph, that

While the songs are steady in frequency, the dunes do not have perfect pitch. At Sand Mountain, for example, dunes can sing slightly different notes at different times, from B to C sharp. [emphasis added]

Um, Kenneth, your own video sidebar makes it clear that the "songs" are NOT necessarily "steady in frequency". Isn't that a little embarrassing?

[The scientific paper is S. Douady, A. Manning, P. Hersen, H. Elbelrhiti, S. Protiere, A. Daerr, B. Kabbachi, "The song of the dunes as a self-synchronized instrument", unpublished ms. 12/2004, revised 1/2006, to appear in Physical Review Letters (?). More info about moaning dunes is available on Douady's web site.]

[There is no truth to the rumor this story is the oneiric source of the question famously asked by Dan Rather's assailant in 1986. It's for that reason that I resisted the temptation to title this post "What's the frequency, Kenneth?"]

[Update -- Kenneth Chang emailed:

Geez, if you're going to pose questions/insults, you really should provide some straightfoward way for someone to reply.
From the abstract of the paper: "Since Marco Polo (1) it has been known that some sand dunes have the peculiar ability of emitting a loud sound with a well defined frequency, sometimes for several minutes."
The notes are taken from Table 1, which I hope I didn't mess up.
The discrepancy arises, I believe (from reading the paper), because you can generate different notes by changing the speed you push the sand, like in the video (and Table 2). There are also overtones. In naturally occurring avalanches, there is one characteristic speed and hence one characteristic note.
So... we should have put a better explanation accompanying the Web video, It's unfortunately impossible to include all these nuances in a 300-word article.

It's absolutely true that the abstract of the Douady et al. paper suggests that the pitches are typically level, and that the information about the particular association between deserts and pitches comes from Table 1 in the Douady et al. paper:

This is what I meant by alluding in a general way to the source of the idea in the cited paper, but I should have been more precise. I could quibble a tiny bit with Chang's translation from Hz to musical pitch-classes, but what's a semi-tone or so among friends? And the plain fact is that the implication of a fixed connection with between deserts and pitches via characteristic sand-grain sizes is suggested by this part of the paper that Chang is reporting on. It's also very plausible that naturally-occurring avalanches usually involve stable velocities and therefore stable pitches.

On this basis, I owe Chang an apology for any implication that he was responsible for the idea of deserts being musically tuned: it comes pretty directly from the Douady paper.

The figure that helps explain why the sounds in the video clip are not steady pitches is reproduced below:

This (their Figure 2) shows "Frequency emitted by pushed (sheared) sand, measured in laboratory experiment, as a function of two laboratory control parameters, height of mass of sand, H, and velocity of pushing blade, V." See the paper for further details -- but the figure clearly shows the same sand, from the same desert, emitting a range of frequencies spanning more than three octaves. Presumably, as Chang suggests in his note, the velocity of the dune-player's hand in the video is standing in for the velocity of the blade, and the velocity contour of his gestures is produces a corresponding pitch contour.

So Douady and his co-authors are to arguably to blame for publishing a misleading Table 1, without a clear explanation of the fact that the implied close connection among deserts, grain sizes and (level) pitches is not generally true in the laboratory, and may be true in the field only under certain circumstances, and in particular isn't true in some of their own field examples. We can't blame Kenneth Chang (who is clearly an excellent science writer in general, and to be commended for bringing an interesting piece of work to general attention) for anything worse than a minor failure to read (and listen) critically -- and it's plausible in fact that he understood the whole situation from the beginning, but simply couldn't fit the whole discussion within the rigid article-length limit that he had to work with.

I'm glad I'm a blogger! It's a lot easier to write more than to write less. ]

Posted by Mark Liberman at 08:43 AM

July 25, 2006

English under siege in Pennsylvania

More than a third of all Pennsylvanians are native speakers of a language other than English -- and many of them have not even tried to learn English since immigrating, or at least prefer to carry out their daily lives in another language, living together in neighborhoods where their native language dominates. Some people worry that the majority status of English is critically endangered. 25 years ago, a major political figure warned that these "aliens ... will never adopt our language or customs, any more than they can acquire our complexion", and so far, his prediction seems to be right on the money. But wait -- the date is 1776, not 2006, and the language contending with English is not Spanish, it's German, and the major political figure who warned about the "aliens" who "swarm into our Settlements, and by herding together establish their Language and Manners to the Exclusion of ours" is Benjamin Franklin.

The first newspaper announcement of the adoption of the Declaration of Independence was published in German, on Friday, July 5, 1776, in the Pennsylvanischer Staatsbote. On Tuesday, July 9, the same paper devoted its front page to a German translation of the declaration. The picture below is from a web exhibit by the Deutsches Historisches Museum in Berlin:

Linguistic sweetness and light didn't always prevail among the founding fathers. As I mentioned, in 1751 Benjamin Franklin said some things about German immigrants that would put him on the ethnocentric fringe of today's debate. Curiously, in 1732 Franklin had published a German-language newspaper -- though it only lasted for two issues, perhaps because he didn't have a Fraktur typeface, and so had to print his German material in his usual roman-letter Caslon Antiqua. In 1735, Christopher Sauer began publishing in German with the familiar Fraktur letters, and had considerable success with a German-language almanac that had a circulation as high as 10,000, and thus was in implicit linguistic competition with Franklin's English one. Today's readers may be interested to know that Sauer and his newspaper, published since 1739 in Germantown, were socially and religiously conservative supporters of the Penn family, and thus "a thorn in the flesh of any progressive politician in the colonies", especially Franklin.

In 1762 Heinrich Miller (who had earlier worked for Franklin) began publishing the Wöchentliche Philadelphische Staatsbote. Miller's politics were much more to Franklin's taste:

It is Henrich Miller, who gave the German population of the Middle Colonies the opportunity to learn about and to participate in the various political controversies that would gradually lead to independence. He printed in German Jonathan Dickinson's and Joseph Galloway's speeches in Assembly on the change of government in Pennsylvania in 1764, he printed a German version of Benjamin Franklin's interview before the House of Commons concerning the Repeal of the Stamp Act in 1766, and from 1774 on he practically served as official German printer for the First Continental Congress by repeatedly publishing its minutes and votes in German. ... Christopher Sauer in his "Germantowner Zeitung" had taken side against Britain during the Stamp-Act Crisis, but when the course of events led more and more towards armed revolution he had to make amends to the pacifistic convictions of the Brethren and to stay free from radical positions in his publications. This led later to Sauer's being accused as a loyalist and Congress put him under orders to abstain from printing and his whole property was confiscated and put up for auction for the public benefit. His successor, Christopher Sauer III, then actually joined the British side and served as printer to the occupation army in Philadelphia and New York.

In 1776, the difference in political allegation between the two established German presses became evident: while Sauer printed a decisive call for peace and to abstain from armed resistance issued by the Quaker community, Miller printed a pro-congressional pamphlet directed at the German inhabitants under the title "Der Alarm", the Minutes of the Constitutional Convention of Pennsylvania and the Regulations for the Pennsylvania Militia in German.

A number of earlier Language Log posts have focused on America's varied linguistic landscape during its first couple of hundred years. This was never a linguistic garden of Eden -- whatever you think prelapsarian social norms should be like -- but viewed in the light of history, the current anxiety over linguistic identity seems exaggerated. By most measures, English in America seems to be stronger than it's ever been.

"The world is upside down" (2/24/2004)
"When smart people get really stupid ideas" (5/2/2204)
"Palatine Boors and their Maryland descendents" (5/14/2004)
"Mere knowledge of the German language cannot reasonably be considered harmful" (5/21/2004)
"Nativism clings to life at 100 or 101" (6/24/2004)
"The secret Netherlanders among us" (7/2/2004)
"The multilingual anthem" (4/29/2006)

And you may be interested in this essay by Dennis Baron on "The legendary English-only vote of 1795". More on the same topic is found in a chapter "German or English?", from "The German Americans: An Ethnic Experience, by Willi Paul Adams (originally Die Deutschen im Schmelztiegel der USA: Erfahrungen im grössten Einwanderungsland der Europäer).

Posted by Mark Liberman at 08:08 AM

PowerGenitalia and PenisLand

Concerning those ambiguously analyzable web site names mentioned this weekend on NPR's "Wait Wait Don't Tell Me" (and I know you were waiting for Language Log to comment on this clearly linguistic story), it does appear to be the case that there is an Italian battery company called Powergen Italia (no relation to the British company called Powergen), and it seems they really did once set up a website with the URL www.powergenitalia.com.

It was taken down some years ago (the company now uses www.batterychargerpowergen.it), but the WayBack Machine keeps an archived copy in this location (just click Cancel if it asks for a password). It's not just a hoax site, as claimed at this page (the commenter below seems to be correct). And they actually spell their name as "Powergenitalia s.r.l."

The story seems to be a thoroughly old one; the above snapshot of the days of PowerGenItalia.com is from 2001. No word yet on why it suddenly resurfaced this week on Wait Wait.

And the other case, Pen Island really is a company selling customized pens, and really does have a current web site called www.penisland.com. They claim to have had at least a certain amount of trouble with rude spam sent in their name, and with people setting up sites with similar names for rather more penis-related purposes. See this page for some further discussion.

Don't forget, concatenation of letter strings can get you into things you didn't want to get into: sometimes x(yz) and (xy)z are non-equivalent but both yield xyz when the spaces or brackets are erased. Always get your proposed URL analyzed for double entendres by fully qualified linguists before setting up your site. Just call the main switchboard at Language Log Plaza and ask for the Uniform Resource Locator Morphological Analysis Division.

Thanks to Brendan McGuigan and Stephen K. Benjamin for research assistance.

Posted by Geoffrey K. Pullum at 01:25 AM

July 24, 2006

Two languages short of bilingual

OK, I'm not quite so speechless anymore after reading Bill Poser and Mark Liberman on the Bogota iced coffee ad shock horror scandal probe. Apostrophe use in English is tough, I admit that; learning it takes a bit of work. But public servants speaking out on language issues — especially those who think that Spanish-speaking immigrants have a duty to learn English before turning up for work at the factory or the farm — should be the first to tackle that work. How about a campaign to ensure that politicians who try and soak up right-wing and anti-immigrant votes by participating in language bigotry should at least get their apostrophe placements right? ‘E-Mail me,’ says Mayor Steve Lonegan, the man organizing a McDonald's boycott (or "McDonalds boycott" as he would say) in his "Mayors Message": ‘Email me at mayor@bogotaonline.org’. So let's do that (or "lets do that", as the Mayor would say). Let's all email him with a brief summary of the rules for use of the apostrophe in genitive singulars and genitive plurals. And perhaps subject-verb agreement as well; Fernando Pereira (a Portuguese-speaking immigrant, but he works hard and learned English before turning up to work at the University of Pennsylvania) has pointed out to me that the Star-Ledger story quotes Lonegan as saying: ‘The true things that bind us together as neighbors and community is our belief in the American flag and our common language.’ This man needs language assistance. Especially given his job. Being monolingually literate in Standard English is the normal baseline for politicians. Lonegan falls below this; he's two languages short of being bilingual.

As Mark Liberman reminds me, our president's stirring words should be our guide here:

We need to challenge the soft bigotry of low expectations. If you have low expectations, you're going to get lousy results. [Applause.] We must not tolerate a system that gives up on people."

Language Log does not give up on people. Not even politicians. Our credo is no politician left behind.

Posted by Geoffrey K. Pullum at 10:08 AM

Mensaje de los alcaldes [sic]

Because Geoff Pullum and Bill Poser were preoccupied with the problem of characterizing Steve Lonegan, the mayor of Bogota, NJ, who has led the recent objections to Spanish-language McDonald's ads, they failed to tell you something important about the Borough of Bogota's web site. I don't mean the weekend truck rental service, though that is way cool. I'm referring to the helpful little BabelFish panel in the lower left corner of every borough web page. This includes the "Mayors Message" [sic], which is therefore available in (a reasonable approximation to) Chinese, German, Japanese, Korean, French, Italian, Portuguese, and (last but not least) Spanish. (The order is a bit odd -- at first I thought it was alphabetic, but French and Italian are out of sequence.)

As a result, the Borough's own web site is far more complicit than the contested McDonald's billboard in helping furriners to get by without learning English. The McDonald's billboard only informs linguist miscreants that "Un frente helado se aproxima. Nuevo café helado." When they actually get in line for that café helado, they're going to have to figure out that it's on the menu as "iced coffee". But with one click on the eBogota web site, we learn that "¡Si usted vive en Bogotá, Nuevo-Jersey usted es vivo en una ciudad que sepa que una parte grande del futuro está haciendo vida conveniente y fácil! Somos re-engineering la manera que nuestra ciudad hace negocio, estamos haciendo un esfuerzo de tener Bogotá en línea y abierto para el negocio 24 x 7." You can already "pre-register" your gato or perro -- and apparently soon you'll be able to transact all your business with the borough on line, in convenient Spanish translation!

Anyhow, there's another small linguistic point here. Because Mayor Lonegan left the apostrophe out of his title, the Fish dutifully renders it as "Mensaje de los alcaldes" -- "Message of the mayors". I could care less about apostrophes, myself, but given the importance of setting a good example for immigrants, I'm tempted to turn this one over to Lynne Truss.

[Update -- Steve from Language Hat writes:

Did you notice that the Spanish version of the Bogota website calls it "Bogotá"? That's pretty funny, because the pronunciation of the NJ town is stressed on the penultimate syllable (as a MetaFilter commenter put it, it rhymes with Abe Vigoda). Furthermore, according to Kelsie Harder's Illustrated Dictionary of Place Names, the town name has nothing to do with the Colombian city but is "from the name of a Dutch family of early settlers, Bogert." (Oddly, the extremely detailed historical section of the Bogota website -- http://www.bogota.nj.us/history/default.asp -- doesn't explain this, saying only "It was at this time that 'Bogota' was beginning to be used as the name of our area of Ridgefield rather than 'Winckelman,'" but the Bogert family does feature prominently in the history.)

Curiouser and curiouser. It's typical of those sly and stubborn Dutch immigrants to try to disguise their linguistic atavisms as Spanish! ]

Posted by Mark Liberman at 08:04 AM

Another Jackass of the Week

Since Geoff is tongue-tied, I'll help him out. Mayor Steve Lonegan of Bogota, New Jersey is our new Jackass of the Week. He thinks that an ad for iced coffee in Spanish sends the message that Spanish-speaking immigrants don't need to learn English. Hunh? People surrounded by English-language media, who for the most part need to use English at work, at school, and in business, including when they work or eat at McDonalds, are going to conclude that they needn't bother learning English because the occasional ad is in Spanish? And how is advertising in Spanish "divisive"? Surely whining about other people's languages is what is divisive. Warren Meyer has aptly named the phenomenon whereby some people turn into blithering idiots on hearing Spanish Spanish Derangement Syndrome.

I think I'll stop by my local McDonalds tomorrow. Incidentally, according to this article, Bogota doesn't even have a McDonalds.

Posted by Bill Poser at 02:26 AM

July 23, 2006

Iced coffee ad for Hispanics outrage

The mayor of Bogota, New Jersey, authorized by a 4 to 2 vote of the council, wrote to McDonald's to protest a billboard on which iced coffee was advertised in Spanish. He urges a boycott if McDonald's won't take the sign down (see story here). I ought to have an informed comment on this, but words fail me. I simply do not believe the excesses of the language bigots in this country sometimes.

Posted by Geoffrey K. Pullum at 01:53 PM

Will Shortz sets impossible puzzle on NPR

Will Shortz's word puzzle for last week (on NPR's Weekend Edition Sunday) was to find a name from classical mythology that was, in spelling, a concatenation of English pronouns. And the problem was probably impossible. What's for sure is that the answer he gave was not correct.

The answer was supposed to be Theseus, the name being a concatenation of these and us. The latter for is indeed the accusative form of the pronoun lexeme we. But the former, these, is not a pronoun.

I'll be using the terminology of The Cambridge Grammar in explaining why, but this isn't some sort of terminological quibble: there really are pronouns in English, and they share certain very clear, sharp properties. And the word these is not one of them, no matter which system of terminology you prefer. The thing is that there are syntactic differences in behavior between the two classes of words.

These is the plural inflected form of the lexeme this, which is a determinative, specifically a member of the demonstrative subclass of determinatives (another subclass is the articles, the definite article the and the indefinite article in its two shapes an and a).

The lexeme this is one of the special determinatives (like some and most, but not the articles) that are permitted to function as simultaneously determiner and head of a noun phrase. (There are some subtleties to the argument; but see page 422 of The Cambridge Grammar for some crucial discussion.) So that's why a form of this can occur on its own where a noun phrase can occur (which is perhaps the root of the confusion): This is typical has a subject noun phrase consisting of only one word, which is both determiner and head of the subject noun phrase.

Here, very briefly, are four lines of evidence for arguing that these is a determinative, not a pronoun. The last one is particularly telling, I think.

1. Third-person pronouns do not co-occur in a noun phrase with a common noun the way determinatives do: the book is a (singular) noun phrase (and the is a determinative); *it book is not a noun phrase (and it is not a determinative). [It must be noted here that the pronouns we and you are special in that they have additional uses as determinatives, in phrases like we linguists or you boys; see page 374 of The Cambridge Grammar for discussion. But this is special to just those two lexemes. None of the 3rd person pronouns have second lives as determinatives in Standard English (notice, these is 3rd person); and none of the singular ones do; and none of the reflexive ones like yourself do; and so on. Don't be misled by How 'bout them apples? or them bones, them bones gonna walk around; those are from non-standard dialects where the shape of some items is different, and those dialects have a determinative with the form of them and the meaning of those. Tricky, isn't it?]

2. Pronouns do not in general allow modification by preceding quantifiers: *every it, *all they, etc., are not grammatical noun phrases.

3. Pronouns occur before the particle in verb-particle idioms, not after it: Don't even bring it up in conversation (with the pronoun before the particle up) is grammatical but *Don't even bring up it in conversation (with the pronoun following) is not.

4. A particularly salient point is that pronouns occur in what are known as confirmatory tags: compare Susan is clever, isn't she? (grammatical) with *Susan is clever, isn't Susan? (not grammatical).

Now, by all four of the tests these facts make available, these is a determinative, not a pronoun!

First, these books is a grammatical noun phrase (confirmation of determinative status).

Second, all this and all these are grammatical (disconfirmation of pronoun status).

Third, Don't even bring up these in conversation is grammatical (disconfirmation of pronoun status).

Fourth, *These pastries are delicious, aren't these? is not grammatical (disconfirmation of pronoun status).

There is lots of other evidence that could be brought to bear, but this will do for a start. Sadly, Will Shortz did not consult Language Log's capable staff before setting his puzzle, even though our rates for non-profit organizations like National Public Radio are so reasonable.

This is not the first time, my friend Aaron Kaplan points out to me. The New York Times crossword puzzle for last December 30 had the clue "Lord's Prayer adjective," and the answer is 3 letters long. The answer is supposed to be thy, of course. But thy is not an adjective. It is the genitive form of a (now archaic) pronoun. It can be used as a determiner, just as any other genitive noun phrase like the children's can. Old-fashioned traditional grammars insist that anything that can occur before the noun in a noun phrase (or anything that can modify a noun) is an "adjective", but that policy gives you a class of "adjectives" infected with a diverse array of members that have almost nothing syntactic or semantic in common (I>London has to be an adjective because of the phrase London fog, and so on). As Aaron notes, the error here may be the fault of the original puzzle author. But Shortz is ultimately in charge, and is paid to be. This man needs a linguist on call.

Why do people neglect the informational resources that are available to them? I do not know. Shortz could have just called the main switchboard at Language Log Plaza and ask to be put through to the Lexical Categorization Department in the Grammar Division. (That's who Jon Stewart should have called before telling the College of William and Mary that terror is not a noun. And who Microsoft should have called to learn whether it was even remotely plausible to try and stipulate that trademarks should never be used in the possessive or the plural.)

I know, some of you will say that this earlier post, written when I was young, perhaps suggests a certain lack of sympathy with the whole puzzle genre, a mild prejudice against the very idea of puzzles that inclines me to be mean to Will Shortz. But no, I am perfectly capable of maintaining a level head on this issue. Not everything that Will Shortz bases his sometimes ingenious puzzles on is mistaken. But he is clearly drawing his grammar information — just as nearly everyone else does — from a superficial grasp of what was printed in school grammars in the 19th century. The subject has moved on. There is stuff you need to consider if you are going to talk about grammar or invent puzzle clues that make reference to grammar.

Others among you will say (as some have already said) that unless Will Shortz had got an accessible copy of The Cambridge Grammar, he had only published dictionaries to go on, and they all say that these can be a pronoun. True, they all do. But the point of view that I take does not make grammar depend on authority. It depends on evidence. Part of the tragedy of the present state of English grammatical studies is that the published resources aren't in line with the known evidence. In particular, all published dictionaries are simply wrong about the categories to which they assign quite a few alleged pronouns, prepositions, adverbs, adjectives, and conjunctions. All of them. It's a crisis.

And it isn't Will Shortz's fault, of course. I don't expect him or NPR to take the slightest bit of notice of what I've said here. But this is Language Log. You get the actual truth, and a glimpse of the evidence that backs it up. Plus a glimpse of the extent to which what is commonly believed about English grammar is at many points demonstrably incorrect.

Posted by Geoffrey K. Pullum at 01:35 PM

The ancient roots of passive avoidance

In connection with our discussion on the history of passive-avoidance, I'd like to suggest a connection to a much longer history -- two and a half millennia of variation between "loose" and "periodic" styles.

Here's an explanation of the distinction, quoted from Hardy Hansen's web site for a course on "Greek Prose Style" at CUNY:

[L]et us consider the two main types of sentence structure which ancient critics recognized: the loose style (lexis eiromene) and the periodic style (lexis katestrammene). The first phrase means literally "speech strung together" (from eiro, "to string or thread together", like beads in a necklace); the second, "speech turned or guided toward an end"; the word "period" (periodos, "way around") refers metaphorically to a racecourse, where the starting and finish lines were the same: contestants went out and around the turning-post, then retraced their path. ...

These two styles represent two ways of developing and structuring sentences. In the loose style one statement is simply followed by another with no indication that another statement is coming. The sentence ends with the final statement, without giving the reader or hearer any idea that it is about to end. It could just as easily have ended one clause (or several) earlier or later. ...

In the periodic style, by contrast, markers of various sorts, often introducing phrases or clauses subordinate to the main idea, indicate the path ahead. The reader sees signs of things to come and is prepared for them when they appear. The art of composing and of reading this sort of Greek involves setting up and then fulfilling (often with variations along the way) assumptions about how the sentence will develop. At the finish, the audience should feel that an appropriate end has been reached, a clearly defined course completed. All periodic composition involves, in one way or another, suspension of sense: the arrangement of one or more elements of the sentence so that the thought is not felt to be complete until something else has been added.

Hansen offers an English example of each sort of writing. Hemingway demonstrates the loose or strung-together style, in a passage from The Sun Also Rises:

The bus climbed steadily up the road. The country was barren and rocks stuck up through the clay. There was no grass beside the road. Looking back we could see the country spread out below. Far back the fields were squares of green and brown on the hillsides. Making the horizon were the brown mountains. They were strangely shaped. As we climbed higher the horizon kept changing. As the bus ground slowly up the road we could see other mountains coming up in the south. Then the road came over the crest, flattened out, and went into a forest. It was a forest of cork oaks, and the sun came through the trees in patches, and there were cattle grazing back in the trees. We went through the forest and the road came out and turned along a rise of land, and out ahead of us was a rolling green plain, with dark mountains beyond it. These were not like the brown, heat-baked mountains we had left behind. These were wooded and there were clouds coming down from them. The green plain stretched off. It was cut by fences and the white of the road showed through the trunks of a double line of trees that crossed the plain toward the north. As we came to the edge of the rise we saw the red roofs and white houses of Burguete ahead strung out on the plain, and away off on the shoulder of the first dark mountain was the gray metal-sheathed roof of the monastery of Roncesvalles.

Samuel Johnson demonstrates the periodic or explicitly-structured style, in a passage from the preface to an edition of Shakespeare:

That praises are without reason lavished on the dead, and that the honours due only to excellence are paid to antiquity, is a complaint likely to be always continued by those who, being able to add nothing to truth, hope for eminence from the heresies of paradox, or those who, being forced by disappointment upon consolatory expedients, are willing to hope from posterity what the present age refuses, and flatter themselves that the regard which is yet denied by envy will be at last bestowed by time. . . .

To works, however, of which the excellence is not absolute and definite, but gradual and comparative; to works not raised upon principles demonstrative and scientific, but appealing wholly to observation and experience, no other test can be applied than length of duration and continuance of esteem. What mankind have long possessed they have often examined and compared, and if they persist to value the possession it is because frequent comparisons have confirmed opinion in its favour. As among the works of nature no man can properly call a river deep or a mountain high without the knowledge of many mountains and many rivers, so, in the productions of genius, nothing can be styled excellent till it has been compared with other works of the same kind. . . .

The reverence due to writings that have long subsisted arises, therefore, not from any credulous confidence in the superior wisdom of past ages, or gloomy persuasion of the degeneracy of mankind, but is the consequence of acknowledged and indubitable positions, that what has been longest known has been most considered, and what is most considered is best understood.

In addition to differing in clausal complexity and the degree to which the reader is explicitly guided along the rhetorical path, Hansen's examples also differ in their use of verbal mood (in a broad sense). I've divided the two passages up into verb-headed phrases, and put each such unit into one of the four categories active intransitive ("the bus climbed steadily up the road"), active transitive ("we saw the red roofs"), passive ("it was cut by fences") and copula ( "the country was barren"). A tabular view of the results:

	Active intransitive	Active transitive	Passive	Copula
Hemingway	16	5	1	11
Johnson	5	9	14	5

And a graphical view:

These differences in the distribution of different sorts of verb forms no doubt have other reasons besides the difference between lexis eiromene and lexis katestrammene. For example, Hemingway gives a physical description of a journey through a specific landscape, while Johnson offers an abstract discussion of the reasons for valuing older works over newer ones. In passages with other functions, the same authors would no doubt modulate their uses of different verb forms while maintaining their individual styles. Whatever the influence of content, however, I think that the goals of the periodic style will tend to promote heavier usage of passive forms, as one of several methods for keeping the structurally complex sentences "turned and guided towards an end".

As to why the 20th century turned (with the expected hypocrisy and backsliding) towards lexis eiromene, you can find some thoughts in a couple of earlier Language Log posts:

"Modification as social anxiety" (5/16/2004)
"The evolution of disornamentation" (2/21/2005)
I can also recommend a blog post by Francis Morrone, "The Word (and World) made Flesch" (10/1/2004).

Posted by Mark Liberman at 11:51 AM

Safire's batting average against left-handed grammar questions

There was a nice post a week ago at headsup: the blog about a vacation substitute for William Safire ("Humm batter batter batter" 7/16/2006). My favorite bit:

Jack Rosenthal is president of The New York Times Company Foundation. This is his 25th year as a pinch-hitter for William Safire, who is on vacation.

This may be his 25th year of being called up for a week in the majors to fill in for Safire, but he is not a pinch-hitter. Pinch-hitting for Safire would be, oh, bringing Perlman in to face a left-handed grammar question, since Safire is about .130 lifetime on grammar questions and Perlman hits them all over the lot.

Our guest, on the other hand, looks ready to swing at most of the same stuff Bill does ...

And check out the part about "staying away in droves", hypercorrections, and Yogi Berra.

Posted by Mark Liberman at 11:17 AM

July 22, 2006

How long have we been avoiding the passive, and why?

A few days back, the Senior Writers' Lounge at Language Log Plaza was enlivened by an exchange about the passive voice in English. Poser relayed a query about where the injunction against the passive originated. Nunberg fixed on George Orwell's 1946 article "Politics and the English Language", where Orwell firmly instructs us: "Never use the passive where you can use the active." I demurred, noting that the injunction was a commonplace in college writing handbooks in the 30s and 40s (in the U.S., anyway); and now I'm ready to show some of the evidence.

So Orwell isn't the originator. But it's likely that his very influential essay brought Avoid Passive to a much larger audience than it had before; no doubt Strunk and White's equally influential Elements of Style (1st ed. 1959) helped spread the word in the U.S. Eventually, Avoid Passive becomes a central element in the ideology of English writing style.

But where DID it originate?

Fowler (1926) shows no animus against the passive, nor do the great American grammar ranters of the late 19th century, Richard Grant White and Alfred Ayres. Hall's (1917) survey of disputed usages doesn't mention Avoid Passive or anything like it. Something seems to have happened (possibly only in the U.S.) in the first two decades of the 20th century -- or maybe I'm just looking in the wrong places. Stay tuned for further developments.

In any case, in the 30s we see handbooks characterizing the passive voice as a weakness to be avoided. From Foerster & Steadman's Writing and Thinking: A Handbook of Composition and Revision (1931:380):

Weak Passive voice

84a. As a rule, avoid the passive voice.

The use of the passive voice detracts from the smoothness, interest, and emphasis of the sentence.

... The passive voice is properly used only when the agent is unknown or unimportant, or when the receiver of the action is more important than the agent.

And from Kierzek's Macmillan Handbook of English (1939:65):

Weak Passive Voice

65. Avoid the use of the passive voice whenever the active voice is more natural and direct.

The passive voice is properly used when the receiver of the action is more important than the doer of the action or the action itself.

Ah, you will have been struck by how similar these passages are, even though they come from different authors and different publishers (Houghton Mifflin and Macmillan, respectively). Such similarities are all over the place in the world of English handbooks. Writers base their texts on the ones they learned from themselves, a fact that results in some continuity over generations. Publishers want their texts to be authoritative and standard (we are, after all, talking about the "facts" about English, right?), and of course everybody reads the competition, so editors work to align their books with others, adding or revising material on their own, or farming the writing out to others (who may work for different publishers on different occasions). All of this ensures a remarkable sameness to the texts.

(It's not just handbooks of English, of course. All kinds of textbooks are shaped by the same forces. Some of you will have noticed the recent flap about American history texts, as reported, for example, in Diana Jean Schemo's "Schoolbooks Are Given F's In Originality", in the New York Times of 7/13/06, p. A1.)

Every so often, an author will perceive some infelicity in student writing which hasn't (so far as the author knows) yet been catalogued, and a new injunction will find its way into a handbook. And probably then propagate to other handbooks. This is quite likely what happened with Avoid Passive. There are writers who are fond, even overfond, of the passive; they could use a warning. A teacher or editor confronted with such a student naturally wants this warning to be couched not just as a critique of a particular sentence, but as general advice that can applied to future writing; as a result, the advice tends to get rigidified into a blanket injunction. Which can then spread.

There are fashions in these proscriptions. Some of them, like the passionate contempt for speaker-oriented hopefully voiced in some circles, seem inexplicable to me. Others gain some power from associations with other bits of ideology, both linguistic and non-linguistic. Look at Avoid Passive in this light.

The formulations above object to the passive on two grounds. There's an esthetic objection: passives are felt to be less smooth, interesting, and natural than actives. These are judgments of linguistic taste, much like the musical judgment that Beethoven's symphonies are more interesting and moving than Mozart's (I am not espousing that judgment myself, by the way). But tastes differ, notoriously, and there are contexts in which a reasonable person might find a passive to be smoother, more interesting, and more natural than an active. Advocates of Avoid Passive are then telling writers, "We have better linguistic taste than you do; learn to think the way we do." (In fact, "Do as I say, not as I do", since advocates of Avoid Passive use passives themselves with some frequency.)

The second objection to the passive is via an appeal to metaphorical values attached to the two voices: active as energetic, strong, and emphatic, passive as inert, weak, and reserved. Now, it might well be that these metaphorical connections spring in part from taking the grammatical terminology "passive" and "active" rather too literally. But they certainly arise from the (significantly mistaken) beliefs that sentences describe actions, that the subjects of active clauses denote the agents or doers of these actions, and that the subjects of passive clauses denote the receivers or recipients of these actions. (These beliefs are part of a traditional, but seriously inadequate, conceptual framework for grammar in which syntactic concepts are, for the most part, collapsed with semantic ones.)

Next, note the unspoken (and questionable) assumption here that energetic action, strength, and emphasis (rather than reserve) are unalloyed goods. These are conventionally taken to be MASCULINE qualities, so that a bit of linguistic ideology (about the values of the active vs. the passive) plugs into a bit of non-linguistic ideology (about the values of the masculine vs. the feminine). It's possible that the valuing of the active voice over the passive, and the more general favoring of a spare and forceful writing style in the early 20th century, is connected to wider social values of the time; but I'm not enough of a cultural historian to speak to the point.

Orwell's main objection to the passive seems to have been that it is wordy, a criticism we'll soon see from another source. I've always found this objection baffling, because in real life most passive clauses are agentless (the first sentence of this posting is of a relatively rare type), so that they're generally SHORTER than their active counterparts. Consider an active clause with subject X and direct object Y: X V Y. Its agentless passive counterpart is of the form: Y be Ved. If X is n words long, the passive is shorter than the active by n-1 words (or n words in some contexts). When I wrote, above, "in the 30s we see handbooks characterizing the passive voice as a weakness" I could have saved a word by using the passive: "in the 30s we see the passive voice characterized as a weakness".

Yes, I know, Orwell and the other wordiness critics are thinking about agentive passive clauses (with "by X" in them). But most passive clauses that might be problematic are agentless.

Something that both Foerster & Steadman and Kierzek get more or less right, but tends to be downplayed in later advice about the passive, is that how "important" the referent of Y is plays a role in choosing the voice for a clause. Unfortunately, they don't really have the vocabulary to talk about what's going on here, so I doubt that their talk about importance is at all comprehensible to their readers.

One piece of it pretty much everybody is clear on: use the (agentless) passive when the referent of X is unknown or irrelevant, or (at your peril) when you want to conceal that information. But passives, including agentive passives, are also good when the referent of Y is topical in the discourse, old information, foregrounded, etc. Passives (in which Y appears as subject) are good for this because subjects tend to denote things that are topical in the discourse, old information, etc. There's now quite a considerable literature on the discourse functions of subjects and on related matters. What's important here is that, in the face of what's known about discourse organization, a blanket prohibition against passives whenever an an active version is available is just bad advice.

But now for the really good stuff, from Jensen, Schmitz & Thoma's Modern Composition and Rhetoric (2nd ed., 1941; the 1st ed. was in 1935), which devotes two pages to the "passive style", starting on p. 437 (I've bold-faced my favorite parts, and flagged a bunch of specific points). JST start by deploring wordiness but then shift into a hymn to action:

Another kind of wordiness, the most pernicious kind of all, comes partly from laziness, partly from fear. [flags: gratuitous attributions of motives; moralizing] This we may call the "passive style" as distinguished from the "active." It is full of cumbrous [flag: a stretch for fancy vocabulary rather than plain prose] qualifications: "in general it may be said that," "under ordinary circumstances it will be found," "it is probably safe to say that." It has long and unnecessary transitions [flag: "unnecessary"; elsewhere, this handbook and others exhort writers to supply transitions]: "Now that we have seen how the machine functions, let us take a view of its advantages to social progress." Worst of all, the writer of the passive style converts his [I'm letting the masculine generics pass] verbs into abstract nouns and uses passive verbs and verbs of being. He thus robs his writing of its greatest strength: action. He takes good honest verbs like separate, develop, bewilder [flag: three not especially action-packed verbs, at least in many of their uses], make, and steals their life away by turning them into the abstract nouns separation, development, bewilderment, manufacture, or, much worse, the making of. With his verbal ideas thus abstracted, the writer of the passive style must cast about for other verbs to fill his sentences. First he looks for verbs of being. To say that a thing is or seems or becomes is almost never as good as to say it does something. He who robs his thoughts of action robs them of half their life, for life is action and readers like to think in terms of action. [flag: what, not even one small voice for reflection? or depiction?] Especially is this evident [flag: why this odd inversion?] in another characteristic of the passive style, the use of verbs in the passive voice. A passive verb shows action in reverse. It represents a subject [flag: well, actually, the referent of a subject; a subject is a linguistic expression] not as doing something but as being done to. Hence it too makes meaning static. That is the great defect of the passive style. It pictures for a reader life in the abstract, life without action: still-life.

He who would express his meanings vigorously and directly must choose words so that each one of them carries as much meaning as a word can bear. One critic [Marie Gilchrist in Writing Poetry (1932)] has pointed out that some words are active and expressive and that others merely mark time. She says:

Verbs and words derived from verbs are of great importance. When you say that your object does something, rather than that it is something or is like something else, you give it life and movement... [I omit Gilchrist's dithyramb to the verb, in the tradition of the 18th and 19th centuries (Aarsleff 1974)]

The worst of the passive style is that it is fatally easy to write. [flags: extravagant hyperbole; apparent implication that if it comes easy to you it's probably bad]

There's a lot more, in which passive-style variants are judged "less direct and less simple", while active-style variants are labeled "more forceful and more vigorous". JST conclude (p. 439) with a stern rebuke to those inclined to the passive style:

Some minds find it comforting to write this passive style. [flag: more attribution of motives] Some people feel that it "sounds better"; that to say a thing straight out is blunt and crude [flag: will no one speak out for subtlety?]; or that more words express more meaning [flag: very often they do; the trick is to tell when]. They are wrong. To communicate an idea clearly and forcefully is difficult at best. If one always pulls his punches [flags: gratuitous, and absurd, claim that people who use "passive-style" features do so all the time; baseless claim that writers who use these features deliberately "pull punches", avoid the more pugilistically satisfying alternatives] he weakens the force of his statements by passive constructions and needless words; he makes the job of communication more difficult than ever.

If the index is to believed, this is ALL that JST have to say about passive clauses in the 654 pages of their book: Just Say No. Papa H. would be proud.

Actually, it turns out that Hemingway only occasionally led with a punch. Of the chapters in my 1938 Scribner's edition of the short stories, roughly half begin with stative sentences, half with what could (very) generously be labeled as sentences denoting one or more activities. One of the hard-to-classify first sentences (in "A Way You'll Never Be") even has a passive in it:

The attack had gone across the field, been held up by machine-gun fire from the sunken road and from the group of farm houses, encountered no resistance in the town, and reached the bank of the river.

[Note: Several people have suggested to me that overt opposition to passives is much less strong in the U.K. than in the U.S., and that in fact passives are more frequent in formal speaking and writing in the U.K. than in the U.S. I have no evidence about either claim (though I do have a renewed appreciation of just how hard it is to count passives).]

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 08:11 PM

War is profane

This article in today's NYT discusses a particular issue concerning the future broadcast (on PBS, in Sept. 2007) of Ken Burns's documentary "The War" ("a soldier's-eye view of World War II") -- they may need to censor some of what the soldiers say in the documentary. PBS apparently has a new policy for content aired between 6am and 10pm, "to comply with tightened rulings on broadcast indecency by the [FCC]":

[I]t is no longer enough simply to bleep out offensive words audibly when the camera shows a full view of the speaker's mouth. From now on, the on-camera speaker's mouth must also be obscured by a digital masking process, a solution that PBS producers have called cartoonish and clumsy.

What really caught my attention, though, was this:

In addition, profanities expressed in compound words must be audibly bleeped in their entirety so that viewers cannot decipher the words. In the past, PBS required producers to bleep only the offensive part of the compound word.

I'm trying to figure out what is meant by "compound words" here. I'm sure that really common compounds like asshole and bullshit are what the policy writers have in mind, where only the ass or shit part might have been bleeped in the past ([bleep]hole, bull[bleep]), and which must now be "audibly bleeped [as opposed to inaudibly bleeped?] in their entirety".

I wonder about the intent here. The idea is supposed to be that innocent children may be watching PBS between 6am and 10pm (not if they have any control over the clicker, I would say, but that's a whole nother can of worms), and we don't want to expose those kids to profanity. However, the new rule about compounds presupposes that kids will be able to figure out asshole and bullshit from [bleep]hole and bull[bleep]. So we're assuming that they've already been exposed to the relevant language -- how else would they be able to figure it out?

(This reflects a deeper problem I have with bleep-censorship in the first place. In my view, bleeps just highlight profanities and make them mysteriously interesting to kids; furthermore, I don't think they have any really positive effect on how much kids will themselves curse or think about cursing, because every bleep they hear just adds to the tally of instances of cursing by adults. But I digress.)

I'd like to see what the PBS policy actually says about compounds and other "containers" of profanities. I assume that the policy is intended to make it harder to identify particular profanities, as noted above, but is this intention really achieved by (a) extending the policy to all compounds and (b) limiting the policy to only compounds? Let me explain what I mean.

(a) First, the "all compounds" issue. Compounding is about as productive as anything else in English, the result being that there exist many profanity-containing compounds in which the profanity won't be easily identifiable if you bleeped it on its own, simply because the compound is not as commonly known as asshole or bullshit. Some of my favorite examples of these kinds of compounds come from the character named Michael Bolton (played by David Herman) in the movie Office Space. Here's a passage including just one example, in which Michael explains why he hates his name; the compound is highlighted in boldface.

Samir: No one in this country can ever pronounce my name right. It's not that hard: Samir Na-gheen-an-a-jar. Nagheenanajar.
Michael: Yeah, well at least your name isn't Michael Bolton.
Samir: You know there's nothing wrong with that name.
Michael: There was nothing wrong with it... until I was about 12 years old and that no-talent ass clown became famous and started winning Grammys.
Samir: Hmm... well why don't you just go by Mike instead of Michael?
Michael: No way. Why should I change? He's the one who sucks.

(The no-talent part may be a separate modifier, but ass clown is definitely a compound just like asshole is -- don't be misled by the typographical space between the two parts of the compound. I would have written them together, but I just copied this from imdb.com.)

Unless you've seen Office Space like a million times (or know someone who has), I think you're highly unlikely to figure out ass clown from [bleep] clown. But should the whole compound be bleeped just in case?

(b) Now the "only compounds" issue. There are quite a number of (largely idiomatic) phrases containing profanities in English that should probably be bleeped in their entirety, if the intent of the policy is really what I think it is. Consider the following small handful of such phrases. I've bleeped out the profanity, and I'll bet most any English speaker, kids included (or maybe in particular), will be able to fill in those bleeps without a problem. (See the key below to check your answers.)

the [bleep] is gonna hit the fan
I don't give a [bleep]
[bleep] it
[bleep] you
[bleep] off
what the [bleep]?

Would these phrases need to be bleeped in their entirety? In thinking of your answer to that question, note that any exceptions granted to phrases would have to take into account phrases that are profanities but whose individual parts are not (e.g., blow me).

In thinking of all of the above, I've found it a little hard to put aside the fact that the following clause from the article is worded a little funny:

profanities expressed in compound words must be audibly bleeped in their entirety

The noun phrase subject here is "profanities expressed in compound words", the head of which is "profanities". So, the objects that "must be audibly bleeped in their entirety" are the "profanities" which are "expressed in compound words" -- not the intended reading, which is that the relevant objects are the "compound words" that contain the profanities. So, the noun phrase subject should be rephrased such that "compound words" is the head ("compound words in which profanities are expressed", or better "compound words containing profanities"). Maybe it was that way originally, but an editor got to it?

A couple more interesting quotes from the article:

Margaret Drain, the vice president for national programs at WGBH in Boston, said her station was already examining how it would probably have to edit references to sexual activities in a coming "Masterpiece Theater" production, "Casanova."

I imagine that could take a while ...

As for "The War," Ms. Drain called it "the perfect test case for the F.C.C., because who's going to take on veterans of this country who put their lives at risk for an honest, just cause?"

I hope that the "honest, just cause" bit is just the icing, not the whole cake ... I don't think veterans of wars more questionable than WWII (such as the current one, IMHO) should be any more or less vulnerable to censorship.

[ Comments? ]

Key:

shit
shit, fuck, or rat's ass
fuck
fuck
fuck (also piss or sod for some speakers)
fuck

back

Posted by Eric Bakovic at 01:44 PM

Blogging from the seat of power

In a recent debate with other New York Times columnists (Times Talks, U.S. Politics: What's Next?, July 17, 2006), Maureen Dowd got a big laugh when she said

I don't think Bush is stupid either, but I think that they are willfully kind of blind about different cultures, and how to deal with different countries ... I mean, Paul Wolfowitz said in an interview that we shouldn't "mirror", which is ascribing our behavior and motivations to other countries, and yet they do this again and again ... You know, what just surprises me is that they don't use basic kind of common sense, I mean you could look at a movie like Mean Girls, and figure out the way these North Koreans are reacting, you know it's like high school girls with nuclear weapons, they just want some attention from us, you know? [unedited passage and audio link are below]

This is vintage MoDo, a witticism that deflates the powerful by framing them as clueless high school girls vying for status. It's too bad that Dowd isn't on radio and television more, since her languid, nasal whine is the perfect vehicle for this sort of humor.

But imagine if she were president, or even a senator. The same people who laughed and applauded would be shocked and horrified. This person is in charge of our nation's destiny, and she's still acting out her high school dramas? In one breath she tells us that we need to understand other cultures on their own terms, and in the next breath she's telling us that Kim Jong Il is just like Lindsay Lohan?

My point here is not that MoDo is self-refuting, but that we evaluate people and opinions in context. What's amusingly edgy coming from a columnist would be appalling coming from the president.

The first time that I really understood this effect, I think, was when Arno Penzias became vice president for research at Bell Labs. When I started working there, the VP for research was Bill Baker, who had spent years as science advisor to presidents from Eisenhower to Ford. At Bell Labs, Baker had helped to create a managerial culture of benign and aristocratic serenity. He greeted everyone in the halls by name -- including me, in my first week of work -- and when he came around for a lab visit, he asked intelligent and supportive questions, and complimented everyone for his or her specific contributions. (Though one of his aides later told me that in leaving one of these sessions, he might well ask under his breath "can we find a way to fire that idiot?").

Arno's style, in contrast, was typical of a regular member of technical staff, which is what he was before the work that won the Nobel Prize in 1978. He was smart and skeptical and combative. He asked for evidence, he questioned logical connections, he looked for alternative explanations. If he didn't believe you, he said so, and he said why. That's how engineers and scientists generally act, in America anyhow, and it's a good thing, because that intellectual rough-and-tumble is an important part of the process that makes ideas and inventions better.

After Arno won the Nobel Prize, he was promoted rapidly, and when he became vice president for research, he was still the same guy, under the expensive suits. But suddenly, without changing at all, Arno the smart, intellectually engaged scientist became Arno the Hun. At least, that's how many of my colleagues saw him. He would come around for a lab visit, or invite some staff members to breakfast, and interact with them pretty much the way they interacted with each other. But after the meeting, the halls would buzz for days: "Do you know what he said to X?" "No, but I hear he's going to reorganize the whole division from top to bottom." "Well, Y says he'll slice and dice Area Z first." We were used to years of amplifying the subtle signals that people like Bill Baker emitted, and Arno overloaded our receivers with painful and distorted noise until we adjusted our settings. (I get the impression that Larry Summers had some similar problems with signal-level calibration recently at Harvard.)

As weblogs and other new media become more popular, similar things are happening again and again in slightly different ways. In one class of cases, bloggers become powerful enough to be judged by the same standards that they use in judging others. A recent example was Maryscott O'Connor's complaints about goings-on in the Kosworld:

This is what happens when you crash the gates. All of a sudden, you're not just a pajama-clad kid in his parents' basement; once you've demonstrated your power and influence, people start demanding accountability and transparency.

Another class of cases arises because traditional media have started looking for what NPR ombudsman Jeffrey Dvorkin called "Truth with edge", the informal, intelligent, opinionated perspective that they see as characteristic of blogs and other new media. Or maybe it's just a matter of fashion, as Warren Kelly put it:

Once upon a time, people mocked bloggers. Yes, I know it's hard to believe, after we toppled the Rather regime and all, but it's true. Bloggers were wanna be journalists, hacks, or worse. Now, of course, many journalists are wanna be bloggers.
Then there was podcasting. People mocked podcasters, calling us wanna be DJs. They said the music we played was substandard. OR they said we were violating copyright. RIAA hates us. And now, of course, podcasting is mainstream -- just ask NPR.

In terms of my Bell Labs example, Bill Baker has decided that he needs to be more like Arno Penzias. But again, transplantation from one context to another can change everything. When you project the blogger's perspective into the seat of power, it can turn into a kind of tyranny. Suddenly the amusingly eccentric autodidact becomes Kim Jong Il, or the housepainter with a grievance becomes .. well, you know what I mean.

There was an extraordinary example of this effect earlier this year, at the leading scientific journal Nature. Or rather, at the journalistic organization that shelters under Nature's skirts. As part of a broader campaign to achieve an edgier brand of truth, news@nature.com started a regular column "To be blunt: looking for the point of seemingly pointless research", written by Helen Pearson under the pseudonym "Sybil".

The very first instance of Sybil's column, published on January 9, 2006, critiqued some research on social networks published a mere three days earlier by Duncan Watts and one of his students (Gueorgi Kossinets and Duncan J. Watts, "Empirical Analysis of an Evolving Social Network", Science, 311 88:90, 6 January 2006). The archive of Sybil's columns is only available to Nature "Premium Plus" subscribers, at a cost of $15.99 per month. If you happen to be a Premium Plus subscriber, here is the link. If not, here's a poor person's version. Even more interesting is this version, apparently an earlier draft that went out by mistake on news@nature.com's RSS feed, and was duly posted at BioEd Online.

If Sybil were an ordinary common-or-garden-variety blogger, this would have been a normal, amusing, snarky critique. Here's the start of the accidentally-released draft (the final version is not much different):

2006, I have decided, is the year that I'll make it big. I'll get a promotion. I'll be wildly popular at parties. And in order to do this I'll meet lots of people - very important people - and make them my best friends.

I was keen to get started right away. So imagine my delight when I found a study that would help me make my new contacts sitting in my e-mail inbox at the start of the week. It sounded like a large and impressive investigation; it was, after all, published in the journal Science, which is one of my very favourite reads.

The two guys behind the research, at Columbia University in New York City, decided to analyse how people make friends and interact with each other. To do this, they sifted through some 14 million e-mail messages sent by over 43,000 students and staff in a large university (an institution that they decline to name, although I have my suspicions).

The pair spent three years or so building and running fiendish computer algorithms that could analyse who had e-mailed whom and how often. They assumed that two people who exchange e-mails have some kind of relationship, be they friends or acquaintances.

I know you're holding your breath, so here's what they found: two people are more likely to strike up a relationship if they go to the same college class or have a friend in common.

Duh...

Brilliant. Genious. [sic] Three years sifting though millions of messages and that's the result? My excitement made a nosedive towards depression as I thought of the poor people who set out with such a great project only to find... the obvious.

This is a pretty typical of a certain style of blogging, adapted from the traditions of British intellectual invective. Our own David Beaver is a master of the form, which he applied (for example) in an exuberantly snarky post about some work on animal communication ("And people say we monkey around", 5/18/2006). In that particular case, I thought that the research itself was pretty good (though the media coverage David satirized was mostly not), and I also thought that larger research program was even better, as I explained in a later post ("Monkey words", 5/28/2006). I also happen to think that "Sybil" entirely missed the point of social network research, by pretending that it ought to provide a recipe for social climbing rather than an empirically-testable mathematical model of relationship formation and transmission of influence. But all's fair in the darwinian struggle of ideas, right?

Well, sometimes. The thing is, "Sybil" isn't a common-or-garden-variety blogger: she writes under the aegis of Nature, one of the world's top two scientific publications. That's blogging from the seat of power. So her contemptuous dismissal of a graduate student's first first-author paper as obvious and useless wasn't just an amusing piece of give-and-take in the intellectual agora. In context, it was the casual whim of a tyrant.

Here's the complete Maureen Dowd passage, with an audio link:

But I- I agree with you that- I don't think Bush is stupid either, but I think that they are willfully kind of um
blind about different cultures, and how to deal with different countries, which is really strange, 'cause the one lesson that was supposed to come out of Vietnam was that we would never again go into
a situation where we didn't un- try and understand the other culture, where we stumbled in so blindly, and now that seems to be what the-
what the original Foreign ((Service)) dream team is doing all over the world.
I mean, Paul Wolfowitz said in an interview that we shouldn't "mirror",
which is ascribing our behavior and motifica- mo- motivations to other countries, and yet they- they do this again and again, it never seemed to- I mean you have these
you know, thirty billion dollar agencies that should tell us what these other places are thinking culturally,
but it never seemed to occur to any of them that Saddam might just be bluffing about his weapons because it was an Arab macho thing, he had to
keep up this front, for the neighbors. I mean, they never seem to try and get into other people's heads, that the North Koreans might just want a little respect, by getting our attention,
and that if we took out Saddam, that would uh
signal to the Iranians and the North Koreans that the way to fend us off, and get our
you know, attention and respect was to have nuclear weapons, not *not* to have nuclear weapons, so
you know, what just surprises me is uh that they don't use basic kind of
common sense, I mean you could look at uh a movie like Mean Girls, and figure out the way
these uh North Koreans are reacting, you know it's like high school girls with nuclear weapons, they just want some attention from us, you know?

Posted by Mark Liberman at 12:47 PM

Rice to Middle on Sunday, U2 to Far in December?

The headline on a Reuters wire story dated Fri Jul 21, 2006 12:23pm ET: "Rice to Middle on Sunday". That's short for "Middle East", as the story makes clear. Is this is an editor's slip, or a standard way of talking in some circles, or a new trend? And if some people use this shortened form outside of headlines, do they add the definite article ("Well, I'm off to (the) Middle again tomorrow"). I don't know; if you do, please tell me.

[Update -- Ben Zimmer writes:

I've never heard or seen "Middle" as elliptical for "Middle East". I've tried all sorts of clever Web searches to find another example and have come up empty. So I think the headline writer simply omitted "East" by accident.

An anonymous source observed that newswires have traditionally been rather sloppy about headlines, in the interests of quick turn-around of copy, since in the old (pre-web) days they could trust editors to fix them up before publication.

And Andreas Amman writes:

I have never seen Middle used for Middle East, not even in a headline. But I can tell you that the German equivalent "Naher Osten" (lit. 'near east') got contracted to "Nahost". I suspect it happened in compounds first, e.g. "unser Nahost-Korrespondent". I was surprised to see that "nach Nahost" gets 81,700 Ghits - not a lot less than "in den Nahen Osten" with 193,000. "Nahost" behaves like a toponym, hence no definite article in German.
As far as I can tell, "Nahost"/"Naher Osten" denotes the same as Middle East does - whenever I hear "Mittlerer Osten", I get the feeling that it was translated 1:1 from an English-speaking source.
Many thanks to you and the other residents of Language Log Plaza for keeping such a brilliant blog! We need things we can smile about to remedy the ghastly effects of all the Ü sounds we have to produce every day...

]

Posted by Mark Liberman at 08:52 AM

July 21, 2006

No Oil for a Chicken in Every Pot!

In a spirit of national reconciliation, H. Saucy at Print Culture asks readers to come up with bumper stickers that express two opposite political sentiments at once, possibly via a pun or irony. Among the entrants:

BLACK HELICOPTERS? BRING 'EM ON!
MY OTHER HUMMER IS A PRIUS
CHOOSE LIFE -- IN PRISON
IT WILL BE A GREAT DAY WHEN OUR SCHOOLS GET ALL THE MONEY THEY NEED AND THE AIR FORCE DOESN'T HAVE TO HAVE A BAKE SALE TO BUY A BOMBER
YOU CAN HAVE MY CONSTITUTIONAL RIGHT TO FREE SPEECH WHEN YOU PRY IT FROM MY COLD, DEAD HANDS.

CAPITAL PUNISHMENT STOPS A BEATING HEART

Posted by Geoff Nunberg at 10:45 PM

Sad knights who say nü

Prof. David Myers was kind enough to write back quickly, and as I suspected, the story about umlauts making Germans unhappy is another Glenn Wilson media moment. That is, a fabrication by a reporter looking to sculpt a good story out of some scraps of science and a whole pile of journalistic manure.

His note:

Hoo boy, Mark.

This story, which I see is still recycling through the media, is perhaps the most bizarre incident of my career. The journalist who originally wrote it did interview me. But I never reported doing any such research (and haven't). (The finding attributed to me was then republished by various people and media worldwide, leading to dozens of letters. . . . some from angry Germans!)

Here, from my Psychology 8th edition, is the closest thing to what's reported . . . the second bulleted point regarding the Zajonc study (but even that study doesn't contend that Germans are unhappy people because of their language).

Not for the first time, the BBC should be ashamed of itself. Not for the first time, there's no evidence of any such reaction. Perhaps if the expressions appropriate for shame were electrically imposed on their faces, a bit of the corresponding emotion would leak up into their limbic systems?

I believe that Prof. Myers' reference to "Zajonc & others" is Zajonc, Murphy and Inglehart, "Feeling and Facial Efference: Implications of the Vascular Theory of Emotion", Psychological Review, 96(3) 394:416, 1989. This turns out to be a pretty interesting paper. It seems that umlauts make you hot-headed -- literally. From the abstract:

The vascular theory of emotional efference (VTEE) holds that facial muscular movement, by its action on the cavernous sinus, may restrict venous flow and thereby influence cooling of the arterial blood supply to the brain. Subjective reactions resulting from facial action (phonetic utterance), resembling but unrelated to emotional efference, were found to differ in hedonic quality and to produce correlated changes in forehead temperature. Direct tests that introduced air into the nasal cavity revealed that cooled air was pleasurable, whereas warm air was aversive. It is conjectured that variations in cerebral temperature might influence the release and blocking of emotion-linked neurotransmitters—a consequence that would explain, in part, why some experiences are felt subjectively as pleasant and others as unpleasant.

The first of the Zajonc et al. experiments went like this:

Subjects. A total of 26 native German speakers, who were either exchange students, visiting scholars, or University of Michigan professors, and who spoke German daily, were solicited to participate in the following study. They ranged in age from 20 to 65 years. The ostensive purpose of this and of the following four studies was that they dealt with language acquisition.
Materials. Four short (approximately 200 words) stories were written in German for the purposes of Study 1. Of the four stories, two contained a high frequency of the vowel ü, whereas the remaining two contained no words at all with the vowel ü. Two sets of stories were compiled, each consisting of one ü story and one no-ü story. One set of stories involved young boys, Peter and Jurgen (in the no-ü story and in the ü story, respectively), who wished for their birthdays either dogs and cats (Hunde und Katzen) or foxes and hens (Füchse und Hühner). The second set of stories were written in the style of newspaper articles, depicting Peter Meier, who excelled in shot put (Kugelstosser), and Günter Müller, a promising young hurdler (Hürdenlaüfer). Within each set, every attempt was made to match the ü and no-ü stories for emotional tone and semantic content.
Procedure. Each subject was asked to read aloud one set of two stories; one half of the subjects read an ii story first followed by a no-ü story, and one half read the stories in the opposite order. To obtain an indication of cerebral temperature changes, thermographic images of subjects' faces were collected using an AGA infrared Thermographic 782M system. The argon-cooled thermographic camera takes an infrared television image that generates isotherms of surface temperature variations. Depending on the range of temperature variations investigated, different absolute degrees of resolution can be obtained. For our purposes, a range of variation was selected that generated a resolution of 0.5 "C, which was found adequate at the aggregate level. After a 10-min habituation period that stabilized facial temperature, baseline images were collected from each subject prior to reading each story. Subsequent images were taken immediately after subjects read each of the four paragraphs of a story.
After reading both stories aloud, subjects completed a questionnaire probing both their affective reactions to the stories and their recall of the information conveyed. The affect questions asked the subject to make pairwise comparisons of the two stories regarding the suitability of each story for children, their relative resemblance to a fable, the quality of the German prose, the formality of the language, which story was more interesting, and most important, which of the two was more pleasant and which the subject liked better. Free recall questions for the animal stories asked for the names of the protagonists, the color of a truck that brought the animals, which birthday was being celebrated, and what animals the children wished for and received. Similar items were asked in the sports stories.

The results? The ü-ful stories caused readers' facial temperatures to rise, while the ü-less stories didn't:

Pronouncing ü vs. o in isolation had similar results, in their Experiment 3:

Subjects. A total of 20 native German speakers and 20 Americans served as subjects. ...
Procedure. Both German and American subjects repeated aloud after a tape-recorded voice the vowel sounds ü and o 20 times each, at 3-s intervals. In the course of uttering the vowel phonemes, thermographic readings were taken before the vowel session (baseline) and after the 5th, 10th, 15th and 20th repetition of each vowel sound. Finally, subjects rated the two sounds on 7-point scales according to how pleasant, familiar, and difficult they were to produce, and how much they liked each sound.

Results? "The phoneme o had no apparent effect on forehead temperature, producing a change of only +0.02 °C. In response to the
phoneme ü , however, there was a rise of+0.14 °C, F(1, 35) = 4.40, p < .04." And both Germans and Americans reported liking o somewhat better than ü:

Now we get to the good stuff, Experiment 4:

Subjects. A total of 26 male introductory psychology students at the University of Michigan participated in this study in partial fulfillment of a course requirement.
Procedure. Subjects arrived separately and were informed that the experimenter would be with them momentarily. A 10-min habituation period that stabilized facial temperature was necessary to ensure that subjects, who had been exposed to varying weather conditions, returned to a normal range of temperature. After this time elapsed, subjects were told that they would be involved in a study of language acquisition and were given a copy of the following instructions:

The first segment of this experiment requires that you repeat seven vowel sounds aloud twenty times each following a tape-recorded voice. Please try to repeat each sound as accurately as possible even though several vowels may sound unfamiliar. Because you will be photographed periodically throughout these repetitions, it is extremely important that you do not move your head. Following each series of twenty repetitions you will be asked several questions about that particular vowel sound. Do you have any questions?

Subjects then repeated aloud after a tape-recorded voice the first vowel sound 20 times, at 3-s intervals. Thermographic readings were collected, as previously, prior to the vowel session (baseline) and following the 5th, 10th, 15th and 20th repetitions. After the 20 repetitions, subjects were asked to rate the sounds on 7-point scales according to how pleasant, familiar, and difficult it was to produce, as well as how much they liked the sound and whether it put them in a good or bad mood. This procedure was repeated for the remaining six vowel sounds. To control for possible order effects, one half of the subjects were presented with the seven vowel sounds in one order (i, e, o, a, ü, ah, u) and one half in another order (i, ü, o, a, e, ah, u). Note that only the phonemes that are of focal interest, namely e and ü, change places to control for position effects. The positions of the other phonemes remained constant.

It's not made clear in the paper what these "vowel sounds" really were. The fact that "a" and "ah" are treated as separate "vowel sounds" suggests that maybe "a" means the pronunciation of the letter A, rather than IPA [a]. This would imply that the "vowel sound" written as "e" is IPA [i]. Anyhow, here's the effect of different vowels on facial temperature:

And here's a graph showing how much the speakers liked the different vowels:

And finally, the pay-off -- how the speakers said the vowels affected their mood:

This is pretty persuasive. At least, it persuades me that if you ask undergraduates how good repeating [i] 20 times makes them feel, they respond on average with a higher mood rating than if you ask them how good repeating [y] 20 times makes them feel. It worries me a bit that the evaluation is so meta -- the subjects are asked explicitly "how much they liked the sound and whether it put them in a good or bad mood", which seems quite likely to produce different results from simply checking their mood. The explicit causal tie to the (changing) vowel sounds will surely amplify the span of self-reported moods -- I tend to doubt that as a subject in this experiment, I would really become especially mournful or especially ecstatic. Rather, I imagine that I would feel mildly bored throughout, but would attempt to cooperate with the experimenter (in pursuit of my Psych 101 grade) by guessing how the various "vowel sounds" ought to be associated with moods.

Or then again, maybe I'd just be reading my mood off my facial temperature. If so, you might think that a splash of cool water might be more effective than saying [i] twenty times. And indeed, though Zajonc et al. didn't try cold water, they did test the results of warm and cold air, in an experiment that pretended to be about smells. The results:

So when the BBC posed the question "Ex-Chancellor Kohl: Glum over political woes or vowel sounds?", maybe they should have added "or inadequate air conditioning?" We've had rolling chilled-water cut-offs at Penn recently, to save energy, and no one has thought to suggest to overheated office workers that they just say "ee". (And maybe when the heat fails in the winter, as it sometimes does, we should try the umlaut option?) Somehow I doubt that either vowel will help much, though I should be careful about expressing an opinion without a scientific test.

[This post's title courtesy of Jonathan Lundell, who wrote (with respect to the quote from Izard in my earlier post):

I don't think this is as far from James's idea as Izard makes it out to be ("Although this hypothesis has been attributed to Darwin, James, Tomkins, Gellhorn, and Izard, it is clearly distinct from their ideas..."). James in PP concocts characteristically cartoonish examples, but I rather think he'd have agreed that the physical mediation of emotion needn't be as overt as striking, running or crying. (In any event, let's please have more posts that find a reason to quote both James and Wittgenstein. Post-PP and post-TLP preferred, though.)

]

Posted by Mark Liberman at 08:10 PM

The book for your linguist lover

I have come upon a book that would be the ideal birthday present for the linguist in your life who you feel already has everything, even a copy of Far From the Madding Gerund. (By the way, if you don't have a linguist in your life, you should definitely consider it. When a linguist kisses you, you stay kissed.) The book in question is quite obscure at the moment. The publisher is Battlebridge, located in London and Ahungalla. (Really, Ahungalla. It's in Sri Lanka.) Your linguist lover will not know about it yet. It is called Limits of Language: Almost Everything You Didn't Know You Didn't Know About Language and Languages, and it's by Mikael Parkvall. The ISBN is 9781903292044 in the new 13-digit format prefixed by 978 (Amazon.co.uk uses the older 10-digit form in its database, 1903292042; see this page for information about the change in ISBNs). The publisher sells it at £17.95 at the moment. It seems to be available via Amazon only in the UK and Japan, so have some pounds sterling or yen ready. The book is basically the realization of a fantasy idea I once had for a Linguist's Book of Lists (see chapter 22 of my book The Great Eskimo Vocabulary Hoax). It also has a touch of Guinness Book of World Linguistic Records about it. It is really cute, and absolutely stuffed with linguistic trivia and facts and dates and lists and ephemera and exotica (and a linguist joke or two among the fake endorsement quotes on the back). It's often funny, but also quite serious and useful in many ways. It will delight any member of our fun-loving profession. Buy it, and check it out for yourself before you gift-wrap it for your linguist lover.

Posted by Geoffrey K. Pullum at 07:26 PM

We feel sad because we say ü

This is old, but too good to pass up. Seed Magazine put it like this:

Germans can be grumpy, unpleasant people—and it's not because of post-Nazi guilt or a diet filled with bratwurst, says one American researcher. It's because of their vowels. Hope College psychology professor David Myers says saying a vowel with an umlaut forces a speaker to turn down his mouth in a frown, and may induce the sadness associated with the facial expression. Myers added that the English sounds of "e" and "ah" naturally create smile-like expressions and may induce happiness. Clearly the solution for the Germans, much like the solution for every other people in the world, is to become more like Americans. The German Embassy would not comment on the findings, saying they were "too scientific."
(source: BBC)

Ah, BBC science reporting!

Here's the original BBC story, from way back in 2000: "'Vowels to blame' for German grumpiness". I missed it at the time, due to the fact that the blogosphere (including Language Log) didn't exist. As usual, the article is an extraordinary display of faith, hope and charity, with a considerable admixture of creativity. We can all be grateful to Seed Magazine for reviving this extraordinary story from 2000 as "New and Notable" in January of 2006. (You'll note that I'm working hard on my New Year's resolution to take a positive attitude towards science reporting.)

The author of the reported study is David Myers, a social psychologist whose web site identifies him as "a communicator of psychological science to college students and the general public". According to the BBC story, Myers

... just finished a sabbatical at St Andrews University which involved using electrodes to manipulate the muscles of the face - research which, he said, bore out his theory.

So apparently the idea that umlauts make Germans grumpy was suppported by the results of electrically stimulating the faces of Scottish students. (As far as I know, there is no substance to the rumors that parrot telepathy was also involved.) It seems that this research was never published by Myers himself, neither in a scientific journal nor in the popular press. At least, nothing relevant is listed in his publications list, nor could I find anything on Google Scholar. And Andrew Hammel wrote to Myers back in February to ask for a pointer to a publication, and apparently got no answer. (If anyone knows a citation, please send it to me. I'll also write to Prof. Myers myself, as this may well be another Glenn Wilson case. [Indeed it was: see Prof. Myers' response.])

The idea that the expression of emotion might actually cause the associated emotions to be felt is not an unreasonable one. In fact, William James famously argued that emotion itself is simply our perception of its bodily expression: "we feel sorry because we cry, angry because we strike, afraid because we tremble". Or at greater length:

Our natural way of thinking about these coarser emotions is that the mental perception of some fact excites the mental affection called the emotion, and that this latter state of mind gives rise to the bodily expression. My theory, on the contrary, is that the bodily changes follow directly the perception of the exciting fact, and that our feeling of the same changes as they occur IS the emotion. Common-sense says, we lose our fortune, are sorry and weep; we meet a bear, are frightened and run; we are insulted by a rival, are angry and strike. The hypothesis here to be defended says that this order of sequence is incorrect, that the one mental state is not immediately induced by the other, that the bodily manifestations must first be interposed between, and that the more rational statement is that we feel sorry because we cry, angry because we strike, afraid because we tremble, and not that we cry, strike, or tremble, because we are sorry, angry, or fearful, as the case may be. Without the bodily states following on the perception, the latter would be purely cognitive in form, pale, colorless, destitute of emotional warmth. We might then see the bear, and judge it best to run, receive the insult and deem it right to strike, but we should not actually feel afraid or angry.

A good example, in my opinion, of how it takes a really smart person to have a really stupid idea.

I love the way the BBC story ends:

A spokesperson for the German Embassy said: "We can give no comment on this as it is too scientific."

"Wovon man nicht sprechen kann, darüber muß man schweigen."

[Hat tip to Margaret Marks.]

[There's been lots of email on this topic. Mark Seidenberg wrote that Myers has written very successful psychology textbooks, and has also made a substantial donation to the Association for Psychological Science. I wrote back that I'm witholding judgment about Myers' role in the BBC story -- it seems quite likely that the original research was entirely credible, and he contributed only an off-hand remark or joke about umlauts, which the BBC's reporter spun up into a n elaborate confection of pseudo-scientific bullshit (I use that word in its technical sense here).

Fernando Pereira wrote

Is that idea so stupid? Antonio Damásio claims quite a bit of experimental evidence for it in his books. It makes sense in a model in which feelings are a higher level, conscious report of states with positive or negative associations. The states, which have bodily aspects, are in the primary, reactive control loop, while the feelings engage higher level, longer term memory and executive functions.

I observed in reponse that the theory (as William James expresses it) seems to predict the impossibility of unexpressed emotions, which contradicts the experience of everyday life. You could tell a story about covert expression at the level of hormone levels, or synaptic chemistry, or brain rhythms, or something, but at that point the idea has lost its popular/Jamesian form completely. That's not to deny the plausible idea that there is feedback from external, macroscopic emotional expression to internal states.

Caitlin Light wrote:

That has got to be a joke. I don't see how Germans can have "grumpy vowels" when a common Berliner's goodbye is "Tschüssiii!" Besides which, German's aren't especially grumpy, from my experience. I would bet I can out-grump any of my German roommates.

Well, I suppose that the claim would be about statistical tendencies, not individual roommates. But the BBC article certainly did not cite any evidence that German-speakers are any grumpier, on average, than speakers of other languages.

Robert Hellig wrote:

What makes me even more suspicious about this Umlaut story is the fact that English has similar sounding vowels as the German Umlaut, even if they are not written with two dots. To my amateur ears ä is the same sound as in BigMac, ö as in turn or turbulence and ¨ as in bureaucratic. Thus my theory would rather be that the process of writing lots of dots makes you grumpy, as the facial expressions should be the same for speakers of the two languages.

Certainly for some pairs of English and German dialects, such pairings would be pretty close. But again, there was no evidence in the BBC report that anyone involved had actually considered the relative inventory of sounds and articulations in a serious way.

And Lane Greene wrote:

I am very certain, because I remember it so clearly, that in my college introductory psychology 101 textbook was mention of an experiment, presumably a serious one, in which subjects made various vowel sounds for a length of time. They were told they were doing a phonetics experiment. But a post-vowel battery of mood tests showed that the ones that had made "eeeeee" were in the best mood. I wish I could cite, but I know even my fevered imagination didn't make this one up. I've told people the story more than once over the years.

That's certainly a good story, and it might be true. (I'm sure that Lane's memory of Psych 101 is accurate -- I mean that the textbook might have given an accurate account of an experiment, though incremental overinterpretation in such cases is not unheard of. I'll look around for the reference. [Update: more on this here.]) ]

[Eva Ciabattoni writes:

I found it interesting that the umlaut study was done by an American. I'm a native speaker of both German and English and have noticed that Americans have as much difficulty pronouncing umlauts as German speakers have pronouncing American vowels. It makes me wonder how the study was structured. The foundation of such research would necessarily be accurate pronunciation.

Eva, I'm afraid that you're being far too rational here. As far as it's possible to tell from the BBC report, there were neither any Germans nor any vowels involved in this research. Instead, subjects at St. Andrews (presumably Scottish university students) were made to contract various facial muscles by electrical stimulation, and their mood was subsequently tested. Either Prof. Myers or (much more likely) the BBC reporter added all the stuff about Germans and umlauts, as a sort of quasi-scientific free association ("so if frowning makes you somewhat more likely to feel sad even you're forced into it by electrical stimulation of your facial muscles, maybe the same thing happens if the same muscles happen to be used in producing certain sounds, like, oh, say, maybe German umlauts?").

Eva continues

In fact, I'm sitting here right now trying out my umlauts. Umlaut-a puts a smile on my face. Umlaut-o makes my purse my lips as though I'm blowing smoke rings, but I'm not frowning. Umlaut-u makes me jut out my lower lip like an emoting chimp, but I'm still not frowning. Of course, the true test is how I feel after the exercise.
Let's see - Vaguely anxious with a growing sense of foreboding, but that began earlier when I read the latest news from the Middle East. I doubt whether it's due to the umlaut exercise I just did.

Well, Eva, I think you've probably just done more umlaut-related research than was done at St. Andrews, and I'm certain that you've looked into the matter more thoroughly than the BBC reporter did. ]

[OK, here's some more on the "say ee to be happy" idea. I checked Henry Gelitman, Alan Fridlund and Daniel Resiberg, "Psychology" (Fifth edition). They devote a paragraph to the James-Lange theory and the "facial feedback hypothesis" (p. 479), but don't mention any studies of the effects of pronouncing different vowel sounds. The most recent review I could find (Carroll E. Izard, "Facial Expressions and the Regulation of Emotions", Journal of Personality and Social Psychology, 58(3) 487:198. 1990) concludes that "external manipulation of facial expression produce[s] only a weak effect". In more detail:

Laird (1974) proposed the hypothesis that a subject-blind, experimenter-manipulated facial expression would elicit a corresponding emotion experience. Although this hypothesis has been attributed to Darwin, James, Tomkins, Gellhorn, and Izard, it is clearly distinct from their ideas of regulating naturally occurring feelings by self-management of expressive behavior or activating feelings through goal-directed, self-initiated expressions. Explanation of the effect of such self-initiated actions and those of experimenter-manipulated facial movements may require different concepts and the assumption of different mechanisms. That investigators have confused Laird's hypothesis relating to subject-blind, experimenter-manipulated expressions and self-managed or self-initiated expressions has contributed to the FFH [Facial Feedback Hypothesis -- myl] controversy. Because the research relevant to this controversy has been reviewed several times, I will present only a brief synopsis to set the stage for a reconceptualization of this research.

Three recent reviews of the approximately 20 studies relating to FFH reached divergent conclusions. Laird (1984) concluded that the evidence overwhelmingly favors FFH, supporting the notion that experimenter-manipulated facial expressions affect emotion experience. Winton (1986) reviewed the same studies and demonstrated that Laird's conclusion applies only to a weak or dimensional form of FFH. Most of the favorable evidence comes from studies that manipulated one positive and one negative facial expression, showing only that the facial-expression manipulation changes emotion experience on a positive–negative or hedonic dimension. According to Winton, Laird's conclusion that manipulated expressions of joy and anger lead to the experience of joy and anger, respectively, is unwarranted.
Winton argued that the strong or categorical form of FFH (e.g., manipulated joy expression leads to joy experience and anger expression leads to anger experience) requires that the experimenter contrast the effects of at least two positive or two negative emotions and use a dependent measure that can adequately distinguish between the experiences of these emotions. The only published study that meets these criteria (Tourangeau & Ellsworth, 1979) failed to support the categorical version of FFH. Contrary to most of the evidence, it also failed to support the dimensional version of FFH. A series of unpublished studies (described in Izard, 1977, Chaps. 3 & 4) that met Winton's criteria also failed to support the categorical version.

By a "weak or dimensional form of FFH", Izard means that experimenter-manipulated facial expressions might affect subjects' emotional states, not because of a normal association between the expressions and the emotions, but for random other reasons (e.g. because "facial movements that increased cerebral temperature through changes in vascular blood flow triggered unpleasant feelings, whereas those that decreased brain temperature activated pleasant feelings", or "because the procedure impresses subjects as an irrational and intrusive demand, and they respond emotionally ... [which] may help account for the lack of an emotion-specific response to the facial movements"). Much more on the relevant psychology is here, including description of an experiment on the emotional impact of producing different vowel sounds.]

Posted by Mark Liberman at 08:18 AM

July 20, 2006

Getting on the map

You can tell that a group has made the big time when people start complaining about them. With that in mind, I'm happy to see this cartoon from xkcd:

I hope that readers won't think I'm being defensive, but I'd call the proliferation of models and methods in computational linguistics a sign of flexibility rather than lack of definition. If everyone agrees that there's only one way to do things, you're more likely to be looking at a cult than at a viable scientific or engineering discipline. (And then the approach that everyone agrees on probably doesn't work, anyhow...)

Well, maybe I am being a little defensive. So, exemplifying the diversity of methods in computational linguistics,here are two more cartoons. The first references one of the methods of speech analysis (applied in this case to animal communication):

while the second one illustrates transformational grammar:

[Hat tip to Matthew Hutson.]

Posted by Mark Liberman at 12:40 PM

July 19, 2006

Compound Interest

I thought of my post on the political significance of object+pres. participle (O-PP) compound adjectives as a kind of throwaway -- just noting what seemed to me the obvious point that the right has more-or-less owned the trope of using these compounds to brand liberalism as a lifestyle choice. As exemplified by, oh, I don't know -- say the title of my new book Talking Right: How Conservatives Turned Liberalism into a Tax-Raising, Latte-Drinking, Sushi-Eating, Volvo-Driving, New York Times-reading, Body-Piercing, Hollywood-Loving, Left-Wing Freak Show.

As it happened, the observation set a lot of bloggers to keyboarding. Kevin Drum asked his readers if they couldn't come up with a set of (clean) O-PP compounds to do the work that these were doing for the other side, and engendered a surprising spate of responses. And several other bloggers offered evidence to suggest that the generalization was bogus, in tones of triumph that swelled in direct proportion to the absurdity of the claims they were ascribing to me. At the risk of belaboring the point, then, allow me to me clarify.

For one thing, I was hardly claiming, as Trevor seems to assume, that the contemporary right had invented the O-PP compound, when it actually goes back to Shakespeare!!!! Well, yeah, I did sort of know that. Nor was I suggesting, what's only slightly less absurd, that conservatives had invented the practice of stringing compounds like these together as insults. That's what Zachary Roth seems to assume, in the course of wondering whether I have ever actually talked to a black person, while pointing out that this pattern of insults is a longstanding feature of black oral culture. (Roth cites an exchange of ethnic insults in Spike Lee's "Do the Right Thing" in which characters exchange ethnic insults like "pizza-slinging," "gold-chain wearing," "Goya bean eating," and the like.) And Jack Fenner wrote to ask if there was a connection between the right's pattern and extended adjectival phrases like the one Wesley Snipes produces in "White Men Can't Jump":

"Oh man shut your anorexic malnutrition tapeworm-having overdose on Dick Gregory Bohemian diet-drinking ass up. Leave me alone!"

There are all familiar patterns, but there's no real connection, nor need there be. The fact is that the use of these compounds as insults is neither a conservative invention nor a modern black one. Really it's so obvious that it didn't have to be invented at all. The syntactic pattern was probably all of 20 minutes old when somebody took it into his head to call somebody else a goat-fucking churl.

What I had in mind involved a specific trope rather than a whole construction. In modern political discourse, the right has made a point of redrawing the liberal-conservative distinction in terms of "bogus differences in consumer culture," as I put it. So it makes sense, I suggested, that the right should "seize on the object+participle construction, whose function to turn activities into attributes -- politically speaking, that is, you are what you do (or more accurately, what you drive, drink, or otherwise consume)." Which is to say that compounds of the general form "[product]+[consuming]" or that name other kinds of socially charged activities (like body piercing) are going to come far more commonly from the right than the left.

In fact when you look at the compounds of this form that Drum's readers offered as insults for the right, the vast majority of them had to do either with actions by Republican politicians (lies-leaking, bribe-taking, law-breaking, health-care denying, rich-serving, Constitution-erasing, privacy-invading and the like), or with particular incidents (face-shooting, oxy-raving, Schiavo-diagnosing). Only a handful, like Hummer-driving, involved characterizing Republicans or conservative voters in social or broadly "cultural" terms. And by-and-large, the latter are far less frequent than the equivalent descriptions of liberals.

For example, Google turns up just 19 hits for "Hummer-driving" before conservative(s) or Republican(s) against 270 for Volvo-driving" with liberal(s) or Democrat(s), and even when you throw pickups and trucks into the conservative mix, the disproportion is still overwhelming. More generally, Google estimates 54,800 hits for eating and drinking before liberal(s), the vast majority preceded by words like cheese, Granola, quiche, white wine and so forth, against only 1260 for the same words before conservative(s), as in meat-eating, and beer-drinking. (Syntactic ambiguities preclude doing the same counts with driving, watching, loving, and other verbs that can select animate objects -- "driving liberals" could be part of the larger phrases "Volvo-driving liberals" or "driving liberals insane.") And after hand-screening false hits (as when a sentence-boundary intrudes between eating and liberal, say) the comparable figures for Nexis U.S. papers are 38 to 2.

Does that mean liberals never use these compounds to insult conservatives? Of course not. As I noted in the original post, liberals do use phrases like Bible-thumping and warmongering of conservatives -- the last maybe not the best example to have chosen, both because it doesn't involve a lifestyle characterization and because it's no longer semantically compositional ("Mongered any good wars lately?").

In that connection, Julian Sanchez notes that those compounds, like gay-bashing, crop up a number of times before conservative on the Web. (Sanchez also mentions self-serving but most linguists would consider that a different type of compound on both syntactic and semantic grounds.) Fair enough, but it isn't clear what the baseline should be here, in the absence of a specific O-PP compound on the other side. "Bible-thumping conservatives" is more common than "religion-hating liberals," for example, but conservatives usually express that idea with the phrase "godless liberals," which wins out over "Bible-thumping conservatives" by about 12-to-1 in Google Groups and 4-to-1 in Nexis. Where conservative and liberal versions of O-PP compounds are in direct competition, the one that characterizes liberals seems to be always more frequent -- on the Web, "America-hating liberals" is about 5 times as frequent as "flag-waving conservatives," for example. Draw the generalization appropriately, in short, and it holds up pretty good.

Posted by Geoff Nunberg at 10:30 PM

Cracking down on the Hezbollians

When President Bush was overheard telling Tony Blair, "What they need to do is get Syria to get Hezbollah to stop doing this shit, and it's over," everyone latched on to Bush's use of a naughty, naughty word. But a broader foreign-policy implication of Bush's comment is that the administration continues to have trouble knowing how to deal with — or even conceptualize — non-state actors like Hezbollah. The solution offered by Bush is simply to have the United Nations (presumably the antecedent for "they") exert pressure on a recognizable nation-state, Syria, which is imagined as some sort of puppeteer pulling Hezbollah's strings.

But in remarks made yesterday Bush provided some linguistic evidence that he is prepared to treat Hezbollah not just as an entity controlled by a nation-state, but as the equivalent of a nation-state — or at least a major ethnonational group worthy of a toponymic suffix. From the man who brought us Grecians, Kosovians, and East Timorians... meet the Hezbollians. Here's the official White House transcript:

Listen, Syria is trying to get back into Lebanon, it looks like to me. We passed United Nations Resolution 1559, and finally this young democracy, or this democracy became whole, by getting Syria out. And there's suspicions that the instability created by the Hezbollian attacks will cause some in Lebanon to invite Syria back in, and it's against the United Nations policy and it's against U.S. policy.

(This is a sure bet to make Jacob Weisberg's "Bushisms," since even the Wall Street Journal's Washington Wire made light of the President's neologism, under the heading "Vocabulary Lessen.")

Mark Liberman has observed that Bush's penchant for forming toponyms (or demonyms) with the -ian suffix would, in fact, be one way of regularizing a particularly confusing aspect of English morphology. But his use of Hezbollian suggests that Bush would take this linguistic reform beyond the usual suspects, to groups classified by the U.S. as terrorist organizations. So what Bushian quasi-toponym can we expect next? Hamasian? Talibanian? How about Al-Qaedian? And let's not forget Colombia's FARCians, Peru's Shining Pathians or Spain's ETAnians. Suddenly these groups don't seem so shadowy after all, made concrete and legible through the wonders of toponymic suffixation.

Posted by Benjamin Zimmer at 11:27 AM

Words of curse

Greetings from the youth and popular culture desk at Language Log Plaza. Ben Zimmer's two posts about shit brought to my fragile little mind the fifth season premiere of South Park, an episode called "It Hits the Fan". If you haven't seen it, you can get the gist of the episode here and here -- and here's the script. It kicks off with Cartman telling Kyle, Stan, and Kenny that someone will utter the word shit for the first time ever on network television, on the TV show Cop Drama. Everyone in South Park gathers around their televisions to catch the momentous event:

Mitchell: Just understand that it's my job. I still think you're a good cop.
Frank: Well, Mitchell. I guess you're gonna do what you're gonna do. Let's just try and stay friends no matter what.
Mitchell: You're right. Maybe I'll ss-see you around.
Frank: Goodbye. Oh, and Mitchell? [voice lowers to a whisper] You... got some shit on the side of your mouth right there.
Mitchell: Oh, yeah, that ol' thing, yeah.
Viewers: ... Wwooww!!!

Now that it's OK to say shit, the teachers at the school have to teach the children how to use of the word appropriately: namely, that it can only be used if it doesn't refer to actual shit or shitting. As Mr. Garrison clarifies to his kindergarten class:

You can say "I have to poop and shit," or "Oh, shit, I have to poop," but not "I have to shit." Are we all clear?

And Ms. Choksondik gets a little more technical with her fourth graders:

The adjective form is now also acceptable. For example, "The weather outside is shitty." However, the literal adjective is not appropriate. For example, "My bad diarrhea made the inside of the toilet bowl shitty, and I had to clean it with a rag, which then also became shitty." That's right out!

The episode later takes a bizarre turn: people saying shit start to die by coughing their insides out. It turns out that shit is one of several "words of curse" that were originally responsible for the Black Plague, which has now revisited the world thanks to the freedom given to the word by Cop Drama. Such is the power of language, as the kids conclude towards the end of the show in their usual moral-of-the-show-clarification:

Kyle: "Curse words" -- they're called that because they are a curse. We have to go back to only using curse words in rare, extreme circumstances.
Stan: And besides, too much use of a dirty word takes away from its... impact. We believe in free speech and all that, but... keeping a few words taboo just adds to the fun of English.
Cartman: So please, everyone, from now on you've got to try and watch your language.

[ Comments? ]

[ Update --

This E! Online article on the episode refers to shit as the S-Bomb, except when directly quoting a Comedy Central executive:

"What the show is saying is that 'shit' is a common word," Comedy Central executive vice president Bill Hilary tells the New York Post. "What does it mean? It means poo."

I suppose that, just as the NYT will only take shit from the president, E! Online will only take shit from entertainment execs.

-- end update ]

Posted by Eric Bakovic at 11:13 AM

Taking shit from the President

Repercussions from the sh-t heard round the world continue to be felt. Unlike most other media sources, the New York Times and the Washington Post decided not to censor President Bush's pithy solution for peace in the Middle East: "What they need to do is get Syria to get Hezbollah to stop doing this shit, and it's over." The bloggers at Gawker treated the printing of shit in the Times as a momentous event, wondering,"does this indeed mark the debut appearance of a barnyard epithet for manure in The Gray Lady?" This was enough for Slate's "Today's Blogs" feature to claim, inaccurately, "Gawker heralds the four-letter word's first ever appearance in the New York Times."

No, it's not the first time. As I mentioned in an update to my original post, the late New York Times editor Abe Rosenthal exempted presidential swearing from the newspaper's ban on shit during the Watergate era. Rosenthal's obituary in the New York Observer (quoted by a Gawker commenter) tells the story:

When a Watergate tape revealed that Richard Nixon had said, "I don't give a shit what happens, I want you all to stonewall it," The Times printed shit for the first time, though only in the text of the tape, and not in the accompanying news story.
When a Newsweek reporter called Rosenthal to ask if this was a seismic change in the paper's standards, he replied, "No. We'll only take shit from the President."

For the record, I've reproduced the fateful first shit in the New York Times, as it appears on page 20 of the July 10, 1974 paper. The context is the House Judiciary Committee's transcripts of Nixon's White House tapes, which differed in some crucial ways from the sanitized transcripts released by the White House itself. The conversation Nixon had with his aides on March 22, 1973 (full transcript here), in which Nixon asserted his "stonewalling" strategy, had been conveniently omitted until the White House was forced to give up the tapes. Nixon's "not giving a shit" was therefore damning evidence of a cover-up, and the revelation helped seal the President's fate as the Judiciary Committee drew up impeachment charges.

Indeed, when the Times carried a piece on "The Evidence for Impeachment" in the July 14 "Week in Review" section, this passage of the transcript was featured prominently. However, shit had disappeared, replaced by four dashes:

The four-dash treatment also appeared the following day in an opinion column by Anthony Lewis. Elsewhere in its Watergate coverage, the Times dropped "I don't give a shit..." entirely, instead beginning Nixon's quote with "I want you all to stonewall it." But on July 23, the Times spelled out shit one more time, in the summary of the impeachment inquiry from Judiciary Committee special counsel John Doar. So in both cases where shit appeared unexpurgated, the Times was reprinting official Congressional documents rather than featuring the word in writing by the paper's own reporters.

That wasn't the last time that shit appeared in the Times in a Nixonian context. On June 6, 1976, the Book Review dropped an S-bomb from a rather unusual source: William F. Buckley, Jr. It appears in Buckley's review of The Company, John Ehrlichman's lightly fictionalized account of the Watergate affair. In one scene with characters standing in for Lyndon B. Johnson and CIA Director Richard Helms, the LBJ character says of his successor (the Nixon stand-in), "If you think this is politics, just wait until that son of a bitch gets in here. You'll be eating his political shit for breakfast, lunch, and dinner." After quoting the passage, Buckley writes, "That has the advantage of sounding like LBJ, but is highly distracting to the ethical crochet." So Rosenthal's rule about "taking shit from the President" apparently extended to fictional Commanders in Chief too.

Posted by Benjamin Zimmer at 01:01 AM

July 18, 2006

Passive aggression

We had a great after-lunch discussion about the passive the other day in the Senior Writers' Lounge at Language Log Plaza. It started with Poser mentioning that a reader had written to him about his having mentioned the injunction against the use of the passive, and how the Declaration of Independence violates that injunction. The reader asked where the injunction might have originated.

Nunberg immediately expressed the opinion that it is a thoroughly modern fixation. More than modern, in fact: post-WWII. He credited Orwell, citing "Politics and the English language" (1946). In the course of a tirade about the alleged evil and dishonesty of political writing, Orwell wrote (apparently without irony, Nunberg noted) that in the evasive kind of writing he disapproves of, "the passive voice is wherever possible used in preference to the active". Orwell formulated an edict (which he had just violated): "Never use the passive where you can use the active."

(By the way, I should tell you that I despise that essay of Orwell's; so I was delighted to see Stanley Fish describing it in the New York Times Book Review last Sunday as "what is surely the most overrated essay in the modern canon, George Orwell's turgid, self-righteous and philosophically hopeless ‘Politics and the English language’." Fish was a bit unpleasant about the Nunberg book he is reviewing, but as far as Orwell is concerned, I say, from Fish's mouth to God's ear.)

Nunberg also made this remark about Orwell's attitude to the passive:

Orwell didn't directly connect this to the idea that the passive was "weak" or "evasive" (he seemed to object to it chiefly because it was wordy), but later usage writers depicted the passive as a way of avoiding responsibility for saying who was responsible for the action, which seemed to them typical of government or bureaucratic prose. It's also connected to the idea that good prose should be "muscular," which somebody should track down some time. Actually a lot of this rests on a kind of pun on the word 'passive', doesn't it?

Zwicky demurred. "We can't hang this one on Orwell," he said:

It's a commonplace in college handbooks in the 30s and 40s (in the U.S., anyway), and probably goes back before that. The topic comes with images of strength, muscularity, and action (that is, symbolic masculinity), and sometimes the passive voice is seen as part of a larger "passive style" (copular constructions, abstract nouns, etc.).

Zwicky also reminded us of a delicious passage in Merriam-Webster's Dictionary of English Usage about Orwell. They reports the bias against the passive as a long-standing prejudice, and take Orwell as the jumping-off place for the discussion; but they note (p. 720):

Bryant 1962 reports three statistical studies of passive versus active sentences in various periodicals; the highest incidence of passive constructions was 13 percent. Orwell runs to a little over 20 percent in "Politics and the English Language."

Isn't that gorgeous? More passives in Orwell's pompous essay with the warning about how you mustn't use them than in any periodical you can lay your hands on! The man was either utterly without shame, or blithely unaware of the characteristics of his own usage, or cynically certain that we wouldn't check.

Liberman chipped in at this point with the observation that the instructions for the preparation of abstracts for the Acoustical Society of America (see them at http://asa.aip.org/honolulu/honolulu.html#34) are explicit about requiring the passive voice, at least in certain cases:

7. Use passives instead of pronouns "I" and "we," e.g., "It was noted" instead of "We noted."

He thought it might also be a requirement of the American Institute of Physics (nobody has checked that yet). Poor general scientific public, though: some sources insisting they mustn't use the passive and others insisting that they must!

This all reminded Poser of something in his (extremely nerdy) past. He said:

This seems to be an instance of over-extension of a prescription that in limited circumstances, e.g. dramatic writing, might make sense. In other circumstances, it really doesn't. Years ago I translated into English the manual describing the implementation of the mathematics library of a Hitachi computer. It said things like: "The square root function is computed using the Newton-Raphson algorithm. Three extra bits of precision are used for intermediate calculations." In Japanese these sentences were generally active, but Japanese is a language that does not require an overt subject. The natural translation into English uses the passive. Hitachi was at the time trying to improve the quality of its English publications and had created translation guidelines that included the admonition not to use the passive. Figuring out how to translate this kind of material without using the passive was quite difficult since it is hard to decide what subject to use. Is it the computer? The mathematics library? The particular function? Omitting the subject has the great virtue of avoiding this rather sticky question.

A good point, I thought. The passive construction certainly has its uses in cases of that sort.

I then reminded everyone that Strunk and White's vile little compendium of tripe about style (4th edition, 2000, p. 18) says "Use the active voice", and adds some editorializing about how the passive is "less bold, and less concise", and if you leave out the agent it becomes "indefinite". They go on with some mealy-mouthed stuff admitting that they cannot say one must never use it; but their firm prejudice against it is clear.

Now, those who know me will be able to predict that I couldn't resist grabbing a copy of the just-mentioned pathetic booklet (it was hard to find one; Poser says he threw his away) and checking on whether Strunk and White managed to get to the end of the page without accidentally using a passive themselves. And of course they didn't, the bald-faced hypocritical morons. Within just a few lines, still talking about how bad the passive is, they write:

Many a tame sentence of description or exposition can be made lively and emphatic by substituting a transitive in the active voice for some such perfunctory expression as there is or could be heard.

This, in addition to containing a passive clause (the one with be made as verb), reveals the interesting fact that they seem to think existential clauses like "There is a spider in the bathtub" are in the passive voice.

You know, I do try to stress the ignorance and inadequacy of Strunk and White as strongly as I can here on Language Log; but it never seems strong enough.

At this point Liberman came up with an idea for a further investigation. He grabbed a laptop (we keep stacks of them lying around in the Senior Writers' Lounge, like paper napkins) and did a quick count of the first 100 tensed verbs in E.B. White's introduction to Letters of E.B. White (1976). He found that 28 of them were copulas associated with adjectives or predicate nominals; 51 of them were active verbs (including quite a few not especially muscular specimens such as "felt lonely" and "came of landed gentry", where no passive counterpart exists); and 21 were passive verbs (in fairness, it should be noted that he counted "was born in Brooklyn" as a passive, which could perhaps be argued against).

This is either 21% passives (21/100) or 29% passives (21/72), depending on what you want to do about the actives that don't have a passive counterpart and the "be born" case.

In most cases, Liberman observed, the passive clauses could easily have been re-phrased to make the passives into actives, but White had chosen not to do it. For example:

"This company had a factory in Harlem, where the cases for uprights, squares, and baby grands were manufactured by a crew of beer-drinking Germans, skilled artisans. The actions (keyboard, hammers, dampers, etc.) were bought from a company that specialized in that and were installed at the Horace Waters factory."

Why not say, where a crew manufactured the pianos? Why not say, the Horace Waters company bought the actions from a company that specialized in that, and installed them at the Harlem factory? I'll tell you why. Because Strunk and White aim to tell you that you mustn't use passives; it doesn't apply to them. What a shameless, pontificating, ignorant, hypocritical, incompetent, authoritarian pair of old weasels they were.

Posted by Geoffrey K. Pullum at 08:55 PM

Another Sign of the Apocalypse?

For some time I've been collecting examples of interesting pronominal anaphora in writing. Some of the examples are inept; the reader is initially led to entertain an unlikely referent for the pronoun, and sometimes it is almost impossible to shake the wrong reading; these examples are akin to the truly inept dangling modifiers that the Fellowship of the Predicative Adjunct has/have [choose according to your nationality] been collecting for some time. But other times there is no problem, given the context and real-world knowledge.

I posted some recent examples to the American Dialect Society mailing list, concluding the first of these with the observation that

The fact is that huge numbers of personal pronouns are potentially ambiguous in their reference, but this is rarely a problem. Which means that handbook advice to avoid ambiguity of reference for pronouns is remarkably unhelpful; this is tantamount to telling people to avoid pronouns, period.

Beverly Flanigan now reports that she has students at Ohio University who were taught just this (in high school, I assume). Surely the Apocalypse is upon us.

According to Flanigan, on ADS-L 7/18/06:

For the past several years I've had students who in fact tell me they were taught not to use pronouns in writing. The result is a constant repetition of nouns where pronominal substitutions would have been perfectly comprehensible. Most annoying.

Annoying? It is to weep and gnash one's teeth.

For the record, here's the case I wrote about, from The New Yorker of 7/10&17/06, p. 90, in David Denby's review of "The Devil Wears Prada" (pronouns bold-faced):

A high-minded college journalist who wants to do serious work, Andy hangs up Miranda's coat and bag every morning after she flings them down on Andy's desk; she runs and fetches, criss-crossing the city, tending to Miranda's dog, her twin daughters, her dry cleaning.

Finding a referent for a pronoun can, in principle, involve checking out (at least) the following factors: (a) the properties of the pronoun; (b) how recently mentioned possible referents were; (c) whether a possible referent was mentioned in a position structurally parallel to the one the pronoun is in; (d) whether a possible referent was mentioned via a NP in a prominent position in the sentence, especially the subject NP; (e) the salience/foregrounding/topicality of the referent in the discourse context; (f) the real-world plausibility of the referent. This is amazingly complicated stuff, and even small changes in wording can shift the likelihood of one referent over others, as in the following set:

Bush invited Putin to his ranch. [very likely: Bush's ranch]
Bush followed Putin to his office. [Bush's or Putin's office, depending on the context]
Bush followed Putin to his dacha. [very likely: Putin's dacha]

(Note that sometimes following the Avoid Pronouns "rule" produces truly bizarre results, like "Bush invited Putin to Bush's ranch.")

Most of the time we sort our way through referent finding without difficulty, relying heavily on real-world plausibility. I noticed the Andy/Miranda example only because I'm hypersensitive to pronouns (on reflection, Geoff Pullum thinks it's not very good writing; but he didn't notice anything odd the first time through).

Then one that definitely gave me pause, the very beginning of Caroline Leavitt's "Learning Mother Love" in Psychology Today, July/August 2006, p. 44:

It's a shiny bright apple of a day in San Francisco and the three of us--me, my husband, Jeff, and our one-year-old son, Max--are at a concert. He's in red corduroy overalls and a striped shirt, his hair long and golden as the day ahead of us. The concert's been going on for an hour already, and the whole time Max has been content to sit on his father's lap, enthralled by the music.

There are several ways to fix this one, but the obvious one is to junk the pronoun and repeat "Max". Occasionally that's the way to be clear. Just not ALWAYS.

Finally two that work just fine, I think, but would go differently if the surrounding wording were changed:

Soon after his 60th birthday, Beecher became a celebrity of a far less exalted kind. Theodore Tilton, his longtime friend and sometime journalistic collaborator, accused the preacher of committing adultery with his wife...
(NYT Book Review, 7/16/06, p. 10, Michael Kazin review of a biography of Henry Ward Beecher)

To see the issue, suppose that instead of "committing adultery..." the sentence went "committing sodomy...".

Relatives awaiting immigrants from Eastern Europe in 1893 discover they died at sea.
(NYT Book Review, 7/16/06, p. 28, iUniverse ad for The Golden Door by Charles B. Nam)

To see the issue, try following "discover they" with "don't know which port the ship is headed for" rather than "died at sea".

Trying to teach people how to use pronouns skillfully in writing is a very hard task, as you can see from looking at some examples like these. But telling them to Avoid Pronouns is certainly not the way to go.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 08:33 PM

Phony Oriental wisdom in the 12th century

[Guest post by Victor Mair]

About thirty-five years ago, I encountered for the first time the following saying (translated from Persian): “Seek knowledge even as far as China.” When I first heard this maxim, it immediately struck me as being counterfeit. Its empty sententiousness, plus the fact that I was unable to make sense of it historically, made me feel that it could not be genuine.

Since I have been working on forged “Chinese” proverbs lately, I now begin to feel that “Seek knowledge even as far as China” is comparable to such pseudo-profundities as the infamous “May you live in interesting times.” Although we are here dealing with an ersatz Persian aphorism, still it attributes inscrutable Oriental wisdom to China. As a Sinologist, I am compelled to respond whenever a smokescreen is put up around the object of my researches. Consequently, I decided to do a little investigation about the origins of this suspicious saying: “Seek knowledge even as far as China.”

It just so happens that I’m currently writing a paper on Eurasian avian bird colloquies, and one of the most celebrated of these is Farud ud-Din Attar’s “The Conference of the Birds.” Lo and behold, “Seek knowledge even as far as China” not only occurs in this famous Sufi poem, but it is provided with a most dramatic context.

The beginning of the affair of the Símurgh, ah the wonder!
In all His glory He flew over China one midnight;

Into the middle of China from Him a feather fell;
As a result every province was filled with tumult.

Everyone limned a tracing of this feather.
All who saw that drawing were much affected.

The feather is now in China’s Art Gallery;
“Seek knowledge even as far as China” is because of this.

[Farid ud-Din Attar, “The Conference of the Birds,” ll. 735-738 (Avery, tr. 1998: 69)]

This widely quoted, but poorly attested, maxim belongs to the class of Islamic wisdom called hadith, that is, a report of the sayings or actions of Mohammed or his companions, together with the tradition of its chain of transmission. Much serious scholarship has been expended on authenticating and interpreting the large body of extant hadith, the bulk of which have been gathered in ten or so major collections. Six of these, all compiled in the third century of the Islamic era, are considered to be the most authentic, and constitute an important source of legal injunctions for orthodox Muslims. Unfortunately, despite its notoriety, the hadith about the sensational midnight overflight of China by the Simurgh is not among the early, well-documented collections. In fact, several authorities consider it forged, while even those who support it consider it to be of only “fair” or “weak” authenticity. See <http://www.sunnah.org/sources/hadith_utlub_ilm.htm>. My own conclusion is that the fraudulent Orientalist mischief-makers were hard at work already in 12th-century Persia, when Attar wrote “The Conference of the Birds.” Whenever “Chinese wisdom” is involved, caveat emptor!

Incidentally, for those who are not familiar with the Simurgh, it is a mythological bird somewhat similar to the Indian garuda or the Arabic rukh (“roc”). There is a (false) folk etymology: si (“thirty”) + murgh (“bird”). The true etymology of the word reveals, however, that it has deep roots in Iranian (Persian simurgh < Middle Persian senmurv, akin to Avestan meregho saeno < meregha- [“bird”] saena- [“eagle”], the latter element being a close cognate of Sanskrit syenah [“a hawk, falcon, eagle; bird of prey”]). (Forgive me for simplifying the phonological representations in this e-mail message.) “Simurgh” is often translated as “phoenix,” but the mythological creature it signifies is sufficiently unique and important in world culture to warrant retention of the original name in Persian.

[Guest post by Victor Mair]

Posted by Mark Liberman at 07:33 PM

Phony Orientalism in the 20th Century

[Guest post by Victor Mair]

In the latest issue of the National Geographic (August 2006: 136-149), there is an article entitled “Ants: The Civilized Insect,” by the eminent Harvard myrmecologist, Edward O. Wilson, with spectacular photographs by Mark W. Moffett, who earned a Ph.D. studying army ants under Professor Wilson. The article is prefaced by the following paragraph:

“In Japanese the word ‘ant’ is intricately written by linking two characters: one meaning “insect,” the other meaning ‘loyalty.’ Altruistic and cooperative toward one another, nestmates readily go to war to preserve their colony. Renowned biologist and lifelong ant observer Edward O. Wilson introduces our new occasional series on these highly social creatures.”

Unfortunately, this paragraph is so fraught with errors as to be completely useless and potentially very damaging. However, since it will probably be seen by at least 25,000,000 readers and perpetuates serious linguistic misconceptions, I consider it my duty to point them out for all who are willing to listen.

In the first place, the Japanese *word* for ant is ARI, and it has absolutely nothing to do with “loyalty.”

Second, the author of this sorry paragraph has confused writing with language, since he/she is talking about the Chinese **character** used to write the Japanese **word**, not the word itself.

Third, he/she mentions “two characters,” but the Japanese word for ant is written with a single character [蟻]

Fourth, the KANJI used for writing ARI does have two main components, namely, the 6-stroke radical (or semantic indicator) on the left and the 13-stroke phonophore on the right. The phonophore can be further broken down into components meaning “goat / sheep” on the top and “I / me / we” on the bottom.

Fifth, the radical on the left does convey the idea of “bug” or “insect,” but the phonophore on the right – while it does mean “righteous[ness]” or “just[ice]” when it stands alone; pronounced YI4 nowadays – only serves to indicate the sound of the character. How do we know that this is so? Well, although the early Taoist thinker Zhuang Zi did write the Chinese word for ant pronounced YI3 (that’s the Modern Standard Mandarin [MSM] pronunciation; in his day it would have been pronounced something like *ngia or *sngje) with the character now used to write the Japanese word ARI, at around the same time (late 4th-early 3rd c. BC), the poet Qu Yuan wrote the same word for ant with another graph that is also pronounced YI3 (in MSM; in his day it would have been pronounced something like *ngier [e = schwa] or *sngjei), but whose phonophore means something else entirely (it is the particle for asking rhetorical questions that is now pronounced QI3 in MSM). From the fact that both of these graphic variants for the same morpheme still survive to this day, it is clear that their phonophores are meant merely to convey the sound of the morpheme in question.

Sixth, to assert that this character used to write MSM YI3 and Japanese ARI means “loyal bug” is to succumb to the seduction of an erroneous folk etymology. Unfortunately, all cultures that have used or still use the Chinese characters to write their languages are plagued by countless instances of such pseudo-etymology, but one would hope that the National Geographic fact checkers would have asked a linguist about this obviously overly romantic explanation before they ran with it to the world.

Seventh, by the way, the MSM word for ant, like the Japanese word ARI, is also bisyllabic. Around the Tang Dynasty (more than a millennium after the time of Zhuang Zi and Qu Yuan, there was an expression MA3YI3 meaning “big ant,” where the MA3 literally meant “horse” but was used as a prefix for various members of the animal kingdom to signify that they were bigger than average specimens or species. Gradually, in line with the general polysyllabicization of vernacular (especially northern) Sinitic, MA3YI3 displaced YI3 as the word for ants in general, not just big ones. Half a millennium or so later, during the Ming period, people got around to adding a bug radical to the horse, so that both of the characters used to write the bisyllabic word for ant (MA3YI3) now possessed the appropriate radical. (The same sort of story could be told of thousands of items in the modern vocabulary of Sinitic languages.) I should note, however, that many southern Sinitic languages still have a monosyllabic word for ant. For example, in Cantonese, it is NGAI, displaying not only its morphological conservatism, but also its phonological conservatism.

In short, I have gone to this length to explain why the Japanese word for ant is **not** “intricately written by linking two characters” because it is my ardent wish that people who do not understand how the HANZI / KANJI / HANJA work will stop telling fairy tales about them.

[Guest post by Victor Mair]

Posted by Mark Liberman at 07:15 PM

Drawing the line

Consider porn magazines, a species of utilitarian literature whose purpose is to provide descriptions and images that will bring its (mostly male) readers to climax. These days, pretty much anything goes -- linguistically and visually -- inside the covers of these publications. But the covers themselves present a challenge: they should be as enticing as possible (so that people will buy the magazines), but they also have to steer clear of illegality as to what words and images can be publicly visible. So you get avoidance, in this very unlikely place. In combination with pushing as close to the line as possible.

Taboo and taboo avoidance are always hot topics here at Language Log Plaza, and then on Saturday John Baker posted to the American Dialect Society list with a link to a paper (in .pdf format) by Christopher Fairman, a law professor at the Ohio State University, about the legal status of an English taboo word, and I was moved to report here on my observations about one type of umliterature, porn magazines meant for gay men. ("Umliterature" is a term suggested by Larry Horn on ADS-L in June 2005, as a reanalyzed version of "um, literature", as in "I was paging avidly through some, um, literature in bed last night." The model is "umfriend", which you can check out on Google; both um- words convey some dubiousness about the appropriateness of the base word, having to do in some way with sexual activity. So nice to see a new prefix being born. I've decided my XXX-rated comic homoerotic collages should be referred to as "umart" for short. Ok, "um-art", to avoid the parsing "u-mart".)

Oh, yes, Fairman's article (74 pages, 409 footnotes) is titled, with stunning directness and simplicity, "Fuck". Fairman believes that the word should be freely used. The legal history in the U.S. is remarkably convoluted, however.

On to the gay porn mags, mostly Torso and Honcho (which I happen to have, um, on hand). The no-no words for the covers are fuck, of course, plus cocksucker and cocksucking; the covers seem also to avoid cock 'penis' and asshole. The no-no images are of penises, testicles, and anuses. Otherwise, you can get right up to the line.

Visually, buttocks are fine, as is a certain amount of pubic hair, plus erections visible through clothing. Full frontal nudity can appear on a cover, so long as the model's equipment is concealed behind a teaser for one of the stories or photo spreads inside. Couples can even be pictured in positions that are unmistakably part of one of the sex acts that can't be directly named, so long as their naughty bits aren't shown.

Linguistically, fuck and cocksucker are often fixed with asterisks, just the minimum one each:

CARLO MASI Talks About Working Out, Sex With Blonds, And On-Camera F*cking (Torso 8/06)
Antonio & Martin: Sex Junkie F*cked Raw (Honcho 9/06)

C*cksucker's Double Dose (Honcho 5/06)
C*cksucker Gang-Banged (Honcho 4/06)
Gas Jockey Initiates C*cksucker (Honcho 1/06)
C*cksucker Services College Jocks (Honcho 2/06)

Though occasionally the editing goes awry:

"My Straight Roommate Watched Me Get F*ucked!" (Torso 4/06)

For fuck, there are many tamer alternatives, often metaphorical and often combined with rhyme or alliteration:

Sex Bully Pumps Rump (Honcho 5/06)
College Jock Gets Pumped (Torso 3/06)
Horn-Dog Twinks in Rump-Roasting Romp ([2] 3-4/06)
Nailing the Perfect Buns (Honcho 4/06)
Straight Jock Bones Gay Bud (Torso 8/06)
Coach Ganged By 3 (Honcho 7/06)
The Gang-Bang: Ploughed By 5 And Loves It! (Honcho 9/06)
Marines Share Spunky A-Hole (Honcho 9/06)

(Note punning in spunky.) Similarly for the description of cocksucking (note that suck seems to be ok on its own):

Sucking Straight Guys (Honcho 6/06)
Goin' Down On Hitchers (Honcho 7/06)
Blowjobs Anonymous (Honcho 5/06)
Jarhead Blow Buddies (Torso 4/06)
Baseball Jock Scores BJ (2/06)
A Nose for Hose (Honcho 5/06)
Sports Trainer Licks Sticks (Torso 4/06)
Servicing Straight Meat (Honcho 2/06)
Eat My Wad! (Honcho 9/06)

As for cock 'penis', I haven't seen it. It can be punned on:

Cocksure Stud's Throat-Stretching Exercises (Torso 4/06)

And dick seems to be ok:

Swallowing Surfer Dick (Honcho 5/06)

But otherwise we get stick, meat, hose, etc. (as above) or references to erect penises via boner or hard-on:

Straight Bait's Butt-Hungry Boner (Honcho 7/06)
Straight Guy's Bi Boner (Torso 3/06)
Tuff Punk's Smart-Ass Boner (Honcho 1/06)
Hard-on vs. Hard-on (Torso 3/06)

I also haven't seen asshole. A-hole works for asteriskless avoidance:

Marines Share Spunky A-Hole (Honcho 9/06)

Ass gets by, not only in puns like smart-ass (above), but also in contexts where the writer might try to claim that the reference is to the buttocks as a whole, rather than specifically to the anus (what we don't seem to get is anything like "Frat Boy Takes It Up the Ass"):

Gay Soldier's Sore Ass (Honcho 2/06)
Het Hunk's Gay Ass ( Honcho 11/05)

Finally, there are rump, buns, butt, etc. (already illustrated), where in these sexual contexts a word for the buttocks is often taken to refer to its central feature.

That's a sampling from about a year's worth of these porn mags. Inside, they show and say pretty much anything. On the covers, they show as much of the merchandise as they can get away with and talk as dirty as they can get away with, saying, in effect: open me up, and we can go all the way together.

[Two notes, one on form and one on content.

On form. You'll notice how much language play there is in these teasers: puns, metaphors, rhyme, alliteration, even assonance in "Tuff Punk". I can't imagine that the readership of Torso and Honcho and their brother jerk-off mags includes an unusually large number of men appreciative of language play. Instead, I'm guessing that the writers of the teasers adopt a style also found in headlines for feature stories (we've several times had occasion to remark here on these practices in science writing), and for a similar reason: to catch the readers' attention and hold their interest, when a plain description might be bland and uncompelling. The content of most of the pieces in these magazines pretty much reduces to Guy Fucks Guy or Guy Sucks Guy, so we need details and eye-catching language.

On content. It's hard not to notice how much gay-guy-servicing-straight-guy there is in these magazines. This is a fact mostly about Torso and Honcho, which have something of a thing for straight guys, as does some portion of their readership. But I have friends who wouldn't touch these particular magazines, because they find the passion for straight guys so icky. To each his own. Fortunately, I don't have to analyze these matters any further here, since this is Language Log, not Psychology of Sex Log.]

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 06:32 PM

Pixie dust for digital divination: an open letter to Gary Flake

Yesterday I heard an interesting talk by Gary Flake here at the "Microsoft Research Faculty Summit". His title was "How I Learned to Stop Worrying and Love the Imminent Internet Singularity", and the subtitle was "Why right now is the best time in the history of the universe to be a computer scientist". You can read his slides by clicking on the link behind his title. I enjoyed the talk, including what he said at the end about his plans for Microsoft Live Labs, and I especially liked the idea that he described (on slide #34) as "Use extra resources as pixie dust". He was talking about trying to do difficult Microsoft-internal magic, like connecting research with products, but it made me think about a different hard problem, also much discussed yesterday: the "dramatic decline ... in the number of CS and IT graduates", and the "impending loss of national competitiveness". This inspired me to express, in the form of an open letter to Gary Flake, some ideas that have been circulating for a few years among computational linguists.

The world is writing itself down on the web, and those who can log and index and search and count can read it. At least, they can read it if they're literate in the mathematical languages of digital divination, and able to conjure up the right algorithmic spirits.

Search engines offer everyone a peek at the possibilities, although most people don't have the right data, or the right tools, or the right skills, to take more than a few stumbling steps down this road. Nobody has traveled more than the first few miles. But every bright kid sees where the road is going, and wants to get there. This is a fantastic research opportunity, and an even more spectacular educational opportunity.

Three things are missing: data, tools and knowledge. Grabbing an interesting chunk of the web is a significant chore, and data like query logs are only available to those with popular search engines. It's a bigger chore to create an environment where skilled divinators can easily ask -- and quickly answer -- interesting questions about the mathematical entrails of a web snapshot. And the biggest barrier for would-be researchers is acquiring a practical grasp of the divinatory arts: not just programming, but statistics, information theory, formal language theory, graph theory, complexity theory and more.

But suppose you collected some big chunks of net-stuff and connected them to the right kind of search environment -- clever indices for some things, general map-reduce analysis for others. Add a collection of interesting, accessible howtos and sample scripts for exciting problems, and throw open the gates. Academic researchers in fields like computational linguistics would rush in and start playing. Within a couple of years, this stuff would be featured in courses from high school to graduate school, in areas from linguistics and psychology to statistics and computer science. Individual students would start winning science-fair competitions with projects from this world.

The key that opens the gate is human language technology, because most of the web's meaning is in its words. You'd have to build a lot of HLT tools into the environment, and you'd have to teach people how to use them. But the ultimate object of study is not language -- though you'd certainly learn a lot about the world's languages along the way -- it's the world that language explores, describes and creates.

You could have a portable mini-version that anyone with a spare terabyte could run; a bigger system requiring a modest cluster; a huge system, hosted on somebody's servers, offering limited capabilities to the general public; a huge system, offering spectacular resources of space and time, but limited to those (students and faculty) who are invited in. There could be specialized versions: the USPTO archives; the biomedical literature; the Enron files; the Congressional Record; blogs and web forums; the complete archives of the U.S. Supreme Court.

You'd attract hundreds of thousands of of students into computational science and engineering; you'd help advance research in several fields; and you'd win a lot of friends for your company.

Posted by Mark Liberman at 10:25 AM

July 17, 2006

Presidential expletive watch

You'd think President Bush might have learned his lesson back in 2000, when a live microphone picked up his rude comment to Dick Cheney, calling New York Times reporter Adam Clymer a "major-league asshole." Now once again he's been caught swearing, unaware of a live mike in front of him. Bush was speaking privately with Tony Blair during a lunch at the Group of Eight summit in St. Petersburg, and the topic of discussion was the current violence in the Middle East. Here's how the conversation with Blair was transcribed by Reuters:

Bush: I think Condi (Secretary of State Condoleezza Rice) is going to go (to the Middle East) pretty soon.
Blair: Right, that's all that matters, it will take some time to get that together. ... See, if she (Rice) goes out she's got to succeed as it were, where as I can just go out and talk.
Bush: See, the irony is what they need to do is get Syria to get Hizbollah to stop doing this shit and it's over.

You can watch the video of the exchange as it appeared on BBC News here, or in another BBC report (with commentary from Baria Alamuddin, Foreign Editor of the Arabic newspaper Al Hayat) here.

Even though Reuters and the BBC have no qualms about reproducing the word shit, American media outlets are once again trying to figure out how to report on an expletive without actually saying or writing it. (Even the foreign news organizations are being careful: the Reuters report informs readers right away that there is "strong language in paragraphs one and eight," and the online BBC news report similarly warns, "This clip contains strong language.") So far, the Associated Press (which once managed to report on a poll it conducted on obscenities without actually mentioning any), has expurgated Bush's comments in at least two different ways:

Version #1 (via CNN, ABC News, etc.):
"See, the irony is what they really need to do is to get Syria to get Hezbollah to stop doing this (expletive)."

Version #2 (via CBS News, Houston Chronicle, etc.):
"See the irony is that what they need to do is get Syria to get Hezbollah to stop doing this s--- and it's over."

To be fair, some international news outlets have also bleeped Bush: the Australian Associated Press uses "s**t," while the Guardian follows the AP and uses "s---." But the most creative expurgation I've seen has been on the CNN website. If you click on the link in the AP article to "watch raw footage of the Bush, Blair exchange," you get a pop-up window with the clever headline, "The sh_t heard round the world." Give that copywriter a raise!

I expected to hear a bleep concealing the word shit in the accompanying CNN video, but it turns out that they chose not to censor Bush (and even provide helpful captioning for the exchange). I'm sure CNN wouldn't dare run that video unbleeped over the air [*] — even though they're a cable station and not (yet!) subject to the draconian fines that the FCC now imposes on broadcast television for even accidentally aired obscenties. For instance, affiliates of FOX may face fines of up to $325,000 because the network's microphones picked up a conversation during a NASCAR race betweeen crew chief Kevin Manion and his driver Marin Truex, in which Manion called the team's car "a piece of shit." The Manion incident led the American Family Association to issue an action alert calling for the FCC to punish local FOX affiliates for airing the obscenity. I wonder if the AFA would make the same call to arms if it was our president uttering the S-word?

(See links here for related posts, including one on Nixon's deployment of the S-bomb.)

[* Update #1: I was wrong — CNN aired the video unbleeped, complete with captioning. TVNewser has the clip.]

[Update #2: USA Today reports that CNN stood alone in airing the video unexpurgated:

CNN broadcast and posted unedited video. The New York Times and The Washington Post reported the word in Web stories. On CBS, NBC, ABC, Fox News, MSNBC and USA TODAY, the word was excised in videos and Web stories (though an audio clip with a warning at USA TODAY included it). The Times and the Post said they'd publish the word today; USA TODAY will not.
CNN spokeswoman Christa Robinson says: "The word is not one we would normally air on CNN, but when said by the president in this context, we thought it was appropriate.
"The expletive ... was reflective of how these two world leaders talk with each other."
But David McCormick, NBC News standards chief, says NBC was able to "communicate the spirit of the conversation without actually broadcasting the word."
But Bob Steele, who teaches journalism ethics at the Poynter Institute, says news organizations were right to use the word.
"You don't have to spend very long in a barber or beauty shop or on the golf course or in the locker room to hear this word," he says. "It may be tough on some people's ears, but this word is not one that is high on the scale of offensiveness."

I also received an email from the foremost chronicler of obscene language (and the prudery surrounding it): Reinhold "Rey" Aman, editor of Maledicta: The International Journal of Verbal Aggression. A constant source of amusement for Maledicta is the squeamishness of news organizations in printing "bad words" — see, for instance, the item here about how newspapers bowdlerized Tom DeLay's use of the word chickenshit in 1997. Aman's exposure of news editors' weaseling ways was the subject of a 1992 article in the American Journalism Review. The late New York Times editor Abe Rosenthal is quoted in the AJR article as saying "We'll take 'shit' from the president, but nobody else." At least his paper has stuck by this dictum by printing Bush's comments uncensored.]

[Update #3: Wonkette notes that the Washington Post went ahead with uncensored shit, yet for some reason pulled its punches recalling the Adam Clymer asshole incident, saying only that Bush "called the journalist a 'major-league ...' well, jerk."

[Final update: More on the New York Times' use of shit here.]

Posted by Benjamin Zimmer at 03:53 PM

Clunky or subtle?

In the New York Times Book Review yesterday, Brad Leithauser reviews Seamus Heaney's latest book of poetry, District and Circle, using Heaney's "rough-hewn, hand-honed" rhyming practices as a starting-off point. Leithauser characterizes these rhymes as "dissonances", "jagged, irregular pairings", harmonies that chime "clunkily" rather than "cleanly" and "have grown harsher over time". But he's not slamming Heaney's poetry, only noting that by such touches "a poet fabricates an individual, distinguishing music"; Heaney's half rhymes produce, for Leithauser, a tough, even raw, music. My own view of off rhymes (first articulated in 1976) is that they can show great artistry and subtlety. I don't think that on their own they strike listeners as "dissonant" -- often, quite the opposite. The effect they convey depends on what other poetic devices they're combined with.

Leithauser's commentary:

I sometimes think there's no more reliable way of initially entering a poet's private domain than by examining what he or she rhymes with what. Certainly, the abbreviated signature of a good many poets could be read by assembling a sample list of the end-words of their lines. George Herbert, Lord Byron, Emily Dickinson, Marianne Moore, James Merrill -- in many cases a savvy reader could, with all the quiet exultation of a code-breaking cryptographer, identify the author purely through paired rhyme-words, independent of what the poem was actually about.

Add to that company the Irish poet Seamus Heaney, Nobel laureate of 1995, whose rhymes are rough-hewn, hand-honed. Dungarees and rosaries? Whops and footsteps? Joys and tallboy? We're in Heaney country. His dissonances aren't for every poet; you might even say they're not for the younger Heaney, whose harmonies have grown harsher over time. W. H. Auden once promised his readers he'd never again rhyme an "s" sound and a "z" sound, however concordant they might look on the page (dose, rose). Similarly, late in life, Elizabeth Bishop explained to her students that although she'd once rhymed plural and singular (chests, rest), she planned never to do so again. You'll find both sorts of rhymes, as well as various jagged, irregular pairings less easy to characterize, in Heaney's new collection, "District and Circle."

What does it matter? Why should we care whether two words chime cleanly or clunkily? The issue can seem picayune -- until you recognize that it's through just such tiny touches, such minimal modifications of sound, that a poet fabricates an individual, distinguishing music.

As it happens, those two sorts of rhymes are the ones I focus on in my 1976 paper on half rhymes in rock music ("Well this rock and roll has got to stop. Junior's head is hard as a rock." in Chicago Linguistic Society 12.676-97). In the first, "feature rhyme", segments that are not identical are treated as matching for purposes of rhyme; the segments in question can be vowels (hell ~ will) or consonants (dose ~ rose, stop ~ rock). In the second, "subsequence rhyme", a truncated consonant cluster counts as matching the full cluster (pen ~ mend, rest ~ chests).

A crucial fact is that almost all the half rhymes in my data were simple instances of one or the other of these types, and that certain matches were hugely more frequent than others, to the extent that half-rhyme matching can be taken as indicating SIMILARITY IN SOUND, an idea that has been pursued by later researchers (Donca Steriade in particular), examining a wide variety of data, from many languages, and from both "art" poetry and "popular" poetry (Japanese rap lyrics, for example). There are other indirect reflections of similarity in sound -- the sounds involved in phonologically based slips of the tongue, those that are confused in mishearings, and those that match in "imperfect puns" -- but it now seems very clear that the practice of poets who use half rhymes is not a matter of failing to reach some target of "perfect rhyme" (as would be suggested by this terminology) but rather as aiming for a rather different target, that of word parts that "sound alike", without necessarily being identical.

It looks like Leithauser is judging Heaney's rhymes as crude because he thinks Heaney is choosing not to pick full rhymes and is settling (almost surely deliberately) for second-best. I'm sure that's not what's going on in the great lyrics of John Lennon and Bob Dylan, which are heavy with half rhymes, and I suspect that's not what's going on with Heaney's poems either (though I haven't read the newest ones yet). In any case, if Leithauser thinks half rhymes intrinsically convey some kind of artless rawness, he's just wrong. Maybe he should brush up on his Hopkins and his Yeats.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 03:44 PM

Not a Slip of the Tongue

Today's New York Times contains an article about the widespread criticism of remarks that Alaska Senator Ted Stevens made in opposition to a bill that would enforce net neutrality. The article is entitled Senator's Slip of the Tongue Keeps on Truckin' Over the Web. That makes it sound like he made a speech production error and chose the wrong word or mangled his syntax. That isn't what happened.

Here is a somewhat compressed version of what Stevens said that has got so many people riled up. You can download an audio recording of the whole thing here.

I just the other day got, an internet was sent by my staff at 10 o'clock in the morning on Friday and I just got it yesterday. Why? Because it got tangled up with all these things going on the internet commercially... They want to deliver vast amounts of information over the internet. And again, the internet is not something you just dump something on. It's not a truck. It's a series of tubes. And if you don't understand those tubes can be filled and if they are filled, when you put your message in, it gets in line and its going to be delayed by anyone that puts into that tube enormous amounts of material, enormous amounts of material.

It's true that he puts things in a way that many people find funny, and saying "an internet" rather than "an email" might be a slip of the tongue, but that isn't the focus of the criticism. The criticism is about his lack of understanding of how computer networks work and in particular his almost certainly false belief that the reason it took five days for an email from his staff to reach him was because it was caught up in a traffic jam caused by heavy commercial traffic. (There's a hilarious mock forensic analysis of the email here.) Just search for "Ted Stevens" and terms like "internet", "truck", and "tube" and you'll get oodles of examples. Among the ones I've read are: this, this, this, and this. The very clear theme is that Stevens' speech shows that he doesn't understand what he is talking about, which is unfortunate since he is the Chair of the Senate Committee on Commerce, Science, and Technology and as such has considerable influence on legislation affecting net neutrality and other telecommunication issues.

This might be another instance of linguification, but in this case I have to give the headline author the benefit of the doubt. The article does not give a clear explanation of what the controversy is about, so it may well be the fault of the reporter or the article editor that the author of the headline didn't have enough to go on to write an accurate headline.

Posted by Bill Poser at 03:15 PM

Is "language modeling" linguistics?

In reponse to my post about the relative public awareness of linguistics and other disciplines ("Psychology ≅ 10-100 x Linguistics?"), Vili Maunula of Lingformant wrote to point out another way to get data on this: Google Trends. You can get a comparative display -- what it shows is qualitatively consistent with the quantitative estimates that I developed earlier ("psychology" as a search term in red, "linguistics in blue"):

Meanwhile, here at the "Microsoft Research Faculty Summit 2006", I'm looking forward to Ken Church's update to his 2005 ACL presentation (Kenneth Church and Bo Thiesson, "The Wild Thing" ACL 2005). Ken, who has always had a flair for public relations, is using the catchy title "Wild Thing Goes Mobile and Local" (well, the competition is titles like "Recent Progress on Sensor Networks and Embedded Computing", or "Bioinformatics on the Windows Platform"):

Typing is a pain, especially on a phone. Suppose you want to search for “Condoleezza Rice,” but you don’t know how to spell her name. And even if you did, you wouldn’t want to type that much, especially on a phone. With the Wild Thing, the user types “c rice” or “2#7423” on the phone. This pattern is short hand for the regular expression: /c.* rice.*/ or /[abc] .*[pqrs][ghi][[acb][def].*/ on the phone. The system uses a language model, based on MSN query logs, to find the k-best (that is, most popular) expansions of the regular expression. To find hot stuff, or your favorites, you shouldn’t have to type a lot, especially if we know your location. The Wild Thing raises some interesting — and fun — technical challenges for language modeling research. What is the probability of all queries in all locations? MSN has lots of data, but they haven’t seen every query in every location. How do we smooth language models over geography?

Ken is using the term "language modeling" in the standard technical sense: "estimating the a priori probability of different character sequences". Is "language modeling", in this sense, part of the discipline of linguistics?

The logical answer, in my opinion, is "yes". We're talking about empirical estimates of the parameters of a formal model of the relative probability of different linguistic sequences, in geographical and social context, used to guess someone's communicative intent: surely this is part of the science and technology of language. More precisely, I guess, we can classify it as "applied psycholinguistics".

But the practical answer, alas, is "no". Most people who work on "language modeling" consider themselves to be computer scientists (or sometimes electrical engineers), and their credentials and their organization names generally reflect that. In 1983, I hired Ken into something called the "Linguistics Research Department" at Bell Labs in 1983 -- but he'd just gotten his PhD in "computer science", not "linguistics". And "Linguistics Research Department" is not an entry in very many corporate directories.

I'm not arguing for disciplinary exclusion -- in a logical world, there'd still be plenty of room for mathematicians and computer scientists and psychologists and sociologists to study language, and such interest should continue to be welcomed. Instead, I'm trying to make a point about disciplinary inclusion. "Language modeling" is now mostly outside of "linguistics" because of a series of historical accidents, not because this way of drawing disciplinary boundaries makes sense. The world would be better if things were different.

Posted by Mark Liberman at 02:39 PM

Trunca in Monta Sa

About 150 years ago Montana Salish began to undergo a sound change that went roughly like this: Delete everything after the stressed vowel if you want to, but you probably won't want to if there is crucial, otherwise unrecoverable grammatical information after the stressed vowel. This truncation process is variable -- both truncated and untruncated forms of many words exist in the modern language -- and it has produced variable results: although all types of words can be truncated, the change has tended to result in lexicalized truncated nouns but not verbs (assuming that Montana Salish and other Salishan languages actually have a lexical category "noun", an issue that has been hotly debated). The reason is that verbs are much more likely than nouns to have crucial grammatical information in suffixes. Speakers can and do indulge in word play that exploits ambiguities produced by the truncation process, specifically ambiguities in lexical suffixes.

Like other Salishan languages, Montana Salish has a large set of over a hundred lexical suffixes -- that is, derivational suffixes that carry lexical rather than grammatical meaning. Many or most of these suffixes have quite concrete meanings, such as `face/fire', `small round object', `ear', `mouth', and the like; many of them also have abstract meanings. Most of the lexical suffixes begin with a vowel, and most have underlying stress on the suffix-initial vowel. Because the language has only five vowel phonemes, /i e a o u/, this means that deleting everything after the stressed vowel results in extensive ambiguity: there is no formal way to tell, for instance, whether a truncated word ending in the stressed initial =́a of a lexical suffix is from =́alqs `clothes', or =́aqs `nose/road/cost', or =́asq't `sky/day', or =́aXn `arm', or any of a sizable number of other lexical suffixes.

Last week, working with Bitterroot Salish and Pend d'Oreille elders on the Flathead Reservation in northwestern Montana, I heard an elegant example of joking manipulation of truncation-induced ambiguity. We were discussing planting things in gardens, and this reminded one elder of a favorite joke told by a now-deceased Pend d'Oreille elder. She remembered that John used to say

či qsk'ʷisnq'eʔ́u t pat́aq

This sentence begins with či `I' and ends with pat́aq `potato'; the particle t marks `potato' as the object of a formally intransitive verb. The long word qsk'ʷisnq'eʔ́u is the verb, and it literally means `going to go put in' (technically, slightly simplified, qs-k'ʷis-n-q'eʔ=́u `irrealis.future-go-in-put=lexical.suffix').

Now, the lexical suffix =́u is truncated, and there are quite a few possibilities for the untruncated form. The most likely by far is =́ulexʷ `earth, ground', so that the whole sentence would mean `I'm gonna go plant potatoes'. John's joke built on one of the other possible long forms of the suffix, =́ups `tail, bottom' -- which would make the sentence mean (and here I quote John) `I'm gonna go stick a potato up my butt'.

The elders assure me that the old Indians used to have a lot of fun with these shortened forms. They also tell me (as I'd suspected) that John and his late wife were the last married couple to speak Montana Salish together at home, and that the few remaining speakers no longer use the language except when they happen to get together, notably at regular elders' meeting at the Culture Center and at our weekly summer language sessions. They don't have to tell me that (with one exception, a man of about 50) the youngest fluent speakers are now in their 60s, or that there are now only about forty fluent speakers left. So playing with the humorous possibilities of truncated lexical suffixes is just one of the community's linguistic talents that will soon be gone forever.

Posted by Sally Thomason at 01:05 PM

Complex Rendering in Linux

Mark is regrettably right about the rendering problems in Firefox, but it is worth noting that other Linux browsers get it right. Konqueror handles Mark's Bengali example just fine, as well as the Tamil and Urdu text I've been working on recently. Tamil presents the same kinds of differences between phonological and graphical order that Bengali does, and Urdu requires right-to-left rendering as well as positional variants and ligatures. Galeon, a browser I like a lot, renders Tamil and Urdu correctly but not Bengali. I don't know why that is.

Posted by Bill Poser at 02:57 AM

July 16, 2006

Compounding the insults

With one or two exceptions, I've been pretty pleased with the attention people have given to my new book Talking Right: How Conservatives Turned Liberalism into a Tax-Raising, Latte-Drinking, Sushi-Eating, Volvo-Driving, New York Times-reading, Body-Piercing, Hollywood-Loving, Left-Wing Freak Show. The title in particular has been very good to me -- whatever else the reviewers seize on, almost all of them find a word of praise for the subtitle. In fact I suggested to my publisher that they should run an ad consisting of nothing but blurbs like:

"The most descriptive subtitle of the year" -- William Safire, New York Times.

"Its title makes its point before you read the first page." -- Kevin Drum, Mother Jones
"The title alone should get everyone thinking." Dr. Forbush on Daily Kos

But while I've certainly never been one to look a gift hosanna in the mouth, honesty compels me to say that the real kudos for the title is due to an unknown copywriter for the Club for Growth, and more generally to the rhetorical stroke that has enabled the right to capture, not just the political vocabulary of the English language, but one of its major word-formation processes in the bargain.

The subtitle was adapted from the ad that the Club for Growth ran during the run-up to 2004 Iowa caucuses, when Howard Dean was still the front-runner for the Democratic presidential nomination. An announcer asks a middle-aged couple leaving a barbershop what they think of "Howard Dean's plans to raise taxes on families by $1,900 a year." The man responds, "I think Howard Dean should take his tax-hiking, government-expanding, latte-drinking, sushi-eating, Volvo-driving, New York Times-reading ..." -- and then his wife picks up the litany -- "... body-piercing, Hollywood-loving, left-wing freak show back to Vermont, where it belongs."

It was a clever ad -- on the progressive site Alternet, in fact, Daniel Kurtzman called it the best conservative ad of the campaign. What made it cute was precisely the demographic mishmash it brought to mind. I had this image of Marilyn Manson sitting in a rocker on the porch of his home in Brattleboro with the New York Times, and laughing so hard at Maureen Dowd's column that he chokes on his unagi cone. But absurd or no, it neatly exemplified the pot-pourri of traits that conservatives have used to brand liberals as out-of-touch and pretentious weirdos -- and by-the-by, the syntax that made the branding possible.

The fact is that the right owns those object+present participle compounds, as surely as it owns values, media bias, the lapel-pin flag, and sentences that begin with "See...." In fact you could trace the whole history of the right's campaigns against liberals via those compounds -- from tree-hugging and NPR-listening back through the Nixon era's pot-smoking, bra-burning, draft-dodging, and America-hating, until you finally excavate the crude origins of the trope in nigger-loving, the ur-denunciation of white liberal sentimentality.

Of course there's no intrinsic reason why the right should have a monopoly on those compounds. Back in the day, people played just as fast and loose with stereotypes in depicting poor white Southerners as cross-burning, Bible-thumping, sibling-shtupping primitives -- not just Northern liberals, but white-shoes Republicans and "genteel" Southerners, too. You still see this sort of thing coming from liberals from time to time -- writing in the Chicago Sun-Times just after the 2000 election, William O'Rourke described Bush's America as "Yahoo Nation":

It is a large, lopsided horseshoe, a twisted W, made up of primarily the Deep South and the vast, lowly populated upper-far-west states that are filled with vestiges of gun-loving, Ku-Klux-Klan sponsoring, formerly lynching-happy, survivalist-minded, hate-crime perpetrating, non-blue-blooded, rugged individualists... which contains not one primary center of intellectual or creative density.

But nowadays that sort of talk is kept alive chiefly by conservatives who never tire of reminding the good people of the heartland how much contempt liberals have for them. In her book Shut up and Sing, for example, Laura Ingraham writes that "mocking the pickup-driving, tobacco-chewing, shotgun-owning South is one of the elite rites of passage." And the Washington Times' Greg Pierce, while avoiding the construction itself, writes that "To [Howard Dean's supporters], America's red states are populated by ignorant cowboys, unwashed swampies, hellfire preachers, beauty parlor bimbos, redneck sheriffs, Confederate flag wavers and retarded hillbilly kids sitting in trees playing the banjo."

But actually liberals rarely talk this way. On the Web, Volvo-driving liberal outnumbers pickup- or truck-driving conservative by around 50 to 1, and when you do encounter a phrase like beer-guzzling redneck it's almost always offered either as a conservative caricature of liberal speech or in the spirit of a reclaimed epithet (as in, "...and proud of it, son!" In fact the word redneck turns out about 20 times more likely to appear in the pages of National Review or The American Spectator than in The American Prospect or The Nation, almost always set in the mouth of some imaginary liberal.

Whatever they privately believe, most liberals know that this sort of culture-stereotyping is counter-productive for the left, not just because it puts them on the wrong side of the faux-populist divide, but because it excludes from consideration the bowtie-wearing, port-sipping Yalies who are sitting around the National Review office cooking this stuff up in the first place. And even when they restrict themselves to purely political attributes, liberals can't really use those cadences nowadays without implicitly acknowledging the right's ownership of them. In the course of praising the cleverness of the Club for Growth ad, for example, Kurtzman suggests that liberals might think of responding with an ad "telling Bush to take his deficit-creating, war-mongering, gas-guzzling, corporate criminal-coddling, election-stealing, Rush Limbaugh-listening, civil liberty-seizing, Bible-thumping, right-wing dictatorship back to Texas, where it belongs." But that comes off as nothing more than a strained tribute to the right's mastery of this syntax, in something like the way anti-war Democrats' "lie and die" seems to validate the right's "cut and run" as the basic pattern for Iraq War sloganeering.

The great rhetorical achievement of the right, as I argue in the book, is to have reformulated distinctions of class as bogus differences in consumer culture. So it makes sense that conservatives should seize on the object+participle construction, whose function to turn activities into attributes -- politically speaking, that is, you are what you do (or more accurately, what you drive, drink, or otherwise consume). Whereas when people on the left are of a mind to make sweeping generalizations, they tend to draw the distinction characterologically rather than culturally, which is why they favor extended bahuvrihi compounds like narrow-minded, hard-hearted, and mean-spirited.

I'll grant you that the partition of the morphology will never be absolute -- the left will always own war-mongering and it's unlikely the right will ever let go of limp-wristed, for example. But it's a sign of how polarization sets people at cross-purposes: their "they" is not our "we," and vice-versa.

Added 7/17: Ben Zimmer drew my attention to a weird eructation of obj-participle compounds by Rep. Louie Gohmert (R-TX) during last month's House debate on the Iraq War resolution. After suggesting that Jack Murtha would have wanted to withdraw troops from WWII if he had been around, Gohmert said:

Who would really be helped would be ruthless, heartless, finger-detaching, hand-removing, throat-slashing, decapitating, women-raping and -abusing, child-misusing, corpse-abusing, merciless, calloused, deranged, religious zealot, murderers who think they are going to get virgins in the next life, but may find they are the virgins with what happens to them.

What makes this so bizarre, I think, is that stringing these compounds together is usually a comic trope: it's as if Gohmert has got his modes confused, describing the Iraqi insurgents in cadences that most of his ideological soulmates would find more appropriate to describing Cape Cod liberals.

Added 7/17:In a posting called "Fact-dodging Geoff Nunberg" on Kalebeul, Trevor has me claiming that the object-present participle form of compounding is a recent invention of the political right and originates with nigger-loving. Another liberal intellectual canard, says he: in fact these compounds actually go back to Shakespeare!!!!! If Trevor owned a copy of Marchand he'd know that the pattern actually goes back to Middle English -- there are cites for seafaring from 1200, for example. If he owned a dictionary he'd know the difference between a construction and a trope. (PS: Trevor has responded, but apparently still hasn't looked up either word. RTFM.)

Posted by Geoff Nunberg at 04:42 PM

Linguist thought able to read isn't

A grim story, but what a great garden path headline!

Doctor Suspected in Town House Collapse Dies

I was led down multiple garden paths with this one. First, I thought that the doctor suspected something ...

Then I came across the "in", and tried again, this time thinking that it's the beginning of a relative clause, and the doctor is suspected (by someone) to be in the town house ...

Then came "collapse", and it took me the longest time to figure out that it wasn't a typo ...

But I finally figured out that it was the town house that had collapsed, that the doctor had been a suspect in the collapse, and that the doctor had died (as it turns out, as a result of the collapse) ...

Phew! Who'd have thought it'd be so hard to parse a simple headline?

Other mentions of garden path phenomena on Language Log:

Garden paths at the Guardian (9/21/2004)
Blinded by content (6/4/2005)
Burnt offerings (9/30/2005)
Surprising crocodile kin (1/26/2006)

[ Comments? ]

[ Update, 7/17 --

Maryellen MacDonald, a renowned authority on garden path sentences (and psycholinguistics more generally), offers some useful clarification to my post:

The headline Eric cites contains two syntactic ambiguities that I and many other psycholinguists have studied. The first one is the most widely-investigated in sentence processing research, typically called the main verb/reduced relative ambiguity, in this case whether "thought" is the main verb of the current clause or is the start of a reduced relative clause, where "reduced" here refers to the omission of the optional complementizer (who, that, which) and auxiliary verb, as in "Doctor who is suspected..." The second ambiguity concerns the structure of NPs, specifically whether a particular N is the head of the NP or a pre-nominal modifier of some upcoming head N. In the headline, it turns out that the third N (collapse) was the head, but Eric initially thought that "house" was the head and "collapse" was a verb. This ambiguity is less extensively studied, but it's very useful for illustrating how pervasive the task of syntactic ambiguity resolution is--every noun a comprehender encounters admits this head/modifier ambiguity, and unless they're hermits, people encounter thousands of nouns a day.

Psycholinguists who study syntactic ambiguity resolution have made a lot of progress in the last few decades understanding why people, constantly bombarded with ambiguity, typically get to the intended interpretation of an utterance with no conscious awareness that they had any ambiguity to resolve, and they only rarely encounter a garden path. One view most closely identified with UMass linguist Lyn Frazier (who gave the field the term "garden path" as well as many other insights) is that people initially adopt the syntactically simplest interpretation (e.g. as in Eric's first tree) and are garden pathed (yes, we make the term into a verb) if a more complex structure is needed. An alternative view often called "constraint-based ambiguity resolution" holds that people rapidly and unconsciously converge on the correct interpretation by weighing lots of information, in part concerning how the words and word combinations have been used in past experience. A garden path often occurs when the usage in the ambiguous sentence is unexpected given past usages of words and phrases.

For the "doctor suspected" ambiguity, some relevant information includes how often "suspected" is used in the active vs. passive voice and how plausible it is for a doctor to suspect vs. be suspected (and many other constraints). One source of the garden pathing here is that doctors are people who do a lot of suspecting (e.g., about the cause of someone's illness), and unlike, say, mobsters, they are not highly frequent suspects themselves. For the head/modfier ambiguity, some of the relevant information includes the frequency with which a noun has previously served as a head vs. a modifier, the frequency of a collocation (e.g. town house), and the frequency with which a word is used as a noun vs. a verb. Part of the difficulty in the headline comes from the fact that "collapse" is highly frequent as a verb, which favors the interpretation that "house" is the head of the NP.

Psycholinguists have observed before that the truncated style of headlines makes them a good source of garden paths. Perfetti and colleagues, in a 1987 Journal of Memory and Language article [see citation below--EB], have a trove of amusing ones, including PENTAGON PLANS SWELL DEFICIT; TEACHER STRIKES IDLE KIDS, and TORONTO LAW TO PROTECT SQUIRRELS HIT BY MAYOR.

Perfetti, C.A., S. Beverly, L. Bell, K. Rodgers, and R. Faux. 1987. "Comprehending newspaper headlines." Journal of Memory and Language 26, pp. 692-713.

-- end update. ]

Posted by Eric Bakovic at 03:15 PM

Kudos to Microsoft

This is embarrassing. Among other things, I've recently been working on a Bengali morphological analyzer, and so I've been doing a lot of looking at Bengali text on line, like here, and a little bit of creating Bengali html text for display. Bengali has some complex rendering issues typically of South Asian scripts. Here's a sample, from the Bengali Wikipedia's "Bangla script display help" page:

The following image shows you how a correctly enabled computer will render the Bangla script:

The following line of text shows how your computer renders the above line:
ক + ি → কি

The Unicode Bengali code chart (or here, in html form) will tell you that this involves 0995 "Bengali letter KA" plus 09BF "Bengali vowel sign I (stands to the left of the consonant)". Put them together, in the logical order "KA I", and they spell the syllable /ki/ -- but the "vowel sign I" is supposed to be rendered first, even though it's second in the character sequence. When I look at the results in Internet Explorer 6, the sequence is rendered as it should be, but after a modest amount of trying, I can't get it to work in Firefox. (As complex rendering issues go, this one is pretty simple -- but that doesn't mean that you can count on it to be handled correctly.)

In current Microsoft software, most of this stuff mostly works, thanks to people like Michael Kaplan, who has an always-interesting blog called "Sorting It All Out", about internationalization issues construed broadly. My experience has been that Microsoft, though slower than I would like, has been generally better about dealing with such things than any of the other players whose software I've had occasion to use. Since this blog has sometimes complained about MS software for one reason or another, it's only fair to offer kudos where it is deserved, and I hereby do so, as I've meant to do for several months. Praise is due not only to the individuals who have made the software work, but also to the corporate commitment that gives them the mandate to do it.

So why is this embarrassing? Because I've finally gotten around to posting this praise for Microsoft just as I'm about to head to Seattle for the "Microsoft Research Faculty Summit 2006". I hope you believe that it takes more to buy my good opinion than two days of listening to inspirational talks and eating hotel food. Well, there's also the "dinner cruise from Lake Washington to Puget Sound"; maybe that's what tipped the scales :-).

[Update -- this morning Patrick Hall wrote:

I was interested to see your post on a wandering Bengali vowel signs in Firefox ("matras," I've learned that the vowels are called "matras," or at least they are in Hindi). I've wondered about that same problem. I've got a rather amorphous, ranty blog post on the topic here:

http://ruphus.com/blog/2005/08/01/font-problems-with-hindi-in-firefox/

But hey, it has illustrations. And it wouldn't be a blog post if it weren't amorphous & ranty. :-)

It seems that this particular bug is a long-standing problem in Firefox (there's a link to a Bugzilla thread in my post). A guy named Simos, who I believe is involved in Gnome internationalization, left a comment on the post as well, explaining that the Pango font rendering system still isn't fully integrated into Firefox. It's worth noting that on Linux at least, Pango-based applications like the gedit text editor handle the matras correctly.
Definitely complex stuff, and I couldn't agree more that the Microsoft i18n folks are to be congratulated.

For those readers who may not be clued in, "i18n" stands for "internationalization", the "18" representing the 18 letters left out between "i" and "n". This afternoon, Patrick added:

I ran across a couple more links that may be relevant to your Bangla woes:

Bengali character picker at w3c.org
http://people.w3.org/rishida/scripts/pickers/bengali/

I was playing around with this and came to the rather disturbing realization that if you enter the characters in in the incorrect order ( "i + ka" instead of "ka + i" in your example), they render "correctly." This strikes me as all kinds of bad.
Known Problems of FireFox in Bangla
http://www.ekushey.org/projects/mozilla/firefox/known_issues.htm

This one explains how (on Linux at least), the Pango font rendering can be compiled into Firefox (but isn't). There are screenshots that seem to show that this resolves the matra problems you describe.

Yes, in a post more than two years ago ("Them old diacritical blues again", 3/21/2004) I expressed frustration that in order to get Unicode combining underdots to work write in Mozilla, I had to put them in the string in the wrong order. IE got it right then (at least with a suitable font), Mozilla got it wrong, and nothing has changed since. Hey guys, we're supposed to accelerating towards a cultural singularity, not sitting in an i18n fixed point...

And Kerim Friedman writes:

It is worth noting that on a Mac, Safari handles devanagari scripts just fine, even though (alas) Firefox has yet to solve these problems. Safari is based on KHTML (used in Konqueror on Linux), unlike Firefox which is based on the Gecko engine. Devanagari scripts are also handled fine in all built-in OS X applications.

I filed a bug with Firefox about this over a year ago, and it is unfortunate that they have yet to fix the problem.

Interestingly, the new online "Writely" word processor (recently bought by Google) seems to be able to work with these scripts in Firefox, even though Firefox has the problems you mention.

While I've been impressed by Writely in general, this particular feature doesn't work for me in Writely in Firefox on Windows XP (although it works in Writely in Firefox on a Mac with OS X), even if I use a font that renders Devanagari-derived scripts correctly in MS Word or in the excellent (and free software) Abiword word processor. This kind of inconsistency across operating systems and platforms is puzzling and annoying. Abiword, by the way, does the right thing in all the cases that I've checked. Since it's happy to read and write UTF-8 text, I've been using it for general multilingual editing. Why Firefox still can't consistently handle the display side correctly is not clear to me. ]

[Several people have written to me with messages like this one from Chung-chieh Shan:

Perhaps because my Debian "unstable" system has Pango 1.12.3 installed, and Debian enables Pango support in Firefox, the Bengali text here does work for me. (Screenshot attached.)

I gather from this and other emails that part of the problem here is due to ambiguity about which parts of the rendering problem should be solved in an individual application, versus in a (shared library associated with the) windowing system, versus in the underlying operating system itself. Quite a few people wrote to me to complain that I shouldn't blame this rendering problem on Firefox, since the problem is really that Windows and Mac OS X are not behaving as Firefox expects them to (except when Firefox is compiled in a certain way, maybe). If I believed this, though, I'd have to believe Microsoft's argument that integrating applications such as browsers into the OS is a good thing for users. Firefox is my standard browser (except when I'm looking at pages in scripts with non-trivial rendering), but the LONG time it's taken the folks at mozilla.org to get their act together on this point does seem to ratify the arguments Microsoft used against Netscape back in the day.]

Posted by Mark Liberman at 05:02 AM

July 15, 2006

More "self" talk, from country crooners to city slickers

Back in November I wondered in two posts about the origins of the jocular expression of self-address, "So I said to myself, 'Self...'" It turned out to be a surprisingly difficult task to trace the history of this turn of phrase. Despite many people's vague recollections of hearing it used by various comedians of the '50s and '60s, I was originally only able to find examples in print from the '80s on. The first breakthrough was made by Lance Nathan, who found an example of self-talk in the song "Play Something Sweet (Brickyard Blues)," composed by Allen Toussaint in 1973 and recorded by Three Dog Night the following year. Now two country songs have emerged that take the usage firmly back to the 1960s, and another lead suggests that it might go back to the 1940s, in the comic performances of Spike Jones and the City Slickers.

The earliest example of self-talk I've found so far is from the 1965 song "I Had One Too Many," composed by Lee McAlpin and performed by The Wilburn Brothers on their booze-soaked album I'm Gonna Tie One On Tonight. Here are the relevant lyrics (which can be heard in this audio clip):

Drove all around downtown, I'll admit it.
And I said to myself, "Self, take a left, that'll git it."
Then a big Cadillac cruised down the street, and I hit it.
And that really did it.
I had one too many when one was plenty for me.

Don Blaheta directed my attention to another country song using the expression a few years later, Dolly Parton's "I'll Oil Wells Love You," on her 1968 album Just Because I'm A Woman. Dolly sings (in words co-composed with her uncle Bill Owens):

I met a man in Texas,
And oh, he was so fine,
And I said to myself,
"Self, I'm gonna make him mine!"

So now we know that the use of vocative self in a jokey interior dialogue was established in country-and-western songwriting circles by the mid-'60s. This suggests that it stems from traditions of comic storytelling, perhaps originating in the rural South. This sort of oral usage is difficult to document, unless it happens to have made its way onto recordings of comedians who adopted the expression. Correspondents have recalled hearing self-talk from standup comics as varied as Brother Dave Gardner, Flip Wilson, Tommy Smothers, and Bill Cosby, but so far I haven't had any luck finding any recorded examples of the expression from them.

I did get a good tip from Tony Delgado, who pointed me to "Moose Turd Pie," a tall tale recorded by Bruce "Utah" Phillips on his 1973 album Good Though! It's worth hearing in its entirety (audio available here, full text here), but the relevant bit goes as follows:

I looked down at that meadow wafer,
And I said to myself,
"Self, I'm gonna bake up a big moose turd pie."

The story is said to have folkloristic roots predating the Utah Phillips recording, so this got me to thinking that perhaps the "Self..." expression was disseminated along with the tall tale in its various retellings. As it happens, there's a "Moose Turd Pie" expert out there: Saul Broudy, whose 1982 dissertation at the University of Pennsylvania was entitled, "The Effect of Performer-Audience Interaction on Performance Strategies: 'Moose-Turd Pie' in Context." I contacted Broudy, who in turn got in touch with Utah Phillips himself. As Broudy reports:

He told me he first heard the "self" bit done by Spike Jones & the City Slickers at the Hippodrome Theater (a vaudeville house) as a child in his home town of Salt Lake City around 1946 (Phillips' stepfather distributed films etc to theaters in the Intermountain area, so it was natural he would attend such venues).

A vaudeville origin was actually one that I had originally considered, and Spike Jones and his crew seem as likely a propagator of the expression as any. As Thomas Pynchon wrote in the liner notes to Spiked! The Music of Spike Jones, the City Slickers satirized hillbilly music and dialect humor but were themselves mainly from rural backgrounds:

Early on, in '43, in a Radio Mirror interview, Spike described his band as "a subtle burlesque of all corny, hill-billy bands." A great many of these City Slickers who were so hep to the jive had in fact themselves come originally from out in the middle of America, places like Thief River Falls and Oilton and Muncie -- the Nilsson Twins hailed from Wichita, Sir Frederick Gas from Kansas City, George Rock from Farmer City, Illinois, and Spike himself from the farm and railroad environment of California's Imperial Valley.

So perhaps one of the City Slickers carried self-talk from somewhere in the heartland, then brought it to a popular audience via Spike Jones' radio appearances and stage performances. We may never know the origin of the "Self..." expression, but one thing is clear: it's as American as mom and apple pie.

[Why stop at the 1940s when we can take things all the way back to the first century? Jeff Russell writes in with a Biblical forerunner:

"And he spake a parable unto them, saying, The ground of a certain rich man brought forth plentifully: And he thought within himself, saying, What shall I do, because I have no room where to bestow my fruits? And he said, This will I do: I will pull down my barns, and build greater; and there will I bestow all my fruits and my goods. And I will say to my soul, Soul, thou hast much goods laid up for many years; take thine ease, eat, drink, and be merry. But God said unto him, Thou fool, this night thy soul shall be required of thee: then whose shall those things be, which thou hast provided? So is he that layeth up treasure for himself, and is not rich toward God."
(Luke 12:16-21, King James Version)

In the original Greek, that would be: "και ερω τη ψυχη μου ψυχη εχεις πολλα αγαθα...", or in transliteration: "kai ero te psuche mou psuche echeis polla agatha..."]

Posted by Benjamin Zimmer at 04:17 PM

I was (probably) wrong

I may very well have been wrong when I suggested that "drop ceiling" is derived from "dropped ceiling", in the way that "ice cream" derived from "iced cream". Michael Robinson wrote:

I never thought that "drop ceiling" had any relationship at all to "dropped ceiling". I've always assumed that "drop" was used in the same sense as it is used in theatrical scenery, where a "drop" is something that is lowered down from the space above the stage. You talk about a "drop curtain", never a "dropped curtain". A "drop curtain" is a curtain that's a drop.

"Drop ceilings" are not very common on stage, but not unknown, as when representing a very confined space such as a ship's cabin.

At any rate, it seems to me that the usage of "drop ceiling" is influenced by "drop curtain".

In contrast, I've always assumed -- without thinking about it -- that a "dropped ceiling" is a false ceiling that has been "dropped" or "lowered" relative to the true ceiling, and a possible connection to theatrical drops never occurred to me. Whatever the derivation, it's certainly the recency illusion that made me think that "drop ceiling" was an innovation. People have been using both forms for at least a century, and the earliest citation I could find is to "drop", not "dropped".

In the December 1901 issue of Harper's Bazaar, the author of the "Household Decoration" column advises "An Admirer":

Use a pale but decided cream shade in window draperies, with white shades between the sash and long curtains. Pure white would be harsh with the mellow tones already planned. Arrange the writing-table between the bay-window and the bookshelves: the big Davenport in the northeast corner, the mahogany sofa in the corner southeast, and that superb table in the centre of the room. Choose cream Brussels net curtains for the reception-room, edged either with pointe Arabe or Renaissance lace. Why not vary the walls here by using an Empire paper in green tones with rococo or pure Empire festoons, in which the least bit of rose color occurs? Have a drop ceiling of deep cream, divided from the paper itself by a narrow gilt moulding. Your Empire furniture would show to great advantage in such a setting, and the half-defined rose shade suggested would answer some tone in the mahogany as well as in the damask covering, which is usually an accompaniment of Empire furniture.

On the other hand, the same column in the September 1909 issue tells "M.T.":

It is very gratifying to hear that the suggestions I gave you last year have been carried out and have pleased you so much. I am sending a sample of tan paper for your living-room, and with it would use dull bronze green and brown furnishings. Cover your couch with either a bronze green or brown -- not with bottle green. It will be necessary to be very careful in your selection of just the right tone of green. [...]

It would be very attractive to use the same paper in both of your rooms, with bronze green for a contrasting color in one room, and a greenish brown for a contrasting color in the other room, but if you wish to have them different, the green paper of which I sent you a sample before will be beautiful in the sitting-room. Here, too, I would confine the colors to brown and green so that the two rooms will combine harmoniously. Inner curtains of bronze-green silk would be lovely in the living-room, and of light brown in the sitting-room. I would use dropped ceilings again if you think they make the rooms look in better proportion.

The earliest use I could find in the NYT (March 2, 1913, "Spring Styles in Wall Papers") has "drop":

In some rooms drop ceilings are still used, for it gives the sense of largeness. As the decorators say, "It spreads the room."

But Diana Rice, 10/16/1921 (p. 83) "Here is Finest Rural School", has "dropped":

The modified Spanish style of architecture, the many windows, the broad, low entrance steps and wide doors present a pleasant and encouraging friendliness to the chance visitor. Pushing into the rough-plastered vestibule and thence into the roomy halls, with their deep cream dropped ceilings and soft green walls, inviting vistas through low, graceful Moorish arches of quiet classrooms, study halls and library, lure the visitor on.

By the 1930s, real-estate ads are using "dropped ceiling" and "drop ceiling" promiscuously. The OED, alas, offers no information.

Of course, the pronunciation of -ed in this context is just [t], and the phonetic difference between [pts] and [ps] is not a very salient one, especially in fast speech. Partly for this reason, syllable-final /t/ and /d/ are often deleted in English. Much of the long history of research on this topic is summarized in Chapter 5 of Andries Coetzee's 2004 dissertation:

In English, a coronal stop that appears as last member of a word-final consonant cluster is subject to variable deletion – i.e. a word such as west can be pronounced as either [wɛst] or [wɛs]. Over the past thirty five years, this phenomenon has been studied in more detail than probably any other variable phonological phenomenon. [...]

The factors that influence the likelihood of application of [t, d]-deletion can be classified into three broad categories: the following context (is the [t, d] followed by a consonant, vowel or pause), the preceding context (the phonological features of the consonant preceding the [t, d]), the grammatical status of the [t, d] (is it part of the root or is it a suffix). The contribution of each of these three factors can be summarized as follows: (i) The following context. [t, d] that is followed by a consonant is more likely to delete than [t, d] that is followed by either a vowel or a pause. Dialects differ from each other with regard to the influence of following vowels and pauses. In some dialects, a following vowel is associated with higher deletion rates than a following pause. In other dialects this situation is reversed – i.e. more deletion before a pause than a vowel. (ii) Preceding context. In general, the more similar the preceding segment is to [t, d], the more likely [t, d] is to delete. Similarity has been measured in terms of sonority (higher deletion rates after obstruents than sonorants), but also in terms of counting the number of features shared between [t, d] and the preceding consonant. (iii) Grammatical category. Generally speaking, [t, d] that is part of the root (in a monomorpheme like west) is subject to higher deletion rates than [t, d] that functions as a suffix (the past tense suffix in locked).

In the case of the many paired two-word phrases with and without the -ed suffix on the first element, this phonological variation intersects with morphological and semantic variation. In my earlier post, I gave the list ice(d) cream, skim(med) milk, ice(d) tea, wax(ed) paper, roast(ed) beef, shave(d) ice, cream(ed) corn, whip(ped) cream, and Barbara Zimmer wrote to suggest adding screen(ed) porch and steam(ed) crabs. Is it a porch that has been screened in -- a screened porch, like a covered wagon -- or a porch whose walls are made of screens -- a screen porch, like a stone house (or a screen door)? Either formation is consistent with the broader syntactic and semantic patterns of English, and with the forms and meanings of the particular words involved. And either phrase could easily be misperceived for the other one, given both the low phonetic salience of the pronuncation difference at best, and the general tendency for that difference to be omitted in speaking. So it would make sense to find word histories in which X Y turned into X-ed Y, as well as histories that go the other way.

[By the way, if you have access to the Proquest American Periodicals Series Online, or to a library with old issues of Harper's Bazaar, I can recommend the experience of reading a half dozen or so of the Household Decoration columns from the first decade of the 20th century. As you assimilate those idealized descriptions of upper-middle-class living spaces in the time of Theodore Roosevelt and Edward VII, I predict that you'll soon feel an intense need for free verse, cubism, and Freudian analysis, even if you're normally as immune as I am to the attractions of Modernism.]

[Update 7/16/2006: Dan Bruno points out that "old fashion(ed)" should be added to the list of common pre-nominals that alternate between V+ed and N forms, as in these examples.]

Posted by Mark Liberman at 08:08 AM

July 14, 2006

"Mind like parachute - only function when open!"

Responding to our recent series of posts on the fashion for invented "Chinese proverbs", Stefano Taschini has written to remind us of another historical source of "oriental wisdom", namely the sayings of Charlie Chan. As evidence of on-going influence, Stefano points us to the collection of "Chan Bytes" (in .wav format) at charliechan.net, and to an online compendium of "The Complete Sayings of Charlie Chan", which offers "in alphabetical order, ... every aphorism, or saying, totaling nearly five hundred, as stated by Charlie Chan in all of his forty-four movies in the film series proper. Also included are two maxims uttered during Mr. Chan's speech to movie audiences in favor of the passage of a 1935 Pennsylvania referendum measure." And finally, Stefano cites Howard Berlin's book, "Charlie Chan's Words of Wisdom", which bills itself as "A collection of 600 proverbs spoken by the cinema's inscrutable Oriental detective".

In the case of Charlie Chan, much of the fun arises from combining intrinsically contemporary references into the form of traditional sayings: "Mind like parachute - only function when open!" But Stefano is right to suggest that the currrent fashion for appeal to the wisdom of the east is nothing new. In fact, references to the authority of (often fictional) exotic ancients has been a theme of western culture since Plato.

In other news, I've selected my own pearl of ancient Chinese administrative wisdom (though mine of course is genuine, issues of translation and interpretation aside). To the extent that life presents me with executive challenges, I aspire to the model of Dao De Jing 37: 道常無為，而無不為。 (dào cháng wú wéi, ér wú bù wéi.) -- "The Way takes no action, but leaves nothing undone."

[David Eddyshaw wrote in to register a vote in favor of Confucius, Analects 2.12, 子曰、君子不器。, "The accomplished scholar is not a utensil" (though David prefers the punchier but sexist translation "A gentleman is not a pot"). This is certainly an excellent pearl of wisdom for teachers, but I doubt it will have much appeal among the executive classes. ]

Posted by Mark Liberman at 08:51 AM

Ceiling tiles dropped, also morpheme

As a small linguistic footnote to the recent tragedy in Boston, I've learned that we can add drop(ped) ceiling to the list of words like ice(d) cream, skim(med) milk, ice(d) tea, wax(ed) paper, roast(ed) beef, shave(d) ice, cream(ed) corn, whip(ped) cream, where a phrase of the form [V+ed N] becomes lexicalized without the -ed. In yesterday's NYT, Pam Belluck wrote ("Wide Flaws Found in Boston Tunnel After Death", July 13, 2006):

Mr. Amorello said the concrete tiles formed a drop ceiling and were attached to the roof of the tunnel with metal tiebacks. The tiebacks were affixed to the roof with bolts and epoxy glue. One of the tiebacks detached, causing a section of five three-ton tiles to crash down on the Del Valles’s car. [emphasis added]

In fact, I'm way behind the curve on this one, since a quick web search shows that "drop ceiling" is already about four times commoner than "dropped ceiling":

	Google	Yahoo	MSN
"drop ceiling"	249,000	203,000	56,476
"dropped ceiling"	57,000	48,300	15,710
drop/dropped ratio	4.4/1	4.2/1	3.6/1

Posted by Mark Liberman at 06:16 AM

July 13, 2006

Life would be more livable if there were any chains left to bust

The wires are burning up, here at Language Log Plaza, with information about fake Chinese sayings. Email from Goh Eng Cher provided a link to research by Stephen E. DeLong and Keith Henson, indicating that the "ancient Chinese curse" usually rendered in English as "may you live in interesting times" was introduced into Western discourse in a science fiction story published in 1950. Specifically: Eric Frank Russell, writing as Duncan H. Munro, "U-Turn", Astounding Science Fiction, April 1950, p. 137. DeLong also quotes an email from Mauricio Diaz, who asserts that Carl Jung discusses the same phrase in his 1931 introduction to Richard Wilhelm's German translation of The Secret of the Golden Flower: A Chinese Book of Life. However, DeLong searched the English translation of the relevant book without finding any support for this attribution. And DeLong was also unable to find any evidence that this "ancient Chinese curse" was ever actually used by the ancient Chinese.

With respect to the 1950 Russell story, DeLong writes:

The main character of "U-Turn," Mason, has opted for assisted suicide to escape a regimented life in which Venus and Mars are civilized, life on the Moon is spent safely underground, and wild animals in Earth's jungles are as harmless as if they were artificial. We learn at the end of the story that Mason has correctly surmised that the death chamber to which he voluntarily goes is actually a Star Trek-like transporter which will irreversibly send him where he really wants to go -- to the current human frontier, Callisto, one of the moons of Jupiter -- assuming he is among the small fraction of those who survive the dissociation and reassociation process of the device. But before that, while one of the bureaucrats processes his "death wish," Mason complains about the order, regulation, and control under which everyone is forced to live:

For centuries the Chinese used an ancient curse: "May you live in interesting times!" It isn't a curse any more. It's a blessing. We're scientific and civilized. We've got so many rights and liberties and freedoms that one can yearn for chains for the sheer pleasure of busting them and shaking them off. Reckon life would be more livable if there were any chains left to bust.

Well, as the Aspen quotations remind us, there are always the chains of truth.

Eng Cher, who is a Chinese Singaporean, writes:

It is definitely not something that I heard my family members using as a curse (it won't be a blessing!) while growing up. Even when the younger generation use this phrase nowadays, it is seen as a Western phrase.

In my experience, the usual modern American reason for talking about the "ancient Chinese curse" is to suggest that unpleasant periods of uncertainty and change are the price we pay for moving forward. In post-war America, where this "traditional Chinese" saying seems to have germinated, we've always been ambivalent about novelty. Is it a symptom of progress, bringing us longer and better lives with more and better things, both tangible and intangible? Or have we paved paradise and put up a parking lot?

The view of novelty as disaster rather than progress, which Russell attributed to the Chinese and their allegedly hydraulic empire, also lies behind the traditional (?) Spanish valediction often quoted and used in Patrick O'Brian's sea stories: "Que no hayan novedades". O'Brian translates this as "May no new thing arise", as in the following passage from The Wine-Dark Sea, p. 130. The Surprise is in the port of Callao, and Stephen Maturin is looking for a colleague.

It was high tide on the dusty strand, and as Stephen walked up it towards an archway in the way a gritty cloud swept across from the earthquake-shattered ruins of Old Callao. When it had cleared he saw a group of ill-looking men of all colours from black to dirty yellow standing under the shelter. 'Gentlemen,' he said in Spanish, 'pray do me the kindness of pointing out the hospital.'

One of the men offers to guide him to it. Stephen, who is an eminent amateur ornithologist, tries to learn something about the local species:

In the middle of the square three black and white vulturine scavengers with a wingspan of about six feet were disputing the dried remains of a cat. 'What do you call those?' asked Stephen.

'Those?" replied his guide, looking at them with narrowed eye. 'Those are what we call birds, your worship. And there, before Joselito's warehouse, is the hospital itself. [...] And there I see a vile heretic coming out of it, with his countryman.'

'Which? The little small fat yellow-haired gentleman, who staggers so?

'No, no, no. He is an old and mellow Christian -- your honour too is an old and mellow Christian, no doubt?'

'None older; few more mellow.'

'A Christian though English. He is the great lawyer come to lecture the university of Lima on the British Constitution. His name is Raleigh, don Curtius Raleigh: you have heard of him. He is drunk. I must run and fetch his coach.'

'He has fallen.'

'Clearly. It is the tall black-haired villain who is picking him up, the surgeon of the Liverpool ship, that is the heretic. I must run.'

'Do not let me detain you, sir. Pray accept this trifle.'

'God will repay your worhip. Farewell, sir. May no new thing arise.'

'May no new thing arise,' replied Stephen.

A more condensed discussion of the various versions and attributions for the "interesting times" phrase can be found in the Wikipedia here.

Posted by Mark Liberman at 07:47 AM

Proverb Attribution

As far as I know none of the putative Chinese proverbs recited at the Aspen Conference is authentic, but one of the proverbs quoted by Daniel Gross has a real Japanese analogue. The intended meaning of "tall flowers are cut down" is presumably the same as that of the authentic Japanese proverb 出る釘は打たれる. "the nail that sticks out gets hit".

I suspect that there is a tendency to attribute sayings that one likes to whatever source seems likely to carry authority or authenticity. On several occasions I've seen "women hold up half the sky" 女人撐著半邊天 described as an "old Chinese saying". It's a good slogan, and it is Chinese, but I don't think that it is traditional. As far as I can tell, it was introduced by Mao Tse-Tung.

Posted by Bill Poser at 12:22 AM

July 12, 2006

Fake proverbs at the Aspen Institute?

Victor Mair sent in a link to Daniel Gross' story in Slate about how "CEOs, venture capitalists, policy wonks, elected officials, movie stars ..., and freeloading journalists" vied with one another in quoting "Chinese proverbs" at a recent Aspen Institute shindig ("The Fortune Cookie 500", July 5, 2006). Victor's comment: "I didn't read them all carefully, but I think that every single one of the 'Chinese proverbs' these business execs quote is phony". Disappointing, if true -- you'd think that our executive class could at least be fashionable in an accurate way. First the CEO of Raytheon plagiarized the pithy sayings that he passed off as his own, and now we learn that his brethren and sistren may be inventing the sayings that they claim to be traditional. Say it ain't so! Can anyone confirm that any of the Aspen oriental apothegms is non-fake? Write me and I'll tell the world.

Posted by Mark Liberman at 09:27 PM

Born on the 11th of July

Napfisk at No Dependencies wishes us a "Happy 11th of July", observing that

Today is our Flemish Community’s ‘national’ holiday, commemorating the Battle of the Golden Spurs. On 11 July 1302, a Flemish ad hoc militia of local nobles, townsmen and mercenaries beat the French army in a field outside of the town of Kortrijk. [...] The Flemish militia was largely made up of mobile foot soldiers that had the advantage of higher ground, soggy marshland and ditches. [...] The French cavalry nearly drowned, couldn’t retreat because of its own advancing infantry and finally fled in panic. [...]. After the event, the battle field is said to have been littered with the abandoned golden spurs of the French knights.

Despite this famous victory, "Two years later, the County of Flanders was soundly reintegrated in the French kingdom until it was passed on to Burgundy in the 15th century".

There's a linguistic hook in the beginning of this story. According the the Wikipedia article on De Guldensporenslag, the French were upset because

After being exiled from their homes by French troops, the citizens of Bruges went back to their own city and murdered every Frenchman they could find there on May 18 1302. They identified the French by asking them to pronounce a Dutch phrase, schild ende vriend (shield and friend). Everyone who had a problem pronouncing this shibboleth was killed.

However, as Alex Baumans explained to Bill Poser a couple of years ago ("Schild en vriend", 2/5/2004):

Alex Baumans writes from Flanders that this story, which I learned from my Belgian father, is a myth. There probably was some sort of password like this, but it couldn't have been schild en vriend. One reason is that schild didn't acquire the fricative [x] that makes it difficult for non-natives to say until much later and indeed still hasn't in some Flemish dialects. Another is that the enemy consisted not only of the French but of native speakers of Flemish who supported the French crown.

It seems that such shibboleth stories are often elaborated, if not completely made up. But there's a better-supported linguistic connection at the modern end of this story. I'm not sure how the Flemish refer to this newly-codified "national holiday" -- is it (the Dutch equivalent of) "the 11th of July", or is it "Golden Spurs Battle Day"? Either way, the choice is relevant to the recently current debate here in the U.S. about whether it's politically correct to refer to "Independence Day" as "the Fourth of July", and what using one term or the other might mean about your political principles.

Last week , Geoff Nunberg argued that using "Fourth of July" or "July Fourth" in place of "Independence Day" is not "just another assault on the spirit of American patriotism" ("George M. Cohan, call your office", July 5, 2006). Geoff gave three arguments: first, "Independence Day" is getting more common rather than less common, at least in the New York Times; second, it's "goofy" to think that "'real' holidays don't go simply by dates"; and third, polls show that "around 60 percent of Americans describe themselves as more patriotic than the average American, while fewer than 10 percent consider themselves to be less patriotic".

I have another piece of evidence to offer, though I'm not sure what it means about the trajectory of American patriotism. It's obvious that references to the date of the holiday -- whether expressed as "fourth of July", "4th of July", "July fourth", or "July 4th" -- will be much commoner than references to random days of the year, and indeed 7/4 is referenced, one way or another, on about 30 times as many pages in Google's index as 6/4 is. What's more interesting to me is that in the name of the holiday, the "Xth of Month" form is so much enriched relative to the "Month Xth" form. In the case of the fourth days of June and August, the "Xth of Month" form provides only about four or five percent of the current web references, whereas for the fourth day of July, the "Xth of Month" form accounts for more than two thirds of the hits:

	Google	Yahoo	MSN	*(harmonic mean)*
fourth\|4th of June	208K	118K	39,586
June fourth\|4th	6.58M	2.08M	757,807
DofM/(DofM+MD)%	3.1%	5.4%	5.0%	4.2%
fourth\|4th of July	142M	55.3M	6,022,139
July fourth\|4th	55.8M	23.2M	3,314,226
DofM/(DofM+MD)%	71.8%	70.4%	64.5%	68.8%
fourth\|4th of August	137K	87.1K	22,224
August fourth\|4th	2.7M	1.46M	426,434
DofM/(DofM+MD)%	4.8%	5.6%	5.0%	5.1%

Is this because of the influence of the popular song by George M. Cohan? It's not a general fact about commemorated dates -- the eleventh day of September is moving in the opposite direction, towards "September 11th" instead of "the 11th of September":

	Google	Yahoo	MSN	*(harmonic mean)*
eleventh\|11th of August	91.2K	73K	15,075
August eleventh\|11th	2.08M	1.17M	323,069
DofM/(DofM+MD)%	4.2%	5.9%	4.5%	4.7%
eleventh\|11th of September	236K	174K	41,465
September eleventh\|11th	18.6M	5.99M	1,434,502
DofM/(DofM+MD)%	1.3%	2.8%	2.8%	2.0%
eleventh\|11th of October	83.5K	74.7K	15,038
October eleventh\|11th	1.77M	1.15M	278,468
DofM/(DofM+MD)%	4.5%	6.1%	5.1%	5.2%

Presumably this is because of the connection to the emergency number 911.

In any case, I take this as evidence that "Fourth of July" has become well established as an idiom in American English today, and therefore has taken on whatever idiomatic meanings its users associate with it. I have no doubt that these associations are mainly patriotic ones. In this context, to complain that the phrase "Fourth of July" fails to specify its patriotic occasion is like complaining that a red herring is not really a fish.

Posted by Mark Liberman at 07:05 AM

July 11, 2006

The inscrutable Chinese language

In the middle of an article on architecture in China ("The China Syndrome", New York Times Magazine, May 21, 2006), in the context of describing a complex bureaucracy that gives a lot of interpretive power to individual officials, Arthur Lubow drops in a single sentence of linguistic essentialism:

The very idea of doing something architecturally new in China is itself so new that ambitious architects must surmount novel challenges. The popular mentality, however open-minded, is enmeshed by a web of shifting and inconsistent rules. ''It's not that we don't have systems,'' says Yung Ho Chang of M.I.T. ''We have incomplete systems. We have this superprogressive energy code, but a decades-old structure code. It is pretty easy for the bureaucrats to make exceptions, which they love to do. They think every case is unique, so they will break the code. Not you.It's this kind of incomplete changeable system.'' The Chinese language is itself poetically vague compared with English and more open to interpretation. Winning approval of a design often involves finding a receptive official. ''You go to one person who says yes and then another person says no,'' complains Li Hu, who, with Steven Holl, is building a mixed-use complex in Beijing. ''We were almost there, and the person died of a heart attack, and we had to start all over with a new person. No one wants to be responsible.'' [emphasis added]

I've heard similar comments before about Mandarin. It might even be true, in some sense, given features like no obligatory marking of noun number and definiteness, lack of pronoun gender, and so on. On the other hand, I can supply endless examples where the English language did not in any way impede bureacratic blur; and it's surely possible to write precise and specific regulations in Mandarin. The relevant question, I guess, is whether the fact that Chinese construction-industry regulations and building-permit applications are written in Mandarin is relevant in any material way to the complex, erratic (and presumably corrupt) way in which the Chinese regulatory bureaucracy deals with such things. I'll bet that the answer is "no".

After all, we commented earlier on an article in which the Guardian told us that "The German language provides fully functional clarity. English humour thrives on confusion" ("Thriving on confusion in the Guardian", 5/24/2006). On the other hand, we've also cited a piece in the Courrier International asserting that "current English is characterized first by an extreme concern for coherence and for explicitness approaching redundancy" ("Paradoxes of the imagination", 9/29/2005). That's the nice thing about linguistic (and social and sexual) stereotypes; they're like astrologers, so that if you don't like what one of them says, you can turn to another one, and sooner or later you get the answer you want.

I'm still working on my New Year's Resolution to take a positive attitude towards treatments of linguistic matters in the popular press. So I'll suggest that what we need here is a sort of Encyclopedia of Linguistic Stereotypes, providing a long list of authoritative, general statements like

The Chinese language is poetically vague compared with English and more open to interpretation.
The German language provides fully functional clarity. English humour thrives on confusion.
English is characterized by an extreme concern for coherence and for explicitness approaching redundancy.

Each statement should of course be accompanied by a link or at least a citation. Then intellectually needy writers could pick the stereotype that fits their needs.

The closest thing I know of is John Cowan's Essentialist explanations, which provides 794 phrases like these:

English is essentially Norse as spoken by a gang of French thugs.
Swedish is essentially Norwegian spoken by Finns.
Spanish is essentially Italian spoken by Arabs.
Francophones are essentially Germans speaking the bad Latin they were taught by Gauls.
Modern Greek is essentially Classical Greek as spoken by Venetians.
Mandarin is essentially Chinese as spoken by Mongols.

But these are the wrong sorts of generalizations.

[Update -- Margaret Marks, who "did an A Level in Chinese and later some classical poetry and a lot of stuff about lackeys and running dogs", writes:

What rubbish about vagueness! I think it is true that Chinese classical poetry (not the language, the form) is vague, since it's pared down, a bit like the frog jumping into the pond in a haiku.
But everything else is pinned down by context.

Yes, or specified to the necessary level of explicitness by the writer. But please, Margaret, you have to help me stick to my New Year's resolution. Maintaining a positive attitude about this stuff can be hard, but that's what New Year's resolutions are for, after all.]

Posted by Mark Liberman at 10:37 PM

Throughout the ranks of left-wing bloggers

A nice example of a plain vanilla linguification of the true-to-false type that I first noted here here and here occurs in Anthony Dick's review of a new George Lakoff book in National Review Online:

On any day of the week, you can read throughout the ranks of left-wing bloggers the following fervent incantation: "We need to reframe the debate."

His underlying claim: all over the left-wing blogs, on every day of the week, there is a lot of stuff about Lakoff's ideas concerning the importance of framing the political debate. The linguified claim: the actual string of words "We need to reframe the debate" appears just about every day on large numbers of left-wing blogs. Let's check that the latter is true, shall we?

The total number of hits for "We need to reframe the debate" on the entire web is tiny: 126. And of course by no means all of those are on left-wing blogs (one of the top Google hits for the phrase is from an electronic media show web site). All sorts of six-word clauses will get this many hits; "We need to defend our country", for example, gets 129. "You need to wash your face" gets significantly more (nearly 200).

When we limit things to find pages that have both "We need to reframe the debate" and the name "Lakoff", we find we are down to a mere dozen pages. This is so small as to be roughly equivalent to not occurring at all. The number of pages containing the randomly chosen clause "the eagle has landed" and the randomly chosen word "peripatetic" is also about three times as big (32 hits). If we pick "why did the chicken cross the road" and the randomly chosen word "archbishop" we get 646 hits — 54 times as many as "We need to reframe the debate" and "Lakoff".

So Anthony Dick's linguified statement is wildly, absurdly false. Yet I have no doubt that the claim he started with — that left-wing blogs talk a lot about Lakoff's debate-reframing ideas — is broadly true. Why did he switch? Why did he linguify?

One can only speculate. But in this case, humor is not a candidate explanation (even the people who pointed out to me that the Daniel Gilbert case was striving for humorous effect cannot invoke that here). This guy is trying to convey a serious claim about activity on the left in politics. So why doesn't he just state it?

I think my best guess would be that in cases of this sort the author is a bit sheepish about the subjective character of the underlying claim (it has that unscientific "I hear people talking about it everywhere" character to it), and they decide that a linguified alternative will look more objective, like something they might have actually checked out on Google or Lexis-Nexis. But then they don't bother to actually check it out.

So I do have an objection to what I see going on in a lot of cases of linguifying in current journalistic prose. It's about linguistic error, and it's not really about bad writing style. It's about intellectual laziness (that's the connection to snowclones), and about the cynical assumption that fact-checking is unnecessary for a public as stupid and gullible as us.

It's about lazy writers deploying pre-assembled clichés that come in kit form (just a small amount of assembly required): too lazy to check the facts, quite happy to follow the trend of linguified claim-making, just glopping the sentences down like a painter who doesn't give a rat's ass how good the final result will look. It's about columnists who hand us invented blather concerning the frequency of words and phrases, secure in their confidence that their friends will all nod approvingly and even their opponents probably won't lift a finger to check anything.

People write to tell me they think I should lighten up a bit on linguifying — that I'm being over-serious and too harsh on a bunch of harmless metaphors and hyperboles and snowclones. But I don't think I want to lighten up on sloppy columnists casually assuming that unverified frequency claims about linguistic material can be tossed about at will. And I don't think I should.

Posted by Geoffrey K. Pullum at 09:37 PM

July 10, 2006

Don't dare return in under an hour

More evidence, as if we really needed it, that linguists are needed out there. Specifically, in trades such as writing software user interfaces and designing road signs it would be a good thing to have a few more employees with an analytical understanding of language and its interpretation. My friend Avi went to park a rental car in an unfamiliar city (it was Cambridge, England) and found himself looking at the sign that you see on the right. He was a bit intimidated. It looked very much to him as if he was being told first that he could only park there for a maximum of one hour, and second that this was also the minimum — that he must stay away for a full hour, and if he were to return to his car within one hour and try to get into it and drive away, he could be ticketed for that. (Avi is from Israel, and resides in Los Angeles. How is he supposed to know what insane parking regulations might be in force in an arbitrary English city?) I assume that the intended meaning was that you can drive away any time you like, but the car must not return and park in the same place for an hour afterward. Would it really be so hard to devise a sign that expressed that meaning clearly, simply, and unambiguously?

Posted by Geoffrey K. Pullum at 06:21 PM

Dialog Box Button Labels

The problem of difficult to interpret dialog box buttons that Geoff refers to is, ironically, a consequence of user interface guidelines. The design that he prefers, with buttons labelled "Yes" and "No", is one that most UI guidelines say to avoid, on the grounds that they require the user to interpret other information in the dialog correctly. The Apple Human Interface Guidelines say:

Button names should be verbs that describe the action performed.

The KDE User Interface Guidelines (KDE is one of the two major desktop environments for Linux) are more explicit:

Dialogues that ask questions should not use Yes/No; this forces the user to take an extra mental step such as "Am I saying Yes to deleting this file, or am I saying yes to keeping this file?"

The GNOME Human Interface Guidelines (GNOME is the other major desktop environment for Linux) are the same:

All button labels should use an imperative verb describing what it will do when clicked.

To old Unix hands such as myself with our Shiva-like power to wipe out entire filesystems in eight keystrokes, UI guidelines seem all too often to assume that users are illiterate morons, but they do have a point: there are cases in which the user may find it difficult to work out what action "yes" corresponds to and what "no". The problem with Geoff's example is that avoiding "yes" and "no" isn't enough: you have to choose the right phrases for your button labels. Indeed, I suspect that using "yes" and "no" is perfectly fine if you phrase the question clearly. UI guidelines have something in common with Strunk and White and other such usage guides: they are often motivated by real problems but go astray by oversimplifying the solution.

Posted by Bill Poser at 01:12 PM

Underlying claim false, linguified claim true

Here's a case where we have an underlying claim that is wildly false but the linguified claim might be broadly true:

‘High tech’ and ‘in a museum’aren't usually found in the same sentence.
[In the "From the Secretary" feature, Smithsonian Magazine, April 2006; thanks to Christina Marie Rahaim]

The underlying claim is that nobody usually discusses high-tech in connection with museum displays, and that seems ridiculously false: there appear to be tens of thousands of web pages discussing exactly that topic, in connection with computer art in art museums, displays in science museums, educational aids involving the Internet in general museums, and so on and so on. The subject is hotter than a two-dollar pistol. But I used Google to find pages that had both the string ‘high tech’ and the string ‘in a museum’ (more than 37,000 of them), and in the first hundred I read, the two strings, though often connecting the two topics, always happened to be in separate sentences syntactically. So the linguified claim (which had nothing to do with what the Secretary actually wanted to talk about) seems to be largely true.

That does not, of course, change anything about whether the rhetorical device here is a linguification, or about whether it is a tired old snowclone of linguification, or whether the Secretary published a shabby piece of dull hack writing back in April.

Posted by Geoffrey K. Pullum at 12:06 PM

If you can answer this, you are not paying attention

Here is a particularly appalling piece of linguistic blindness on the part of the software authors at a computer company. My instantiation of a mailer that I use has been automatically updated, and when I try to start it up I see a dialog box with this message in it:

Mail has been updated. Do you want to allow the new version to access the same keychain items (such as passwords) as the previous version?

This change is permanent and affects all keychain items used by Mail.

Below are two buttons. One says Don't Change. The other says Change All.

But wait a minute: which of those is supposed to mean "Yes, allow the new version access to the old passwords", and which is "No, I have changed my mind about this, don't allow access?"?

To repeat, the question is "Do you want to allow the new version to access the same keychain items (such as passwords) as the previous version?" Nowhere in this interrogative clause does the lexeme change, or any synonym of it, occur. I am asked a yes/no question about allowing access. Do I want to continue allowing Mail to have access to the necessary passwords to do its job? Yes, I do. But neither button is labeled Yes. (Update: By the way, I'm not suggesting one of them should be labeled yes; there is a matter of policy here, which Bill Poser explains in this very perceptive post. I'm just noting that neither button is labeled with an imperative that means "Yes, allow access".)

Sure, there is a Help panel that can be accessed from this dialog box, but it's completely useless. It tells me what I know, and does not mention the key thing that I don't know and need to know. It just provides another instance of the same problem. It does use the lexeme change, but not in the right context. It says that if an application changes (i.e., is replaced by a new version) it cannot be fully trusted, so it is not automatically permitted access to the same passwords and security codes as the previous trusted version. And it explains that when this happens "a dialog appears asking you to grant permission." But nowhere does it explain how either "Don't Change" or "Change All" could conceivably be interpreted as granting permission.

There is a sense in which I don't want any change: I want all my passwords to stay the same, and I want no change in the usual routine where the Mail application is allowed access to the relevant servers so it can download my mail. "Don't Change" seems a perfectly reasonable command to cover this.

But in another sense, I do want a change: since I do want the new version of Mail to operate, I want the actions of the system to change so that the new version of the Mail program will be granted access to the same servers as the old one. I want all servers and security daemons to change their behavior and respond to the new version. "Change All" seems a perfectly reasonable command to cover this.

It all boils down to this question: Change what? And in the case of "Change All", all what?

How the hell does the writer of this dialog box expect me to know which sense of "changing" things is supposed to be the relevant one? Am I expected to read the mind of the developer? Is it a change when you allow a new program access to old passwords? Yes. Is it a change when you stop a program from getting its familiar access to passwords on grounds that its object code has been tampered with? Yes.

Let me stress at this point that my point here is not about what I should do. I don't want a thousand people to mail me instructions concerning which of the two buttons to click to keep Mail working. (Update: People are doing so already, of course. But then nobody ever listens to what I tell them. Just don't join them, OK?) This is Language Log, not a software users group mailing list. My point is linguistic.

Producing language that other people will be able to understand involves not just having a picture in your mind of the scenario and designing a nice-looking (and policy-compliant) dialog box that you feel represents your view of it. You have to deploy a shared linguistic system, according to established rules, using lexemes of known meaning, to present that picture to others in a way that will work for them. You have to consider whether there are other ways of viewing the situation at hand. You have to examine the wording you have chosen to see if it has ambiguities or unclarities. You have to put yourself in the place of a person who did not work with the developers of the operating system, someone who sees your dialog box without the benefit of any prior experience with the way you conceptualize things, and you have to ask yourself whether they would understand what to do.

Don't worry, I'll find out what to click. I can use trial and error, or call a friend, or consult the excellent gurus at UCSC's Humanities Division, or phone a tech support line... Well, maybe not the latter; I don't have the time to spend three quarters of an hour listening to "The Girl from Ipanema" playing over a phone connection to India. But I'll figure something out. Don't write me about it. Just take note of what the actual point is here, because it is much more general than whether I am just about to enable my Mail program to continue functioning or to disable it by the button choice I make. Sure, there will be some answer to which of the two baffling buttons I should click on, and there is some reason that once seemed sensible to someone concerning why the wording is the way it is. But that is not of interest. It has nothing to do with what I am talking about.

It is a simple matter of the semantics of English that "Do you want to allow the new version access?", having the form of a closed interrogative clause, has exactly two appropriate answers, and neither of them requires or reasonably includes the lexeme change for its expression. The two answers are (1) "Yes (I do want to allow the new version access)", and (2) "No (I don't want to allow the new version access)". Neither of those meanings is expressed either by "Don't Change" or by "Change All". The dialog box is incoherent. That's a purely linguistic observation, and will not be altered by my coming to know which button to click. The message in that box will baffle every user — or at least, those users who are not baffled have not been paying attention. It is a disgracefully and needlessly unfriendly misfeature.

Let me add that the operating system that has done this to me is OS X, running on my Apple Macintosh G4 notebook computer — superb software running on a marvellous machine. It is just astounding that anyone uses Windows machines at all anymore, given that both OS X and Linux are of such overwhelmingly superior quality. This post is not about criticizing Apple's software, which is better than almost anything that can be found in the industry. It is about the inadequacy of the attention to linguistic detail found throughout the industry. It appears that in software development divisions there is typically no one technically trained in syntax, semantics, and pragmatics overseeing the choice of language in the dialog boxes and help panels that are written. The problem is that in most software divisions, the team has no linguist (or linguistically sophisticated technical writer — the degrees held matter less for this purpose than the linguistic acumen).

Posted by Geoffrey K. Pullum at 11:02 AM

Morphology in action

When I search for "unsportsmanship", Google News still shows the original:

But on the NYT site, an alert copy editor on the morning shift has modified the story ("Italy Defeats France in Penalty-Kick Shootout") to read: "Midfielder Zinédine Zidane had been ejected after he committed an astonishing act of impudence and unsportsmanlike conduct in the 109th minute." [Hat tip: Ron Hogan.]

Posted by Mark Liberman at 07:02 AM

July 09, 2006

Snowclones of linguification

It is rapidly becoming clear that there are numerous English snowclones devoted almost entirely to the purpose of linguifying claims. In this post I gather together just a few of them, with some illustrative examples people have sent in. The snowclones I illustrate — and I'm sure the list is not exhaustive — are these (some assembly required: substitute a term denoting a language for L; a noun for N; a word for W; the name of a linguistic unit (such as "sentence") for U; and names or definite descriptions for X):

Can't even spell/pronounce W
Not know the meaning of W
W isn't in X's dictionary/vocabulary
W is not in L
W is X's middle name
W and V are (not) found in the same U
Look up W in the dictionary and you'll find a picture of X
Hate the word W
Not know the name of X
Hear the word W and reach for one's N

Can't even spell/pronounce W

Etiquette: I Can't Even Spell It! Teen Dining & Social Etiquette (title of a videotape released in 2000).

See also this post for an example (highly controversial: millions of people wrote to me saying they didn't understand what I was talking about).

Not know the meaning of W
1895, New York Times:

He is a modern Claude Duval. He is the nerviest man that ever stood in two boots, and doesn't know the meaning of the word fear.

1862, John Ruskin, Unto This Last, ii.40:

Primarily, which is very notable and curious, I observe that men of business rarely know the meaning of the word 'rich'.

[Thanks to Jan Freeman for these examples.]

W isn't in X's dictionary/vocabulary

See Mark Liberman's post, "The dictionary of fools", for many examples of this snowclone. And notice that Douglas Adams was mocking it in 1987 in his novel Dirk Gently's Holistic Detective Agency:

"The word 'impossible' is not in my dictionary. In fact, everything between 'herring' and 'marmalade' appears to be missing."

W is not in L
1813: Napoleon:

"Ce n'est pas possible, m'écrivez-vous: cela n'est pas français."
[It is not possible, you write to me: that is not French.]

W is X's middle name
See the quotations gathered on this Wikipedia page.

Not found/occur in the same U

"when people are talking about me, the words 'computer' and 'savvy' are never in the same sentence" [from http://www11.brinkster.com/asij1993/News/UpdateSprFall00.asp]

Douglas Adams plays with this one in his 1987 novel Dirk Gently's Holistic Detective Agency:

WFT-II was the only British software company that could be mentioned in the same sentence as such major U.S. companies as Microsoft or Lotus. The sentence would probably run along the lines of "WFT-II, unlike such major U.S. companies as Microsoft or Lotus ..." but it was a start.

Look up W in the dictionary and you'll find a picture of X
2000, Dan Savage in the Village Voice:

Forget about keeping it a secret; when you look up fishbowl in the dictionary there's a picture of a graduate student lounge.

. . . when you look up deadline in the dictionary you won't find a picture of my brother (http://www.villagevoice.com/people/0023,savage,15435,24.html)

Hate the word W

1623 William Shakespeare, "Romeo and Juliet":

What, drawn, and talk of peace! I hate the word,
As I hate hell, all Montagues, and thee:
Have at thee, coward!

Not know the name of X
1629, Léonard de Marandé, The iudgment of humane actions a most learned, & excellent treatise of morrall philosophie, which fights agaynst vanytie, & conduceth to the fyndinge out of true and perfect felicytie, written in French by Monsieur Leonard Marrande and Englished by Iohn Reynolds London: Imprinted by A. Mathewes for Nicholas Bourne, at ye Royall Exchange (http://gateway.proquest.com/openurl?ctx_ver=3DZ39.88-2003)

It is true, Choler hath power and predominancy ouer all men; that there are many people who haue not yet approoued the stings of ambition, who know not the name of Couetousnesse, and yet there are none who haue not felt the effect of Choler.

[Thanks to Bruce Rusk for this. There was another example posted here from Dryden's translation of Chaucer, but I have been persuaded that it did not illustrate the point.]

Hear the word W and reach for one's N
Originates with the famous line from the play Schlageter by Nazi playwright Hanns Johst: "Wenn ich Kultur höre ... entsichere ich meinen Browning" ("If I hear [the word] ‘culture’ ... I release the safety-catch of my Browning"), often misquoted as "When I hear the word culture I reach for my revolver", and misattributed to Hermann Goering.

Again, notice Douglas Adams deftly mocking this snowclone in chapter 2 of The Restaurant at the End of the Universe (1980):

"When he heard the words integrity or moral rectitude he reached for his dictionary, and when he heard the chink of ready money in large quantities he reached for the rule book and threw it away."

Posted by Geoffrey K. Pullum at 05:32 PM

Plagiarism Watch: Ann Coulter

The subject of the latest plagiarism scandal is right-wing bitch-goddess Ann Coulter. The New York Post has reported that John Barrie, who operates the ithenticate plagiarism-detection service, claims to have found at least three instances of "textbook plagiarism" in Coulter's book Godless: the Church of Liberalism and additional instances in Coulter's syndicated weekly column. For example, he says that Coulter's Aug. 3, 2005, column, "Read My Lips: No New Liberals," about Supreme Court Justice David Souter, includes six passages, ranging from 10 to 48 words each, that appeared 15 years earlier in the same order in a Los Angeles Times article entitled "Liberals Leery as New Clues Surface on Souter's Views." If true, that can hardly be attributed to chance.

According to Editor and Publisher, Universal Press, Coulter's publisher, has said that they will look into the allegations. Neither they nor we currently have access to Barrie's report as it is available only to those who subscribe to his service.

Coulter has issued an un-rebuttal, in which she whines about what a bad newspaper the New York Post is but does not respond to the charges. That does not look very good for her. On the other hand, there is some question as to whether Barrie has identified real instances of plagiarism. Blogger Thorley Winston compared the L.A. Times article about Justice Souter (to which I do not have access) with Coulter's column and concludes, in my opinion correctly, that Coulter is not, in this instance, guilty of plagiarism. Coulter may well have drawn from the article, but much of the similarity consists of properly attributed quotations from Souter, and the rest is at the level one might expect in a factual account of the same material. There are, after all, only so many ways to say "Souter said this about that". According to Winston, pace Barrie, the passages are not in the same order in Coulter's column as in the L.A. Times article.

Posted by Bill Poser at 05:20 PM

Voluble apes? I don't think so

The story about language and bonobos by Jon Hamilton that aired on NPR's Weekend Edition yesterday is headlined "A Voluble Visit with Two Talking Apes". This is an astonishing piece of mendacity even for a headline writer, since the apes in question (the bonobos Kanzi and Panbanisha) never utter a single sentence, word, syllable, or anything other than shrieks (their productive capacities are limited to touching symbols on a board or a computer screen, usually to indicate that they want some immediate desire to be fulfilled). And not one example was offered of the animals putting even two symbols together according to some linguistic rule. The piece was not without interest — if you did not know that Sue Savage-Rumbaugh has had some success with getting tame bonobos to pick up on the probable meaning of vocal utterances in English by humans, you should take a look at this; but it did nothing to rid me of my strong suspicion that the study of the possibilities of human/ape linguistic interactions is much colored with love, enthusiasm, empathy, creative interpretation, and wishful thinking on the part of the humans, and the whole project is scientifically suspect. Read the story and judge for yourself whether my basic claim about animal communication — that no non-human animal has ever so much as expressed a single opinion about anything in the history of the study of animals — is falsified by anything in this story. I would love to know even a tiny bit about what a bonobo thinks of us (as opposed to what it thinks about the idea of being given a banana). But I don't believe I'm ever going to have the chance.

Posted by Geoffrey K. Pullum at 03:28 PM

Wisdom of blogs : Italy to win, Language Log to place?

Matthew Hurst at Data Mining reads the blogospheric tea leaves: "Here's how I read the following graph: the blogosphere was more surprised to see France getting through to the final than Italy. Therefore, there is a higher expectation for Italy to win."

There's a lot of interesting stuff on Matthew's blog: take a look at his most recent post, "Mapping categories of influence in the blogosphere", for example. And he wins the Language Log seal of approval by citing Nature's recent list of 50 top science blogs, ordered by Technorati rank, and noting that they left out Language Log, which should have been in 2nd place, right after Pharyngula.

Posted by Mark Liberman at 09:05 AM

The dictionary of fools

According to email from Jonathan Ferro, of the top twenty Google hits for "not in my vocabulary", only one (in blue below) is not a linguification snowclone:

1. dieting 2. failure 3. lasting relationships 4. worry 5. universal design 6. abort 7. boredom 8. compromise 9. enough 10. lame duck 11. fear 12. exploit 13. sour mix 14. guilt 15. early retirement, mortgage paid in full, travel around the world 16. drop pearl earrings 17. cicadas 18. nice 19. fail 20. inactivity

Jonathan adds that "I also looked into 'not in my dictionary', but only two of the top ten are genuine linguifications, so I didn't dig further".

Looking in LION, I found Frederick Thomas, Clinton Bradshaw; or, The Adventures of a Lawyer, Volume 2 (1835):

If Talbot were poor, he might do something; but now, bah! he will be spurred into an occasional feeble effort, and fail. And his wealth will give him all the leisure to canker and fester over it. But I — the stern necessity is on me to labour — to do head work — and if the sweat of the brain is like other sweat, a plebeian offering to the goddess industry, may be I may pluck, in my rough road, a certain leaf or two, and hide the sweltering stain upon my brow, as Cæsar hid his baldness. `Impossible,' said Mirabeau — `that word is not in my vocabulary,' nor shall it be in mine."

And this from Charles Smith, The Wild Youth: a Comedy for Digestion (1800) [translation of a play by August von Kotzebue]:

Lisette: Patience! patience!

Frederick: This word is not in my dictionary.

Lisette: Then write it in it. Keep your tender letter. I shall tell her, that a handsome young gentleman, with a pair of large wild eyes, has resolved to love her eternally. Not so?

Alejandro Satz wrote to remind us of the famous Napoleon quote (which may have been inspired by Mirabeau or by an older usage inspiring both):

'Impossible' n'est pas français. [Letter to General Lemarois (9 July 1813)]

Alejandro also cited a widely-used variant English version of a similar sentiment, also attributed to Napoleon, whose source is less clear:

Impossible is a word found only in the dictionary of fools.

Posted by Mark Liberman at 08:31 AM

July 08, 2006

What Champollion will decipher this hieroglyphic for us?

Yesterday, Lydia Joyce wrote to call my attention to Henry David Thoreau's transcendentalist phonology. The context is Thoreau's description of something that he calls "sand foliage" (Walden, Chapter XVII, "Spring"):

Few phenomena gave me more delight than to observe the forms which thawing sand and clay assume in flowing down the sides of a deep cut on the railroad through which I passed on my way to the village, a phenomenon not very common on so large a scale, though the number of freshly exposed banks of the right material must have been greatly multiplied since railroads were invented. The material was sand of every degree of fineness and of various rich colors, commonly mixed with a little clay. When the frost comes out in the spring, and even in a thawing day in the winter the sand begins to flow down the slopes like lava, sometimes bursting out through the snow and overflowing it where no sand was to be seen before. Innumerable little streams overlap and interlace one with another, exhibiting a sort of hybrid product, which obeys half way the law of currents, and half way that of vegetation. As it flows it takes the forms of sappy leaves or vines, making heaps of pulpy sprays a foot or more in depth, and resembling, as you look down on them, the laciniated lobed and imbricated thalluses of some lichens; or you are reminded of coral, of leopards' paws or birds' feet, of brains or lungs or bowels, and excrements of all kinds.

I can't remember the last time I walked through an unstabilized rail cut or road cut in the right sort of ground. But that most of us have seen similar patterns in flows of sand and clay, if only in the smaller and more temporary forms that children and waves create together at the beach. Thoreau is impressed by how quickly such patterns form:

The whole bank, which is from twenty to forty feet high, is sometimes overlaid with a mass of this kind of foliage, or sandy rupture, for a quarter of mile on one or both sides, the produce of one spring day. What makes this sand foliage remarkable is its springing into existence thus suddenly. When I see on the one side the inert bank,—for the sun acts on one side first,—and on the other this luxuriant foliage, the creation of an hour, I am affected as if in a peculiar sense I stood in the laboratory of the Artist who made the world and me,—had come to where he was still at work, sorting on this bank, and with excess of energy strewing his fresh designs about. I feel as if I were nearer to the vitals of the globe, for this sandy overflow is something such a foliaceous mass as the vitals of the animal body. You find thus in the very sands an anticipation of the vegetable leaf. No wonder that the earth expresses itself outwardly in leaves, it so labors with the idea inwardly. The atoms have already learned this law, and are pregnant by it.

It's not obvious why quick emergence should be more evocative of transcentalist musings than slow emergence is, but my own reactions are similar to Thoreau's. I suppose it's because sudden structures emphasize the process as much as the result.

Now we get to the phonology. The law that the atoms have learned, it seems, is a sort of transcendentalist sound law:

Internally whether in the globe or animal body, it is a moist thick lobe, a word especially applicable to the liver and lungs and the leaves of fat, (λείβω , labor, lapsus, to flow or slip downward, a lapsing; λοβός , globus, lobe, globe, also lap, flap, and many other words,) externally a dry thin leaf, even as the f and v are a pressed and dried b. The radicals of lobe are lb, the soft mass of the b (single lobed, or B, double lobed,) with a liquid l behind it pressing it forward. In globe, glb, the guttural g adds to the meaning the capacity of the throat. The feathers and wings of birds are still drier and thinner leaves. Thus, also, you pass from the lumpish grub in the earth to the airy and fluttering butterfly. The very globe continually transcends and translates itself, and becomes winged in its orbit. Even ice begins with delicate crystal leaves, as if it had flowed into moulds which the fronds of water plants have impressed on the watery mirror. The whole tree itself is but one leaf and rivers are still vaster leaves whose pulp is intervening earth, and towns and cities are the ova of insects in their axils.

It tells us something about Thoreau and his time that he starts his word lists with Greek and Latin, before going on to English, and that he analyzes the lists in terms of pseudo-Semitic "radicals", though he gives no actual Hebrew or Arabic roots. (The various on-line versions of Thoreau's text variously transliterate the Greek, or omit it in favor of a series of underscores, or render it as "[letters of the Greek alphabet]". Hey, folks, this is the Age of Unicode -- you can put the Greek back in.)

One of the most interesting things about this associative rhapsody of sound and sense is what Thoreau leaves out of it. He takes off from the Greek words λείβω and λοβός (roughly "flow" and "lobe"), and other words sharing the "radicals" l and b. (In more detail, Liddell & Scott give λείβω the glosses pour, pour forth; make a libation of wine; let flow, shed;, melt and liquefy one's spirit; Pass., of the tears, to be shed, pour forth; in Pass., also, melt or pine away. And for λοβός we find lobe of the ear; lobe of the liver, lobe of the lung, capsule or pod of leguminous plants; in rose leaves, the white part.) Thoreau develops the /b/ into other labials /p/, /f/ and /v/, to get lapse, lap, leaf and leaves. Prefixing the guttural /g/ gives him "the capacity of the throat", the globe and the grub, whence he can rise from the flowing lobes of earth to the "thinner and drier leaves" of the "airy and fluttering butterfly". But given these phonemes and philosophies, there's a lexical dog that conspicuously doesn't bark in this passage: the l, the o, the g, logos, the neoplatoist origin of the universe, ἐν ἀρχῇ ἦν ὁ λόγος, Henry, hello?

Of course the omission is no accident, given Thoreau's beliefs -- but I wonder how consciously he was substituting λείβω and λοβός for λόγος here?

(And along his way to the winged globe, the delicate crystal leaves of ice, and the cities as insect eggs in the axils of rivers, it's a small thing that Thoreau conflates speech sounds and letters in comparing the "lobes" of the letters b and B with the lobes of the liver and lungs...)

Lydia comments that "this isn't quite onomatopoeia, but I don't know what to call it". I'm not sure that it has a name, but "transcendentalist phonology" will do for a start. (Or maybe "linguistic theology"? Walt Whitman continues the discussion here.)

Thoreau has more to say about his railway cut, in a similarly hallucinatory manner:

What is man but a mass of thawing clay? The ball of the human finger is but a drop congealed. The fingers and toes flow to their extent from the thawing mass of the body. Who knows what the human body would expand and flow out to under a more genial heaven? Is not the hand a spreading palm leaf with its lobes and veins? The ear may be regarded, fancifully, as a lichen, umbilicaria, on the side of the head, with its lobe or drop. The lip (labium from labor (?)) laps or lapses from the sides of the cavernous mouth. The nose is a manifest congealed drop or stalactite. The chin is a still larger drop, the confluent drippings of the face. The cheeks are a slide from the brows into the valley of the face, opposed and diffused by the cheek bones. Each rounded lobe of the vegetable leaf, too, is a thick and now loitering drop, larger or smaller; the lobes are the fingers of the leaf; and as many lobes as it has, in so many directions it tends to flow, and more heat or other genial influences would have caused it to flow yet father.

Thus it seemed that this one hillside illustrated the principle of all the operations of Nature. The Maker of this earth but patented a leaf. What Champollion will decipher this hieroglyphic for us, that we may turn over a new leaf at last?

Walden was published in 1854. The Origin of Species was published in 1859. There are more than a few additional steps in the deciphering, of course -- D'Arcy Thompson's On Growth and Form for a start, and complex systems theory, and homeobox genes, and a lot more, much of which believers in this theological tradition still need to take on faith.

This phenomenon is more exhilarating to me than the luxuriance and fertility of vineyards. True, it is somewhat excrementitious in its character, and there is no end to the heaps of liver lights and bowels, as if the globe were turned wrong side outward; but this suggests at least that Nature has some bowels, and there again is mother of humanity. This is the frost coming out of the ground; this is Spring. It precedes the green and flowery spring, as mythology precedes regular poetry.

Posted by Mark Liberman at 09:35 AM

July 07, 2006

A linguification from an unusual source

Here's a clear example of linguifying discovered by Glenn Branch, who is the deputy director of the National Center for Science Education in Oakland, California. It is the source in which it was found that will surprise you. First, here's the example:

The angriest [response] I know about was from Bruce Derwing of the University of Alberta, who in 1979 published an article in which my name appeared alarmingly close to a rash of phrases like ‘failure to recognize the nature of the problem’, ‘pure sloth and accompanying ignorance’, ‘arrogance’, ‘narrowness and inflexible mind’, ‘thoroughly anti-scientific’, and ‘disreputable and isolated’.

Notice that the underlying claim is that Bruce Derwing criticized the writer for various failures to be scientific, flexible, etc. But the linguified claim talks about where a certain name was in running text, specifically, that it was close to certain phrases. Even if it is taken to be true, it does not entail the underlying claim. Clearly, the name could be close to phrases of a deprecative sort without those phrases being attributed to the person named. Proximity does not necessarily have anything to do with the matter at hand. So this is an absolutely classic linguification — of the sort where the linguified claim is true, rather than the kind where it is false, but a classic plain vanilla piece of linguifying nonetheless.

Now to the source. The above sentence appeared in the linguistics journal Natural Language and Linguistic Theory in 1983. It was in an opinion column entitled "The revenge of the methodological moaners", in the "Topic...Comment" series (see volume 1, no. 4, pp. 583-588). It is reprinted in a 1991 collection of essays by the same author called The Great Eskimo Vocabulary Hoax and Other Irreverent Essays on the Study of Language (University of Chicago Press, 1991). The above quote appears on p. 124.

The author of the original essay and of the book: Geoffrey K. Pullum.

Yes, that would be me. I used the device of linguification in something I wrote in 1983 (I didn't realize that until Glenn pointed it out). And now I'm raising the question of why anyone uses this trope, and by implication casting aspersions on the writing skills of people who do. What can I tell you? Nothing, except what grownups say to kids when caught smoking, swearing, drinking, fornicating, or putting their elbows on the table: do as I say, not as I do. Style advice-givers never follow their own advice. Didn't you know that?

Posted by Geoffrey K. Pullum at 06:42 PM

So ignorant, as that they know not the name of a rope

[Or was that "the name of a trope"? We still have no examples of linguifying from Cicero, or the psalms, or Gilgamesh, but a note from Bruce Rusk, reproduced below, takes the practice back to the 17th century, and lends it the authority of John Dryden, Sir Walter Raleigh, and William Shakespeare. Further ~~poking around~~ research in a similar vein takes us back to 1581. Here's Bruce's note:]

This is Dryden’s translation of a passage from Chaucer’s Knight’s Tale, published in 1700. Note line 1500:

1496    'So keep me from the Vengance of thy Darts,
1497    'Which Niobe's devoted Issue felt,
1498    'When hissing thro' the Skies the feather'd Deaths were dealt:
1499    'As I desire to live a Virgin-Life,
1500    'Nor know the Name of Mother, or of Wife.

Compare Chaucer’s original:

            Ful many a yeer, and woost what I desire,
            As keep me fro thy vengeaunce and thyn ire,
1445    That Attheon aboughte cruelly.
            Chaste goddesse, wel wostow that I
            Desire to ben a mayden al my lyf,
            Ne nevere wol I be no love ne wyf.
            I am, thow woost, yet of thy compaignye,
1450    A mayde, and love huntynge and venerye,
            And for to walken in the wodes wilde,
            And noght to ben a wyf, and be with childe.

A previously unlinguified claim has been linguified (becoming plainly false, just like the supposed “forgetting” of word pronunciations).

There are many earlier examples of linguification:

Léonard de Marandé, The iudgment of humane actions a most learned, & excellent treatise of morrall philosophie, which fights agaynst vanytie, & conduceth to the fyndinge out of true and perfect felicytie. Written in French by Monsieur Leonard Marrande and Englished by Iohn Reynolds London : Imprinted by A. Mathewes for Nicholas Bourne, at ye Royall Exchange, (1629).

It is true, Choler hath power and predominancy ouer all men; that there are many people who haue not yet approoued the stings of ambitio, who know not the name of Couetousnesse, and yet there are none who haue not felt the effect of Choler.

And a fine one from Sir Walter Raleigh:

Sir Walter Rawleighs judicious and select essayes and observations upon the first invention of shipping, invasive war, the Navy Royal and sea-service : with his apologie for his voyage to Guiana (1667).

For many of those poore Fishermen and Idlers, that are co~monly presented to his Majesties Ships,are so ignorant in Sea-service, as that they know not the name of a Rope, and therefore insufficient for such labour.

Looking at similar phrases on the LION and EEBO databases, there seems to be a burst around 1700, but this could just be the bias of the sources.

E.g., Mr. (Thomas) Dilke, d. ca. 1698, The Lover's Luck (1696):

Thus free from all Cares of Taxes and Wars,
We know not the Name of Dull Sorrow;
Ev'ry Purse is our Prey, which we spend in a Day,
And the Devil take Care for to morrow.

And finally Shakespeare himself -- Coriolanus, Act III, Scene I (1608):<

His nature is too noble for the World:
He would not flatter Neptune for his Trident,
Or Ioue, for's power to Thunder: his Heart's his Mouth:
What his Brest forges, that his Tongue must vent,
And being angry, does forget that euer
He heard the Name of Death.

I’d bet classicists will find much earlier examples in other languages

[Above is a guest post by Bruce Rusk.]

I should add, ahead of our many alert readers, that Sir Walter may have been complaining about literal and consequential ignorance of the many specific and detailed rope-names required to follow orders on a sailing ship. And perhaps Dryden's speaker intends "know the name of __" in the sense of "know what it's like to have someone call me by the name of __". In any case, here's a clearly non-literal example of name-forgetting that takes us back to 1581:

A caueat for Parsons Hovvlet concerning his vntimely flighte, and seriching in the cleare day lighte of the Gospell, necessarie for him and all the rest of that darke broode, and vncleane cage of papistes, vvho vvith their vntimely bookes, seeke the discredite of the trueth, and the disquiet of this Church of England. VVritten by Iohn Fielde, student in Diuinitie. (1581)

Such enemies to God are these papists, that they subuert al religion, teaching for doctrin the vnsauory precepts & traditions of men, they mingle their lead vvith the Lords gold, and fill his haruest full of darnel They breake, as you haue heard, al the commandements of God, to maintein their own waies, and stop from vs the springes of the vvater of life, that vve might drink of their puddles. For their own dreams they make vs forget the name of our God, and leade vs from that simplicitye that is in Christ Iesus, They are vnthankfull vvretches for al Gods benefits, and to say grace vvith them, vnlesse it bee after some mumbling sorte in an vnknovven tong, eyther before meate or after, is a note of a ranke Heretique.

Posted by Mark Liberman at 08:16 AM

July 06, 2006

A new type of dong

Am I the only person in the world who thinks it is hilarious that the name of the mighty missile launched on Independence Day by the diminutive leader of the world's crappiest country is the Taepodong? And that it prematurely ejaculated 40 seconds after erection? C'mon; lighten up, America. Even if you are able to repress your amusement at the the strange tendency of Korean names to sound obscene in English (and we linguists do try not to laugh at foreign languages; we try respect even the funny-sounding ones), you gotta laugh at the phallic imagery here. A 5-foot-2-inch brutal tyrant with lifts in his shoes and a bouffant hairdo and 13 illegitimate children who's afraid to fly and spends $700,000 a year on brandy when his people are starving and he can't get his Taepodong up? And an earlier attempt at missile building led to one called the Nodong? Go on, laugh at the murderous little shit. It won't change anything, but it'll do you good.

Ah, no, it seems I am not at all the only person to think this an occasion for jest. Jon Stewart and Rob Corddry on The Daily Show made merry with the rich linguistic possibilities here, as can be seen from the video here — and they had visited the topic earlier as you can see here. If only I had cable, I might have known all this; thanks to Monica Lacerda for pointing it out to me, and to Sylvia Drake for pointing out the earlier piece, and to Lane Greene for reminding me that the North Korean naming policies also gave us the the Nodong.

Posted by Geoffrey K. Pullum at 06:56 PM

The bunkum of "The Bunkum of Bunkum"?

Recently, David Donnell sent me a link to Daniel Cassidy's exercise in creative etymology, "How the Irish invented Slang: The Bunkum of Bunkum (for Dizzy Gillespie)" counterpunch, July 1-2, 2006. Cassidy hopes "we can put to rest the bunk about bunkum", by which he means to debunk the origin given in the OED and elsewhere for the word bunkum:

[f. Buncombe, name of a county in N. Carolina, U.S. The use of the word originated near the close of the debate on the ‘Missouri Question’ in the 16th congress, when the member from this district rose to speak, while the house was impatiently calling for the ‘Question’. Several members gathered round him, begging him to desist; he persevered, however, for a while, declaring that the people of his district expected it, and that he was bound to make a speech for Buncombe. (See Bartlett, Amer. Dict.)]

According to Cassidy's alternative history, bunkum is an "Irish and Scots-Gaelic word" derived as follows:

Buanchumadh, (pron. buan'cumah), perpetual invention, endless composition (of a story, poem, or song), a long made-up story, fig. a shaggy dog tale.

Buan-, prefix, long-lasting, enduring, perpetual, endless.
Cumadh (pron. cumah), Vn., (act of) contriving, composing, inventing, making-up; a made-up story.

Níl ann ach cumadh, it is just a made-up story. (Ó Dónaill, Foclóir Gaeilge-Béarla, Irish-English Dictionary, 353)

If it were a very long made-up story, one would say in Irish: níl ann ach buanchumadh, it is just a "long, endless tale." A similar Irish compound, buanchuimhneach, means "(someone) having a long memory."

Cassidy suggests that this is only one of many arguments for a systematic campaign of etymological revisionism in favor of borrowings from Gaelic, such as the derivation of swank from somhaoineach, and he connects this project to the influence of African-American Gaelic speakers. I asked Jim McCloskey whether Cassidy's riff is linguistically plausible, and he responded:

No, this is very fanciful, I think. `Buanchumadh' is, I suppose, a morphologically possible word, but it's not a word I've ever come across and it's not in any of the dictionaries that I have to hand (three). I'm reasonably sure that the apparent dictionary entry at the start of the piece:

Buanchumadh, (pron. buan'cumah), perpetual invention, endless composition (of a story, poem, or song), a long made-up story, fig. a shaggy dog tale.

is fictional; it's not, at any rate, in any of the standard dictionaries, and he gives no other reference to check.

Further: if this were an actual word of the language, it would mean something like `perpetual composing'. It's a long way from that to a sense close to that of English `story'.

This is too bad, since I think that there is lots of interesting territory to be explored (linguistic and other) having to do with connections between people of African descent and people of Irish descent in the New World---especially in the Caribbean, where lots of Irish people were sold into slavery after the clearances of the mid 17th century. Stuart Davis has some interesting work in this area, though that has to do with influences on southern Black English from Irish forms of English (not from Irish).

And I would bet you almost anything that that is where you'll find linguistic influences---from Irish English rather than from Irish.

For the other suggested borrowing: the idea that `swank' comes from `somhaoineach' seems very, very dubious. This is a real word (meaning `valuable' or `profitable') but it's also very, very obscure, and it's hard for me to imagine that it would have been one of the words brought to this country by Irish-speaking migrants or that it would have survived in the linguistic melting pot (I've never heard it used in speech, as far as I can recall; I know it only from old law texts). In any case, isn't `swank' well established in English English?

The OED says that it's "a midl. and s.w. dial. word" that is "ultimately related to OHG. MHG. swanc swinging motion", with citations like these:

1809 BATCHELOR Anal. Eng. Lang. 144 (Bedfordshire dialect) Swangk, to strut.
1848 EVANS Leic. Words & Phrases s.v., I met him swanking along the road, ever so genteel.

I suspect that if Cassidy had evidence that somhaoineach was used by Irish or Scots Gaelic speakers in 18th- or 19th-century America, he would have cited it. Though in fairness, there may not be a lot of documentary evidence available. The problem is that any random pair of languages have many random examples of similarly-sounding words and phrases with vaguely connected meanings, especially if you allow obscure words and unexpected phrases, and so it's all too easy to make a case on such grounds for an unexpectedly large role for Gaelic, or Russian, or Bambara, or Greek, in the history of English.

I can remember, when I was child, listening to an aged relative's fanciful stories about how this and that aspect of American culture or vocabulary was originally Russian. I guess that this is a common expression of ethnic pride, simultaneously promoting and subverting assimilation. It's famously lampooned in the character of Gus Portokalos in the movie My Big Fat Greek Wedding:

"Give me a word, any word, and I show you that the root of that word is Greek."
"Kimono, kimono, kimono. Ha! Of course! Kimono is come from the Greek word himona, is mean winter. So, what do you wear in the wintertime to stay warm? A robe. You see: robe, kimono. There you go!"
"The root of the word Miller come from a Greek word, millah, meaning apple, so there you go. And our name, Portokalos, is come from the word meaning orange. So today here, we have, apples and oranges. We all different now, but in the end, we're all fruit."

[Update: Grant Barrett is less kind but also more experienced.]

Posted by Mark Liberman at 08:12 AM

July 05, 2006

Classical linguifying

Geoff Pullum asks "What's the earliest linguification anyone can find? What can we find that is dated before 1987?"

Some common figures of speech that may count in this search are quite old and still in common usage. One is the notion that reference to two things within some span -- a breath, a sentence, a letter, a day -- implies an equivalence of value, or a connection beyond what is actually asserted. This involves an equivocation between purely linguistic contiguity on one hand, and conceptual equivalence, similarity or connection on the other, so I think it should count as an example of what Geoff is looking for. It's easy to find these by searching for strings of the form "in the same X", for plausible values of X, and it happens that I collected some examples of this last year, for a post that never quite converged. If these examples are accepted, then we can easily take linguification back to the middle of the 18th century, while awaiting specimens from Cicero, the psalms, or Gilgamesh.

Samuel Richardson, Pamela (1741-1742) Vol. 4, letter XLIV:

And so much, my dear Miss Darnford, for your humble Servant; and for Mr. Williams's and Mr. Adams's matrimonial Prospects---And don't think me disrespectful, that I have mention'd my Polly's Affair in the same Letter with yours. For in High and Low, (I forget the Latin Phrase---I have not had a Lesson a long, long while, from my dear Tutor) Love is in all the same!

Frances Chamberlaine Sheridan, Memoirs of Miss Sidney Bidulph (1761)

Would not this be a pretty conclusion of my adventures? No, no, Sir George, expect better things from thy friend. I hope my knight-errantry will not end so tragically. But hasten to make my peace with that gracious creature your sister: yet why do I name her and myself in the same sentence? She cares not for me, thinks not of me, or, if she does, it is with contempt. I said this before, and I must repeat it again; but tell her, what I have done was with a view to promote her happiness. Oh! may she be happy, whatever becomes of me.

Fanny Burney, Cecilia (1782), Cecilia, or Memoirs of an Heiress (1782) vol. V, book IX, chap. III. "A Confabulation"

"A strange slighty character!" cried Mr. Monckton, "yet of uncommon capacity, and full of genius. Were he less imaginative, wild and eccentric, he has abilities for any station, and might fix and distinguish himself almost where-ever he pleased."

"I knew not," said Cecilia, "the full worth of steadiness and prudence till I knew this young man; for he has every thing else; talents the most striking, a love of virtue the most elevated, and manners the most pleasing; yet wanting steadiness and prudence, he can neither act with consistency nor prosper with continuance."

"He is well enough," said Lady Margaret, who had heard the whole argument in sullen taciturnity, "he is well enough, I say; and there comes no good from young women's being so difficult."

Cecilia, offended by a speech which implied a rude desire to dispose of her, went up stairs to her own room; and Mr. Monckton, always enraged when young men and Cecilia were alluded to in the same sentence, retired to his library.

Mrs. Rowson, The Inquisitor; or, Invisible Rambler, Volume 1 (1793)

The world talks much about honesty, but I cannot comprehend where it is to be found.---The trader will stand behind his counter, and ask you three shillings per yard for cloth more than it is worth, and if you are inexperienced, as it frequently happens in such cases, you pay him without hesitation---he knows he has imposed upon you, yet he will lay his hand upon his heart, and declare he is an honest man.--- The Courtier---Oh! quoth reflection, pray don't mention a courtier and honesty in the same breath.--- The women---how can you talk of their honesty, when you have so flagrant a proof to the contrary before you.-

Susan Ferrier, Marriage (1818) , VOL. III., CHAPTER XII.

[Lady Emily is reading from Shenstone's Pastoral]

"I have found out a gift for my fair,
I have found where the wood-pigeons breed."

"There's some sense in that," cried the Doctor, who had been listening with great weariness. "You may have a good pigeon pye, or un sauté de pigeons au sang, which is still better when well dressed."

"Shocking!" exclaimed Lady Emily; "to mention pigeon-pies in the same breath with nightingales and roses!"

Washington Irving, Bracebridge Hall; or, The Humorists. A Medley, by Geoffrey Crayon, Gent. [pseud], Volume 1 (1822)

He began to throw out hints about the importance of a man's settling himself in life before he grew old; he would look grave whenever the widow and matrimony were mentioned in the same sentence; and privately asked the opinion of the Squire and parson, about the prudence of marrying a widow with a rich jointure, but who had several children.

John Hamilton Reynolds, The Press. A Satire. (1822) PART I.

The first time that I read Barry Cornwall's Dramatic Scenes, appears like a delicious day-dream; one of those rosy moments which we occasionally enjoy amidst the thorny paths of life. Their author has certainly deteriorated since their publication. His Poems do not deserve to be mentioned in the same breath, nor is it easy for me to conceive them the offspring of the same mind.

Edward Bulwer Lytton, Pelham (1828) vol. II. chapter XXVI:

[Clutterbuck is speaking]

"There is one thing, my Pelham, which has grieved me bitterly of late, and that is, that in the earnest attention which it is the---perhaps fastidious---custom of our University, to pay to the minutiæ of classic lore, I do now oftentimes lose the spirit and beauty of the general bearing; nay, I derive a far greater pleasure from the ingenious amendment of a perverted text, than from all the turn and thought of the sense itself: while I am straightening a crooked nail in the wine-cask, I suffer the wine to evaporate; but to this I am somewhat reconciled, when I reflect that it was also the misfortune of the great Porson, and the elaborate Parr, men with whom I blush to find myself included in the same sentence."

Charles Dickens, Martin Chuzzlewit

"Of all the ridiculous young fellows that ever I had to deal with," resumed Mrs. Todgers, "that is the most ridiculous and unreasonable. Mr. Jinkins is hard upon him sometimes, but not half as hard as he deserves. To mention such a gentleman as Mr. Jinkins, in the same breath with him---you know it's too much! and yet he's as jealous of him, bless you, as if he was his equal."

George Henry Borrow, Lavengro (1851) vol. I. chapter VI:

If I am here asked whether I understood anything of what I had got by heart, I reply---"Never mind, I understand it all now, and believe that no one ever yet got Lilly's Latin grammar by heart when young, who repented of the feat at a mature age."

And, when my father saw that I had accomplished my task, he opened his mouth, and said, "Truly, this is more than I expected. I did not think that there had been been so much in you, either of application or capacity; you have now learnt all that is necessary, if my friend Dr. B---'s opinion was sterling, as I have no doubt it was. You are still a child, however, and must yet go to school, in order that you may be kept out of evil company. Perhaps you may still contrive, now you have exhausted the barn, to pick up a grain or two in the barn-yard. You are still ignorant of figures, I believe, not that I would mention figures in the same day with Lilly's grammar."

William Makepeace Thackeray, The History of Pendennis (1849) vol. II. chapter III. "Contains a Novel Incident"

"Not a bad speech, young one," Warrington said, "but that does not prevent all poets from being humbugs."

"What---Homer, Æschylus, Shakspeare and all?"

"Their names are not to be breathed in the same sentence with you pigmies," Mr. Warrington said; "there are men and men, sir."

"Well, Shakspeare was a man who wrote for money, just as you and I do," Pen answered, at which Warrington confounded his impudence, and resumed his pipe and his manuscript.

Charlotte Brontë, The Professor (1857) vol. II. chapter XXIV.

[Hundsden says]

"If Tell was like Wellington, he was an ass."

[Frances responds]

"Well, whenever you marry don't take a wife out of Switzerland; for if you begin blaspheming Helvetia, and cursing the cantons---above all, if you mention the word ass in the same breath with the name Tell (for ass is baudet, I know; though Monsieur is pleased to translate it esprit-fort) your mountain maid will some night smother her Breton-bretonnant, even as your own Shakspeare's Othello smothered Desdemona."

Elizabeth Cleghorn Gaskell, Wives and Daughters (1866) vol. II. chapter XXVII. "Off with the Old Love, and On with the New.."

[Cynthia and Molly are discussing the departure of Cynthia's most recent beau.]

"I don't like people of deep feelings," said Cynthia, pouting. "They don't suit me. Why couldn't he let me go without this fuss? I'm not worth his caring for!"

"You've the happy gift of making people love you. Remember Mr. Preston,---he too wouldn't give up hope."

"Now I won't have you classing Roger Hamley and Mr. Preston together in the same sentence. One was as much too bad for me as the other is too good. Now I hope that man in the garden is the juste milieu,---I'm that myself, for I don't think I'm vicious, and I know I'm not virtuous."

Oscar Wilde, Lady Windermere's fan (1893) FIRST ACT:

LORD WINDERMERE
I am not going to give you any details about her life. I tell you simply this---Mrs. Erlynne was once honoured, loved, respected. She was well born, she had position---she lost everything---threw it away, if you like. That makes it all the more bitter. Misfortunes one can endure--- they come from outside, they are accidents. But to suffer for one's own faults---ah!---there is the sting of life. It was twenty years ago, too. She was little more than a girl then. She had been a wife for even less time than you have.
LADY WINDERMERE
I am not interested in her---and---you should not mention this woman and me in the same breath. It is an error of taste.

You can find your own modern examples from the web easily enough.

Posted by Mark Liberman at 10:39 PM

Four more examples of linguifying

I still get people writing to me to say they don't understand about the phenomenon I mused on here and other places, and later decided to dub linguification. Why not (they ask me) simply classify Daniel Gilbert's turn of phrase as simply an exaggeration? Well, I answer that here ). But encouragingly, a number of people have cottoned on to what I'm talking about, and have sent me new cases of linguifying. Here is one from from Jonathan Lundell:

The bus continued its extended left turn toward Santo Fico's newest building--the Palazzo Urbano. Built before the turn of the century to house all the government offices, the faded two-story palazzo now stood empty and in disrepair. Most of the windows were locked and shuttered and apparently nobody had even mentioned the word paint in its presence for many years.
[from The Miracles of Santo Fico, by Dennis L. Smith]

The underlying claim: that the palazzo had not been repainted in many years, and was in sufficiently bad repair that one could imagine that no one had even so much as planned or considered any repair or painting work on it. But of course, for anyone interested in carrying out some refurbishment on the palazzo, mentioning the word paint on site is neither necessary (you could do your planning elsewhere and then send painters round) nor sufficient (mentioning the word paint over and over again while standing in the shadow of the palazzo would accomplish nothing). The underlying claim has been linguified. (Notice, this is a case where the linguified claim might well be true: perhaps no one had done any painting and no one had done any talking either. Perhaps nobody had been anywhere near the empty Palazzo Urbano for years. As I defined it (here), linguification often involves shifting from a possibly true claim to a definitely false one, but not always.)

Ran Ari-Gur sent me this very clear case:

[...] when people are talking about me, the words 'computer' and 'savvy' are never in the same sentence... [from http://www11.brinkster.com/asij1993/News/UpdateSprFall00.asp]

The underlying claim: people don't refer to me as being computer savvy. But the linguification is again neither necessary nor sufficient. People could refer to someone as computer savvy without using either of those words at all ("He knows his way round the network, both hardware and software, whether it's Wintel or Mac or Linux, and he can fix things in the dark with one hand tied to his swivel chair"), and they can use the two words in the same sentence without attributing the property ("Jason may be savvy about a lot of things, like maybe masturbation and surfing, but computer savvy he ain't; he's a computational nincompoop").

Lesley Graham sent me this one:

Millar is no angel - an arrogant, out-spoken man guilty of injecting EPO...but he's served a 2 year ban and while he may not have found humility he's certainly looked it up in the dictionary.
[from http://www.caledoniacalling.com/]

The underlying claim is that Millar may have modified his behavior very slightly in the direction running from extreme arrogance toward total humility. But the claim that he has looked humility up in a dictionary is a linguification (and doubtless false: even the most arrogant people there are — i.e., mathematical economists — know the word).

And Tom Phillips makes the very interesting point that the wonderful and much missed Douglas Adams (he of The Hitchhiker's Guide to the Galaxy) had not only spotted that there was something a little odd about linguified claims before the end of the Reagan administration, he had included a humorous undercutting of a linguification in his less well-known 1987 novel Dirk Gently's Holistic Detective Agency:

WFT-II was the only British software company that could be mentioned in the same sentence as such major U.S. companies as Microsoft or Lotus. The sentence would probably run along the lines of "WFT-II, unlike such major U.S. companies as Microsoft or Lotus ..." but it was a start.

Spot on, Douglas! And thanks, Jon and Ran and Lesley and Tom. You, like Rob Chametzky, get the point. This is not some already known figure of speech like hyperbole or metaphor or catachresis or synechdoche. It's a peculiar, and I suspect fairly modern, literary conceit. And to repeat myself, I simply don't see why people think it is a good idea. Are the above examples funny? Thought-provoking? Witty? Revealing? Poetic? Do they convey a point in some especially sharp, hit-the-nail-on-the-head way? Not as far as I can see. It is clear that many people think linguification is a great idea. I just don't see why.

By the way, I hesitate to put out another call for contributions here, but... What's the earliest linguification anyone can find? What can we find that is dated before 1987?

Posted by Geoffrey K. Pullum at 09:32 PM

Whistling in the Saddle

Via Kieran Healy at Crooked Timber: in the Guardian's World Cup coverage, a covey of metaphors jostle cheek-to-jowl in a custard of figuration:

France began this tournament saddled with worries about the ageing legs at the heart of their team, but they have changed their tune.

Posted by Geoff Nunberg at 05:14 PM

Unconstitutional bad punning

Called to answer for his Heimlich eggcorn of June 25th, Opus on July 2 compounds the offense:

The earliest citation that I can find online for the "cashtrated" pun is from 1997, but I'm sure it's much older.

Posted by Mark Liberman at 08:02 AM

George M. Cohan, call your office

In today's San Jose Mercury News, James Hohmann reports that some people are disturbed by what they see an increasing tendency to use "Fourth of July" in place of "Independence Day," a shift they consider just another assault on the spirit of American patriotism. "We lose a certain acknowledgment as to what the meaning of the Declaration of Independence is and why our nation was founded when we just say 'Happy Fourth of July,'" says one indignant patriot, who also sees the shift in naming as the sign of an increasingly commercial culture. And in the blogosphere, the gun rights enthusiast Publicola reprints an annual post reminding reader that "the holiday is called Independence Day. It occurs on the 4th of July. Unless you typically wish people a happy 25th of December or ask if they have plans for the 31rst of December then please try to refer to the holiday by its rightful name."

That's going pretty far to seek disquietude, particularly when it turns out that people haven't actually been bailing out on Independence Day in the first place. On the contrary: in the New York Times over the decade ending in 2002, Independence Day was about 85% as frequent as Fourth of July and July Fourth taken together, against only 60 % in the 1970's -- a figure that remained more-or-less at that level going back to the 1920's. And even if people actually were using Independence Day less, there's something goofy in the notion that "real" holidays don't go simply by dates.

For starters, it isn't as if using descriptive names for holidays preserves the their significance -- how many people could tell you what Armistice Day or Labor Day specifically are about? Nor would you want to argue that names like Labor Day and Christmas have impeded the commericalization of those holidays.

But what's more curious about the whole business is that Americans seem to have an aversion to referring to any; historical event by the date it occurred, the way people in other nations routinely do. The Italians have their December 12, September 20, February 8, October 25, and November 4 (to name just a few of the dates that have streets in Rome named after them). The French have their 9 Thermidor and 18 Brumaire, not to mention later dates like April 21. The Germans have June 17, the Portuguese have April 24. As I wrote once about these dates, "You have the image of a people turning the pages of the calendar in unison and marking the important dates in red letters. It's the same sense of history that allows families to refer to the dates of birthdays and anniversaries without having to remind each other what their significance is."

Whereas we have... well, damn near nothing anymore. Pull people under 40 or so off a bus at random and offer them $100 if they can identify the historical significance of any three of February 12, February 22, November 11, November 22, and December 7 -- your money will be safe. And hardly a man is still on the scene who can tell you what happened on April 18, either the old one (apart from inhabitants of Massachusetts residents old enough to remember the time before Patriots' Day was moved to the third Monday in April) or the recent one (apart from some guys in Idaho wearing camouflage and scanning the skies for black helicopters). Even September 11 is rapidly fading from the collective memory, as witness not just the dominance of the format "9/11"; but the fact that a lot of people pronounce it as if it were the same as the police emergency number.

Which leaves us exactly one date of historical significance that virtually all Americans can invest with historical significance: July 4. True, that's merely the date on which the Continental Congress approved the Declaration of Independence, two days after the delegates had actualy voted in favor of declaring independence. But that's a mere quibble -- the fact is that there's no other date so suffused with historical meaning. "Born on the Fourth of July" is quite as clear and a deal more stirring than "Born on Independence Day." And if Independence Day were ever really allowed to become the only name for the holiday (which could then be conveniently scheduled on the first Monday of July, I suppose), if would be reasonable to ask whether a people whose collective consciousness was completely devoid of dates could be considered a people with a history -- I mean, as opposed to merely a people with a handsome set of hand-colored dinner plates.

In the end, it's true, the complaints about calling the holiday the Fourth of July are so bizarrely off-the-wall that even the commander of the Santa Clara American Legion post was puzzled by them: "When we say Fourth of July, we're talking about Independence Day.'' The only mystery is why this issue should come up now, after a century or two when nobody saw anything problematic about "the Fourth." But as trivial as the issue is -- for now, anyway -- it reflects a new, strategically divisive sense of the significance of patriotic symbols. The point of symbols nowadays is not simply to declare one's devotion to one's country, but to insist that one loves it more than others did -- and hence to turn once universally sanctioned symbols into contested ones. Since the Vietnam era, Peggy Noonan wrote approvingly a few years ago, wearing a flag in one's lapel has been "a sign that said 'I support my country, and if you don't like it, that's too bad.'"

Hence the curious Lake Woebegon effect in American patriotism: in polls, around 60 percent of Americans describe themselves as more patriotic than the average American, while fewer than 10 percent consider themselves to be less patriotic. Or in other words, most people think that other Americans are less patriotic than they. And what better way to signal one's own superior patriotism and disparage the patriotism of others than to contest the way we refer to the Fourth of July -- in modern times, the one patriotic symbol that has never been controversial. As one Doc Farmer puts it on the right-wing ChronWatch site:

The date of America's anniversary may occur on July 4th, but to me, it's never the Fourth of July. It's Independence Day... Independence Day is more important to Americans--REAL Americans--than any other "National" holidays…. Independence Day probably means more to me than most "average" Americans.

Noted, Doc. Can you reach me another beer from there?

Posted by Geoff Nunberg at 12:35 AM

July 04, 2006

Air quotes and non-apologies

In his discussion of Chicago White Sox manager Ozzie Guillen's use of the word fag to describe a despised reporter, Arnold Zwicky missed an interesting aspect of Guillen's subsequent defense. One soundbite from Guillen's remarks to reporters on June 21, the day after his F-bomb, was frequently replayed on ESPN:

I should have used another word. They can do whatever they want, but I'm not going to back up. I will apologize to the people I offended because I should have used another word.

When those words appeared in print, some papers made small revisions to Guillen's comments. For instance, the Chicago Tribune replaced the word "they" with "[MLB]" (i.e., Major League Baseball), since the context was a reporter's question about Guillen's possible suspension. The Sun-Times, meanwhile, chose to render "back up" as "back [down]," editing Guillen's slightly unidiomatic usage to a more expected phrasal verb to fit his defiant refusal to apologize directly to the reporter in question, Jay Mariotti. But one paralinguistic feature that the newspapers chose not to transcribe was Guillen's prominent use of air quotes as he said, "I will apologize to the people I offended."

Greg Couch, the Sun-Times columnist who originally reported on Guillen's use of the word fag, took note of the air quotes in an interview on the blog After Elton:

AE: Do you think Guillen understands what he did wrong now?
GC: No. His apology was so weak: "I didn't mean to hurt anyone. I'm sorry if anyone took offense." I'm sure he didn't, but that's not the issue.

AE: And it's basically a non-apology apology. "I'm sorry if you're so sensitive that you're offended by this, but I didn't really do anything wrong."
GC: Yeah. I told him, "Ozzie, you've been in this country for 25 years. You know you can't use that word." He even used air quotes when saying those I "offended". Then he laughed when asked about sensitivity training. He doesn't take it seriously.

This is a twist on the typical sports non-apology exemplified by Pete Rose's statement, "I'm sorry it happened and I'm sorry for all the people, fans and family it hurt." As Geoff Pullum observed about the Rose case, expressing regret that an incident occurred and that it had an adverse effect on people does not constitute a fully formed apology. Guillen's use of air quotes visually bracketing the word "offended" distances himself even further from a true apology, casting doubt on the idea that he has anything to apologize for.

Guillen has already used his lack of proficiency in English as an excuse for being misunderstood (and also as an excuse for not bothering with the sensitivity training that MLB Commissioner Bud Selig has ordered him to attend). Perhaps if questioned about his use of air quotes he would say that the gesture has different connotations in his native Venezuela — as he claims is the case with the word fag. Guillen does seem to enjoy using air quotes and might have an idiosyncratic view of their appropriateness. (The photo above is not actually from his June 21 comments but from a pre-game interview before Game 4 of last year's World Series. It's unclear what exactly he's air-quoting in that instance.)

If Guillen does eventually attend sensitivity training, maybe they could devote a moment or two to air quotes. If, for instance, Stephen Colbert uses air quotes around "doctor" and "professor" when referring to Michael Adams as he did in the truthiness wars, his viewers are supposed to infer that Colbert (or rather his on-air blowhard persona) doesn't think that Adams really deserves those titles. And whoever sees Guillen air-quote the word "offended" would similarly infer that he doesn't really think anyone was offended. Perhaps he thinks that the whole thing is a media-constructed non-story, or that gay rights advocates were not serious in their outrage at his remarks.

In any case, Guillen is hardly alone in making light of the whole idea of apologizing for homophobic slurs. The website Outsports has detailed how such non-apology apologies are all too common when sports figures are called to task for making anti-gay comments. These ambivalent statements of regret seem to serve two purposes: (a) inoculating the speaker from further criticism because a perfunctory "apology" has been made, and (b) serving as a wink/nudge implying that the speaker hasn't really capitulated to those calling for an apology. I'll leave the last word to Jim Buzinski of Outsports:

Apologies are ultimately about learning. About why our words hurt, about putting ourselves in someone else's shoes, about why something uttered in one setting is wrong in another. They are also about healing, about having people with different backgrounds, upbringings or points of views understand each other a little better. The "non-apology apology" accomplishes none of this. I am unapologetic when I say it's time to get rid of it.

[Update: Turns out Guillen did actually start his sensitivity training on Monday. The Chicago Tribune quoted Guillen as describing the training session with another F-word: "fun." He further elaborated:

"I told the guy I don't need to be polite, I need to speak better English. I understand the system better. A lot of people thought I was making an excuse of not being from this country. Because I was here 26 years, I know what every little word means to everybody. That's not an excuse. I think the guy said, `If you don't have anything nice to say, don't say anything.'
"I said, `If you have to say that to somebody, don't tell me what then.' I'm not going to say that. I will be the same guy, use a different word."

No word on whether air quotes were covered.]

Posted by Benjamin Zimmer at 07:10 PM

Avoiding the other F-word

The editorial in the 6/29/06 Bay Area Reporter (a San Francisco weekly "serving the gay, lesbian, bisexual, and transgender communities since 1971"), "Ozzie and the 'fags'" begins:

If we see the phrases "a derogatory term used to describe someone's sexual orientation" or "a slur associated with homosexuality" in place of the direct quote "fag" one more time, we're going to scream. Those are the phrases most mainstream media outlets have used repeatedly for the last several days when reporting on the latest verbal tirade of Chicago White Sox manager Ozzie Guillen [on 6/20/06]. Guillen disagreed with comments made in a column by Chicago Sun-Times writer Jay Mariotti and called him a "fag." Plain and simple.

Actually, Guillen managed a double-F, or fortissimo, performance -- he called Mariotti a "fucking fag" -- with an S intro, "What a piece of shit he is, fucking fag".

(Well, that's how the F part of his performance was reported by Sal Marinello on the Blogscritics site, one of the very few to venture to print any version of fuck at all in the coverage of the Guillen affair. I'd guess that Guillen said "fuckin' fag", but that could be hard to check.)

It's a sign of my profound lack of interest in the sports world and my keen attention to the gay world that this news came to me more than a week after the event, in the BAR. By then, googling on <"Ozzie Guillen" "Jay Mariotti"> was pulling up over 73,000 webhits, most of them about this incident. And by then, Guillen had "apologized", after a fashion -- according to an Associated Press story,

"I shouldn't have mentioned the name that was mentioned, but I'm not going to back off of Jay," Guillen said, using another profanity to describe Mariotti.

and had been disciplined by baseball commissioner Bud Selig, with a fine and an order to attend sensitivity training. For several days, Guillen continued to explain that he had nothing against homosexuals, and he even plans to go to the Gay Games in Chicago, and he doesn't speak English well and in his native Venezuela that word isn't a slur (fag is a word of Venezuelan Spanish? who knew? or is he saying that maricón isn't a slur in Venezuela? that, too, would be news to me), and anyway "I wasn't calling people that. I was calling him that." The BAR's editorial cartoon, by Paul Berge, nicely skewers Guillen's self-defense:

I've been asked to apologize for calling a newspaper columnist a blankety-blank three-letter F-word.

If I hurt anybody with what I called him, I apologize. But I wasn't talking about those people. I was talking strictly about that columnist.

I have nothing against those people! I was just saying that I don't like the guy, so he must be one of them!

[in thought balloon] ...And the league thinks I need sensitivity training! Hah!

Various sports columnists -- Gene Wojciechowski and Mark Kreidler on ESPN.com, for instance -- deplored Guillen's language, but (unlike the major news bureaus) used the word fag, as did Mariotti in his own Chicago Sun-Times columns, though Mariotti kept his distance from fuck; from his 6/25 column:

I keep wondering how many other managers and coaches would have been fired for describing someone as "a [bleeping] fag.''

The BAR editorial continues:

So we have two issues: Guillen's homophobic comment, and the decision by most newspapers to clean up the quote. The problem with the former likely won't be solved by sensitivity training -- Guillen remains a hothead, in our opinion, and seems resistant to change, according to comments made by family members in news accounts about his latest tirade. That's too bad, because as our readers know, homophobia in professional sports remains a major league problem, and rather than fight learning about new things, Guillen should embrace change.

But the other issue also is important. Media outlets should print "fag" if that's what someone says. For many years, the word has been viewed widely as an antigay slur, and people should be held accountable for their comments. If they want to continue to be homophobic, that's their right, but let us see it in print and hear it on the air. Letter-writer John Sulikowski called Editor and Publisher, a trade publication, to task for cleaning up Guillen's comments: "Typical gutless journalism," he wrote.

The paper is (covertly) making a distinction here between two classes of "bad words": taboo words, like fuck, which are offensive in polite society regardless of the intentions of people who use them (the offense comes with the word); and slurs, like fag, which are offensive because they can be used as insults (the offense comes with the way the word is used). Slurs can have non-offensive uses, in in-group talk, by reclamation, as signs of trust and intimacy, and so on. I myself am on record (in the 6/03 issue of Out) as having no problem being called or calling myself a fag or faggot, in certain contexts. And non-slurs can be used as insults; if Guillen had called Mariotti a gay or a homosexual, rather than a fag, he wouldn't have won any politeness prizes, since in this context the attribution of homosexuality, however neutrally expressed, expresses contempt, and so counts as an insult. Guillen, being a foul-mouthed asshole ("asshole" is Marinello's characterization of him), just ratcheted things up one notch.

But the BAR takes things one step further: people who use slurs as insults, it maintains (and I'm inclined to agree), should have the ugliness of their attitudes exposed, not politely and protectively covered up. I am not less offended when the AP reports that Guillen uttered "a derogatory term that is often used to describe someone's sexual orientation" (that's 12 words, 25 syllables, folks) than I am by Mariotti's report that Guillen said "fag". (In fact, the direct version is much more informative than the fag-avoiding version. After all, Guillen had so many other choices of derogatory terms to use: faggot, cocksucker, fairy, queer, homo, pansy, and fruit, at least.) The BAR's position here is a lot like the position taken by the Guardian, the Economist, and the New Yorker (among other publications) on the serious taboo words, that they should be used only in quotations, and then only for good reason, and in those circumstances should be printed as-is and not avoided. As for Guillen, we should let him condemn himself out of his own mouth.

[Note: following almost all of my sources, including the White Sox site, I give the man's name as Guillen rather than Guillén.]

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 03:26 PM

Strunk and White vs. the Declaration of Independence

Today is Independence Day in the United States, the anniversary of the Declaration of Independence. It's a document well worth reading, or reading again, so I've reproduced it here for your convenience.

IN CONGRESS, July 4, 1776.

The unanimous Declaration of the thirteen united States of America,

When in the Course of human events, it becomes necessary for one people to dissolve the political bands which have connected them with another, and to assume among the powers of the earth, the separate and equal station to which the Laws of Nature and of Nature's God entitle them, a decent respect to the opinions of mankind requires that they should declare the causes which impel them to the separation.

We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness.--That to secure these rights, Governments are instituted among Men, deriving their just powers from the consent of the governed, --That whenever any Form of Government becomes destructive of these ends, it is the Right of the People to alter or to abolish it, and to institute new Government, laying its foundation on such principles and organizing its powers in such form, as to them shall seem most likely to effect their Safety and Happiness. Prudence, indeed, will dictate that Governments long established should not be changed for light and transient causes; and accordingly all experience hath shewn, that mankind are more disposed to suffer, while evils are sufferable, than to right themselves by abolishing the forms to which they are accustomed. But when a long train of abuses and usurpations, pursuing invariably the same Object evinces a design to reduce them under absolute Despotism, it is their right, it is their duty, to throw off such Government, and to provide new Guards for their future security.--Such has been the patient sufferance of these Colonies; and such is now the necessity which constrains them to alter their former Systems of Government. The history of the present King of Great Britain is a history of repeated injuries and usurpations, all having in direct object the establishment of an absolute Tyranny over these States. To prove this, let Facts be submitted to a candid world.

He has refused his Assent to Laws, the most wholesome and necessary for the public good.

He has forbidden his Governors to pass Laws of immediate and pressing importance, unless suspended in their operation till his Assent should be obtained; and when so suspended, he has utterly neglected to attend to them.

He has refused to pass other Laws for the accommodation of large districts of people, unless those people would relinquish the right of Representation in the Legislature, a right inestimable to them and formidable to tyrants only.

He has called together legislative bodies at places unusual, uncomfortable, and distant from the depository of their public Records, for the sole purpose of fatiguing them into compliance with his measures.

He has dissolved Representative Houses repeatedly, for opposing with manly firmness his invasions on the rights of the people.

He has refused for a long time, after such dissolutions, to cause others to be elected; whereby the Legislative powers, incapable of Annihilation, have returned to the People at large for their exercise; the State remaining in the mean time exposed to all the dangers of invasion from without, and convulsions within.

He has endeavoured to prevent the population of these States; for that purpose obstructing the Laws for Naturalization of Foreigners; refusing to pass others to encourage their migrations hither, and raising the conditions of new Appropriations of Lands.

He has obstructed the Administration of Justice, by refusing his Assent to Laws for establishing Judiciary powers.

He has made Judges dependent on his Will alone, for the tenure of their offices, and the amount and payment of their salaries.

He has erected a multitude of New Offices, and sent hither swarms of Officers to harrass our people, and eat out their substance.

He has kept among us, in times of peace, Standing Armies without the Consent of our legislatures.

He has affected to render the Military independent of and superior to the Civil power.

He has combined with others to subject us to a jurisdiction foreign to our constitution, and unacknowledged by our laws; giving his Assent to their Acts of pretended Legislation:

For Quartering large bodies of armed troops among us:

For protecting them, by a mock Trial, from punishment for any Murders which they should commit on the Inhabitants of these States:

For cutting off our Trade with all parts of the world:

For imposing Taxes on us without our Consent:

For depriving us in many cases, of the benefits of Trial by Jury:

For transporting us beyond Seas to be tried for pretended offences

For abolishing the free System of English Laws in a neighbouring Province, establishing therein an Arbitrary government, and enlarging its Boundaries so as to render it at once an example and fit instrument for introducing the same absolute rule into these Colonies:

For taking away our Charters, abolishing our most valuable Laws, and altering fundamentally the Forms of our Governments:

For suspending our own Legislatures, and declaring themselves invested with power to legislate for us in all cases whatsoever.

He has abdicated Government here, by declaring us out of his Protection and waging War against us.

He has plundered our seas, ravaged our Coasts, burnt our towns, and destroyed the lives of our people.

He is at this time transporting large Armies of foreign Mercenaries to compleat the works of death, desolation and tyranny, already begun with circumstances of Cruelty & perfidy scarcely paralleled in the most barbarous ages, and totally unworthy the Head of a civilized nation.

He has constrained our fellow Citizens taken Captive on the high Seas to bear Arms against their Country, to become the executioners of their friends and Brethren, or to fall themselves by their Hands.

He has excited domestic insurrections amongst us, and has endeavoured to bring on the inhabitants of our frontiers, the merciless Indian Savages, whose known rule of warfare, is an undistinguished destruction of all ages, sexes and conditions.

In every stage of these Oppressions We have Petitioned for Redress in the most humble terms: Our repeated Petitions have been answered only by repeated injury. A Prince whose character is thus marked by every act which may define a Tyrant, is unfit to be the ruler of a free people.

Nor have We been wanting in attentions to our Brittish brethren. We have warned them from time to time of attempts by their legislature to extend an unwarrantable jurisdiction over us. We have reminded them of the circumstances of our emigration and settlement here. We have appealed to their native justice and magnanimity, and we have conjured them by the ties of our common kindred to disavow these usurpations, which, would inevitably interrupt our connections and correspondence. They too have been deaf to the voice of justice and of consanguinity. We must, therefore, acquiesce in the necessity, which denounces our Separation, and hold them, as we hold the rest of mankind, Enemies in War, in Peace Friends.

We, therefore, the Representatives of the united States of America, in General Congress, Assembled, appealing to the Supreme Judge of the world for the rectitude of our intentions, do, in the Name, and by Authority of the good People of these Colonies, solemnly publish and declare, That these United Colonies are, and of Right ought to be Free and Independent States; that they are Absolved from all Allegiance to the British Crown, and that all political connection between them and the State of Great Britain, is and ought to be totally dissolved; and that as Free and Independent States, they have full Power to levy War, conclude Peace, contract Alliances, establish Commerce, and to do all other Acts and Things which Independent States may of right do. And for the support of this Declaration, with a firm reliance on the protection of divine Providence, we mutually pledge to each other our Lives, our Fortunes and our sacred Honor.

According to the principles enunciated by Strunk and White , the Declaration of Independence is an awful piece of writing. It is riddled with adjectives and adverbs, according to Strunk and White, and other purveyors of stupid advice, the nemesis of good writing. Here is the first paragraph with the adjectives and adverbs marked. Is it better without them?

When in the Course of human events, it becomes necessary for one people to dissolve the political bands which have connected them with another, and to assume among the powers of the earth, the separate and equal station to which the Laws of Nature and of Nature's God entitle them, a decent respect to the opinions of mankind requires that they should declare the causes which impel them to the separation.

It frequently uses the passive voice: "all men are created equal...", "they are endowed by their Creator...", "Governments are instituted among Men...". Many sentences are very long: the first main sentence, the one beginning "When in the course of human events...", contains 71 words. Some sentences begin with conjunctions ("Nor have we been...", "But when a long train..."). There are redundancies ("We mutually pledge to each other..."). All in all, according to Strunk and White, a shoddy piece of work. Curious that they never commented on it.

Posted by Bill Poser at 02:50 PM

More animals and listening

In the wake of my posting Judith Barrington's poem "Crows", I've been pointed to more things about listening to animals and animals listening to us -- crows, lions, chimpanzees, and of course dogs.

Jeremy Hawker, crow-surrounded in Norway, reminds me of Ted Hughes's fierce book Crow ("From the Life and Songs of the Crow"), where we find

Crow Goes Hunting

Crow
Decided to try words.

He imagined some words for the job, a lovely pack--
Clear-eyed, resounding, well-trained,
With strong teeth.
You could not find a better bred lot.

He pointed out the hare and away went the words
Resounding.
Crow was Crow without fail, but what is a hare?

It converted itself to a concrete bunker.
The words circled protesting, resounding.

Crow turned the words into bombs--they blasted the bunker.
The bits of bunker flew up--a flock of starlings.

Crow turned the words into shotguns, they shot down the
starlings.
The falling starlings turned to a cloudburst.

Crow turned the words into a reservoir, collecting the water.
The water turned into an earthquake, swallowing the
reservoir.

The earthquake turned into a hare and leaped for the hill
Having eaten Crow's words.

Crow gazed after the bounding hare
Speechless with admiration.

Turning from crows to lions (and Wittgenstein), Hawker quotes from John Gray's Straw Dogs, which attacks the belief that humans are different from and superior to animals:

'If a lion could talk, we could not understand him,' the philosopher Ludwig Wittgenstein once said. 'It's clear that Wittgenstein hadn't spent much time with lions,' commented the gambler and conservationist John Aspinall.

So, mixed opinions on lions. What of chimpanzees, gorillas, and their kin, whose linguistic abilities have been scrutinized for forty years or so now? In 1978 (at the 14th regional meeting of the Chicago Linguistic Society) Mark Seidenberg and Laura Petitto posed the provocative question, "What Do Signing Chimpanzees Have to Say to Linguists?" (a longer article appeared in Cognition the following year, under the title "Signing behavior in apes: A critical review"). Actually, this is two questions, one about what linguists can learn from observing signing chimpanzees (which is what the article mostly concerns itself with), and one about what signing chimpanzees "say" -- that is, sign -- to linguists and other observers. Seidenberg and Petitto answer that question on p. 432 of the published paper:

me banana you banana you me me banana

and similar "long, repetitive, continuous sequences" about matters of intense interest to the chimpanzees.

Ok, it's been fun chatting with the chimpanzees, but let's spend some time with those loving, loyal dogs. (Though dogs get a lot more press than cats in the communication-with-humans department, googling on "talking cat" will net you a lot of entertaining stuff; still, "talking dog" gets more than ten times the hits.) Surely, the last word on dogs' UNDERSTANDING of human language comes (as Ray Girvan reminds me) from Gary Larson, in his Far Side cartoon on the subject:

What we say to dogs: Okay, Ginger! I've had it! You stay out of the garbage! Understand, Ginger? Stay out of the garbage, or else!

What they hear: blah blah GINGER blah blah blah blah blah blah blah blah GINGER blah blah blah blah blah...

[So it turns out Girvan had a different Larson cartoon in mind -- one in which Professor Milton invents a device that translates from Dog to English, and goes down a street full of barking dogs, only to discover that they are all saying "Hey hey hey hey hey hey..."]

Larson is dubious, but Dan Piraro, in a recent Bizarro cartoon (5/30/06), thinks some dogs can do a lot better, and can tell us about it. In this cartoon, a sizable dog confronts a young man sitting on a couch. The dog has his front paws on the couch, and the man is shrinking back in some alarm. The dog complains:

Wanna watch some TV? Do ya, boy? Do ya? Wanna watch some TV? Huh, boy?

Do you see how PATRONIZING that is?!

A nice counterpart to the chimps' long repetitive sequences of signs.

And now I think I'll go off the Language Log Plaza Talking Animal Watch for a while. I've seen too many cats saying "Mama!" And Bill Poser has reported to us on the potentially dire consequences of listening to cats.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 12:10 PM

On Cultures "Without a Language"

Bill Poser is rightly horrified, in his post "Cherokees Without a Language?", about the Asian Pacific Post article that claims that the Cherokees "were so backward that they did not even have a language of their own". But I can guess how the article's author made this silly and offensive mistake: I bet it arose from the linguistic innocent's all-too-common assumption that no written language = no language at all. People who distinguish between "the English language" and "a mere dialect" (i.e. a nonstandard dialect) are often making this same mistake; and it also underlies the belief (held by most non-linguists) that only Standard English -- namely the English that kids, at least American kids, learn to read and write in school -- has grammatical rules. Ditto for the belief that Greek, which has an unbroken written tradition stretching back well over 2000 years and a broken written tradition of over 3000 years, is an older language than, say, Basque, which (except for some names in inscriptions from the first two centuries CE) was not written until the 11th century CE and had no standard orthography until 1964. Beliefs like these may be especially offensive when applied to Cherokee, given the illustrious nature and history of its writing system, a syllabary that is believed to have been invented by Chief Sequoyah in 1819.

Posted by Sally Thomason at 10:48 AM

When you freelance, nobody knows you're a dog

I hope my colleagues in the Fellowship of the Predicative Adjunct didn't miss the amusing quotation, taken from an article about coping with death, at the bottom of page 87 in the June 26 issue of The New Yorker:

I co-parented a beloved boxer dog who, within a week of being asked to write this piece (as if on some cosmic cue), died of a massive heart attack in his apparent prime.

The non-finite clause beginning with being asked is embedded inside an adjunct (it's the complement of the preposition of inside an adjunct preposition phrase headed by within). It needs a subject if it is to be understood. The adjunct is preposed within a supplementary relative clause of which the subject is the relative pronoun who. That relative pronoun is the closest and most obvous candidate to provide the understood subject we need. (The originally intended one is I, but that's too far away to be the first candidate noticed.) But the relative clause is modifying the nominal beloved boxer dog. It's a doozy for misunderstandability. Yet it was published in a magazine (Utne), the editors having apparently failed to spot it. Quite surprising.

We continue to see examples of this sort, often several per week, even in printed sources. Their misunderstandability seems plangent. And yet the writers presumably don't notice, and we suspect most readers don't either.

By the way, for NPR listeners this Fourth of July: the Morning Edition gang did their annual reading of the Declaration of Independence today. How many noticed the dangling adjunct? It begins with "when so suspended". But don't let this grammatical curiosity spoil your whole day. Just keep in mind that dangling adjuncts are not a modern perversion, or the result of a post-1960s moral slackness. Happy Independence Day to all our readers.

Posted by Geoffrey K. Pullum at 09:41 AM

More on the "one-time rings"

I've gotten dozens of responses to the post on "Matrimonial cryptography" (6/28/2006), and I'll quote or summarize them all later. But so far, none of them really solve the problem that Dan and Sarah set, and some of them don't really address it at all. So I thought I'd say a little more about the problem, and explain in greater detail what a solution might look like.

Dan and Sarah are planning to get married, and they hatched the charming plan to find three messages written in the ordinary alphabet of letters from A to Z -- call the messages D, S, and M -- such that (for some simple, general and metaphorically satisfying function f) it will be true that f(D,S) = M. They'll have D engraved on Dan's wedding band, and S engraved on Sarah's, and then their individual messages will combine to create something new, implicit in the two of them but present in neither one alone, just as their marriage does.

They had a specific idea for the function f: use the simple trick of mod-26 addition found in some versions of "one-time pad" cryptosystems. (There could be other choices for f, and Dan and Sarah wouldn't turn down a good one, but choices that involve clever variants of letter-shapes, or remembering secret keys, aren't really in the spirit of the puzzle.) For those of you who aren't familiar with the mod-26 version of the one-time pad idea, here's an explanation. Suppose we align the letters from A to Z (ignoring case) with the numbers from 0 to 25, and with all the other integers modularly related to them:

...	x	y	z	a	b	c	d	e	f	g	h	i	j	k	l	m	n	o	p	q	r	s	t	u	v	w	x	y	z	a	b	c	...
...	-3	-2	-1	0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	25	26	27	28	...
		...	25	26	27	28	29	30	31	32	33	34	35	36	37	38	39	40	41	42	43	44	45	46	47	48	49	50	51	...
			...	-26	-25	-24	-23	-22	-21	-20	-19	-18	-17	-16	-15	-14	-13	-12	-11	-10	-9	-8	-7	-6	-5	-4	-3	-2	-1	0	...

Then we can combine strings of letters by adding them. We find that ||aced + scam = seep||, because

a+s=s (0+18=18)
c+c=e (2+2=4)
e+a=e (4+0=4)
d+m=p (3+12=15)

The combination inverts by simple subtraction: ||seep - scan = aced|| (because 18-18=0, 4-2=2, etc.) and likewise ||seep-aced = scam||.

Similarly, ||zulu + soul = riff|| because

z+s is 25+18=43, and 43 mod 26 (i.e. 43 - 26) is 17 → r
u+o is 20+14=34, and 34 mod 26 is 8 → i
etc.

In a net-accessible list of 109,582 English wordforms, there are 853 three-letter words, and 25,446 triples in which two three-letter words from the list add up by these rules to form a third word that is also on the list. This list of triples starts with ||aah+ado=adv||, ||aah+ads=adz||, ||aah+ale=all||, and runs through ||zoo+wax=vol||, zoo+zag=you||, and ||zoo+zap=yod||. There are 3,130 three-letter words, and 51,296 four-letter triples, from ||aahs+alfa=alms|| to ||zuni+tuck=sops||. There are 6,919 five-letter words, and 15,370 five-letter triples, from ||aahed+lance=laugh|| to ||zunis+tulsa=soyas||. There are 11,492 six-letter words, and 2,665 six-letter triples, from ||aahing+setons=seaway|| to ||zooids+casaba=bogies||.

And of course we don't need to limit ourselves to message mappings where the word edges all line up. We have ||an_ear+troth=testy||, ||salute+in_case=annuli||, and so on (where we'll elide the spaces in the old-fashioned cryptographical style...).

To use this scheme as a cryptosystem, the sender and receiver of the message need a shared keystream of sufficiently random letters. The sender adds the secret keystream to the plaintext to create the cyphertext, and the receiver then subtracts the keystream from the cyphertext to re-create the plaintext.

In Dan and Sarah's plan, there is no secret random keystream. Instead, each ring's message forms the keystream for the other ring's plaintext, and the two rings together -- taken in either order, because addition is commutative -- create a new message, which is implicit in the two of them, but present in neither one alone. The combination-function is simple and hallowed by tradition. The puzzle is to find three messages that are about the right length to engrave on the rings -- at most 20 characters or so -- and also are appropriate to the occasion.

I'm sure that Dan and Sarah would accept lines from Petrarch or Goethe, but let's limit things to English messages for the moment. The number of up-to-20-letter-long English-word-sequence triples will be astronomical, and nearly all of them will be incoherent and matrimonially inappropriate -- and crashingly boring to boot. Given a word list, it's easy to write a program to enumerate the possibilities (though it might take a long time to run to completion), but searching that program's output for attractive phrases is not a plausible plan.

We can set one of the three strings to something appropriate, and look for ways to fill the other two strings with genuine word sequences. It's easy to write a program to do this, but again, nearly all of the results will be incoherent and silly at best, even if we scan pages of output for something reasonable:

...

...

An approach that might be more would start with an appropriate output message, automatically derive a large number of word-sequence-pairs that sum to it, and then sort the results by estimated sequence probability based on some simple but reasonable model (like a trigram model or an aggregate bigram model). You could train or adapt the model on a collection of poetry. Then you might hope that scanning a few dozen screenfuls of output would turn up a plausible candidate or two. That's what I was hoping some kind and clever reader would take the time to to program up for Dan and Sarah -- if I had a spare couple of hours, I'd do it myself.

But today's the Fourth of July, and those of us in the U.S. of A. will be spending today celebrating with family and friends. Still, if you're from a fourthless culture, or otherwise find yourself at loose ends at some point in the near future, you couldn't find a more sentimentally satisfactory puzzle than this one.

Posted by Mark Liberman at 07:03 AM

Cherokees Without a Language?

An attentive reader pointed me at this article in the Asian Pacific Post, reporting some putative evidence for the claims of the book 1421: The Year China Discovered America, which I have discussed here and here, namely a Ming dynasty medallion said to have been dug up in North Carolina. The medallion could, of course, have been brought to North Carolina relatively recently, and there seems to be no real evidence as to where it was found and in what context, but that isn't the worst thing about the article. The REALLY stupid thing about the article is its blithe assertion that the Cherokee "were so backward that they did not even have a language of their own".

Where do people get this stuff? Of course the Cherokee had a language of their own. It's called Cherokee. It's listed in the Ethnologue. It is related to the other ten Iroquoian languages, so it surely isn't something they acquired from Chinese or Europeans. Are there reports of the Cherokee as not having a language? Certainly not.

Even if you don't know anything about the Cherokee or their language and are too lazy to look it up, it is hard to imagine modern human beings with no language. There is no record of any such society. You'd think that such a claim would raise an editor's eyebrows.

Posted by Bill Poser at 03:39 AM

July 03, 2006

Linguifying

I'm back in Santa Cruz after an exhausting two weeks of packing and moving from Cambridge (with no time to write, and barely enough just to read Language Log once a day as everybody should). I'm pleased to see that the flood of tedious and patronizing messages about my post on Daniel Gilbert has died down a bit. So let me try again to explain what struck me about Gilbert's claim ("Movies, theater, parties, travel — those are just a few of the English nouns that parents of young children quickly forget how to pronounce"). We need a new term for what is going on; although I don't in general think you can only grasp concepts that you have words for, I have learned to my cost that at least some people find it hard to get the hang of a new concept if they have no word for it, and Mark agrees that there is no term already in use. I therefore take the step of coining a new lexeme: linguify. It is a term relating to the writer's art, and in particular to journalism. Definition: To linguify a claim about things in the world is to take that claim and construct from it an entirely different claim that makes reference to the words or other linguistic items used to talk about those things, and then use the latter claim in a context where the former would be appropriate.

I note in passing that linguifying a claim is usually (but not always) done in such a way that the new claim is false instead of true, and it is often (but not necessarily) done with the intention of achieving a humorous effect.

You're going to need an example, of course. I have already given several on Language Log, but those times most of you weren't paying attention. So shut up, switch off your cell phones, put your comic books away, and listen. I'm giving you an example. It's from my earlier post "Bisexual chic". A writer named Alexis Long apparently wanted to say that bisexuality was increasingly being seen by mainstream news media as fashionable. But what he actually wrote (in an Australian newsletter for bisexuals) was:

It's difficult to find a piece of writing in the mainstream press which mentions the word 'bisexual' without finding that it is immediately followed by the word 'chic'.

Instead of talking about mainstream media attitudes, he linguified the claim, constructing a new statement about obligatory word adjacency in running text.

How do I know he didn't mean exactly what he said? Because he couldn't possibly have thought it was true. I searched on Google straight away to check on the linguified claim. And the number of occurrences of the word "bisexual" that Google found in mainstream news sources followed by the word "chic" was zero. And extending the search to all news recorded by Google News at the time, I also got zero, out of 984 news pages containing the word "bisexual".

The original claim might well be true; but the linguified claim was staggeringly, outrageous false. In fact it just couldn't be any more false: what Alexis said was difficult to find in an occurrence of "bisexual" (that it was not immediately followed by "chic") is actually found in 100% of the occurrences. And what Alexis says occurs almost all the time is instead never found at all.

But I am not particularly interested in the literal falsity of the linguified claim. What I'm interested in is why anyone ever linguifies a claim at all.

In the case of Daniel Gilbert's linguification, the linguified claim was obviously intended as humorous. He didn't (surely) think it it was true. Nor did I ever think he did, though apparently many of my readers imagined that I did). The linguified claim is certainly false; and it is certainly not an exaggeration of the underlying real claim in the sense that if it were true the underlying claim would be all the more true. But that's secondary. My interest is in why Gilbert would think linguifying was a good idea — why he thought it funnier, more interesting, or whatever.

Let me give another example. This one was sent by Rob Chametzky, apparently the only Language Log reader who saw clearly what I meant and sent me a new example (thank you, Rob; you truly hear the music; you have restored my faith in humanity). In The New York Times Magazine on Sunday, June 4, 2006, in an article called "Mass Natural", Michael Pollan wrote:

We have already seen what happens when the logic of the factory is applied to organic food production. The industrialization of organic agriculture, which Wal-Mart's involvement will only deepen, has already given us "organic feedlots" -- two words that I never thought would find their way into the same clause.

But he doesn't really mean he thought the adjective "organic" and the noun "feedlots" would never occur in the same clause. He couldn't possibly have thought that there would never be any attested sentences like this one:

Similarly, meat from organic grain-fed beef has the same nutritional profile as meat from the largest Kansas feedlot.
[http://www.eatwild.com/articles/whygrassfed.html]

The two underlined words are in the same clause. But Pollan never meant anything about clauses at all. He clearly meant that he never thought people would start regarding feedlots as being among the practices that characterize organic food production. His thought was about things, specifically farming practices; but he linguified it.

Here is one more. It's from Cynthia Gorney's article "Reversing Roe" in The New Yorker, June 26, 2006, page 50:

He seemed to have come to terms with the fact that a lot of literate people in his state now use his name and "sodomized virgin" in the same sentence.

Gorney is not really intending to tell us that South Dakota state senator Bill Napoli has come to terms with the co-occurrence of certain lexemes in certain uttered sentences. Her underlying claim is the following. A lot of people in South Dakota, when talking about Bill Napoli, make reference to his infamous appearance on PBS, where he was asked for an example of a situation where he would allow that an abortion should be legal even though giving birth would not kill the mother. What he said was that it would have to be a case of a virgin who had been saving her virginity for marriage but got brutalized, raped, and sodomized.

It is clear that Gorney does not think that people only mention these facts in terms that include "Napoli" and "sodomized virgin" within the boundaries of a single sentence. She has linguified her claim. (And by the way, the form of words "use X and Y in the same sentence" is fast becoming a snowclone of linguification.)

So, all you readers who went on and on at me about the Daniel Gilbert case — all you correspondents who told me finger-waggingly that nouns that parents of young children quickly forget how to pronounce "was obviously meant as a mildly humorous trope" (thank you M.S. of Madison, Wisc.), or that it "is just a figure of speech to indicate how rarely they would have cause to use those words" (thank you, Wangden), or that it is used "to indicate — jokingly — discomfort with or lack of knowledge of a concept" (thank you, Michele), or that "the underlying assumption may be that familiarity with a subject leads to (relative) expertise, which in turn implies that one is able to communicate effectively about the subject" (thank you, Matthew), or that it means parents "are not capable of enjoying such things as movies, theater, parties, travel- BECAUSE THEY HAVE NO WORDS FOR THEM" (thank you, Dr Pepper), or that perhaps "it isn't a rhetorical device, but a humor device: silliness" (thank you, Chele), or that "it's metaphor. Not an isolated metaphorical assertion ... but a whole metaphorical paradigm in the Lakoff & Johnson sense" (thank you, Rachael), or any of you others who wrote in: none of you have seen the point. I'm not failing to see the attempted humor, I have not become dyspeptic and humorless, I'm not taking it literally, I'm not incapable of understanding it, I am not unacquainted with the ideas of Benjamin Lee Whorf, I'm not ignorant of the (irrelevant) existence of hyperbole or metaphor or synechdoche.

What I wanted to draw attention to was simply the strange practice of publishing linguified claims: for example, saying the name X is invariably followed by the phrase Y when it isn't, or saying X is always accompanied by the qualifier Y when it isn't, and so on and so on. Why linguify? I have no idea. It just doesn't look like a good writing idea to me.

Posted by Geoffrey K. Pullum at 06:49 PM

Listening to the animals

Wittgenstein famously ventured that if a lion could talk, we could not understand him. But suppose we could. Poet Judith Barrington suggests that we might not want to listen. From Barrington's Horses and the Human Soul (Ashland OR: Story Line Press, 2004), the poem "Crows" (offered on today's Writer's Almanac with Garrison Keillor):

Crows

Crows startle the clouds
with grievances never resolved
and warnings blurted into thin air.

Once in a while, the cries of all those who tried to survive
pour from the funnels of their throats.
No wonder we never really listen.

Like most animals, crows tell the truth:
working hard to penetrate our tiny tubular ears,
they cackle on telephone lines while we watch TV.

Once I did listen to a crow, but even when I had heard
his whole story, there was nothing I could do.
Next, I thought, I'd have to listen to squirrels and coyotes.

I like to think I deal with my share of rotten truths
but I couldn't bear to kneel down in damp grass
and listen to the hedgehog or the mole.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 09:53 AM

We are the Inge Borg. You will be Inge borrowing.

I hope that Google's excellent machine-translation researchers will soon apply to German-English MT the talents that have produced such rapid progress in Arabic-English translation. In composing a post on the award of the Ingeborg Bachmann Prize to Kathrin Passig, I considered linking to Google Language Tools translation of the German-language Wikipedia entry "Ingeborg-Bachmann-Preis". Unfortunately, the results are pretty confusing, starting with the title, "Ingeborg-Bachmann-Preis", which should be "Ingeborg Bachmann Prize", but comes out as "Inge borrowing brook man price".

The translation system has decided that Ingeborg, rather than being just a plain old name, is actually a compound of Inge and some (I think morphologically unlikely) form of borgen "to borrow". Likewise the name Bachmann is broken into Bach+mann and translated as "brook man". (This is yet another piece of evidence that MT systems would do well to make use of what computational linguists call "named entity tagging"). The decision to translate preis as "price" rather than "prize" can be seen as a failure in another traditional dimension of computational linguistics, namely "sense disambiguation" -- though for modern statistical MT systems, all dimensions of language are sometimes viewed through the same set of algorithmic spectacles.

The name-translation problems continue in the first sentence of the article, where the phrase "jährlich in Klagenfurt (Kärnten) in einer mehrtägigen Live-Veranstaltung ermittelt" (which should be something like "awarded yearly in a multiple-day live presentation held in Klagenfurt (Kärnten)"), comes out as "annually in complaint ford (Kärnten) in a live-meeting of several days determined".

And things don't get better in the rest of the translated article, so I wound up linking to the corresponding English-language Wikipedia entry instead. "Complaint ford" is an appropriate venue for discussing current MT technology, alas.

I think (I hope!) this means that the German-English translation engine is still whatever commercial off-the-shelf ("COTS") system Google has licensed for this purpose. Come on, Franz, you folks can do better!

[Stefano Taschini writes:

Leaving aside the political and sociological issues of toponym translation, I think that Kärnten is usually referred to as Carinthia in English.

Indeed.]

Posted by Mark Liberman at 08:11 AM

Operation Enduring Freelance: the CIA gets it right

No, not that CIA, this one. This year's winner of the Ingeborg Bachmann Prize, announced on June 25, is Kathrin Passig, specialist in "Taktik, Technik und Theorie" for the "Zentrale Intelligenz Agentur", which is a Berlin-based "Freiberuflernetzwerk" ("freelance network") "an der Schnittstelle von Journalismus, Wirtschaft, Wissenschaft und Kunst" ("at the interface of journalism, commerce, science and art").

Wolfgang Behr has emailed to point out that Passig's prize-winning story, Sie befinden sich hier ("You are located here") includes this passage:

Eskimos haben, wie einfallslose Mitmenschen an dieser Stelle gern in die Konversation einwerfen, unzählige Wörter für Schnee. Vermutlich soll damit auf die abgestumpfte Naturwahrnehmung des Stadtbewohners hingewiesen werden. Ich habe keine Geduld mit den Nachbetern dieser banalen Behauptung. Die Eskimosprachen sind polysynthetisch, was bedeutet, dass selbst selten gebrauchte Wendungen wie ”Schnee, der auf ein rotes T-Shirt fällt“ in einem einzigen Wort zusammengefasst werden. Es ist so ermüdend, das immer wieder erklären zu müssen.

Vor meinen Augen entsteht gerade eine neue Art Schnee, nämlich Schneedurch-den-sich-ein-magerer-Hase-arbeitet. (...)

Eskimos have -- as unimaginative fellow humans are usually fond of interjecting into conversation at this point -- innumerable words for snow. This is probably to allude to the blunted perception of nature among city dwellers. I have no patience with the mindless repeaters of this hackneyed assertion. The Eskimo languages are polysynthetic, meaning that even rare phrases like "Snow falling on a red T-shirt" are combined into a single word. It is so tedious to have to explain this over and over again.

A new kind of snow emerges in front of my eyes, namely snow-through-which-a-skinny-hare-wades. (...)

Wolfgang observes that "this certainly confirms Passig's rank as the central agent of the Zentrale Intelligenz Agentur. And it is precisely the sort of thing Aikhenvald and Pullum would have written, if they hadn't been professional linguists (cf.' Sasha Aikhenvald on Inuit snow words: a clarification', 1/30/2004)".

I'm not sure why their profession ought to get in the way here, but certainly this is a new chapter in what Laura Martin called "the genesis and decay of an anthropological example". Kathrin Passig's own professional life has been quite diverse. Her ZIA page lists her special skills as "Internet, Web-Entwicklung, Perl, PHP, Filmuntertitelung, technisches und literarisches Übersetzen E-D, technisches Übersetzen NL-D" ("Internet, web design, Perl, PHP, film subtitling, technical and literary translation from English to German, technical translation from Dutch to German)".

I'd like to point out that the English-language media are falling down on the job here. Never mind Kathrin Passig's admirably well informed views on the Eskimo lexicon, she's just won what is perhaps the most important literary award in the German-speaking world. On June 25, more than one week ago. And I can't find her name indexed at the New York Times, the Washington Post, or indeed on any of the English-language media outlets indexed by Google News. Even Slashdot is silent (when was the last time that a Perl and PHP hacker won a major literary prize?), and Metafilter is down, so I can't check there.

Perhaps English-language media don't generally cover literary awards in other languages, I don't know. But Passig's story has all kinds of news hooks -- I mean, it's the ZIA! PHP and Perl! PowerPoint Karaoke! A prize winner accepting her award in a t-shirt with a picture of a retro computer terminal on it! And what with the World Cup and all, there must be plenty of American and British reporters in Germany, where Passig is from, and not all that far from Austria, where the prize was awarded. Some of them probably know German, and the rest have some recent experience puzzling out bits of it.

Anyhow, unless you read the German-language press, you (probably) read it here first.

[I've located just one English-language blog post: Nomadics, "Ingeborg Bachmann Prize awarded to Kathrin Passig", 6/27/2006. The signandsight review is here.

]

Posted by Mark Liberman at 06:38 AM

July 02, 2006

NABI OK?

A chain of indirect reports recently led Roger Shuy to suggest that the National Association of Bunco Investigators promotes racist stereotypes of the Roma people. I yield to no one in my dislike of social stereotypes, but I think that this may well be unfair. I've never met any NABI members, or attended any of their meetings; but the cons and scams that they list on their web site are equal opportunity evil, as far as I can tell.

The only use of the word "gypsy" or "romani" on the (extensive) NABI website is in a reprinted LA Times story ("Gypsies: the Usual Suspects", by Hector Becerra, 1/30/2006). The story discusses a meeting in Valley Forge PA at which it seems that at least some sessions (perhaps just one session?) dealt with criminal gangs with Romani associations. The general tenor seems to be comparable to what you might hear in discussions of gangs with roots in Sicily or Central America or the Crimea. Since such gangs exist, it would be foolish to demand that no one ever discuss how to deal with them. In addition to the headline, there are some quotes that clearly make inappropriate ethnic generalizations ("Gypsies like high-end luxury cars, mostly Beemers [BMWs], Mercedes and Caddies these days," [retired New York Police Det. Edward] Berrigan told the conference attendees), but there is also evidence in the story that some of the presenters were making a sincere effort to talk about ethnically-associated gangs while clearly distinguishing the criminal subculture from the majority of the ethnic group.

I've had neighbors and acquaintances who were victimized by transient home-contracting fraudsters, and I'm happy that local police departments are sharing information about how to deal with them, whatever ethnic group the criminals come from. It's a good thing to watch out for ethnic stereotyping, and to object to it when it happens; but discussing how to deal with ethnically-associated criminal gangs is not ipso facto ethnic stereotyping.

[Update: a bit more poking around finds another NABI page (also a news story) that mentions "gypsies -- the words "gypsy" and "Roman(i/y)" don't occur in it, so I missed it in an earlier search: "Self-Proclaimed 'Gypsies' Bilking Tucsonans". While alerting readers to a danger, this article fails to avoid encouraging anti-Roma stereotypes, in my opinion.]

Posted by Mark Liberman at 04:14 PM

Why the Attention to Bush's Language?

The thing that I don't get about the not very accurate criticism of George Bush's use of language is, why do people bother? I guess I could see it if he were a great President and there was nothing substantive to criticize. It might be silly, but it would give journalists and bloggers something to do and let them feel that they were keeping up the side and making the Executive branch toe the line, but the people picking on Bush's language aren't die-hard neocons who can find nothing to criticize in his politics.

Indeed, some people seem to have the impression that the Language Loggers who have critiqued the criticism of Bush's speech are defending Bush and his politics. I can't speak for them, but this is not a valid inference. I know this because the criticism of Bush's speech seems to me to be generally off the mark, yet I find the man so nauseating I change the channel if his picture comes up on television. For the record, in my opinion he's the worst President in the history of the United States. He's dishonest, ignorant, religious in the worst ways and none of the good ones, and a spendthrift. He favors the rich and the privileged, values loyalty to himself and his party over competence, has worked to break down the wall between church and state, and cannot tolerate dissent. He has been slow to address real crises, ranging from Darfur to Hurricane Katrina, forcefully and effectively. His use of false justifications for the invasion of Iraq is in my view treasonous. His diversion of resources from Afghanistan to Iraq has allowed the Taliban to reassert themselves in Afghanistan and made Iraq into a hotbed for terrorism, which it was not until he invaded. He has actively and persistently undermined the rule of law, both domestically and internationally, and has condoned torture. He should be impeached and tried for war crimes. This is just the beginning of my contempt for the man and what he represents.

With all of this to criticize, or to defend if one is so inclined, why on earth are people wasting their time and ours with this tripe about his use of language?! I'm a linguist. I like language. But for the life of me I don't understand why people think that the quality of George Bush's English is more important than what he does with the enormous power that he holds.

Posted by Bill Poser at 03:29 PM

Romani stereotypes

For several decades now Ian Hancock, head of the Romani Archives and Documentation Center at the University of Texas, has been trying to make people aware of the anti-Romani attitudes, stereotypes and brutal treatment all over the world. Hancock estimates that a half-million Gypsies were killed during the Holocaust alone. Himself a Romani academic, Hancock fights a lonely battle and can use all the help he can get. To whet my appetite, he recently sent me a January 30, 2006 article in the LosAngeles Times (sorry, I can't get a link), which describes the annual meeting of the National Association of Bunco Investigators, whose target population is, you guessed it, Gypsies -- although they use some sort of politically correct expressions these days, such as"professional transient burglars" and "transient offenders." These expressions appear to broaden the target category but the main focus of that meeting was the Romani, who, even though there is no demographic evidence to support this claim, are alleged to be the major perpetrators of the roofing and driveway repair scams in this country. Other prominent stereotypes are that the Romani are fortune tellers, thieves, liars, and filthy dirty people. It's scary that such stereotypes still exist.

In the past, linguists have made some progress in first exposing, then helping erradicate other ethnic and culture stereotypes. For example, at the 1971 Linguistic Society of America annual meeting in St. Louis, one session dealt with the then new research on Vernacular Black English, showing how it related to the teaching of English in the schools. At that meeting, some linguists became aware for the first time that minority kids were widely considered to have "cognitive deficits," based only on their use of allegedly non-standard English. Despite the efforts of educational researchers like Siegfried Engelmann and Carl Bereiter, among others, to perpetuate such notions, the cognitive deficit theory couldn't withstand the onslaught of counter-evidence brought by linguists like Bill Labov, the sociolinguists at the Center for Applied Linguistics, and many others. Today this stereotype exists in the minds of only the most backward of educators. But as many Language Log posts have pointed out, the misinformed public always seems to find new ways to express ignorance about language. Minorities are still targets of police profiling around the country. They get pulled over more often, are searched more frequently, and they dominate the death rows of American prisons. Minorities are sometimes still victims of racial steering by some realtors, based only on the sound of their voices when they make telephone inquiries about available apartments and houses. A few racist realtors still refer them to properties and rentals that are only in the minority sections of town. All based on linguistic and ethnic stereotyping. We still have work to do.

Malcolm Gladwell's February 6, 2006 article in The New Yorker (see here) seems relevant to the stereotypes and overgeneralizations that sociolinguists face in their work. He compared current false stereotypes about pit bulls to the racial profiling stereotypes held and practiced by many law enforcement agencies -- as well as by Homeland Security. It took years for linguists to counter the ingnorance of many educators about the alleged cognitive deficits evidenced by the culture and language of minority kids. Some other progress also is being made but the Romani seem to be way at the back of the line.

Posted by Roger Shuy at 01:01 PM

Productivity at the Bushism mine

On June 20, Jacob Weisberg posted this item in his "Bushism of the Day" department at Slate:

"I tell people, let's don't fear the future, let's shape it."—Omaha, Neb., June 7, 2006

This choice was criticized by Eugene Volokh, Ann Althouse and me, as a specific example of the general feebleness of the Bushisms feature.

Since Weisberg is a smart and insightful person, all of us felt that this feature's frequent stupidity and obtuseness require some explanation. Prof. Althouse suggested that "maybe [it's] just to keep Slate critics from noticing other problems". Prof. Volokh implied that it's elitism, since the cited usage "[is] a flub only in the sense that departure from the standard Northeastern/West Coast elite spoken English is a flub". I speculated that the motivation is mainly money, pointing to the many Bushisms products, which include not only 12 books (including e-books and so on) but also calendars, wall posters, refrigerator magnets and a DVD.

While it's perfectly proper for supply to meet demand in the marketplace for political ridicule, I wondered whether there's an ethical problem

when a magazine editor, whose job is making judgments about what is and is not worthy of publication, makes much of his income from re-publication of collections of a feature whose instances are so often so spectacularly superfluous. Does anyone think that Jacob Weisberg would consider very many of these "Bushisms" worth the space in his (excellent) magazine and the attention of his readers (which include me) ... if he didn't have a personal financial motivation for keeping the Bushisms brand and the Bushisms product line in the public eye?

As journalistic conflicts of interest go, I guess this is a venial one. It's not like the DNC is slipping envelopes of cash to Weisberg to reward him for making fun of the president. (Instead, Simon & Schuster is sending him quarterly royalty checks to reward him for making fun of the president.)

Several other bloggers picked up on these questions. Ron Hogan at Galleycat posted the question "Should Slate Lay Off Bush Already?" (Jun3 20, 2006), and

tried emailing Weisberg to solicit a defense of the column and clarify some of the financial issues (like Liberman's guess that the royalties from all that Bush-mocking are "in the same range as what he makes at his day job"), but a week's gone by with no answer.

Not surprisingly, prominent journalists are just as reluctant to answer such questions as politicians and other public figures are. But after reading Hogan's post, I wondered whether there might be a sort of indirect response, in terms of a change in editorial behavior. So I took a look on Slate's site, and found that 12 days after the June 20 item, there has not been another "Bushism of the Week" posted.

However, a bit more investigation suggests that this gap probably doesn't mean anything. Here's a calendar showing the publication dates of the "Bushism of the Day " items so far in 2006 (according to the search function at Slate's site):

There were 26 weeks (less one day) in the first six months of 2006, and 25 "Bushism of the Day " items published at Slate during that same period, according to Slate's index (though the items for 3/24/2006 and 3/26/2006 are duplicates, reducing the count to 24). Though this is roughly one per week, the publication dates (which of course are not the same as the dates of the quotes) are far from being exactly weekly. There was a 21-day gap at the beginning of March, and 20-day gap at the beginning of May. So my guess is that we'll see another flurry of "Bushism of the Day" items in a week or so, as Weisberg (or some intern assigned to the task) keeps the machinery grinding away down at the Bushism mine.

Posted by Mark Liberman at 09:36 AM

July 01, 2006

PrairieDogSpeak

I just finished reading Temple Grandin's recent book Animals in Translation, one of the two best books I've read this year (the other was Jared Diamond's Collapse). She includes some speculations among her observations about how animals and autistic people perceive and interact with the world, and most of these speculations are stimulating, intriguing, and even plausible -- including some of her comments on animals and language, for instance about Alex the grey parrot (I confess, at the risk of being drummed out of Language Log Plaza, that I am both partial to parrots and inclined to be gullible, and also that I would be absolutely thrilled if I could be convinced that some non-human species has the equivalent of human language; still, even allowing for my biases, Alex is one spectacular bird). But she lost me when she talked about Con [sic] Slobodchikoff's claims about prairie dogs' language.

I'm not the first Language Logger to wonder about Slobodchikoff's claims: see Mark Liberman on the subject back in 2004. As Mark observed then, "it looks very much like the pattern familiar from Seyfarth and Cheney's classic work on vervet alarm calls, with additional results on the encoding of more abstract size and shape information in call variation, and especially a focus on the use of `variation in the internal structure of a vocalization to define possible information structures', as Placer & Slobodchikoff put it in their 2004 paper." Like Mark, I haven't seen the Slobodchikoff paper that Grandin cites -- only page 1, which is available on a link from Slobodchikoff's website. Page 1 consists mainly of background information on prairie dogs and their social organization. The whole paper, C.N. Slobodchikoff's "Cognition and Communication in Prairie Dogs" (2002) is just eight pages long, so the amount of argumentation for properties equivalent to those of human language has to be very limited. Here's part of Grandin's report on it:

Using sonograms to analyze the distress calls of Gunnison's prairie dog, he [Slobodchikoff] has found that prairie dog colonies have a communication system that includes nouns, verbs, and adjectives. They can tell one another what kind of predator is approaching -- man, hawk, coyote, dog (noun) -- and they can tell each other how fast it's moving (verb). They can also say whether a human is carrying a gun or not.

Something's wrong here, and it's not just the fact that rapidity of movement is a lot more likely to be expressed by an adverb (at least from an English speaker's perspective) than a verb. The main problem is that concepts like "noun" and "verb" have to be defined syntactically, according to their functions in sentences and discourse: the old "name of a person, place, or thing" definition of "noun", for instance, just doesn't work well in real life. A phrase like "destroy Carthage" is semantically very similar to "the destruction of Carthage", but the former is a verb phrase and the latter is a noun phrase. The on-line Oxford English Dictionary defines "noun" as a word "capable of functioning as the subject and direct object in a sentence, and as the object of a preposition", and a verb as "that part of speech by which an assertion is made, or which serves to connect a subject with a predicate"; both of these are syntactically-based definitions.

The point is that unless the prairie dogs have syntactic structure in their communication system, no linguist is going to accept a claim that they have nouns, verbs, or any other part of speech, or that their system approaches the level of human language. And nobody is likely to be able to prove, in an eight-page article, that the prairie dogs have syntactic structure. According to Grandin, Slobodchikoff does in fact claim that the prairie dogs have syntactic structure -- namely, that they use "transformational rules to create their calls". She explains transformational rules as follows:

In human language, a transformational rule allows you to turn words into sentences that make sense....The prairie dogs seem to have a transformational rule based on speed. Depending on how fast a predator is moving, they speed up their calls or slow them down.

But this interpretation has nothing at all to do with the transformational rules linguists talk about, or with syntactic structure. So if this is what Slobodchikoff has in mind, his understanding of what human language is like is shaky. None of this means, of course, that prairie dogs don't have an impressively elaborate system of alarm calls. But it is far, far from human language in its expressive and structural properties, again assuming that Grandin is reporting Slobodchikoff's brief article accurately.

Posted by Sally Thomason at 11:05 PM

The DeLorean saga

After my last post (here) in which I mentioned the case of US v. John Z. DeLorean, I got a slew of messages (well, three) asking me to say more about how linguistic analysis helped the car manufacturer get his acquittal at trial. So here it is, in abbreviated form.

It's a lot of work analyzing 64 audio- and videotaped conversations. After I corrected all the transcripts (always the first step), getting them in jury ready condition, I started with a topic analysis to find out who brought up which topics throughout. Then I clustered the topics of each speaker to get a picture of what was most on their minds, their agendas in other words. The first 30 or so conversations were between DeLorean and an undercover FBI agent who posed as a banker. At first the banker said that he thought he could get DeLorean's company either a loan or that he could help find some investors in the company. After several months of no progress along these lines the banker told DeLorean that he couldn't get him a loan but he'd keep on trying to find investors. Then he added, out of the blue, that he was also involved in a drug importation business. If DeLorean would care to invest 5 million dollars in it, this might solve his financial problems.

DeLorean's responses were non-committal because, as he testified, he wanted to keep open the possibility that the banker might still be able to find some investors. Several more conversations followed and DeLorean still didn't bite on the banker's drug scheme. His substantive topics continued to be about his problems getting the motor company up and running and his need for investors to keep it afloat, accompanied by a lot of bragging about how successful his new car would be. At one point DeLorean told the agent an outright lie. Evading the banker's persuasive efforts, DeLorean said that his last 2 million dollars had already been taken by the bankrupcy court. Undaunted, the agent then urged DeLorean to turn over either the titles of a few cars just off the assembly line or some stock in his ski equipment manufacturing plant. Getting nowhere with this, the government then switched tactics.

At the very time that these conversations had yielded nothing on which to base an indictment, other agents had just caught a drug smuggler flying drugs into the country. When they questioned him, he told them that a few years ago he had lived next door to the DeLorean family in San Diego. Their sons, in fact, had kept in touch with each other over the years. The name, DeLorean, leaped out at the agents and they got this pilot to become a cooperating witness and visit with DeLorean to try to convince him to buy into their drug scheme.

This secretly videotaped, 40 minute meeting took place at the L'Enfant Plaza Hotel in Washington DC on September 4, 1982. It became the major evidence used in the case. The pilot's task was to convince DeLorean that if he would just invest something, anything, in their scheme, this would be taken, as he put it, as "an act of good faith." It wasn't made clear exactly what he meant by this. DeLorean assumed that they meant it as an act of good faith to encourage them to find investors. The pilot then put some charts on the coffee table and showed DeLorean how his investment in their operation would make him enough money to escape bankruptcy. When he explained that these were "Colombian folks running a dope program," DeLorean's response was, "It'll be dangerous." Taking this as a positive sign, the pilot went on to explain how an $800,000 investment could return 40 million. He pointed out that there were two ways to go -- either interim financing or buy 100 kilos, a $300,000 investment, and get 14 million in return within ten days, time enough to forestall bankruptcy.

To this proposal, DeLorean told another lie: "I'm getting money through an Irish group. It's gotta be legitimate." He went on to explain how "tough these guys" are. The plant was in Ireland, so this reference, though not explicit, referred to the IRA. It was DeLorean's way of saying thanks but no thanks, along with a hint of threat in case he was pushed too far. Note that DeLorean did not explicitly say "no," but certainly this could be inferred. It also still left the door open for the agents to find investors, now DeLorean's only hope for saving his company. Changing the subject, DeLorean then asked, "Is their investment as a loan or as an equity investment?" The pilot replied, "Their interest is in megamillion dollar coke sales...so their interest is in stock." The government now believed that it had all it needed and they quickly indicted DeLorean.

A cumulative analysis of all 64 conversations showed that DeLorean's major substantive topic was to get either a loan or investors. The agent went along with this at first, saying he'd try to help him, then introduced the new topic, the drug scheme, over and over again without dropping the topic of finding investors. DeLorean didn't bite. At the final September 4 meeting we hear the two men talking about "investment," with neither of them explicit about who was investing in what. Both use the noun subject with no direct object, leaving this ambiguous. DeLorean meant, "you invest in my company," while the agent meant, "you invest in our drug operation." How to unravel this ambiguity? Through the meaning that DeLorean conveyed throughout the 64 conversations, coupled with the fact that DeLorean never said "yes" to any of the agents' proposals.

The prosecution in this case was based on several illusions that it probably hoped would convince a jury to convict.

1. The piling up of 64 recordings gives the illusion that there must be a huge pile of incriminatory evidence here. There wasn't. One clue to the government's failure to get the evidence it wanted is that the investigation went on for about a year before the indictment was made. This indicates that the earlier tapes did NOT give them what they needed.

2. The prominence of an indicted millionaire manufacturer is often thought to be fair game for jury conviction. Obviously, not every millionaire is a crook but there is often a negative illlusion or predisposition that this is true.

3. The contamination principle was at work in this case. The very mention of illegal stuff like drugs leads to the illusion that the target is involved up to his ears, whether or not he really is.

4. The illusion that the agents' topics and agenda were about illegality tends to override DeLorean's topics and agenda that he wanted only a loan or investors.

5. The illusion created by ambiguity, noted above, can lead to the interpretation of guilt unless words like "investment" are set in their proper context.

Despite their lack of linguistic analysis, the government plowed right on, believing only one hypothesis, that of DeLorean's guilt. Good intelligence analysis investigates multiple hypotheses in the effort to reach conclusions. Not having done this, the government wasted heaps of taxpayer money on a lost cause and a failure to convict. The sad thing is that DeLorean suffered even more.

Posted by Roger Shuy at 05:47 PM

Another bite at "eats like a meal"

Well, our staff syntacticians are still off at the beach, but in response to my post on the English (pseudo-) middle voice ("Diagnosing soup label syntax", 6/29/2006), John Lawler sent in some additional perspective. In a postscript to his note, John warned me against trusting the Wikipedia:

I notice that you quote from Wikipedia approvingly pretty often. While there's a lot on Wikipedia that's useful, and group editing can often clarify texts, I think it might be wise to warn people occasionally that there's a lot of nonsense about English grammar there, too, presented with the same authoritative tone as the true facts. I always warn my students that it's not to be trusted as a source for facts about the English language.

I'm afraid that you can say the same thing about many other sources, including more than a few refereed journal articles and scholarly books from respected publishers. Even web logs are sometimes wrong! But when we get something wrong or leave something out, people like John are quick to remind us.

In this case, John reminds us that "the 'Middle Alternation' is the first one mentioned in ... Beth Levin's indispensable "English Verb Classes and Alternations" (U. of Chicago Press 1993)", and quotes her examples:

(1)a  The butcher cuts the meat.
    b  The meat cuts easily.

 (2)a  The janitor broke the crystal. 
    b  Crystal breaks at the slightest touch.
   
 (3)a  Kelly adores French fabrics.
    b *French fabrics adore easily.

 (4)a  Joan knows the answer.
    b *The answer knows easily.

 (5)a  Bill pounded the metal.
    b *This metal won't pound.

 (6)a  Bill pounded the metal flat.
    b  This metal won't pound flat.

John observes that Beth considers the possible relationship between this "middle construction" and the "causative/inchoative" alternation involved in examples like "The chemist melted the sample" vs. "The sample melted", which John observes "is much larger and more productive and more variable". (I remember learning about the causative/inchoative patterns from George Lakoff in an undergraduate syntax course, back in paleolithic times when we inscribed syntactic rules on mastodon shoulder bones...).

Quoting from "English Verb Classes":

The intransitive variant of this alternation, the middle construction, is characterized by a lack of specific time reference and by an understood but unexpressed agent. More often than not, the middle construction includes an adverbial or modal element. These properties distinguish the middle alternation from the causative/inchoative alternation. In particular, the intransitive variant of the causative/inchoative alternation, the inchoative construction, need not have an understood agent, may have specific time reference, and does not have to include adverbial or modal elements.

However, there has been some debate in the literature about whether there really is a middle alternation that is distinct from the causative/inchoative alternation or whether there is only a single alternation. Verbs that display the causative/inchoative alternation are found in the middle construction, but there are a number of verbs found in the middle construction that do not display the causative/inchoative alternation. The middle alternation is described as being restricted to verbs with affected objects. This constraint is used to explain the data above involving 'pound': the object of this verb is not affected by the action of the verb, so that the verb is found in the middle construction only in the presence of a resultative phrase, which contributes a state that results from the action of pounding.

John adds:

Middle sentences are more likely than not to be generic, which fits in with the adverbial/modal element Beth mentions.

The presenting slogan ('The soup that eats like a meal') is weird because it violates the restriction to affected objects: eating something destroys it, which is not technically 'affecting' it within the meaning of the act. That's why it sounds strange, I think.

And, as for 'She takes a good picture', that's long been one of my best examples of exotic middle constructions, but the effect is simple enough is you consider 'take a picture (of)' as a compound transitive verb equivalent to 'photograph', with the 'of' appearing only on overt objects, like 'at' or 'to' with 'look at' or 'listen to'.

The Campbell's Soup slogan is more "striking" than "weird", it seems to me. To paraphrase Geoff Pullum's remark on Talk of the Nation the other day, I've seen weird, and this isn't it. But it's true, at least, that many verbs involved in such valency alternations (like melt or cut) are likely to have both patterns listed in their dictionary entries, whereas I haven't found a dictionary that bothers to give the "eats like a meal" pattern in its entry for eat.

Posted by Mark Liberman at 04:46 PM

Mitchell:	Just understand that it's my job. I still think you're a good cop.
Frank:	Well, Mitchell. I guess you're gonna do what you're gonna do. Let's just try and stay friends no matter what.
Mitchell:	You're right. Maybe I'll ss-see you around.
Frank:	Goodbye. Oh, and Mitchell? [voice lowers to a whisper] You... got some shit on the side of your mouth right there.
Mitchell:	Oh, yeah, that ol' thing, yeah.
Viewers:	... Wwooww!!!

Kyle:	"Curse words" -- they're called that because they are a curse. We have to go back to only using curse words in rare, extreme circumstances.
Stan:	And besides, too much use of a dirty word takes away from its... impact. We believe in free speech and all that, but... keeping a few words taboo just adds to the fun of English.
Cartman:	So please, everyone, from now on you've got to try and watch your language.

Lisette:	Patience! patience!
Frederick:	This word is not in my dictionary.
Lisette:	Then write it in it. Keep your tender letter. I shall tell her, that a handsome young gentleman, with a pair of large wild eyes, has resolved to love her eternally. Not so?

...	x	y	z	a	b	c	d	e	f	g	h	i	j	k	l	m	n	o	p	q	r	s	t	u	v	w	x	y	z	a	b	c	...
...	-3	-2	-1	0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	25	26	27	28	...
		...	25	26	27	28	29	30	31	32	33	34	35	36	37	38	39	40	41	42	43	44	45	46	47	48	49	50	51	...
			...	-26	-25	-24	-23	-22	-21	-20	-19	-18	-17	-16	-15	-14	-13	-12	-11	-10	-9	-8	-7	-6	-5	-4	-3	-2	-1	0	...

...	x	y	z	a	b	c	d	e	f	g	h	i	j	k	l	m	n	o	p	q	r	s	t	u	v	w	x	y	z	a	b	c	...
...	-3	-2	-1	0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	25	26	27	28	...
		...	25	26	27	28	29	30	31	32	33	34	35	36	37	38	39	40	41	42	43	44	45	46	47	48	49	50	51	...
			...	-26	-25	-24	-23	-22	-21	-20	-19	-18	-17	-16	-15	-14	-13	-12	-11	-10	-9	-8	-7	-6	-5	-4	-3	-2	-1	0	...

...	x	y	z	a	b	c	d	e	f	g	h	i	j	k	l	m	n	o	p	q	r	s	t	u	v	w	x	y	z	a	b	c	...
...	-3	-2	-1	0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	25	26	27	28	...
		...	25	26	27	28	29	30	31	32	33	34	35	36	37	38	39	40	41	42	43	44	45	46	47	48	49	50	51	...
			...	-26	-25	-24	-23	-22	-21	-20	-19	-18	-17	-16	-15	-14	-13	-12	-11	-10	-9	-8	-7	-6	-5	-4	-3	-2	-1	0	...