September 30, 2005

The Truth About French

You can tell that Geoff Pullum is a syntactician. His remarks on French, focus on syntax and semantics, all but omitting phonology, phonetics and orthography. I therefore pass on the following observations of unknown authorship, which I received many years ago on a sheet of paper headed The Truth About French. I have not been able to turn them up via Google. The only clue that I have is that the use of the word stoat suggests that the author is British. (stoat is the British word for weasel (Mustela erminea), especially in its brown phase.)

[Addendum 2005-10-01: Reader Caity Taylor has pointed out that in Britain there are both stoats and weasels. What in North America is called a weasel, Mustela erminea, is called a stoat in Britain. The animal called a weasel in Britain is Mustela nivalis, which is not found in North America.]

One of the many engaging peculiarities of the French is their conviction that their language - if they could only keep it pure of Anglicisms - is one of singular beauty and nobility. Nothing could be further from the truth. French is nothing but Latin (a gawky language to start with) in an advanced stage of putresence. The words, at any rate, nearly all derive from Latin, though their sense has sometimes been so perverted that, for example, the mangled husks of the Latin persona person and rem thing now signify no one and nothing.
If Caesar could arise from his tomb and revisit that land of three parts upon which his conquests imposed his language, he would have sore difficulty in recognizing a word of it. Many of the imperial consonants have fallen silent; some of the vowels have done likewise, while others have reduced themselves to strangulated peeps, or sought concealment in the nasal passages. The syllables scurry past with their heads down, except that every now and then one of them will pop up on its hind legs like a stoat, making them all pause for a moment before they scuttle on.
Many a proud vocable has been filleted and shrunk almost to nothing; for instance, the summer month that once bore the majestic name of Augustus has in the mouth of the French been reduced to the sound Oo. In an effort to counter this vanishing effect, and to prevent their sentences from becoming too short to be noticed, they throw in all the extra words they can find, to serve as ballast, which results in the creation of such convolvular periphrases as "qu'est-c que c'est que ça?" A logical tongue, they would have us believe; and indeed we find no want of rationality in the arrangement by which 100 being "cent", 200 is "deux cents". Except that 300 is not "trois cents" but "trois cent". A singular plural, in very truth; fruit of a language wonderous indeed in its notions of orderliness.
English may fairly be criticized for the vagaries of its orthography, only the criticism comes ill from the speakers of a dialect in which, where eaux is written, no e is sounded, no a, no u, and least of all an x, but only o. It might as well be spelled aquas, which is what it comes from. There are two sorts of h: one of them is silent, and might as well not exist at all; the other, dignified by the name h aspiré, is not (as the level-headed student might suppose) aspirated, but is as silent as the other, only people refrain from eliding vowels before it in case it should be offended - strange homage to a puff of breath long extinct.
Posted by Bill Poser at 09:30 PM

Rebuttle

I don't usually get into the whole eggcorn collection thing, but this one was too good to pass up: rebuttle (54.5K ghits) for rebuttal (4,580K ghits). Even better, the example I found was rebuttle for rebut:

If you are a Firefox/IE user and think Opera is dumb, please feel free to comment why, and I'll glady rebuttle it with a logical and very reasonable statement.

[ Comments? ]

Posted by Eric Bakovic at 05:50 PM

The miserable French language and its inadequacies

I am really more than a bit disgusted that a speaker of French — of all languages — should have the nerve to criticize the English language (if the woolly verbiage of Professor Sergeant can really be called criticism). Let's be clear (since so many people seem to think the French always have a word for everything): this is a language used by people who are supposed to be the big experts in love and kissing and sexy weekends of ooh-la-la, and they don't have words for "boy", "girl", "warm", "love", "kiss", or "weekend".

No they don't! Don't contradict me. I'm a Senior Researcher and Vice President for Diplomacy at Language Log.

Boy-meets-girl stories cannot really be told in French, because there is no word for "boy" — garçon means "waiter", as everybody who has ever seen a movie with a scene in a French restaurant knows — and fille means either "daughter" or "whore" depending on whether you sneer in a certain way when you use it. (French speakers struggle by with the phrase jeune fille as a work-around to refer to a girl.)

Boy may long for girl to hold him in her warm embrace, but he won't be able to tell her that in French, because they don't have a word for "warm". They have tiède, which means "tepid", but boy doesn't long for girl to hold him in her tepid embrace. So what they use is chaud, which is the word on the hot water tap, the one that isn't froid. A language of love that was minimally functional would be able to distinguish between a warm friendship (enthusiastic discussion of topics of common interest; amicable farewell handshakes with promises to do lunch real soon) and a hot friendship (passion, heavy breathing, sudden uncontrolled couplings in shadowy doorways and on moving trains, returning home having lost underwear, midnight calls to say I have to have you right now). If boy cannot distinguish lexically between these, boy is going to be in real trouble with his relationship with girl.

Now consider love. Aimer is not a word for "love"; it is completely vague between loving and liking; you use it both for the way you are devoted to your spouse and the way you prefer to have your coffee. How do you really feel about me? Je t'aime. How's your fish? Je l'aime. Lover, haddock, whatever; it's all the same. These people do not have a word for love.

Baiser does not mean "kiss". It apparently did once, but today it is not a word you should try to use for a peck on auntie's cheek — it now means "fuck". And embrasser does not mean "kiss" either; people use it for that, but it clearly means "embrace" — bras means "arm". Although the French are widely thought to have invented at least one variety of kissing, they have no word that specifically denotes the activity.

And finally, if, despite all the above lexical difficulties, boy ever gets along with girl well enough to invite her away from Paris for a weekend of ooh-la-la in Dieppe, he will once again find himself completely stuck to express the notion of this crucial time period. What speakers do (to the disgust of the French Academy, which is charged with trying to prevent the miserable French tongue from completely falling apart) is to talk about le weekend. A borrowing from the very English that these linguistic cripples have the temerity to condemn.

So where do they get off, criticizing the language in which fine writers like William Shakespeare and Dan Brown created their literary masterpieces, huh? It makes me so mad.

I know, I'm going to get a whole flood of stupid email defending the beautiful French language and its expressivité: "La langue française, elle est si belle", they'll say, referring to their language as if it were a girl (not that they can say "girl"); Le français, they will say (inexplicably switching their gender decision from feminine to masculine), "est une langue" (O.K., so we're back to feminine again) magnifique, la langue de Racine et de Molière et de Balzac et de Rimbaud... All this from people who think a uvular scraping sound like a cat bringing up a hairball is a perfectly reasonable noise to use instead of an honest "r". From people who simply cannot make their minds up about whether an attributive adjective should precede the modified noun (sensible!) or follow it (silly!): the ever-indecisive French say un bon vin blanc ("a good wine white"), with one before the noun and one after. Get a grip! Pick one or the other!

Anyway, I don't care if the francophones bombard me with hate mail. Let them sue me for 500,000 yen for defaming their linguistic patrimony. I'm not buying the idea that this is a language fit to hold its head high and participate in world diplomacy and lovemaking. This is a language to be tossed the scrap-heap of human communicative failures.

And if it seems to you that I'm being a bit tough on the French, let me just point out that they started it.

Posted by Geoffrey K. Pullum at 05:03 PM

Burnt offerings

My garden-path sentence of the week was in a NYT article by Michael Slackman: "Ever Since A.D. 270, the Need to Get Away From It All". It's about how some 4th-century monk's cells were discovered in Zafarana, Egypt, beneath an 8th-century church beneath a 15th-century church.

"This was a complete surprise," Father Maximous said pointing at the monastic cells.

In the corner of one is a brick stove that was used for cooking. Another was used for prayer. The cells told a story of monks who lived together, with several people in one cell. There was also a basin that was used to soak palm fronds, which they used for weaving things like mats and baskets.

How, I wondered briefly, did those monks use a stove for prayer?

Posted by Mark Liberman at 09:09 AM

September 29, 2005

Paradoxes of the imagination

Jean-Claude Sergeant, a professor of britannic civilization ("professeur de civilisation britannique") at the University of Paris III, recently presented in the pages of the Courrier International his theory that English is a paradoxical language ("L'anglais, langue paradoxale", Sept. 15, 2005).

Prof. Sergeant sums up his idea at the end of the essay:

L'anglais est une langue paradoxale. Assez rigidement structurée sur le plan syntaxique, on pourrait la qualifier de "no nonsense language", c'est-à-dire de langue où l'à-peu-près n'a pas sa place, au risque de voir cette appréciation immédiatement contredite par l'emploi surabondant de reprises anaphoriques – this, that et leurs formes plurielles – et par les multiples possibilités de raffinement lexical que lui permet le jeu des postpositions adverbiales. On approchera un peu mieux encore la vérité de la langue en rappelant que le terme "reasonable" renvoie davantage à ce qui relève du bon sens que de la raison.

English is a paradoxical language. Quite rigidly structured on the syntactic level, we could call it a "no nonsense language", that is to say a language where there is no place for more-or-less, at the risk of seeing this insight immediately contradicted by the overabundant use of anaphoric references -- this, that and their plural forms -- and by the multiple possibilities of lexical refinement permitted by the play of adverbial postpositions. We will come a bit closer to the truth of the language by recalling that the term "reasonable" refers more to common sense than to rationality.

Those Britannics, with their rigid syntax and their excessive anaphora. Truly a nation of linguistic shopkeepers.

Jokes aside , the Courrier's theory checkers seem to have been asleep at their posts as this essay was edited. At least, I can't make sense of the implied logical connections among syntactic rigidity, use of demonstratives, lack of nuance, and rationality. Being an American, however, I'm willing to ignore the theory and focus on checking the facts. Earlier in the article, Sergeant explains in more detail his idea about the inflexibility of English syntax:

Dans sa configuration actuelle, l'anglais courant se caractérise d'abord par un extrême souci de cohérence et d'explicitation proche de la redondance. Les noyaux constitutifs de la phrase – sujet, verbe, complément – ne se laissent pas aussi facilement scinder qu'en français, et l'ordre dans lequel ils interviennent dans la phrase est moins susceptible d'être modifié. On ne peut guère, en anglais, accumuler en début de phrase des éléments de complémentation comme cela se pratique dans la presse française. Ainsi, la traduction de la phrase : "Enarque de 42 ans parvenu au terme d'une brillante carrière de conseiller financier, Jean Dupont songeait à se lancer dans la politique […]" ne saurait différer trop longtemps l'identification du sujet. Une traduction possible serait : "After a successful career as a financial consultant, Jean Dupont, a 42 years old ‘énarque', was considering getting into politics […]"

In its present configuration, current English is characterized first by an extreme concern for coherence and for explicitness approaching redundancy. The core constituents of the phrase -- subject, verb, complement -- cannot be as easily separated as in French, and the order in which they occur in the phrase is less susceptible of modification. We can hardly, in English, accumulate at the start of the phrase the elements of complementation (?) as this is practiced in the French press. Thus the translation of the phrase "Enarque de 42 ans parvenu au terme d'une brillante carrière de conseiller financier, Jean Dupont songeait à se lancer dans la politique […]" could not postpone too long the identification of the subject. A possible translation would be "After a successful career as a financial consultant, Jean Dupont, a 42 years old ‘énarque', was considering getting into politics […]"

(An ‘énarque', by the way, is a graduate of the Ecole nationale d'administration.)

The empirical claim here seems to be that we no-nonsense English speakers, in our impatience to get to the meat of the matter, can't match the typically Gallic accumulation of pre-subject "elements de complementation". I'm not sure what Sergeant means by "elements de complementation", but in the example cited, the pre-subject material is just a floating adjunct, hardly unknown in English. Mapping Enarque onto the rough cultural equivalent Harvard MBA , we could translate his example as the perfectly grammatical English

A 42-year-old Harvard MBA with a successful career as a financial consultant, Jean Dupont was considering getting into politics.

So if Prof. Sergeant is claiming that there's a hard-and-fast grammatical distinction on this point between English and French, he's certainly mistaken. But perhaps he has in mind a statistical difference -- maybe journalists writing in French tend to use this sort of initial adjunct more often than those writing in English do. Since I have a half a cup of coffee left, and a half an hour before I need to go to the gate to catch my flight, let's take a quick look, courtesy of Google News.

One recent event reported world-wide was the conviction of Lynndie England. The ledes of the first eight French-language articles I found on the subject (limited to those where her name was the subject of the first sentence):

La soldate américaine Lynndie England a été condamnée à trois ans de prison et radiée de l'armée pour mauvais traitements sur des détenus irakiens.
Lynndie England, la soldate américaine la plus connue pour son implication dans les sévices sur des prisonniers irakiens à la prison d'Abou Ghraïb en dehors de Bagdad, a été condamnée mardi par un jury militaire à trois ans de prison ferme.
La réserviste américaine Lynndie England, symbole du scandale de la prison irakienne d'Abou Ghraib, en 2003, a été condamnée à trois ans de prison, mardi, par la cour martiale de Fort Hood,
au Texas, pour mauvais traitements sur des détenus.
La femme soldat américaine Lynndie England, symbole des sévices sur les détenus à la prison irakienne d'Abou Ghraïb, a été condamnée mardi à trois ans de prison.
Lynndie England, 22 ans, était au coeur du scandale de la prison irakienne d'Abou Ghraib en 2003.
Lynndie England a été reconnue coupable de six des sept chefs d'accusation retenus contre elle.
Lynndie England, la soldate américaine au centre du scandale de la prison irakienne d'Abou Ghraib a été reconnue coupable.
La soldate américaine Lynndie England a été hier reconnue coupable par une cour martiale de six chefs d'accusations sur sept pour mauvais traitements sur des Irakiens, à la prison irakienne d'Abou Ghraib en 2003.

Of these 8 examples, 4 have material of some sort before her name. None of these, however, really seems to be the sort of adjunct that Sergeant wrote about -- they seem to me simply to be pre-nominal modifiers of the kind common to English-language journalism as well, and also favored by Dan Brown.

In this respect, the English-language press is not very different. Of the first 8 suitable examples that I found, 6 had some modifier before her name. Two of these were just the title Pfc., so that leaves 4 with modifiers like "army private" and "US soldier", just as in the French press:

Army Private Lynndie England, whose smiling poses in photos of detainee abuse at Baghdad's Abu Ghraib prison made her the face of the scandal, was convicted yesterday by a military jury on six of seven counts.
US Pte Lynndie England has been found guilty of abusing prisoners at Iraq's Abu Ghraib jail.
Private Lynndie England, the US soldier who became the icon of the Abu Ghraib prisoner abuse scandal is facing up to 10 years in prison after a military court found her guilty of mistreatment.
Pfc. Lynndie R. England, a 22-year-old Army file clerk whose smirking photographs came to personify the Abu Ghraib prison scandal, was convicted Monday of joining in the abuse when she posed next to detainees who had been stripped and put into humiliating poses.
Lynndie England, the US Army reservist photographed grinning as she humiliated Iraqis at Abu Ghraib prison, was convicted yesterday on six charges of detainee abuse.
Lynndie R. England, the Army Reserve private who became a symbol of the Abu Ghraib prisoner abuse scandal after she was photographed holding a dog leash attached to a naked Iraqi detainee, was convicted Monday on six of seven charges at a military court-martial.
Pfc. Lynndie R. England, the Army reservist who appeared in infamous photographs humiliating detainees at the Abu Ghraib prison in Iraq, was found guilty of six counts of abuse and indecent acts yesterday in the final court-martial for the original group of soldiers who touched off an international furor over U.S. treatment of prisoners.
US soldier Lynndie England was found guilty of abusing prisoners at Iraq's Abu Ghraib by a military jury.

Looking around a bit further, I seem to find things like this in English

In a surprise move, the conservative winner of last Sunday’s Polish elections, Jaroslaw Kaczynski, has proposed economics expert Kazimierz Marcinkiewicz as the new Polish prime minister - refraining from taking the tob job himself.

about as often I find things like this in French:

Dans un communiqué, le contrôleur européen de la protection des données (CEPD), Peter Hustinx, se dit peu convaincu par la nécessité de mettre en place une directive européenne concernant la conservation des données électroniques et téléphoniques ...

The adjuncts and modifiers of names, in general, seem to be deployed fore and after of subjects at about the same rates in both languages (these are not translations, just strings found in the ledes of different stories):

Le ministre de l'Intérieur Nicolas Sarkozy
Nicolas Sarkoy, ministre d'Etat à l'Intérieur et président de l'UMP (droite majoritaire),

French Interior Minister and number two in the government, Nicolas Sarkozy,
Nicolas Sarkozy, France's ambitious interior minister and head of the ruling UMP party,

L'ancien patron de l'Agence fédérale de gestion des situations d'urgence (FEMA), Michael Brown,
Michael Brown, le directeur de la FEMA, l'agence fédérale chargée de la gestion des secours d'urgence,

The former director of Fema (The Federal Emergency Management Authority), Michael Brown,
Michael Brown, head of the Federal Emergency Management Agency in charge of the US response to natural disasters,

So, I'm not claiming any statistical significance here, but I've seen enough to make me willing to offer Prof. Sergeant a small wager: in a well-controlled study of French-language and English-language journalism, we would not find a significantly greater tendency for the French stories to have pre-subject adjuncts or modifiers (and I'll throw in adverbials, if he wants).

Posted by Mark Liberman at 12:45 PM

September 28, 2005

Philologists say ...

I've been reading, more or less for fun, Stuart Shieber's recent edited book The Turing Test. I was struck by two language-related pseudo-arguments, both made in chapter 7, "Can automatic calculating machines be said to think?", pp. 117-132. This chapter is a transcript of a 1952 radio interview among M. H. A. Newman, Alan M. Turing, Sir Geoffrey Jefferson, and R. B. Braithwaite.

The first is a weak etymological argument made more or less in passing by Sir Geoffrey, on the first page of the transcript (p. 117 of the book, emphasis added):

I don't think that we need waste too much time on a definition of thinking since it will be hard to get beyond phrases in common usage, such as having ideas in the mind, cogitating, meditating, deliberating, solving problems or imagining. Philologists say that the word "Man" is derived from a Sanskrit word that means "to think", probably in the sense of judging between one idea and another. I agree that we could no longer use the word "thinking" in a sense that restricted it to man. No one would deny that many animals think, though in a very limited way. They lack insight. [...]

The OED sort of agrees with the spirit of this, but not the letter:

This word and Sanskrit manu have been together referred by some to the Indo-European base of MIND n.1, on the basis that thought is a distinctive characteristic of human beings.

And furthermore:

A more recent theory suggests a derivation (with loss of an initial obstruent) from the Indo-European base of Lithuanian žmonės people and Old Prussian smunents man, which is a variant (with a different ablaut grade) of the Indo-European base of classical Latin homō man, Old English guma and its Germanic cognates (see GOME n.1), and Old Lithuanian žmuo; but these Indo-European words are usually referred to the Indo-European base of classical Latin humus (see HUMUS n.) and ancient Greek χθώυ (see CHTHONIC n.) meaning 'earth'.

The second is a potentially stronger claim about the connection between words and ideas, made by Newman (pp. 122-123, emphasis again added):

[Consider the] different kinds of number. There are the integers. 0, 1, -2, and so on; there are the real numbers used in comparing lengths, for example the circumference of a circle and its diameter; and the complex numbers involving [the square root of -1]; and so on. It is not at all obvious that these are instances of one thing, "number". The Greek mathematicians used entirely different words for the integers and the real numbers, and had no single idea to cover both. It is really only recently that the general notion of kinds of number has been abstracted from these instances and accurately defined. [...]

What I'm unclear about in this case is the link between the conjoined verb phrases in the bolded part of the quotation: is it and furthermore, or and therefore? If the latter, I'm skeptical about the conclusion that the Greeks lacked the general idea simply because they lacked the general word -- though either way, I'm probably just exposing my ignorance of Ancient Greek mathematics and mathematicians.

[ Comments? ]

Posted by Eric Bakovic at 05:13 PM

Tingo and other lingo

[Guest post by Benjamin Zimmer]

A burgeoning new field in pop linguistics consists of gathering together words and phrases in the world's languages that are deemed "untranslatable" into English (or at least lack a tidy lexical translation-equivalent). Howard Rheingold (of Virtual Community and Smart Mobs fame) led the way with his 1988 book, They Have a Word for It: A Lighthearted Lexicon of Untranslatable Words and Phrases (republished by Sarabande Books in 2000). Last year saw the publication of Christopher J. Moore's In Other Words: A Language Lover's Guide to the Most Intriguing Words Around the World. Now comes the latest entry, The Meaning of Tingo: And Other Extraordinary Words from Around the World by Adam Jacot De Boinod.

I haven't seen de Boinod's book yet, but a BBC article gives some examples that are typical of the genre:

While English speakers have to describe the action of laughing so much that one side of your abdomen hurts (hardly an economical phrase), the Japanese have the much more efficient expression: katahara itai.

Of course, the English language has borrowed words for centuries. Khaki and croissant are cases in point.

So perhaps it's time to be thinking about adding others to the lexicon. Malay, for instance, has gigi rongak - the space between the teeth. The Japanese have bakku-shan - a girl who appears pretty from behind but not from the front. Then there's a nakkele - a man who licks whatever the food has been served on (from Tulu, India).

Tingo, we are told, is a word from the Pascuense language of Easter Island meaning "to borrow objects from a friend's house, one by one, until there's nothing left." That definition appears to be borrowed wholesale from de Boinod's predecessor, Howard Rheingold.

De Boinod is no linguist (he's a researcher for the BBC comedy quiz show QI), but he claims to have read "over 280 dictionaries" and "140 websites" (or, according to his publisher's site, "approximately 220 dictionaries" and "150 websites" — take your pick). It's safe to assume that the fact-checking for such books is rather minimal — if a website says it, it must be true, right? One of the examples provided in the BBC article is in a language I'm familiar with: Malay/Indonesian. Does gigi rongak mean "the space between the teeth" in Malay? Well, not exactly. Gigi means "tooth" or "teeth," and rongak means "discontinuous, gapped." So gigi rongak would simply translate as "gapped teeth," or, if used attributively, "gap-toothed." Looks like this particular compound phrase isn't so "extraordinary" after all, since no doubt countless languages have rather unremarkable equivalents for gap-toothed. My guess is that de Boinod relied on an online Malay-English Dictionary that inaccurately translates gigi rongak as "gap between teeth."

As an aside, the reliance on sketchy online dictionaries and wordlists can yield unintentionally humorous results. Take, for instance, the Maserati Kubang. Unveiled in 2003, this "concept car" is supposedly named after "a wind over Java." (Maserati has a tradition of naming cars after exotic-sounding winds.) Close, but no cigar — the actual word is kumbang, not kubang. Angin kumbang literally means "bumblebee wind" in Javanese and Indonesian, and it refers to a very dry south to southwesterly wind that blows into the port of Cirebon on the north coast of Java. But this got mangled on various websites listing winds of the world (e.g., here, here, and here), and kumbang was changed to kubang. What does kubang mean in Indonesian? "Mudhole, mud puddle, quagmire." Probably not the image Maserati was going for!

Books like de Boinod's that gather linguistic tidbits from all over should obviously be taken with a truckload of salt. Language Hat has already noted an utterly spurious item circulated by Christopher J. Moore in his book In Other Words. William Safire, in an On Language column praising Moore's book, cites the example of razbliuto, supposedly meaning "a feeling a person has for someone he or she once loved but no longer feels the same way about." But there's no such word! Turns out Moore got this from Rheingold's They Have a Word for It, who in turn got it from J. Bryan III's 1986 book Hodgepodge. Language Hat eventually tracked down the source of the razbliuto myth: an episode of the 1960s TV show The Man from U.N.C.L.E., incredibly enough. (To his credit, Safire later corrected himself about razbliuto after receiving "a dozen letters from others who insist that the word does not exist.")

The multitudinous errors in such books should not be surprising; as Mark Liberman has reminded us, when a factoid about language is attractive enough, "the linguistic truth of the matter is beside the point." And these books clearly rely on the sort of naive Whorfianism that informs many of the language factoids catalogued here in the past (Inuit languages have many words for snow, Moken has no words for the passage of time, etc.). De Boinod proffers the well-worn "Whorf Lite" argument in the BBC article:

He is also convinced that a country's dictionary says more about a culture than a guide book. Hawaiians, for instance, have 108 words for sweet potato, 65 for fishing nets - and 47 for banana.

I don't know much about the Hawaiian ethnoclassification of sweet potatoes, fishing nets, or bananas, but my guess is that these "words" are (like Inuit "words" for snow) mostly hyponymic compounds, where a general term is modified by one or more additional morphemes to denote something more specific (e.g., "maple tree," "tricolored cat ," etc.). What does it tell us about Hawaiian culture that the language has many such terms for bananas? Not a whole lot, other than that Hawaiians (like other Pacific islanders) are familiar with far more banana varieties than the boring Cavendish that most mainlanders put on their cereal.

It is of course unfair to prejudge de Boinod's book based solely on early press accounts. (One positive sign is that the book apparently includes "a frank discussion of exactly how many 'Eskimo' terms there are for snow.") I'll leave it to experts of other languages to pick apart de Boinod's research, and will just close with another rather inaccurate example that appears in articles about the book in The Telegraph, The Scotsman, and The Daily Record. The articles claim that neko-neko is an Indonesian term for "a person who has a creative idea that only makes things worse." Néko-néko, a Javanese colloquialism borrowed into Indonesian, is not a noun referring to a person but rather a predicative verb or adjective best translated as "doing all sorts of things unnecessarily" or perhaps "sticking one's nose where it doesn't belong." Some linguists might find that an apt description for popular misrepresentations of the world's languages and their lexicons.

[Update: Karen Davis points out that English arguably does have an efficient expression for "the action of laughing so much that one side of your abdomen hurts", namely side-splitting, which weighs in at three syllables and ten phonemes, compared to six syllables and twelve phonemes for Japanese katahara itai (which appears to mean, compositionally, "one side painful"). ]

[Update #2: Bruno van Wayenburg writes:

I'm a regular Lang Log reader from the Netherlands, who read Benjamin Zimmer's log and some reviews about De Boinod's book. I was sort of amused to read about the supposed Dutch word for stone skimming, 'plimpplampplettere', in De Boinod's book

As a native speaker I have never encountered the word, and neither have some linguage keen friends, either for skimming stones or anything else (we say, rather prosaically 'keilen' of 'stenen keilen'). Neither is it in the authoritative Van Dale dictionary, and a google search for it turns up 12 hits, all in English and about De Boinods book (even when you search for Dutch-language pages only). It doesn't even have the verb ending -en.

To me, it sounds like a jocular incrowd word invented at a session of stone skimming. ('Kletteren', pronounced 'klettere', means 'falling violently or noisily' (this one should be in the book, especially as it means 'to climb' in German). 'Pim pam pet' is a popular children's game, and the term may convey a sense of repetition, while 'plons' is the sound of something falling into water. So there you have some associations and onomatopoeiea which could explain this concoction. But surely very few Dutch would mean what you are talking about if you were to use plimpplampplettere out of the blue.

After having turned in this false entry, I can however report that 'uitwaaien' is a genuine Dutch word, regularly used as well as executed in the meaning that De Boinod gives, and it's quite wonderful that it has the power to amaze foreigners.

It would be interesting to keep score, and see what proportion of De Boinod's entries are bogus. About half, judging from the returns so far.]

[Update #3: Matt T. at No-sword has blogged about the Japanese examples:

Another day, another article about those wacky words that other languages have!
While English speakers have to describe the action of laughing so much that one side of your abdomen hurts (hardly an economical phrase), the Japanese have the much more efficient expression: katahara itai.
The vast majority of the "efficiency" there is packed into the word katahara (片腹), meaning "one side of your abdomen", although really "belly" would be more natural than abdomen, but in any case, is this really more efficient than "side-splitting"? I mean, the phrases are directly comparable in terms of both literal meaning and subsequent hyperbolic devaluation, and I count three syllables in "side-splitting", and at least twice as many in katahara itai.
(Incidentally, this phrase is probably a corruption of katawara itai (傍ら痛い), "beside-pain", which is applied to a person or circumstance so shameful or pitiful that it hurts to be near him, her or it. Which isn't really relevant to how it's used today, but is kind of interesting.)
Moving on...
The Japanese have bakku-shan - a girl who appears pretty from behind but not from the front.
True, but when you consider that this word (arguably pair of words) is simply a combination of English back and German schoen, "beautiful", it's not very good evidence for the idea that English isn't kooky enough.
I suppose you could argue that English speakers lack the creativity to put their words together in kooky ways like that, but come on -- even my relatively sheltered life has allowed me to hear several remarkably creative, although often quite unkind, 100% English expressions for people who are attractive from behind but not before. If there's one thing English doesn't lack, it's insults.

Hmm. Maybe Tingo's bogosity factor is higher than 50%.]

[Update #4: Tom Rossen observes that English actually has a one-syllable word for something that's so funny that laughing at it produces a one-sided pain in the abdomen, as in this quote:

(link) As for Tom Hewitt, his Tristan Tzara exuberance should be bottled for commercial consumption. (Hewitt speaks what's supposed to be French with a Romanian accent, and he's a stitch....)

where a stitch is an allusion to the expression "I laughed so hard, I got a stitch in my side." Indeed. ]

[Lots of comments at metafilter, including several more debunkings. More yet on this message board, including the observation that Russian koshatnik doesn't really mean "seller of dead cats": its most straightforward translation would be "cat person" or "cat fancier", as in this dictionary entry. ]

Posted by Mark Liberman at 12:10 AM

September 27, 2005

Multilingual Google

I've mostly been using the Catalan version of Firefox as my browser until recently, but when I decided to install a new version a few days ago I decided to try another language, so I'm now using the Japanese version. Which version you use is independent of the language preferences you set, though, so I'm experiencing a bit of linguistic dissonance with the menu labels and status bar messages in Japanese but most other things, including web pages if the choice is available, still in Catalan.

One site that is available in many languages is Google. With my preferences set the way they are, my Google interface is in Catalan. Here's what it looks like:

The Catalan Google interface in the Japanese localization of Firefox

If you click on Eines d' idioma ("language tools"), you get to a page that lets you choose what languages you want to search pages in and what language you want for the interface.

The list of Google interface languages in Catalan

The list is pretty impressive - there are 116 entries There are a few that aren't real natural languages: Elmer Fudd , Hacker, and Pig Latin, and one that isn't exactly a human language: Klingon, as well as two artificial languages, Esperanto and Interlingua, but that still leaves over 100 natural human languages, some of them not so well known, such as Kazakh and Tongan. There is even one Native American language, Guarani, the language spoken by most Paraguayans. On the Catalan page for some reason Guarani is called Tupi-Guarani, which is the name of the language family to which it belongs. I don't think I've ever read anything in Catalan about native American languages so I can't say for sure that Guarani isn't called Tupi-Guarani in Catalan, but I doubt it. The English, Spanish, and Kazakh pages just call it Guarani. This looks like a mistake to me.

Using Google in another language is a fun way to try out a language you don't know real well. It's easy to switch to a language you do know well if you get stuck and it isn't all that complicated.

I do have one small complaint (beyond the fact that they don't yet have all of my favorite languages), which is that they are evidently sorting the list of languages the same way no matter what language they are in, in the order of the Unicode codepoints. This yields unexpected results.

For example, on the Catalan list Arabic comes last, after Zulu, because the Catalan word for Arabic is Àrab and the À, whose Unicode codepoint is 0x00C0, follows all of the ASCII letters. Z is 0x005A. If Google really wanted to do things right, they would sort the names using the appropriate collating rules for each language.

Posted by Bill Poser at 11:20 PM

Today's wretched hyphenation


Today's really bad automated hyphenation (in a respectable source) appears in John Tierney's op-ed piece "Human Beings 2.0", in the New York Times, p. A27:

    This is part of what Joel Garreau
calls the Hell scenario in "Radical
"Evolution," his book analyzing the
new forms of life -- including "tran-
shumans" and "posthumans" -- com-

My picture of shumans is that they're kind of shmoo-like, but not as cute.

[Update, 9/28/05: language hat has written to say that his image of "shuman" "was defined for all time by R. Crumb", in his character Shuman the Human. There are, of course, many people actually named Shuman. Then there's the tran-shuman, which I'm starting to think of as a shuman in mid-gender.]

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 02:24 PM

Submissions to AAAI weblog symposium due in 10 days

The American Association for Artificial Intelligence (AAAI) is holding its 2006 Spring Symposium Series at Stanford, March 27-29. One of the eight scheduled symposia is Computational Approaches to Analyzing Weblogs, and submissions are due in ten days, by Friday, October 7.

Here's the start of the call for papers:

Weblogs are web pages which provide unedited, highly opinionated personal commentary. Often a weblog (also referred to as blog) is a chronological sequence of entries which include hyperlinks to other resources. A blog is conveniently maintained and published with authoring tools.

The blogosphere as a whole can be exploited for outreach opinion formation, maintaining online communities, supporting knowledge management within large global collaborative environments, monitoring reactions to public events and is seen as the upcoming alternative to the mass media.

Semantic analysis of blogs represents the next challenge in the quest for understanding natural language. Their light content, fragmented topic structure, inconsistent grammar, and vulnerability to spam makes blog analysis extremely challenging when faced with questions like: can the implicit and explicit communities implied by content and link structure be used to determine the relevance and influence of bloggers? Can a blog segment be identified as a summary of a linked story in order to use both as training data for summarization research? Can we determine how information percolates through mass media outlets and blogs? Can blogs with multimedia content be stored in a way that allows us to search across different modalities? Can we find consumer complaints, discover vulnerabilities of products, and predict trends?

If you're interested, read the whole thing. Full disclosure: I'm a member of the organizing committee.

Posted by Mark Liberman at 12:44 PM

Unicode Maps

Our colleague Steve at Language Hat has a post recommending BabelMap, a tool for finding and inserting Unicode characters. It looks like a good tool, but it is for Microsoft Windows, which I don't use. For those of us in the Free (Software) World, a comparable tool is gucharmap.

Here is what it looks like:

The gucharmap program

If this sort of thing is of interest to you, I maintain a page with information about computational resources for linguistics, with emphasis on free software that runs on Unix systems, here

Times Have Changed

In French there are two singular forms of "you", tu and vous, and two associated verb forms, one used to intimates, the other more distant. It used to be that the rules for using these were reasonably clear. You used tu toward people whom you knew well and toward whom it was not necessary to show respect or deference, vous toward everyone else. There were a few special cases: anyone could use tu toward children or animals, and common soldiers addressed each other as tu, as did members of the Communist party. If two adults hit it off, they might decide to switch to tu after a fairly short time, but they would always start out with vous.

I've noticed that even in my lifetime the use of tu has become much freer. Even though I am not a formal person, it still feels odd to me to be addressed as tu by someone I have just met. But things have changed even more than I had realized. My mother recently gave me her copy of Savoir écrire toutes les lettres, a guide to letter-writing published in 1955. Among other things, it contains the appropriate salutations and closings for dignitaries of various sorts, some of which are amusing in their pretentiousness. Here, for example, is how you are supposed to close a letter to the Pope:

Prosterné aux pieds de votre Sainteté
et implorant la faveur
de Sa bénédiction apostolique,
J'ai l'honneur d'être,
Très Saint Père,
Avec la plus profonde vénération,
De Votre Sainteté,
Le très humble et très obéissant serviteur et fils


Prostrate at the feet of Your Holiness
and begging the favor
of His apostolic blessing,
I have the honor to be,
Very Holy Father,
With the most profound veneration,
The very humble and very obedient servant and son
Of Your Holiness.

Whew!

[To clarify the meaning, I've swapped the last line and the penultimate lines in the translation.]

Much of the book is taken up with sample letters. Some of them deal with such odd situations that one is tempted to suspect a bit of tongue-in-cheek on the part of the author, for example the letter of condolence "To a woman who has killed her husband in an automobile accident", but I fear that the author was in fact entirely serious.

One section is devoted to proposals of marriage. I have to admit that I rather like their extravagant language, but what I find truly stunning is the fact that every single one addresses its recipient as vous! I find it difficult to imagine proposing marriage to someone with whom one is not on sufficiently intimate terms to use tu.

[Addendum 2005/09/28: Several readers have commented that even for 1955 this is a rather archaic usage, one that reflects a conservative and aristocratic orientation, which is of course what you tend to find in etiquette guides. Indeed, proposing marriage by letter was in and of itself rather old-fashioned..]

[On the other hand, there has been a real change over not too long a period. I have frequently had the experience of being addressed as tu by people to whom I have just been introduced, in circumstances where this could not be explained by solidarity. (For example, when I was an undergraduate, I remember being addressed as tu by others of the same age whom I did not know when browsing in bookstores and so forth. That did not strike me as odd.) People of my parents' generation with whom I have discussed this say that that was not normal for them and that they would find it uncomfortable.]

Posted by Bill Poser at 01:34 AM

September 26, 2005

Wikipedia on Simpsons words

The Wikipedia has a listing of 63 "memes, words and phrases" derived from the Simpsons. Some specific words also have their own pages -- at least D'oh does. Inexplicably, several Simpsons snowclones lack entries, including "I, for one, welcome our new * overlords" and "mmm, *". [via HeiDeas]

Posted by Mark Liberman at 04:32 AM

September 25, 2005

An apology

I guess I owe Glenn Wilson an apology.

Last April and May, there was a media avalanche about how email, texting and other communications technologies "are a greater threat to IQ and concentration than taking cannabis" (according to the Guardian), "lower the IQ more than twice as much as smoking marijuana" (according to the London Times), "reduce productivity and leave people feeling tired and lethargic" (according to CNN), have effects "similar to the impact of missing an entire night’s sleep" (Red Herring), "temporarily knocks 10 points off a users' intelligence, compared to four for a joint" (the Mirror). The associated headlines were things like Why texting harms your IQ; E-mails 'hurt IQ more than pot'; Distractions at work 'lower the IQ of staff'; and TXTING MKS U STPID: It lowers your IQ more than smoking cannabis. This research has become part of the public's conventional wisdom about the deleterious effects of modern life, as suggested by browsing the 615,000 Google hits for {"email IQ pot"} and similar probes.

The scientific authority most often quoted was Dr. Glenn Wilson, a psychologist at King's College London. I was critical of the media coverage, which was long on hype and short on detail, and my attempts to understand and evaluate the research itself were frustrated by the fact that no description of the experiments' design and results had been published, and no publication was planned, because the work was privately commissioned by HP. So I wrote a series of increasingly skeptical and increasingly jocular posts: Quit email, get smarter? (4/23/2005), A tale of two media (4/30/2005), Never mind (5/03/2005), News flash: the effect of politics, athletics and sex on IQ (5/03/2005). In doing so, I violated my own principle: when reasoning about some piece of reportage that doesn't make sense, it's a good rule of thumb to blame the journalist -- or the journalistic process, including the editor(s) and the headline writer -- before blaming the scientist. The fact that Wilson was misidentified as a psychiatrist in many of the early stories (presumably because he works at an institute of psychiatry) should have been a clue, if one were needed, that most of the rest of the stories' content was bogus as well.

A few days ago, a friend of Wilson's stumbled on these posts of mine, and sent a note in defense of his friend. Since his defense didn't address my questions about the research -- basically "what was the experimental design?" -- I responded with a brief account of what those questions were, and why I thought they were relevant. He responded with a copy to Wilson, who in turn wrote:

This "infomania study" has been the bane of my life. I was hired by H-P for one day to advise on a PR project and had no anticipation of the extent to which it (and my responsibility for it) would get over-hyped in the media.

There were two parts to their "research" (1) a Gallup-type survey of around 1000 people who admitted mis-using their technology in various ways (e.g. answering e-mails and phone calls while in meetings with other people), and (2) a small in-house experiment with 8 subjects (within-S design) showing that their problem solving ability (on matrices type problems) was seriously impaired by incoming e-mails (flashing on their computer screen) and their own mobile phone ringing intermittently (both of which they were instructed to ignore) by comparison with a quiet control condition. This, as you say, is a temporary distraction effect - not a permanent loss of IQ. The equivalences with smoking pot and losing sleep were made by others, against my counsel, and 8 Ss somehow became "80 clinical trials".

Since then, I've been asked these same questions about 20 times per day and it is driving me bonkers.

Since I sent a couple of emails to Dr. Wilson back in April, which I'm sure were lost in the media feeding frenzy, I hereby apologize for my small contribution to the barrage of questions. I also apologize for tarring Wilson to some extent with responsibility for the excesses of the media. I'll also link to this from the earlier posts, including the one that shows up on the first page of Google returns for queries like {email IQ pot}.

There's a bigger problem here: rotten science journalism. For a rational catalog of horrible examples, read Ben Goldacre's recent article "Don't dumb me down" from his Bad Science column in the Guardian ("Ben Goldacre on why writing Bad Science has increased his suspicion of the media by, ooh, a lot of per cents"). I don't think that we can do much about media sensationalism or the scientific ignorance of many journalists. On the other hand , there's no reason why better information about science and technology should not also be available to the public.

Several of the themes that we bark incessantly about here on Language Log come together here, especially Open Access scientific publishing, which allows the public to read the original work, and encouraging more science writing in informal public forums, including weblogs, because "if there were an order of magnitude more science writing in blogs, there'd be less than an order of magnitude more crap, and more than an order of magnitude more good stuff".

But there's one thing that we don't bark about enough. When a piece of scientific research comes to the attention of the media, those who know it best should make available a simple account of what the research is and what it means (or doesn't mean). If misinterpretations become rampant -- which is just another way of saying, if there's widespread media interest -- then it's in everyone's interest for the authors to address the misrepresentations directly. This clarifies things for the more sensible fractions of the public and the media. And it should also help reduce the "bonkers" factor, since even reporters often use web search before they start making phone calls and sending emails, and if they don't, you can still send them off to read your "what my research is and is not" page instead of repeating the same explanations again and again.

In this case, Dr. Wilson's page at King's College London Institute of Psychiatry ("Last updated by Dr Glenn D Wilson on Monday, October 04, 2004 2:00:58 PM") could have a link to a paragraph (or a longer essay) setting the record straight, as he's done in the email quoted above. If that solution is viewed as inappropriate for some reason, there are many other low-impact ways to make such an explanation available to the public.

This page is one of them. So please accept my apologies, Glenn -- and here's hoping that the Guardian, The Times, CNN, the newswires, and so on will chime in too.

Posted by Mark Liberman at 12:46 PM

September 24, 2005

What, me violate copyright?

Not so fast, those who think I would violate anyone's copyright, let alone that of the excellent magazine The New Yorker. Look again at my post on incessant pointless barking. What did I do that could possibly violate copyright, as Heather Green seems carelessly to be assume I and other bloggers are doing? (She doesn't name me or Language Log, but she comments about "the copyright violation that's going on" as bloggers comment on the cartoon.) I put in a link to the cartoon in its original location among the assets of The New Yorker's cartoon library cartoonbank.com, where I found it by a simple use of their search engine. I made no copy. And I made no commercial use of the image (that is their business: you can get the cartoon sent to you on a T-shirt if you want). When you look at the relevant part of my post you're looking at the cartoon itself in its publicly accessible location, as the copyright stamp indicates.

All I did was to put <IMG src=...> in the HTML code that I wrote, followed by the URL at which you like anyone else in the world can look at the cartoon whenever you wish. If the owners of www.cartoonbank.com want to shut off access to the cartoon in question, which is up to them, the image will just disappear from my post the moment they change the location or alter the permissions on the file, and you'll see the broken-picture icon instead. I think people like Heather Green are a little bit confused about copyright law. It's no violation of copyright for me to point your browser to their freely available cartoon.

Now, I'm told that it's considered a violation of netiquette to link to someone's image without telling them: first, they don't know that their image is an integral part of the appearance of your page and they might not want it to be, and second, each time someone loads your page they snatch a tiny little bit of bandwidth from the server where the image is located. But this is an entirely different issue, one of small courtesies. And while it might be relevant for a file of elaborate graphics owned by a private individual paying for a home server, it's surely not relevant for images owned by corporations and held on huge servers specifically designed to deal with millions of hits day and night, hits that they want because that's how they get their business.

Now, if I put links to the whole free content of Business Week Online in a frame saying "Geoff Pullum Business News, where you can read Business Week for free!", and included a whole lot of advertising paid for by rival magazines, I can well imagine that I would get a letter from their lawyers asking me to cease doing this because it was unfair for me to in effect take credit for their content. But it wouldn't be a copyright case, I suspect. And certainly I wouldn't expect a suit if I just said you can find Heather Green's damn fine blog at Business Week Online and here is their logo (another image which is, like billions of others, open and accessible to anyone in the universe who has a web browser).

Posted by Geoffrey K. Pullum at 03:29 PM

The New Yorker is unfazed (though ungrammatical)

In reference to Geoff Pullum's posting of Alex Gregory's blogs-as-pointless-incessant-barking cartoon, Heather Green at BusinessWeek's Blogspotting site has posted some correspondence from Bob Mankoff, identified as "cartoon editor of The New Yorker and president of the Cartoon Bank service, which is owned by the New Yorker". Ms. Green tells us that she "was intrigued that though the New Yorker and their pr seem aware of the copyright violation that's going on with this, they appear unfazed". For my part, I was intrigued to find Business Week apparently in pursuit of possible small-time infringements of other people's intellectual property rights.

I imagine that Mr. Mankoff sees cartoon-blogging as an extension of the traditional practice of posting cartoons on refrigerators, bulletin boards and walls, which has always provided valuable free advertising for the cartoonists and their publishers, while enriching the culture at large. If so, this is a mature and sensible response, which I applaud. Another view of his reaction, however, is that blogs and other online media are simply beneath one's notice.

Mr. Mankoff's response does emit a certain odor of condescension, along with some linguistic oddities:

"While many cartoons on blogs have been submitted and rejected by The New Yorker, this one, seemed to us, to perfectly capture the irony inherent in a communications phenomenon that permits so many to say so little about so much. I think it will become a classic, in every way as emblematic of its time, as this other cartoon, also involving dogs, published in 1993 by Peter Steiner was of its."

This being Language Log, rather than Social Psychology Log, I'll focus on the linguistic oddities, starting with the apparent example of anti-FLoP coordination in the first clause:

... many cartoons ... have been submitted and rejected by The New Yorker ...

Incompletely parallel coordinations of this type are usually unambiguous but also ungrammatical. A New Yorker example that I cited earlier also involved leaving out to in the first element of a verbal coordination, resulting in a sentence that apparently can't be parsed at all in its intended reading:

... for that to happen a person would have to have been exposed at great length, or have eaten raw, infected poultry.

You could construe this sentence as conjoining [have been exposed at great length] or [have eaten raw, infected poultry], but I think it's clear from the context that the author meant to convey that a person would have had to have been exposed at great length to infected poultry, or to have eaten infected poultry raw. You can't parse the sentence that way without adding a missing to.

In contrast, Mr. Mankoff's subordinate clause has two valid parses, one that evokes the image of The New Yorker repeatedly rejecting its own submissions -- [[submitted and rejected] by The New Yorker] -- and another one that means what he meant to say, whose structure is [[submitted] and [rejected by The New Yorker]]. However, this second parse strikes me as so awkwardly asymmetrical as to be unlikely, and I wonder whether instead he meant the first structure with the second meaning, a missing to having been lost in the heat of composition.

There's some other evidence of compositional carelessness in Mankoff's note, such as the puzzling commas around "seemed to us":

While many cartoons on blogs have been submitted and rejected by The New Yorker, this one, seemed to us, to perfectly capture the irony inherent in a communications phenomenon that permits so many to say so little about so much.

Presumably this is a blend of "this one, it seemed to us, perfectly captured..." and "this one seemed to us to perfectly capture..."

Mr. Mankoff is not the first person to (mis-)use {"submitted and rejected by"}, for which Google finds 150 hits, e.g.

Subsequent designs were submitted and rejected by the Commission in November 1929.
Once a petition for certification has been submitted and rejected by the Commission, the same signatures may not be submitted in any subsequent petition to certify a new political party.
IF AN APPLICATION IS SUBMITTED AND REJECTED BY COUNCIL ON THREE (3) OCCASIONS ... IT WILL BE REMOVED FROM ANY FURTHER CONSIDERATION ...
Five separate plans were submitted and rejected by New York City agencies over a twenty- year period.

I suspect that in many of these examples, the author intended to conjoin [submitted and rejected], and just lost track of the to. More than twice as many authors (379) used the pattern {"submitted to and rejected by"} to express similar ideas:

Authors must inform the editor if their manuscript has been submitted to, and rejected by, another journal.
... and also regarding the recent manuscript which was submitted to and rejected by the Journal of Clinical Oncology ...
... the charter amendments which were submitted to and rejected by the electors in 1938 ...
... generally such complaints must provide proof that the complaint has first been submitted to and rejected by the relevant local law enforcement agency ...

And I'll bet that a much larger number wisely decided to frame their thoughts in some other way altogether. As evidence, observe that comparable but fully parallel sequences are much commoner: "considered and rejected by" gets 28,300 Google hits, "addressed and delivered to" gets 10,700, "written and submitted by" gets 7,140,000, and so on.

Mankoff's second sentence also has some curious comma-placement choices, an oddly placed by-phrase, and a strange ending:

I think it will become a classic, in every way as emblematic of its time, as this other cartoon, also involving dogs, published in 1993 by Peter Steiner was of its.

The noun phrase "this other cartoon" has three separate and parallel post-modifiers:

  • also involving dogs
  • published in 1993
  • by Peter Steiner

It's hard not to read this sentence as suggesting that the cartoon was published in 1993 by Peter Steiner, although this is obviously not what Mr. Mankoff meant: the cartoon was created by Mr. Steiner, but published by The New Yorker. Stacking up postmodifiers like this is usually a bad idea, especially when adjacent pairs have as strong an affinity for one another as "published in 1993 by Peter Steiner".

And last but by no means least, we come to the ending "was of its". After clearing out the textual undergrowth, we have:

It [is] ... as emblematic of of its time as this other cartoon ... was of its.

My attempts to find other phrases of this type on the web turned up almost exclusively a set of SEO blackhat pseudo-text sites. (Compare the results of this search, which substitutes "of his" for "of its".)

The reason, I think, is that constructions with its as the object of a preposition, by analogy to phrases with mine, hers, his, and theirs, are simply ungrammatical:

Her legs are longer than mine.
*Looking at the lion, I concluded that my legs are longer than its.

She returned with some relatives of hers.
*The bird returned with some relatives of its.

Bill Gates is as important to his company as Steve Jobs is to his.
*The hyena is as important to its ecosystem as the polar bear is to its.

[As Arnold Zwicky has pointed out to me, the problem seems to arise in other cases where its winds up stressed, so that these are also problematic:

*I wiggled my ears, and then the dog wiggled its.
?My ears are bigger than my cat's, but its are nicer to stroke.

though the second one doesn't seem so bad to me.

Arnold observes that Jorge Hankamer wrote about this in a 1973 paper, "Why there are two than's in English. CLS 9.179-91, and adds

Jorge has the constraint in an especially robust form; he really hates all occurrences of accented "it". For a fair number of other speakers -- I am one -- there's a dispreference rather than a constraint. I find Mankoff's sentence clunky, but not unacceptable.

]

Mankoff's feelings about this are probably like Arnold's, though it's also possible that he expressed a complicated thought a bit carelessly, so that the ungrammaticality of the result was obscured by its complexity.

I guess I need to admit at this point that I've aligned myself -- and not for the first time -- with the cause of rationalist prescriptivism, according to which certain ways of talking or writing are judged to be ill-advised on logical or analytical grounds. But unlike the plural people peeve and many other examples where the appeal to alleged linguistic logic is mistaken and even silly, the questions that I've raised about Mankoff's note seem to me to be justified ones.

For a lucid defense of this point of view in general, see Geoff Pullum's post "'Everything is correct' vs. 'Nothing is relevant'".

[Update: Chris Waigl found more evidence that other writers on the web share Mankoff's willingness to use its as the object of a preposition at the end of a clause:

link) We don't, for example, inevitably compare every three-panel gag-a-day strip to Garfield, even though it is arguably as influential to its form as Far Side was to its.
(link)It bears the same relationship to its genre as Hamlet does to its, and while it doesn't have the same level of language to sustain it, that wouldn't be appropriate in the medium.

Chris also found several historical examples:

(link) This is in accordance with our policy of not intervening unless the European powers are unable to agree and make request for our assistance. Whenever they are able to agree of their own accord it is especially gratifying to its, and such agreements may be sure of our sympathetic support. (Calvin Coolidge, State of the Union Address, *December 8, 1925)
(link)In short, it was he who turned Austria on its axis, and France on its, and brought them to the kissing pitch. (Thomas Carlyle's "History of Friedrich II of Prussia V")
(link)Meanwhile he is the great artisan and laborer by whose aid men are enabled to build a world within a world, or, at least, to smooth down the rough creation which Nature flung to its. (Fire Worship, by Nathaniel Hawthorne )

Chris observes that

Admittedly, the Coolidge cite rather makes your point: I'm quite unsure what the antecedent of "its" is supposed to be.

Yes, I wonder whether there might have been a scribal (or OCR) error in that case: the sentence as presented seems not so much ungrammatical as simply incoherent. I suspect that the "its" should actually have been "us".

Finally, Chris finds some historical examples of stressed its as the object of a verb rather than a preposition.

(link) The water consumed in the three crank engine is 12.93 lb., against 13.0 in the two crank, but the former drives its ship nearly ½ knot per hour faster than the latter does its, and when both ships are driven at the same speed the consumption of coal in the three crank ship is considerably less than in the other. (TRIPLE COMPOUND ENGINES. By Mr. A.E. SEATON. In: Scientific American Supplement, No. 492, June 6, 1885)
(link)It was at this point that her keen attention became fixed on him and never afterwards wavered. If everything had its story, the mistletoe would have its; he must interpret that: and thus he himself unexpectedly had brought about the situation she wished. (Bride of the Mistletoe, by James Lane Allen)

These seem just as problematic to me, but I'm becoming persuaded that this is one of those cases like "such the surprise", where the English language has some corners that not everyone is able to visit.

Apparently reactions to these "accented it" sentences range from mine (they're just as bad as "I gave it to he") through Arnold Zwicky's (they're awkward but basically OK) to some other readers' (they're perfectly fine, there's no problem with them at all). I wonder what the distribution of these judgments are, overall and by age, region and so on. It's amazing how many aspects of the English language remain essentially undocumented.]

Posted by Mark Liberman at 12:06 PM

Shorter yet

In response to Geoff Pullum's post on the Shortest published sentence of the year ("Z."), Michael Kaplan observes that there is a counting number smaller than 1, and attempts to exhibit a sentence containing that number of characters.

[Actually, Jonathan Lundell has also suggested a sentence one letter shorter than my shortest, and it's more convincing. See the modified tail end of my original post. &mdashGKP]
Posted by Mark Liberman at 07:55 AM

September 23, 2005

Football in Navajo, anyone?

A friend of mine forwarded this ESPN.com article about how a college football game (Cal @ NMSU) will be broadcast over the web in Navajo. The broadcast starts tonight at 8pm New Mexico time; check the NMSU website for a prominent link.

I think this is a fantastic idea, and I hope it catches on -- both in Navajo and otherwise. But something about the ESPN.com article still bugged me. It's another one for the "they don't have a word for x" file.

The article ends like this (emphasis added):

The challenge for Frank and Pahe [the two Navajo speakers who will do the broadcast] is that Navajo is very different from English. For example, there's no word for first down.

"It takes nearly twice as long to say something in Navajo as it does English," Frank said. "I've just got to concentrate on the basics."

My guess is that the bolded bit just means that there isn't (yet!) a commonly-established set of football terms in Navajo, not that it's somehow impossible to express "first down" with a single word in Navajo. (Note also the strangeness of this example; "first down" isn't one word in English, either.) The "nearly twice as long" comment by Frank probably more specifically refers to the likely fact that, rather than borrowing terms directly from English, these broadcasters have chosen to translate the terms as longer descriptive phrases (for example, "first down" might be translated into something more like "the team retains possession of the ball for another four attempts").

In any case, I'm sure that if broadcasts like this continue and enough Navajo-speaking sports fans begin to talk in an organized way about football in Navajo, shorter terms will be born: they'll either be borrowed, co-opted from existing words (like "down" was in English), or outright invented. All it will take is for some people to agree on some terms. And this may already be happening, for all we know: according to the article, "[s]everal stations already broadcast high school games in Navajo, but Friday will mark the first time an NMSU football game has been broadcast in the language."

[ Comments? ]

Posted by Eric Bakovic at 12:27 PM

Who is this exalted parrot?

[Guest post by Benjamin Zimmer]

Geoff Pullum and Mark Liberman have bemoaned the pernicious Strunkism averring that the plural of person should only be persons and never people. Though Kenneth Wilson says that "there is a fair-sized history of complaint about the use of people as a plural with specific numbers," I do not believe that this history extends more than a few decades previous to the 1918 publication of The Elements of Style, at least based on currently digitized historical materials. Checking the American Periodicals Series database on ProQuest, I see no mention of this usage distinction before the 1890s.

The earliest prescriptivist note I've found so far comes from The Chautauquan, a Methodist magazine based in Meadville, Pa.:

It is better to say many persons think so than to say many people. [The Chautauquan, Apr. 1891, p. 109]

This piece of advice appears in a section on "The Queen's English," alongside similarly prim exhortations:

If I am not mistaken, you gave me the wrong change; say If I mistake not.

I hate such weather. Never use such an intense word as hate to express dislike.

Six years later, the people vs. persons usage issue became a lively point of contention in the pages of the New York-based literary journal The Critic, beginning with the gripe of an unnamed correspondent:

I am reminded ... that there is one word which is misused by every journalist and every author wherever the English language is written — the word 'people.' Mr. Howells, for instance, in one of his delightful novels speaks of 'three people' sitting in a room. Now, if two of these 'people' were to withdraw, one 'people' would be left — and very much left! It seems unnecessary to state — and yet it is necessary to state it — that 'people' is a collective noun, and can properly be applied only to a nation, a tribe, a class, a community. It is quite admissible to say, 'How are your people?' — meaning your family, your clan; but such a phrase as 'Fifty people were injured,' or 'A hundred people were present,' is sloppy English. 'Persons' and 'people' are not convertible terms. For twenty-five years or more, I have kept my eye on this little word 'people,' and I have yet to find a single American or English author who does not misuse it. [The Critic, Jan. 16, 1897, p. 43]

It's striking that the correspondent used precisely the same "subtractive" argument that Strunk would make two decades later in The Elements of Style. (Strunk wrote: "If of 'six people' five went away, how many 'people' would be left?") In fact, when I first read this passage, I thought it might have been penned by the young Strunk himself warming up his prescriptivist act, until I realized that he would have been 28 years old at the time — a bit too young to have observed "twenty-five years or more" of literary usage!

How might one answer this mind-bogglingly expansive claim that every single American and British author is guilty of "sloppy usage"? In a later issue of The Critic, William Henry Bishop of Yale takes up the call, albeit wearily:

Prof. W. H. Bishop of Yale writes to me: — "I must say that the remarks your correspondent writes you about 'people' are the kind of thing that make me very tired. Since when has English become so logical that you must refrain from saying 'three people' because, then, you might have to say 'one people.' You are not obliged to do anything of the kind, and never will be, unless all good writers agree upon it, and then — for that is the way language is made — it will be proper to do so. Who is this exalted parrot, who has not yet discovered that English is a mass of illogicalities, accepted by convention? And so is every other language as well. The different idioms are not obliged to square among themselves; they are so because they have been adopted, because they are so." [The Critic, Feb. 6, 1897, p. 98]

Bishop, an author of some renown, swiftly dispatches the "exalted parrot" who would seek to deny the conventionality and relative arbitrariness of any linguistic system (to invoke Saussurean semiotics). But arguments both pro and con continued in The Critic for some months afterwards:

The fact remains that if it is correct to say 'two people' it is correct to say 'one people,' meaning one person, or one individual. [The Critic, Feb. 13, 1897, p. 115]

It is usage, and only usage, that makes these things right or wrong; and, as I have said, usage has for centuries justified the use of 'people' as a virtual plural, with no singular, in this sense of 'persons.' [The Critic, Mar. 6, 1897, p. 166]

All attempts to regulate the growth of language on the basis of logical consistency are of course futile. When we are told that if we say "two people" we must also say "one people," we simply smile at the "must" and pass on. There is no necessary offense to the ear in the use of "people" as a synonym for "persons" when the word is preceded by any numeral higher than one; and so usage has conquered. [The Critic, Jan. 29, 1898, p. 69]

Was William Strunk, Jr., then a young English instructor at Cornell, paying attention to this debate in The Critic? Perhaps he was, but his imperious tone in The Elements of Style suggests that he would not have been too concerned with being dismissed as an "exalted parrot" by the William Henry Bishops of the world.

[Note added by Mark Liberman: Ben has collected a spectacular specimen of the rationalist strain of prescriptivism, according to which it's sensible to assert that "every journalist and every author wherever the English language is written" has been using an incorrect plural form. The concept of linguistic original sin is discussed in an earlier Language Log post, "The Theology of Phonology", and placed in a larger taxonomy of prescriptivist arguments in another post, "A Field Guide to Prescriptivists". ]

[Update: Richard Hershberger writes

I always knew my secret (formerly) vice of collecting old usage manuals would pay off!

In Words; Their Use and Abuse by Williams Mathews, published 1877, p. 361, in the chapter "Common Improprieties of Speech, is:

"People for persons. 'Many people think so.' Better, persons; people means a body of persons regarded collectively, a nation."

The complaint does seem to have started slowly. The usually reliable Richard Grant White didn't mention it. Alfred Ayres had a different, and even more bizarre, complaint about the word:

"People. This word is much used when some one of the words community, commonwealth, nation, public, or country would seem better to express the thought intended. People, as the word is often used, not infrequently conveys the impression that a class is meant--a class that includes all, perhaps, but the very rich and the higher officials. Now as there are, strictly, no classes in the United States, as all are equal in the eyes of our institutions, as every citizen is the peer of every other citizen, save in eligibility to the presidency, the impression conveyed by the word people is often erroneous. For example, instead of, 'The Senate must take action and obey the will of the people,' would it not better express what is intended were we to say, 'the will of the nation, or of the country'?"

(Alfred Ayres, The Verbalist 1881, 1896, 1909, 1919, p. 204.)

At least Mathews doesn't suggest that there is something particularly pernicious in the use of people with a specific "word of number" such as six, rather than a plural quantifiers like many or several. I (this is Mark Liberman now) wonder: could this bizarre condition have originated in a misunderstanding of Strunk's little regulation, in which he might have meant "words of number" simply to mean "quantifiers"?]

[Update 9/24/2005: Arnold Zwicky wrote in yesterday to point out that the (excellent) Merriam-Webster Dictionary of English Usage has an enlightening discussion of the history of the people peeve:

The MWDEU entry cites Alford 1866 (A Plea for the Queen's English) mentioning "a correspondent who wrote in to object to the expression several people; he said it ought always to be several persons. Alford was lukewarm to the proposal." MWDEU then cites William Mathews (listed as being published in 1876), which Richard Hershberger supplied [above].

People then "went on the "Don't List" of the New York Herald and thereafter became a staple of journalistic usage writers" (Bierce 1909, Hyde 1926, Bernstein... and up to Safire and Kilpatrick). MWDEU mentions "what must have been a raging dispute on the subject in the pages of the Washington (D.C.) Times in 1915 and 1916." [Strunk is likely to have read these exchanges.]

The rest of the article is equally entertaining. The objections apparently began with several and many, then turned to numbers; some people allowed people with round numbers but disallowed it with specific numbers.

Yesterday I sent Mark some correspondence from earlier in the year in which I reported having suffered under the AP's proscription of "people" with quantifiers -- a proscription not revised in the AP style manual until "around 1980" (MWDEU).

MWDEU cites discussions by Poutsma (1904-06), citing Dickens and Punch, and by Jespersen (1909-49), going back to Chaucer and taking in Defoe, Dickens, Disraeli, and others.

]

Posted by Mark Liberman at 12:08 AM

September 22, 2005

Counting people

Geoff Pullum's exegesis of yesterday's cartoon referred to one of William Strunk Jr.'s irrational little prejudices:

The word people is not to be used with words of number, in place of persons. If of "six people" five went away, how many "people" would be left? [The Elements of Style, 1918]

That might be the most illogical argument I've ever encountered. Since Prof. Strunk died before I was born, it's only in imagination that I can answer: "One, you pompous old crank. Another glass of sherry?"

E.B. White's fever-dream description of William Strunk Jr., from the introduction to the 1979 edition of their parvum opus, suggests that some form of sedation was in order:

From every line there peers out at me the puckish face of my professor, his short hair parted neatly in the middle and combed down over his forehead, his eyes blinking incessantly behind steel-rimmed spectacles as though he had just emerged into strong light, his lips nibbling each other like nervous horses, his smile shuttling to and fro under a carefully edged mustache.

"Omit needless words!" cries the author on page 23, and into that imperative Will Strunk really put his heart and soul. In the days when I was sitting in his class, he omitted so many needless words, and omitted them so forcibly and with such eagerness and obvious relish, that he often seemed in the position of having shortchanged himself — a man left with nothing more to say yet with time to fill, a radio prophet who had out-distanced the clock. Will Strunk got out of this predicament by a simple trick: he uttered every sentence three times. When he delivered his oration on brevity to the class, he leaned forward over his desk, grasped his coat lapels in his hands, and, in a husky, conspiratorial voice, said, "Rule Seventeen. Omit needless words! Omit needless words! Omit needless words!"

This is a scene scripted by Lewis Carroll in one of his darker moods. I've given some other samples of Elwyn Brooks White's psychedelic punditry in a post last February on The Wild Flag, a 1946 collection of his editorials on "Federal World Government".

Anyhow, I don't know whether Strunk invented the idea that people can't be a plural count noun, or inherited it from some earlier usage crank. Ken Wilson in the 1993 Columbia Guide to Standard American English (link) talks about "a fair-sized history":

There is a fair-sized history of complaint about the use of people as a plural with specific numbers, and some older conservatives still don’t like the practice. But both seven people and seven persons are Standard, with people getting a good deal more use than persons. Any difference is stylistic; to some people, persons may seem a bit more formal.

This way of putting it suggests that persons is the old-fashioned choice, with people (as a plural count noun) a modern innovation. But The American Heritage Guide to English Usage (link) obseves that

Some grammarians have insisted that people is a collective noun that should not be used as a substitute for persons when referring to a specific number of individuals. By this thinking you should say Six persons (not people) were arrested during the protest.

But people has always been used in such contexts, and almost no one bothers with the distinction any more. Persons is still preferred in legal contexts, however, as in Vehicles containing fewer than three persons may not use the left lane during rush hours.

And indeed people was a plural count noun in English for centuries before Will Strunk's birth. I couldn't find any examples where Shakespeare happens to use people with a specific number, but he often uses it as a plural count noun, for example:

King Lear, Act II, Scene 2:

I dare auouch it Sir, what fifty Followers?
Is it not well? What should you need of more?
Yea, or so many? Sith that both charge and danger,
Speake 'gainst so great a number? How in one house
Should many people, vnder two commands
Hold amity? 'Tis hard, almost impossible.

For people with a specific number, we can turn to John Taylor's pre-1630 poem Taylors Water-worke (Or, the scullers travels, from Tyber to Thames):

111      In Henries Raigne and Maries (cruell Queene)
112      Eight thousand people there hath slaughtered beene,
113      Some by the Sword, some Hang'd, some burnt in fire,
114      Some staru'd to death in Prison, all expire.
115      Twelue thousand and seuen hundred more beside,
116      Much persecuting trouble did abide.

Nothing seems to have changed people-wise in the century and a half leading to Richard Brinsley Sheridan's The Critic (1779), Act III:

BEEFEATER. "Perdition catch my soul, but I do love thee."
SNEER. Haven't I heard that line before?
PUFF. No, I fancy not---Where pray?
DANGLE. Yes, I think there is something like it in Othello.
PUFF. Gad! now you put me in mind on't, I believe there is---but that's of no consequence---all that can be said is, that two people happened to hit on the same thought---And Shakespeare made use of it first, that's all.

And of course Jane Austen can be counted on to hit Strunk square in the nose. Here's Emma, chapter 8:

Waiving that point, however, and supposing her to be, as you describe her, only pretty and good-natured, let me tell you, that in the degree she possesses them, they are not trivial recommendations to the world in general, for she is, in fact, a beautiful girl, and must be thought so by ninety-nine people out of an hundred; and till it appears that men are much more philosophic on the subject of beauty than they are generally supposed; till they do fall in love with well-informed minds instead of handsome faces, a girl, with such loveliness as Harriet, has a certainty of being admired and sought after, of having the power of chusing from among many, consequently a claim to be nice.

So, Professor Strunk, if of "100 people" 99 went away, how many "people" would be left?

For an example from a different sort of adventure story, here's Sir Walter Scott's Redgauntlet (1824), chapter 4:

There seemed to be at least five or six people about the cart, some on foot, others on horseback; the former lent assistance whenever it was in danger of upsetting, or sticking fast in the quicksand; the others rode before and acted as guides, often changing the direction of the vehicle as the precarious state of the passage required.

And here's Charles Darwin, Journal of Researches into the Natural History and Geology of the countries visited during the voyage round the world of H.M.S. Beagle, chapter 16:

At Ica forty-two people thus miserably perished.

Mark Twain, Life on the Mississippi, chapter 45:

If you add six ladies to the company, you have added six people who saw so little of the dread realities of the war that they ran out of talk concerning them years ago, and now would soon weary of the war topic if you brought it up.

Agatha Christie, The Mysterious Affair at Styles, chapter 8:

From your account, there are only two people whom we can positively say did not go near the coffee—Mrs. Cavendish, and Mademoiselle Cynthia.

And Emily Post, Etiquette (1922):

There are certain words which have been singled out and misused by the undiscriminating until their value is destroyed. Long ago “elegant” was turned from a word denoting the essence of refinement and beauty, into gaudy trumpery. “Refined” is on the verge. But the pariah of the language is culture! A word rarely used by those who truly possess it, but so constantly misused by those who understand nothing of its meaning, that it is becoming a synonym for vulgarity and imitation. To speak of the proper use of a finger bowl or the ability to introduce two people without a blunder as being “evidence of culture of the highest degree” is precisely as though evidence of highest education were claimed for who ever can do sums in addition, and read words of one syllable.

While we're speaking with Ms. Post, I can't resist pasting in the start of the passage quoted above:

It is difficult to explain why well-bred people avoid certain words and expressions that are admitted by etymology and grammar. So it must be merely stated that they have and undoubtedly always will avoid them. Moreover, this choice of expression is not set forth in any printed guide or book on English, though it is followed in all literature.

To liken Best Society to a fraternity, with the avoidance of certain seemingly unimportant words as the sign of recognition, is not a fantastic simile. People of the fashionable world invariably use certain expressions and instinctively avoid others; therefore when a stranger uses an “avoided” one he proclaims that he “does not belong,” exactly as a pretended Freemason proclaims himself an “outsider” by giving the wrong “grip”—or whatever it is by which Brother Masons recognize one another.

People of position are people of position the world over—and by their speech are most readily known. Appearance on the other hand often passes muster. A “show-girl” may be lovely to look at as she stands in a seemingly unstudied position and in perfect clothes. But let her say “My Gawd!” or “Wouldn’t that jar you!” and where is her loveliness then?

Ms. Post, meet Jerry Jeff Walker. You two have a lot to talk about. And remind me later, there's a certain linguist who is eager to ask you about negative evidence...

Finally, the largest count of "people" that I've found in the classics, from the discussion of George Whitefield's public speaking in Benjamin Franklin's Autobiography:

He had a loud and clear voice, and articulated his words and sentences so perfectly, that he might be heard and understood at a great distance, especially as his auditories, however numerous, observ'd the most exact silence. He preach'd one evening from the top of the Court-house steps, which are in the middle of Market-street, and on the west side of Second-street, which crosses it at right angles. Both streets were fill'd with his hearers to a considerable distance. Being among the hindmost in Market-street, I had the curiosity to learn how far he could be heard, by retiring backwards down the street towards the river; and I found his voice distinct till I came near Front-street, when some noise in that street obscur'd it. Imagining then a semi-circle, of which my distance should be the radius, and that it were fill'd with auditors, to each of whom I allow'd two square feet, I computed that he might well be heard by more than thirty thousand. This reconcil'd me to the newspaper accounts of his having preach'd to twenty-five thousand people in the fields, and to the antient histories of generals haranguing whole armies, of which I had sometimes doubted.

Although Franklin was a scientist with a well-deserved international reputation, this is the only piece of quantitative reasoning that I've read in his works. Finding historical counter-examples to Strunk's people peeve has become boring, and I've still got half a cup of coffee left, so what do you say we walk through this crowd estimate in Ben's virtual footsteps?

OK, let's do it. Courtesy of Google Map, Sue & Paul Drouin-Degnan's Gmap pedometer, and TinyURL, here's a map of the radius that Ben is talking about.

The distance from the west side of Second St. to the west side of Front St. along Market St. is about 452 feet, which taken as a radius would yield a semicircle of area (pi*452^2)/2 = 320,920 square feet, and allowing 2 square feet each, 160,460 people. This is more than five times greater than Ben's calculation.

To yield by his method an audience of around 30,000 people, you'd need a radius r such that (pi*r^2)/4 = 30,000, yielding r = sqrt(120,000/pi) = 195. Call it 200 ft. But this would only take Ben to Letitia St. or so, as shown on this map, less than half way to Front St. There's no reason to call Ben's geography into question -- I'm sure he knew very well where the courthouse steps were, and where Front St. was. I can see two possible sources for his mistake. One possibility is that he screwed up the arithmetic. An idea I like better is that he measured the distance in terms of paces, and later misremembered the number as counting feet. A stride length of 27 inches -- about what mine is -- would turn 450 feet into 450/(27/12) = 200 paces. QED.

So Ben Franklin made an honest mistake. Will Strunk, on the other hand, invented (or adopted) an illogical justification for a hallucinated principle of usage, and used his position of authority and his charismatic bluster to impose it on generations of impressionable students.

[Update: Jesse Sheidlower pointed out that I should have appealed to the authority of the OED on this point -- see below for details. Normally I would have done this, but this morning some internet gremlins seem to have interfered with my ability to get coherent access to the online edition. Communication has been restored, and I can quote sense 2 of the people entry:

2. In sing. With pl. concord. a. Men or women; men, women, and children; folk.
Freq. with singular modifiers in Middle English.
In ordinary usage, this is treated as the unmarked plural of person, whereas persons emphasizes the plurality and individuality of the referent.

The OED gives citations for people as a plural count noun back to Bevis of Hampton in 1330. I'm not sure what Strunk really mean by "words of number" -- do "most" and "many" count as "words of number"? If we (for no good linguistic reason) limit ourselves to the names of integers, then OED 2's earliest counterexample to Strunk's people peeve seems to be

1989 Which? Jan. 5/2 Four out of five people thought that fresh fruit and vegetables should be labelled.

though a quote from Chaucer offers an elliptical example:

c1385 CHAUCER Knight's Tale 2513 The paleys ful of peple up and doun, Heere thre, ther ten.

Jesse also offers the following sample, from Oxford's massive archives, of bigger classic pre-people integers than Ben Franklin's:

1722 D. DeFoe Jrnl. Plague Year: The town was computed to have in it above a hundred thousand people more than ever it held before.
Ibid.: Suppose them to be a fifth part, and that two hundred and fifty thousand people were left.
1776 A. Smith Wealth of Nations: In a fertile country which had before been much depopulated, where subsistence, consequently, should not be very difficult, and where, notwithstanding, three or four hundred thousand people die of hunger in one year, we may be assured that the funds destined for the maintenance of the labouring poor are fast decaying.
1846 C. Dickens Pictures from Italy: One hundred and fifty thousand people were there at least!
c1861 W. Whitman "Mannahatta" in Leaves of Grass: A million people--manners free and superb--open voices--hospitality--the most courageous and friendly young men.
1898 H. G. Wells War of the Worlds: It was a stampede--a stampede gigantic and terrible--without order and without a goal, six million people unarmed and unprovisioned, driving headlong.
1925 F. S. Fitzgerald Great Gatsby: It never occurred to me that one man could start to play with the faith of fifty million people--with the single-mindedness of a burglar blowing a safe.

You win, Jesse -- but you have to admit that the Gmap stuff was cool! ]

[For more on the history of the people peeve, see here.]

Posted by Mark Liberman at 07:04 AM

September 21, 2005

You da man

And we the people. It's copula deletion (not). Apparently that's the idea behind this BC strip:

I totally didn't get it, so I asked some of my colleagues for help. Arnold Zwicky, who confesses that his college nickname was "Zot" (the sound of the BC anteater eating an ant), suggested the copula-deletion theory.

Geoff Pullum offered another hypothesis... [&mdash Let me interrupt. Geoff Pullum here. What I said was actually completely nuts and shouldn't have even been mentioned; I should never send emails or operate heavy machinery while under the influence of whatever I was under the influence of. But I am much recovered now, and I do actually have a new hypothesis. If you look in Strunk and Whites vile little assemblage of stupid advice about usage in The Elements of Style, you will find (this is how stupid they are) that the word people is actually banned. They say the plural of person should always be persons. So that could be it. The grammar police are going to insist it should have been "We the persons." Could that be it?

Anyway, returning now to what Mark has posted, what I actually said to Mark in the email was meant to be something tongue in cheek about how perhaps the people in the cartoon thought the original had been us the people and it had been changed to we the people by the grammar police. And I pretended to be shocked that Mark did not know about "the well known rule that we must ALWAYS replace us in all contexts." Then I mockingly and possibly drunkenly added the following. —GKP]

What's the matter, don't you know how to use our great and beautiful language with full correctness? Are you one of the vulgar persons who say "Where's the bus station at?" (correct version: "At where is the bus station?")?

I guess that's the correct version in Geoff's native England, and also over at the New Yorker's IT department. Elsewhere here in his adopted U.S. of A., the correct version would be "where's the bus station at, epithet?" One version of this joke can be found here, [and another one was just posted here, as Matthew Hutson pointed out by email], but there are many other epithets available for use in this context.

Seriously, it might well be true that Johnny Hart called out the grammar police for "we the people" because of we/us anxiety. The other day, I asked a class for (grammatical) reactions to the much-discussed sentence

Toni Morrison's genius enables her to create novels that arise from and express the injustices African Americans have endured.

and one of the suggested emendations was to replace that with which. When people learn that some strange and unnatural principle is supposed to govern the choice between two forms -- that/which, me/I, we/us, whatever -- they sometimes conclude that whichever of these word pairs they're inclined to use in a given case, the "correct" choice is probably the other one. (More on which v. that here and here.)

[Another interpolated note, this time by Mark Liberman. On reflection, I'm convinced that Geoff is right: the alleged infraction is "people". This concocted rule is so far out of line with the norms of contemporary usage that its relevance to this case never even occurred to me. And not even Strunk would prefer "We the persons" in this case, where the joke is a hypercorrect application of one of Strunk's little peeves.

In fact, two different Strunkish peeves involve the word people, though neither one applies to this case. His first complaint was that "The people is a political term, not to be confused with the public. From the people comes political support or opposition; from the public comes artistic appreciation or commercial patronage". His other, more unreasonable notion was that "The word people is not to be used with words of number, in place of persons. If of "six people" five went away, how many "people" would be left?" A perfectly well-formed answer to this question is "One". Or perhaps better, "One, epithet." To quote another post by Geoff Pullum, "Don't put up with usage abuse." ]

Meanwhile, I searched the web for a good illustration of Zot in action, and in the process of failing to find any, I learned that a substantial percentage of what the web knows about anteaters appears to consist of bad puns. From the Online Anteater:

An anteater walks into a bar and says that he'd like a drink. "Okay," says the bartender. "How about a beer?" "Noooooooooo," replies the anteater. "Then how about a gin and tonic?" "Noooooooooo." "A martini?" "Noooooooooo." Finally, the bartender gets fed up and says, "Hey, listen buddy, if you don't mind me asking - why the long no's?"

And apparently the Hilo zoo's anteater, Spike, is betrothed to a young lady named Penny Ant-E.

I'll spare you the one whose punchline is "an armadildo".

[Update: Joe Salmons wrote

I was so baffled by the 'we the people' thing that I took an overhead of the strip to class this morning and invited solutions before the hour started. One student quickly suggested that it might be about commas -- 'we, the people of the United States, ...'. In light of all the recent Eats Shoots and Leaves stuff, that strikes me as really plausible -- although it would be pretty obscure for this kind of cartoon. (As another student quipped, 'I think that guy's running out of ideas.')

Yes, I guess it could be commas as well. My money is still on people, but in a way this whole thing is a kind of parable of the unnatural and thus widely misunderstood nature of the "rules" under discussion. ]

[Update #2: David Kidd writes:

I can't find an electronic copy, but there's an old Far Side cartoon that has founding-fathers-types gathered around one of their number, who sits at a desk with an empty piece of paper and a quill pen held to his lips in consideration, and the caption reads: "So then: should that be 'we the people' or 'us the people'?"

]

[Update #3: Lane Greene writes:

My $.02 -- I can only imagine that a cartoonist who's not a voracious consumer of style books was making fun of copula deletion, "we [are] the people", as your first instinct suggested. The alternatives are all implausible in the extreme to the lay person like me and the average "BC" reader.

"Us the people" sounds bizarre to everyone, especially given that most every American knows the fixed phrase from the Constitution "we the people" and would assume Madison got it right, even if unsure about their own judgment.

"We the public" is a Strunk-only style-not-grammar rule that Johnny Hart seems highly unlikely to know, much less to follow slavishly or expect every reader to know.

"We the persons" doesn't even conform to Strunk's second rule, since it doesn't involve a number ("six persons").

"We, the people" is at least remotely plausible. I doubt Johnny Hart has read "Eats, Shoots and Pontificates Annoyingly", but some English teacher might have drilled some rule along these lines his head. We once had an internal Economist tiff between two editors, one of whom insisted that you must say "Joe Bloggs, of Oxford University, says" and one of whom (the boss, who won by pay grade) said it's OK to say "Joe Bloggs of Oxford University says".

But this last one still seems unlikely since the entire utterance on which the speakers are judged is "We the people". It seems like we are to judge this as a malformed sentence, not a badly punctuated noun phrase.

Far more likely, in a strip that appears in every mainstream broadsheet in the country, is that Hart is having a 10-years-too-late jab at copula-deletion in "Ebonics". Nothing else would make sense for Hart the cartoonist, much less his hugely mainstream audience.

OK, now I'm just as confused as I was when I started, except that instead of having no idea at all what Hart was getting at, I have four candidates: copula deletion, we/us, punctuation, and people/persons. Technically, I guess the hypothesis space is the power set of these alternatives, since the Grammar Police could get on your case for more than one reason. I'd write to Hart and ask him -- but maybe this is one of those things that it's better not to know.]

Posted by Mark Liberman at 06:10 AM

September 20, 2005

New Yorker search engine stark staring mad

Search The New Yorker for some word that doesn't appear in any recent on-line article, I found recently, and you will get the following staggeringly unidiomatic message:

I'm sorry I couldn't find that for which you were looking.

The sad truth is that this probably is not intended as a joke. It would be a rather feeble joke, but at least as a joke it would be less pathetic than it is as an attempt to write ordinary English. I think the programmer who wrote this message was being serious. He or she may even have been instructed (God help us) to write it that way.

Here are the plain linguistic facts: ??I couldn't find that for which you were looking is at the very best highly questionable (I'm almost inclined to call it syntactically ill-formed), a grotesquely clumsy substitute for the perfectly normal I couldn't find what you were looking for. The latter is normal Standard English, acceptable either spoken or written, in either informal or formal style. (Notice that both I'm and couldn't mark the search engine message as being in a relatively informal style, making it even more insane to do the that for which you were looking thing. It really is certifiably stark staring mad.)

It is important that verb-preposition idioms like look for meaning "seek" are not normally broken up, and that which is extremely rare compared to the much more normal what. It is simply astounding that a native speaker could believe otherwise. What on earth could have led the author of the text messages issued by The New Yorker's search engine to put in such a ridiculously tortured sentence?

The backstory has been told in different ways several times before on Language Log. In 1672 an influential essayist called John Dryden published a critical piece called "Defence of the epilogue" which included a catalog of alleged faults in the writing of important recent authors. In that essay he called it "a common fault" to have a "Preposition in the end of the sentence". Notice, uncontroversially, the usage was common in the 17th century. That is because it was fully grammatical then, as it is now, and it already had been for centuries. Dryden even noted that it occurred in his own writings. He had no basis whatever for his objection to it. (Unless you count "Latin doesn't permit this" as an objection. Some people seemed to feel back then that English needed a sort of makeover to turn it into a suitable rival to Latin for serious writing — a whole language community with an inferiority complex. That surely isn't relevant today.)

About a hundred years after Dryden expressed his opinion, Bishop Robert Lowth, in a grammar that became quite important, described the construction with the preposition at the front of the clause as "more graceful as well as more perspicuous", adding that it "agrees much better with the solemn and elevated style" --- though Lowth still made it extremely clear that it is normal in speech and "the familiar style in writing" (the style in which one writes I'm and couldn't rather than I am and could not). Slowly Lowth's view ossified in the writings of other grammarians. By 1800 several famous school textbooks expressed straightforward disapproval of the stranded preposition, and teachers began to teach generations of schoolchildren that it was wrong. In America (though much less in Great Britain) this belief survived from the 19th century into the 20th.

And since nothing progressive really happened in the teaching of English grammar during the 20th century, we now find ourselves, in the 21st century, confronted with educated Americans who seriously think that a search engine should say it is sorry it could not find that for which you were looking. It's staggering. It really is. And The New Yorker apparently encourages this absurd misconception about grammaticality in English. I simply can't imagine what they're thinking of. Or as they would put it, *I simply cannot imagine of what they are thinking.

Posted by Geoffrey K. Pullum at 07:03 PM

Acrostic fame

Here at Language Log Plaza, we celebrate whenever One of Our Own is honored. So we were pleased to hear from an old friend of mine -- my daughter's godmother -- that yesterday's New York Times Magazine double-crostic puzzle was a quotation from Geoff Nunberg's Going Nucular, on retronyms. She added, "I have often mused that having one's work rendered in a NYTimes double-crostic would be the ultimate literary accolade." To Geoff: a specially minted Acrostic Cross (Nucular Grade), with the image of an acoustic guitar on it.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 04:45 PM

P-P-P-Pick up an allusion

In yesterday's (London) Times, Andrew Sullivan has an eloquent essay on "The Politics of Penguins", which features a tantalizing little cultural reference:

Love, it turns out, has very little to do with the mating habits of the Emperor Penguin. According to "The Auk," the scholarly journal of the American Ornithologists' Union, emperor penguins make Liz Taylor look like a lifetime monogamist. Their mate fidelity year to year is 15 percent. Each year, in other words, 85 percent of Emperor penguins get a divorce and p-p-p-pick up a new spouse. Not only that, they're not particularly p-p-p-picky. (Apologies here to American readers. This p-p-p thing is a reference to an old commercial. It's a British thing. Too complicated to explain here).

A Google search for {p-p-p-pick} returns 14,300 results, most equally puzzling to Americans.

Quite a few are news headlines related to penguins, all apparently transparent to British readers:

(link) Females flown in to p-p-p-pick up 'gay' penguins
(link) P-p-p-pick out a penguin

There are lots of other P-P-P-words used without explanation: "PPP Pick up a Pension Credit", a Paintbrush, a picot, a pizza, and so on. There's a movie called The P-P-P-Pick-Up, which according to the Austin gay and lesbian international film festival is "a hilarious short from Germany involving love between a woman and a penguin at a swimming pool".

There are some ads that seem unlikely to have adequate cultural resonance, like an exhortation to "P P P Pick up a [Fiat] Punto", or a recent anti-litter compaign "P P P Pick up a Penguin and Keep Britain Tidy!"

Finally, there's an article at bakeryandsnacks.com ("Breaking News on Industrial Baking and Snacks") entitled "Pushing Penguins", which explains:

05/06/2003 - McVities, the UK biscuit maker owned by United Biscuits, yesterday launched a major new advertising campaign for its popular Penguin brand of chocolate biscuit, and in particular for the new Penguin Chukka sub-brand. [...]

Penguin Chukka features pieces of chocolate, biscuit and caramel in a flip-top pot, and its launch will be supported by a new TV advert continuing the long-running ‘p.p.p.pick up a Penguin’ campaign.

The ad sees the animated Penguin character getting into trouble with some hairy-faced Scotsmen at a caber-tossing contest in the Highlands. The tagline of the ad is ‘Don't just p.p.p.pick up a Penguin, Chukk one’, and UB claims that it emphasises the way the product can be eaten anytime and anywhere.

And there's a post by Scott Taylor at www.zug.com titled "P-P-P-Please Send Penguins!", which offers a p-p-p-primer from an American point of view:

I recently returned from a trip to England. It’s a wonderful place. But it’s not the castles, the questionable dental work, or the constant drizzle that caught my attention: It was the Penguins.

What’s a Penguin? It is a small candy bar (Or, as those who drive on the wrong side of the road say, a “biscuit”). And it redefines the word scrumptious.

Described on the package as “Milk Chocolate covered Biscuit Bars filled with Chocolate Cream,” the Penguin’s compact size, affordability, and sheer deliciousness make them a revelation. At a store I saw an 18 pack for 99 pence. That’s a bit under two dollars, folks, for 18 lovely Penguins. We have no such deals in our nation.

Scott is especially taken with the fun facts on the wrappers ("a cockroach can live without its head for nine days before it starves to death"; "a rat can last longer without water than a camel"), provides pictures, and thinks that Penguins would conquer America like the Beatles or Harry Potter:

Better than black pudding, a monarchy, or soccer, finally we have a non-musical British import that America will embrace readily. Why is it not here yet? I cannot say. But I do know that Penguins would sweep the nation, loved by all, snuck into movie houses and bought up by the busload.

In full rhetorical flight, he produces a beautiful example of "as such" = "therefore":

It’s almost too good to be true. Jokes, an arctic animal with a speech impediment, and candy, together at last. Perhaps the good folk at McVities don’t know what they have on their hands. As such, I decided to write them a letter.

So a few minutes of Google scholarship over breakfast -- the coffee just finished brewing -- has solved the trans-Atlantic p-p-p-puzzle. It's that "long-running 'p.p.pick up a Penguin' campaign". But not every Yank reads Language Log yet, and I'm not as optimistic as Scott about the openness of the American market to new brands of cookies. We're likely to remain two nations separated by a common commercial culture, unless perhaps The Simpsons feature the p-p-p-pick up a penguin slogan in some future episode.

Posted by Mark Liberman at 06:25 AM

September 19, 2005

R!?

Well, it's Type Talk Like a Pirate Day again, time to re-post the Corsair Ergonomic Keyboard, so useful for piratical bloggers:

I mentioned last year that I don't know who deserves the credit for the invention of this keyboard. No one has claimed it yet, so if you know anything about its history, drop me a line.

I'd also like to ask a more linguistic question: where did all the "pirates say arrr" stuff come from, anyhow? Is it some sort of folk-stereotype about pirates coming from r-ful parts of the British Isles? Was it launched by some influential book or movie that featured an especially rhotic buccaneer?

It's not from (the book) Treasure Island, as far as I can tell. The pirate who shows up in the first paragraph is stereotyped enough:

I remember him as if it were yesterday, as he came plodding to the inn door, his sea-chest following behind him in a hand- barrow -- a tall, strong, heavy, nut-brown man, his tarry pigtail falling over the shoulder of his soiled blue coat, his hands ragged and scarred, with black, broken nails, and the sabre cut across one cheek, a dirty, livid white. I remember him looking round the cover and whistling to himself as he did so, and then breaking out in that old sea-song that he sang so often afterwards:

"Fifteen men on the dead man's chest --
Yo-ho-ho, and a bottle of rum!"

in the high, old tottering voice that seemed to have been tuned and broken at the capstan bars. Then he rapped on the door with a bit of stick like a handspike that he carried, and when my father appeared, called roughly for a glass of rum. This, when it was brought to him, he drank slowly, like a connoisseur, lingering on the taste and still looking about him at the cliffs and up at our signboard.

"This is a handy cove," says he at length; "and a pleasant sittyated grog-shop. Much company, mate?"

but to go with the tarry pigtail and the old sea-song, there's nary an extra [r] even hinted at. Nor, on a quick skim, are the rest of the book's pirates notably rhotic. If you know the answer, tell me about it and I'll post it. The best solution to this philological puzzle will win a lifetime subscription to Language Log.

[Not an explanation, but a connection -- the 6th of 30 stories in 30 days (about working in a bookstore) by Andrea Siegel, posted by Andrew Gelman at Statistical Modeling, Causal Inference, and Social Science:

6. A woman comes up to me. "R?" she asks.
I type it in: "R". "R?" I ask helpfully, inviting the next letter. She looks at the screen. "No no no no no. Rrrrrr," she says.
"Rrrrr," I type in.
"No no no no."
I give up. I hand her a pen and piece of paper.
She writes, "Art." She's French.
I point to the Art section.

Now if pirates were really French aesthetes...
Naw. ]

[Here's the real answer! (Or at least the first step.) I thought it had something to do with Treasure Island, and it did. But it was Robert Newton in the 1950 movie version. A short bio of Newton is here. Roger Depledge, the first to send in this information, wins a lifetime subscription to Language Log.

But wait, there are more prizes to be won! The next question: what was Newton's model for the rrrr business? A regional dialect? A music hall stereotype? Or was it sheer alcohol-fueled phonetic invention? ]

[Joe Gordon suggests that perhaps Newton's R was an early entrant in Geoff Pullum's Shortest Sentence of the Year contest. ]

[Geoffrey Nathan emailed:

I've always assumed (and I recall reading it somewhere, when I was a graduate student) that many pirates (such as those of Penzance, for example--those who plough the sea), originated in the Southwest of England, which is, of course, r-ful, and in fact has always seemed to me to be somewhat more hyperaticulatedly r-ry than American r-fulness.

The theory I originally heard was that Maritime Pidgin English (early nineteenth century version) was based on that area. Stereotypical pirate talk (including invariant 'be' for all copulas, for example) was a fossilized survivor of the Pidgin ('That be the white whale, me hearty'). I remember reading (or perhaps hearing about this when I was exposed to some version of the monogenesis theory of the origins of creoles.

Seems at least plausible, and I like the idea of archeological vestiges of Maritime Pidgin English still around in folk culture.

I think Geoffrey also wins a lifetime subscription to Language Log. And for what it's worth, Robert Newtown apparently hailed from Shaftesbury in Dorset. I think that "maritime pidgin English" was more of a 17th and 18th century development, but maybe its r-fulness was well established among buccaneers by the time that Stevenson wrote about. ]

[And here's more on "Maritime Pidgin English" (a term apparently coined by J.L. Dillard) and its role in the history of pirates and of all the rest of us, from an article by Erin Mackie, "Welcome the Outlaw: Pirates, Maroons and Caribbean Countercultures", Cultural Critique 59—Spring 2005.

... as the linguist J. L. Dillard (1975) has shown, the British ship had its own language, one it shared with the coastal regions of England, Africa, the Caribbean, and North America.

The extensive association among sailors and the African peoples they transported, worked with, and were in casual contact with, brought with it specific “exotic” or “savage” forms of transatlantic language. Cultural and linguistic contact took place in and between such cosmopolitan polyglot harbors as London, Bridgeton, Kingston, Charleston, New Orleans, Boston, Philadelphia, and coastal African factory settlements such as Bonny, Whydah, Cabinda, and Goree. A present-day cultural historian describes the linguistic situation in early eighteenth-century Jamaica as one “not of diglossia, triglossia, or even heteroglossia but of panglossia, a state of ‘generalized multilingualism,’” from which Creole emerges as the primary, if despised, language of the island. Communication in the pan- Atlantic world did not take place in standard English (or any European language), but rather in pidgins and Creoles concocted out of European and African languages. J. L. Dillard has documented how “the Maritime Pidgin English, transmitted to West Africans in the slave trade and heavily influenced by West African languages, became the English Creole of the plantations from Nova Scotia to Surinam” (All-American English, 3–76; Black English 73–185). Sailors, slaves, and those populations of whites who owned and were raised by slaves (from whom they learned this language that set them off, to no advantage, from the English) all spoke this maritime English. Sailors, naturally, were among the largest groups of speakers.

Pirates’ language is distinguishable from that of the generality of sailors mostly by its blasphemy and its self-naming practices that, like the skeletons and hourglasses on the Jolly Rogers, stressed the irreverent, oppositional, radically autonomous, do-or-die ethic that controlled pirate identity. Such an ethic is apparent in the names pirates gave their ships: “Batchelor’s Delight, Liberty, Night Rambler, Queen Anne’s Revenge, Cour Valant, Scowerer, Flying Dragon, Most Holy Trinity, Happy Delivery, Bravo, Black Joke, and Blessing.”

The most interesting thing, if you ask me, is how "the irreverent, oppositional, radically autonomous, do-or-die ethic that controlled pirate identity" winds up as the theme of family rides at Disneyland parks. Who's subverting who? ]

[Update 9/23/2005: Ann Parker writes:

I've just been reading "Big Chief Elizabeth" by Giles Milton, which reminded me that Sir Walter Raleigh was from Devon (and had a pronounced Devonshire accent) and most of the sailors recruited for early voyages to America were from the West Country (SW England). Most pirates of British origin would thus have had this distinctive accent.

As for 'arr' itself, any Brit can tell you it is West Country dialect for 'yes' :)

Ah. But what's the difference then between arr and yarr?]

Posted by Mark Liberman at 08:58 AM

September 18, 2005

Out of the y'all zone

Back in 1998, when Michael Lewis covered the Microsoft anti-trust trial for Slate, he opined that Microsoft's lawyer

didn't take ... long to prove that technology doesn't sound nearly as impressive when it is discussed in a booming hick drawl. As he boomed on about "Web sahts" and "Netscayup" and "the Innernet" and "mode ums" he made the whole of the modern world sound a little bit ridiculous.

In a "true-life tale" in today's NYT Magazine titled "Yoga, Y'all", Elizabeth Gilbert suggests that eastern spirituality also sounds faintly ridiculous, to people like her, when translated into a southern-states idiom:

At last it was over, and Julie led us into a period of quiet meditation, where we were to lie on our backs, letting our bodies absorb the benefits of our practice. She changed the music over to a porno soundtrack and turned up the volume. "Shut yer eyes," she said. "Look for yer CHAK-ras! That'll be them bright-colored LIGHTS in yer soul! You gotta sur-REN-der to the MO-ment!"

"Porno soundtrack?" I'm assuming that Ms. Gilbert doesn't literally mean a recording of salacious exhortations and heavy breathing, but just a style of music that she found spiritually inappropriate. Anyhow, most of the rest of the article is much more explicitly critical, including direct assertions like

Over the next hour, Julie proceeded to do everything - I'm not sure how to say this politely - dead wrong.

and some things that are genuinely funny, in the way that ethnic humor often is:

Then, the oddest command of the day: "Work them BOOBS, y'all!"

So many, many yoga classes I have attended in my life, and never once had I heard "Work them BOOBS, y'all!" All I could imagine was that this was a local translation of "Open your heart-center toward the universe."

In the end, Ms. Gilbert lets Julie's class take her to the "y'all zone":

"O.K., then!" Julie concluded. "Everybody go to y'all's own QUIET place now!"

Her last instruction echoed in my head.

Go to y'all's own quiet place.. . .Y'all's own.. . .Y'all Zone.. . .

Indeed, it seemed I had entered the Y'all Zone. Estranged and disjointed, I thought back to those extraordinary months I'd spent studying with the great masters in India, where I'd experienced in my very bones the translation of the word yoga - union.

Though estranged and disjointed, she eventually decides to accept "Transcendence, Tennessee style":

I could resist and remain a critical outsider forever. Or I could do what Julie's students do - search like heck for the bright lights in my soul, surrender to the cacophonous moment and even try to absorb a little benefit from the stretching and straining. I'm still not sure if I can achieve all that, but I'll tell you what - I'm workin' my boobs off trying.

It's a good story, even if it demonstrates yet again that outlets like Slate and the New York Times will revel in jokes at the expense of the southern U.S. that they'd never print if the target of ridicule were almost any other culture.

However, I suspect that Ms. Gilbert invented many of the details. I suspect this partly on general principles: the piece has the flavor of a story improved over many retellings. But there's also some more specific evidence: the first of her southern-fried quotations is, my language consultants tell me, regionally ungrammatical:

My new yoga teacher reminded me profoundly of Julie McCoy from "The Love Boat." She wore a pink leotard and I'm pretty certain was also the grand mistress of Cardio-Burn Stepping. She bounded up to me, placed her nose an inch from mine and demanded, "What's y'all's name?" with such friendly enthusiasm it made me wish I had more names.

Cute. But all the American southerners I've ever asked about this tell me that they would never use y'all in reference to a single individual, no matter how many names he or she might have. Ron Butters ("Data Concerning Putative Singular y'all" , American Speech - Volume 76, Number 3, Fall 2001, pp. 335-336) agrees:

There has been considerable discussion in the past in American Speech about whether the pronoun y'all is coming to be used in the singular in the American South (e.g., Richardson 1984, who says it is not, and Tillery and Bailey 1998 and Tillery, Wikle, and Bailey 2000, who say it is; cf. Montgomery 1996). It has always seemed to me that arguments in support of putative singular y'all depend either on (1) data that is an artifact of the research situation or (2) a mistaken understanding of the pragmatics of the reported utterance--as, for example, when a salesperson bids goodbye to a solitary customer by saying Y'all come back, hear? (an idiom meaning 'you and your friends and family come back, please!'). Note that salespersons are not reported as greeting their solitary customers with *Can I help y'all?

There is certainly some disagreement about this -- Kinky Friedman is quoted as saying

Remember: Y'all is singular. All y'all is plural. All y'all's is plural possessive.

but I'm reluctant to trust the intuitions of linguists on things like this, much less a joke from a singer/songwriter/novelist/politician.

In a book review quoted on Language Hat's site, Roy Blount Jr. tore into this view, as follows:

Recently I became aware of an airy new Southern lifestyle publication -- Y'all: The Magazine of Southern People -- out of Oxford, Miss., that might better be entitled Y'all: The Magazine That Doesn't Know What Its Own Name Means. In its premiere issue, Y'all declared that: '' 'Y'all' is singular. 'All y'all' is plural.'' That bit of blatant misinformation also appears in the ''Dixie Dictionary'' portion of ''Suddenly Southern.''

I don't know whether Y'all picked this up from Duffin-Ward or vice versa. She is not the first non-Southerner to insist that Southerners may call a single person ''y'all,'' but to my knowledge she is the first to declare categorically, in the face of everyday evidence and all philological authority, that it is always a single person we so address. But she isn't one to brook elucidation. With regard to the singularity of ''y'all,'' she writes: ''Southerners will beg to differ here. They insist that even though they use it to address one person, it implies plurality.''

Something -- either second-person-plural envy or hyperjocularity -- has affected Duffin-Ward's ear. People in the South do indeed sometimes seem to be addressing a single person as ''y'all.'' For instance, a restaurant patron might ask a waiter, ''What y'all got for dessert tonight?'' In that case ''y'all'' refers collectively to the folks who run the restaurant. No doubt the implication of plurality is hard for someone who didn't grow up with it to discern. It may even be that Duffin-Ward has heard a native speaker, in real life, violate deep-structure idiom by calling a single person ''y'all.'' That would be arguable grounds for saying that ''y'all'' is singular on occasion. But how can she have missed daily instances of people unmistakably addressing two or more people as ''y'all''? When a parent calls out to three kids, ''Y'all get in here out of the rain,'' does she think only one child is being summoned? (''All y'all'' is of course an extended plural: ''Y'all listen up! I mean all y'all.'' Often it is pronounced ''Aw yaw.'')

So Blount agrees with Butters, and both agree with what my southern friends and relations say, and with my own observations of usage. Furthermore, Elizabeth Gilbert's own evidence is that her East Tennessee yoga instructor used y'all (not all y'all) repeatedly as a plain and clear plural -- "work them boobs, y'all"; "go to y'all's own quiet place".

Therefore, I'm betting that "Julie" didn't really say -- at least not to Gilbert individually, nose an inch from nose -- "What's y'all's name?"

Maybe the kindest verdict on her "true-life tale" would be the one I once heard from a wolf scientist when I asked him about the ethological plausibility of Farley Mowat's memoir Never Cry Wolf -- "mm, let's say that much of it is poetically true."

Posted by Mark Liberman at 01:05 PM

Shortest published sentence of the year

The hunt for the shortest sentence published in print this year is over. Though it is only September, I can say that with confidence. I found the sentence in last week's New Yorker, at the end of an excellent article on the material culture of the Indians of the Brazilian Amazon rainforests — Indians whose languages I had long been interested in. Let me explain.

There are two sentences types that can be as short as one word (and here I use ‘sentence’ for a written string properly punctuated as a sentence, not in the traditional sense that requires a sentence to be or contain a clause and hence a verb) are (i) imperative clauses with unmodified intransitive verbs (Stop!), and (ii) verbless units with a single constituent of a category like question-answering particle (No.), noun phrase (Clinton?), adverb (Often.), etc.

Most categories contain no words shorter than two letters long, and as it happens there are no verbs in the dictionary that are spelled with one letter. The shortest sentence would therefore be two characters long: it would be a sentence of type (ii) in which the one word was spelled with a single letter, plus the obligatory final punctuation mark (‘.’ or ‘?’ or ‘!’).

Among the words that are only one letter long are the pronoun I, which is rarely used on its own (it would sound very pompous indeed to answer "Who did this?" by intoning "I"), and the indefinite article a, which is not allowed to occur alone. There really aren't any common nouns spelled with one letter (for the purpose of keeping this non-trivial, let's agree to ignore the fact that a letter can be used as a common noun denoting tokens of its type, as in Mind your p's and q's, or The word contains an f). So it is among proper nouns that we find the words of one letter that are most likely to occur on their own as properly punctuated separate sentences.

David Grann's excellent article ‘The Lost City of Z’ (The New Yorker, September 19, 2005, 56-81) describes a trip to Amazonia in search of clues to the fate of the explorer Colonel P. H. Fawcett, who disappeared in the Amazon forests in 1925 along with his son and another man. Fawcett was searching — quixotically, most have assumed — for a legendary lost city that he believed lay hidden in the jungle. He called this city Z (there is our one-letter proper name). He was a daring and resilient explorer, firm in his belief and intent on completing his mission. His disappearance prompted many to go and look for them. Quite a few never returned. Some of the Indian tribes in the Brazilian interior were well aware that massive die-offs from disease often followed contact with Europeans, and some of them would club explorers to death rather than introduce them to their villages. Other tribes took captives. Either of these fates might have been the one that overtook Fawcett. We shall probably never know. Not a verified bone has ever been found.

Grann's article describes a retracing of Fawcett's route undertaken earlier this year, with the aid of modern motor vehicles and aluminum boats. The journey goes through territory of a number of Indian peoples whose languages fall in Amazonia-located language families such as Carib; the Bakairí Kalapalo, and Kuikuro are mentioned a number of times.

I've had a professional interest in such groups for years, ever since Desmond Derbyshire convinced me of the astonishing fact that quite a few Amazonian languages, particularly in the Carib family, have Object-Verb-Subject as their standard, basic constituent order in the clause — not just as a permitted stylistic variant, but as the normal everyday order. This doesn't occur anywhere else in the entire world as far as we know. In the Carib language Derbyshire had studied for years, Hixkaryana, Toto yonoye kamara can be glossed with the English words "people", "used to eat", "jaguar". But it doesn't mean that people used to eat jaguars. It is the ordinary way of saying that jaguars used to eat people. In our four-volume series Handbook of Amazonian Languages, published by Mouton de Gruyter between 1986 and 1998, Derbyshire and I have made available to linguists a number of lengthy and detailed descriptions of Amazonian languages by expert linguists, some of them Object-Verb-Subject languages and others strange and surprising in all sorts of other ways (Dan Everett's first description of the astonishing Pirahã language was published in Volume 1.)

At his final destination in the area of the Xingu National Park, a gigantic area of Brazil set aside to be under Indian land management, Grann meets up with a Florida anthropologist, Michael Heckenberger, who has done ten years of research in the Xingu. What Heckenberger has found is staggering. Archaeologists had almost uniformly concluded that there were going to be no architectural features to be discovered in central Amazonia. However many people had lived there, they had done no building in stone. Today they mostly live in huts of simple and highly biodegradable construction. Everything older than a few decades will have long rotted away and become indistinguishable from the jungle itself. Heckenberger found otherwise.

What he has found, in brief, is that in Kuikuro territory there are remains of huge circular moats and palisades surrounding ancient settlements the size of small cities, and long causeways and bridges connecting them. A thousand years ago there had been urban areas surrounded by ditches up to 16 feet deep, 50 feet wide, a mile in diameter, and with a palisade wall of tree trunks at the bottom. There were streets at right angles to each other (oriented north-south and east-west) and wooden causeways providing routes of travel to other nearby friendly settlements. And the Indians left other artifacts, notably millions of shards of high-quality pottery. Some specialists are now saying that the population of the Amazon rainforests — once thought to have been empty of all but tiny bands of hunter-gatherers, a hundred or so at a time, always travelling and never building anything — may once have been in the millions, and the people were by no means at a primitive level of civilization compared to what was happening in other continents around the same time.

It is a revolution in our conception of Amazonia before 1500. The Kuikuro Indians today live a fairly simple life, but their ancestors appear to have built great fortified cities — according to Heckenberger, "with a complicated plan, with a sense of engineering and mathematics that rivalled anything that was happening in much of Europe at the time." The modern Kuikuro know about the archeological remains, but don't seem to be aware that they are the descendants of the architects and builders.

Grann describes sitting down with Heckenberger in the Kuikuro village as Indian dancers and musicians playing long flutes begin circling in the main plaza and Heckenberger talks of how much you can see the past in the present in such scenes. Grann writes:

The musicians were coming closer to us, and Heckenberger said something about the flutes, but I could no longer hear his voice over the sounds. For a moment, I could see this vanished world as if it were right in front of me. Z.

And there's our shortest sentence, right at the end. Others equally short may be published, but there won't be any shorter, because we've hit the lower bound.

[Well, maybe. Jonathan Lundell writes to suggest that when you see a cartoon character thinking "?", that's a sentence containing zero letters plus one sentence-final punctuation mark. I guess the meaning is the illocutionary force of interrogativeness with no propositional content. So that would be even shorter, and I would be wrong. All I can say is: "!".]

Posted by Geoffrey K. Pullum at 12:10 PM

Ritual interviews

Joel Martinsen at Danwei has a translation of a post (in Chinese) from Linghu Lei's blog Life is Extra, entitled "So this is how the New York Times conducts an interview". A sample:

During the interview process today, the reporter from the New York Times first spoke a good deal about why he wanted to do the interview, and then after I had answered the first question, he said that he "identified" with my viewpoint and felt that it was "special." Then he said, "Your viewpoint is extremely close to what I want to communicate with in article," and wanted me to discuss it further. To express this identification that he felt toward my viewpoint, I elaborated on the point I had just mentioned in passing, and then expanded a bit on his viewpoint. Then he echoed my viewpoint. During this process of communication, the New York Times reporter always led me to understand that what I said was precisely what he needed for his article. My viewpoint supported his argument exactly. So unwittingly I became his news spokesperson. At the end, for the final question, the reporter summarized our entire "discussion" and asked me if I agreed - at this point, there was no way I could deny what I had just said, so - right, "that's correct."

This is the friendly version. The hostile version was caricatured by Rasheed Wallace as "Is it true that you're the team's asshole?", as discussed in a Language Log post "Ritual Questions, Ritual Answers", which Martinsen references as providing "a further analysis of the phenomenon".

Public figures need to learn how to enter into these rituals safely. Some people learn by trial and error, and I suppose that media consultants offer training -- are there any good books about it?

[Update 9/27/2005: Karl Weber sent in this anecdote, taken from a 1991 column by Michael Kinsley entitled "Don't Quote Me", and reprinted in his book Big Babies:

"During the last election a television journalist called up to say he wanted to interview me. Puzzled--this man knows far more than I do about politics--but flattered, I said sure. He showed up at my office, set up his lights and camera, and asked, "Mike would you say that . . . " Then he proceeded to enunciate some theory about the course of the campaign.

"Me (eager to please): Good point. You're absolutely right about that. I never thought of it before.

Him (testy): No. Would you say it.

Ah. He didn't want my wisdom. He wanted a sound bite. . . ."

]

Posted by Mark Liberman at 11:00 AM

September 17, 2005

The evolving etiquette of "as such"

In this morning's NYT, in an article by Jennifer Steinhauer and Eric Lipton entitled "FEMA, Slow to the Rescue, Now Stumbles in Aid Effort", there's

Evacuees and local officials also complain that FEMA's request for them to register on line or via phone is unrealistic, given that as of Wednesday 310,000 households in Louisiana were still without telephone service and 283,231 were still awaiting power, or nearly 30 percent of the state's households. And the phone lines are almost always jammed anyway. As such, those with cars drive miles to operating help centers in other counties, where the lines are sprawling. Confusion is rampant. [emphasis added]

The CJR's Language Corner complained about this transitional "as such" back in 1998. But here at Language Log, we don't complain, we explain.

It all starts with phrases of the form "As <descriptive noun phrase>, <modified noun phrase> <has some relevant property>":

As a parent, I found this book highly informative.
As a policeman, he's expected to inform the FBI, but instead he becomes a bounty hunter.
As students, you should expect your supervisors to model ethical, professional behavior.
As a partisan of paganism he was forced to leave Athens, but he returned at the end of a year.
As the owners of the airwaves, we should allow them to be used only for public purposes.
As conservatives, these authors might have been expected to cut through much of the muddle-headed leftist dogma that permeates so many discussions of gender.
As Progressives, the reformers were also willing to accept an enlarged. government regulatory role.

Sometimes the descriptive noun phrase has already been used in a previous clause, and to avoid repetition, the anaphor such is substituted. Here's the earliest example I found in the OED (though this is an accidental example, representing the item air-washer rather than the use of as such) :

1905 Lancet 25 Feb. 507/2 The Stellite Air Deodoriser..is an effectual air-washer and as such it may obviously have numerous hygienic applications.

This is, so to speak, short for "The Stellite Air Deodoriser is an effectual air-washer and as an effectual air-washer it may obviously have numerous hygienic applications."

Some other recent examples from the web:

Books are a cultural product and as such they deserve every protection we can give them.
Rabbits are rodents and as such they must chew to wear down their ever growing teeth.
... these latter two organisms are facultative anaerobes and as such they can be problematic for monitoring purposes since it has been shown that they are able to proliferate in soil, sand and sediments.
Bridges and switches are data communications devices that operate principally at Layer 2 of the OSI reference model. As such, they are widely referred to as data link layer devices.

For conservative users like Evan Jenkins at the CJR, this where things should stop. If you use as such in a passage that can't be analyzed this way, with a backwards connection to an anaphoric noun phrase, and a forward connection to a modified noun phrase in subject position, they'll be on your case. However, it's clear that Norma Loquendi is fighting her way clear of this box. There seem to be two changes: a loosening of the link backward to an antecedent noun phrase, and a loosening of the link forward to a modified noun phrase. When both changes are complete, as such is left as a pure connective adjunct like moreover or therefore.

In the first form of loosening, the antecedent noun phrase for such is only implicit:

Attorneys are often in deposition or court and as such they may not call back for two or three days.

In this case, it's clear that we're meant to infer something like "as people who are often in deposition or court" or maybe just "as busy people".

Directive Leaders are characterised by having firm views about how and when things should be done. As such they leave little leeway for subordinates to display independence, ...

Here we're meant to infer something like "as leaders with firm views about how and when things should be done, they ...".

Sometimes the inferred antecedent for such seems to be almost completely decoupled from the details of the linguistic context. The NYT passage that started us off is an example of this type. In a 1977 paper, Ivan Sag and Jorge Hankamer made a distinction that's relevant here. They pointed out that some anaphors can be used to refer to things in the context that have never been mentioned at all, while others can't. For example, if I happen along while you're trying to tear a telephone book in half, I can start a conversation by saying

I don't think you can do it.

but it's weird at best to say (with no prior context)

??I don't think you can do so.

Sag and Hankamer called the first kind of anaphors "pragmatically controlled" (because the antecedent can be merely implicit in the situation), while the second kind is "syntactically controlled" (because we expect an explicit antecedent in the words recently spoken or written).

There are some uses of such where this distinction is relevant. An expert cook, seeing me trying to chop onions with a carving knife, might suggest (with no prior linguistic context)

With a knife like that, you can't get the force and the motion you need to chop efficiently.

or maybe

With such a knife, you can't get the force and the motion you need to chop efficiently.

For some people, I think that the such in "as such" is moving towards this sort of pragmatic control, if it hasn't gotten there already. Here's another case where the antecedent of such is evoked by the previous text without ever being presented in a specific noun phrase, while the forward link (to they) is just as the standard pattern expects:

The ‘Blues’ do not last. Most women experience them for just 1-2 days. Even when more severe, the ‘Blues’ resolve within 10 days of delivery. As such they can be regarded as a normal reaction and do not require any treatment.

There's a second form of loosening, where such has a perfectly good antecedent, but the noun phrase modified by as such isn't adjacent to it:

Gila monsters are one of only two venomous lizards in the world, the other being the closely related beaded lizards (Heloderma horridum). As such, many states limit the keeping of this species, so check with local laws before purchasing captive bred gila monsters.

Well, maybe as such ought to mean "as a venomous lizard" here, but in any case, it's not the states that are venomous, but the lizards. Cases like this fall (I think) under the heading of dangling etiquette, discussed last year by Geoff Pullum in connection with the example

Rich and creamy, your guests will never guess that this pie is light.

Here's a more subtle example with an as NP adjunct:

As progressives, our focus should be on appointees to the Supreme Court.

The writer wants "as progressives" to modify the group (s)he belongs to, and everything would have worked out according to the standard prescription if only the sentence had been "As progressives, we should focus on appointees to the Supreme Court".

As Geoff explained, "participles, adjectives and also some idomatic preposition phrases, when used as adjuncts, need an understood subject", and "[t]he prescriptive tradition says that the subject filled in must be the one obtained from the subject of the matrix clause". Geoff pointed out that it's hard to maintain that this is a fact of English grammar, since offending examples are extremely common in edited sources, and concludes that "it's manners, not grammar" to furnish your adjuncts with an easily-accessible target of predication.

Sometimes the forward link seems to be completely gone -- no noun phrase plausibly modified by as such can be found at all. This suggests that for many people, as such has become a sort of free-floating discourse-related adverbial, meaning something like "given that this is the case", or "in the state of affairs just described", or simply "therefore".

I don't see any other plausible way to interpret the following quote from a Canadian Council of Forest Ministers 1999 press release:

"Recognition of the need for an international forest convention has continued to gain ground steadily since Rio, with numerous countries now supporting the drive for a convention," stated Natural Resources Canada Minister Ralph Goodale. "All agree that, in today's world, there is a need to work in a focussed and dedicated manner to address the many problems and opportunities facing the global forest community. As such, I will continue to work with my provincial and territorial colleagues toward the establishment of a convention."

In this case, none of the expected nominal components of the original version of the as such construction can be found -- there's neither a coherent antecedent nor a plausible target of predication -- but substitution of a discourse-structuring connective like therefore works perfectly. [Well, maybe Mr. Goodale meant that "as someone who agrees that blah, I will blah"; but I think that's a stretch.]

Here's a case where it's even harder to analyze as such as anything other than a pure connective:

This does not mean the product has no value, but rather that based on clinical trials in humans, there's little hard evidence about the effectiveness of shark liver oil for any condition. As such, there are no formally established dosages for shark liver oil, and the WholeHealth MD recommendations below follow guidelines set forth in these books.

There are 850,000 other Google hits for {" "as such there"}), and a large fraction of them are similarly purified of any grammatical role other than as a discourse connective:

Good and bad phases of life are both in the worldly existence and as such there is no difference as regards results.
Wikipedia is not a medium for the presentation of writers' personal preferences. As such, there should not be any reference to environment, date, working directory, user name, or host name unless absolutely necessary.
Medicare payment for physical therapist services is made at 100 percent of the Medicare physician fee schedule. As such, there is no difference in the amount of Medicare payment where physical therapy services are billed as "incident to" services ...
Therefore, in our example, although Australia only has one capital, and as such there is only going to be one correct answer to the question, 'What is the capital of Australia?', if different answers to the question are given it might be that one person genuinely thought their answer was right, even though it turned out to be wrong ...

The analysis of as such as a pure connective, liberated both from its antecedent and from its target of predication, may also apply to many of the examples where the antecedent and the modified noun phase are unexpectedly inaccessible. If so, then these are not failures of rhetorical etiquette, but simply examples of cultural change. It's not an insult to come to dinner without a tie if you're not aware that one is expected.

Posted by Mark Liberman at 11:29 AM

September 16, 2005

The noses of wrath

Several correspondents have pointed out that there is no reason to be puzzled by the bible passages where different English translations variously refer to noses, eyes, faces and anger (for example Exodus 5:21 -- "stink in the nostrils of Pharaoh", "a stench to Pharaoh", "odious in Pharaoh's sight", "made us stink before Pharaoh", "making the king ... hate us", "made our odour to stink in the eyes of Pharaoh", "caused us to be hated by Pharaoh", "made us reek in front of Pharaoh", etc.), because the Hebrew word אף can mean any of these things. Strong's Hebrew Bible Dictionary calls this word number 639:

properly, the nose or nostril; hence, the face, and occasionally a person; also (from the rapid breathing in passion) ire:--anger(-gry), + before, countenance, face, + forebearing, forehead, + (long-)suffering, nose, nostril, snout, X worthy, wrath.

The morfix site assigns the spelling אף (alef pe) to three different lemmas, glossed as "nose", as "(flowery) anger, wrath" and as "also, even, not even". My understanding (subject to correction -- this is not my field) is that the last one is independent, while the first two are related, with the "anger" meaning derived from the "nose" meaning along the lines suggested by Strong, and instantiated in Psalms 18:7-8 (reference supplied by Trevor at Kalebeul):

Then the earth shook and trembled; the foundations also of the hills moved and were shaken, because he was wroth.
There went up a smoke out of his nostrils, and fire out of his mouth devoured: coals were kindled by it. [KJV]

The "anger is heavy breathing" metaphor isn't the only way to connect the nose and the face to negative emotions. Thus the English expression to "(really) get in <someone's> face" often means to confront or provoke them.

I had to really get in his face in this movie I have to be adversarial to him; I have to be at odds with him.
She called me a racist, yelling and screaming as she really got in my face.
You first told him politely that you didn't appreciate this, then one day you really got in his face about it.
When she doesn't like something, she is one to really get in your face.
They talk about what a tough disciplinarian he was, how he could really get in your face.
So they go outside and the guy steps out and asks ‘em to take their best shot; really gets in their face about it!
So I, like, totally got in his face and everything and made him take one item back to the shelf.
...when this Eurotrashy couple from probably Murray Hill tried to take our bench, Jake handed me his camera and totally got in their face and scared them away.
My mom blows up and totally gets in my face for it.
Afterwards, I stand to the side in a daze for what seemed like forever; until the prop master totally got in my face about not returning the prop to him.
I like to have fun, and I DO NOT put up with girls like that...and I will definitely get in their Face if they act that way towards me.
i completely got in his face an blew up at him
I lost it. I asked him what the fuck was the hold up and pretty much got in his face.
Needless to say, there are MANY topics which follow that pattern, where I try to keep it on-topic, or at least polite, but others repeatedly get in my face with personal attacks and compel me to respond at some point.

And to "(really) get up <someone's> nose" often means to annoy them, especially in the UK and Australia:

I'll admit, the ID people get up my nose. I didn't mind the old William Jennings Bryan Creationism half as much ...
What gets up your nose the most? Is it constant customer complaints?
If it's filthy, that's obviously going to get up your nose. But it's the design that really pisses me off.
What especially gets up my nose is the camera work, it seems to be some new style of movie making, it sucks.
It's the mundane, ho-hum everyday stuff that really gets up his nose.
somthing what really gets up my nose is when a shop advertises its specials for the week and when u go for them they dont have any left
It used to really get up my nose when legendary liberal campaigners Mike and Fran Oborski used to put the slogan "Fran again" on their leaflets.
Anyway, what seriously got up my nose about Theophanous was his holier-than-thou attitude when the allegations against him first came to light.
"I Have never had any problems with eBay In the past, but This seriously got up my nose," he said.
Even though he was supposed to be possessed, this still seriously got up my nose and kinda marred the ending a bit for me.
Jon can be so god damned stubborn sometimes that it totally gets up my nose!
The level of general ineptitude displayed in their madcap capers utterly gets up my nose.
... he completely got up my nose last night by giving us the wrong address for the pizza place he wanted to go to because he's too lazy to walk like FIVE BLOCKS to North Beach, and we ended up wandering all over the Union Square neighborhood looking for the damn restaurant, which he couldn't remember the name of.

You can find both expressions at once:

After all, it took me more than 3 years to become an activist, and that was only due to repeated persecution by the "Church" that definitely got in my face and up my nose.
Bush’s decision to resubmit nominees he couldn’t win approval for in his first term smacks of in-your-face and-up-your nose politics.
The link between rank and file trade union activists and community activists is now well established with the presentation of some of the most irreverent, in your face and up your nose presentation of trade union programs that reflect the aspirations of all workers.

There's also the use of "front" as a back-end abbreviation for confront. which some people might interpret as having something to do with getting in front of someone.

Finally, Geoff Nunberg wrote

I think of a double dactyl by Anthony Hecht that I always liked:

Higgledy-piggledy
Juliet Capulet
Cherished the tenderest
Thoughts of a rose.

"What's in a name?" said she,
Etymologically,
"Save that all Montagues
Stink in God's nose."

That leads to the question of whether "stink in X's nose" is another, perhaps earlier variant. Or for that matter, "stink in the porches of X's nose."

I'll leave that one for my next free coffee break.

[Update: Gabriel Nivasch wrote

Regarding the verse Exodus 5:21 and the Hebrew word "af" (aleph pei):
There's no word "af" in the verse, so the readers' coments on the word "af" are not quite relevant.

Here's a word-by-word translation:

asher        - that
hiv-ashtem   - you (pl.) have made repulsive
et           -
reicheinu    - our smell
be-einei     - in the eyes of
phar-o       - Pharaoh
uve-einei    - and in the eyes of
avadav       - his servants

So it actually says "smell in the eyes" here! Clearly a mixed metaphor. This didn't escape the classic commentators. The Ibn Ezra says that the substitution of one sense for the other can be justified by the fact that the five human senses join together at a spot above the forehead. He brings another example of a sense substitution: 'Light is sweet...' (Ecclesiastes 11:7).

Alternatively, "in the eyes of" doesn't have to be literal. It can mean "before", as in some of the translations you brought.

A better example would have been Isaiah 65:5, which definitely involves "af", and is variously translated

whiche seien to an hethene man, Go thou awei fro me, neiy thou not to me, for thou art vncleene; these schulen be smoke in my stronge veniaunce, fier brennynge al dai. [Tyndale 1395]
Which say, Stand apart, come not nere to me: for I am holier then thou: these are a smoke in my wrath & a fire that burneth all the day. [Geneva 1587]
Which say; Stand by thy selfe, come not neere to me; for I am holier then thou: these are a smoke in my nose, a fire that burneth all the day. [KJV 1611]
who say, 'Keep away; don't come near me, for I am too sacred for you!' Such people are smoke in my nostrils, a fire that keeps burning all day. [NIV 1973]

]

Posted by Mark Liberman at 04:42 PM

More on Murray Emeneau

I don't have Arnold Zwicky's illustrious history -- he has been President of the LSA, for starters -- but we are in the same generation, so I too have met most or all of the LSA's former presidents starting with Emeneau, and one before Emeneau, Adelaide Hahn. Some of the earliest presidents I met because Bernard Bloch -- LSA president in 1953, long-time editor of Language, and teacher extraordinaire -- always got invited to all the best parties at LSA meetings, and he would take us callow students along. (Bloch would say to the host, as we trailed in behind him, that he was sure the host wouldn't mind having a few of his students join the group. The hosts probably did mind, of course, but they didn't say so.)

I also share Arnold's great respect for Murray Emeneau. When I first got interested in language contact thirty years ago, and started reading around in the literature, Emeneau was the person who seemed to have the best account of the dynamics of contact -- even more than Uriel Weinreich, my other intellectual hero in this domain, whose classic book Languages in Contact (1953/1968) is surely cited more than any other work on language contact. Emeneau wrote extensively on this topic, but his most famous language-contact publication is his 1956 article `India as a Linguistic Area' (reprinted in 1964): it's a ground-breaking article, and it is not outdated even now, so it's still well worth reading. It's also right, in my opinion, but issues of language contact in India enjoy...or suffer from, depending on one's viewpoint...various political and intellectual controversies, so not everyone will agree with me on that point.

I met Emeneau just once, several years ago when he was in his late 90s: I was visiting Berkeley, and he came to campus to meet me. I was recovering from the flu at the time and feeling ill and groggy, but when he came in the door I leapt to my feet and barely refrained from bowing. He was a giant in our field, and if I had time to google the quotation so I could get it right, I could add the bit about his like not coming again.

Posted by Sally Thomason at 07:33 AM

September 15, 2005

Reminiscences on a theme by Emeneau


With the recent death of Murray B. Emeneau, at age 101, an era in the history of the Linguistic Society of America has come to an end, and there probably won't be another time like it.  The thing is, Emeneau was president of the LSA in 1949, and if you look at the roster of presidents, you have to go almost twenty years ahead to find a president who is still alive: Gene Nida, president in 1968.

As it happens, Emeneau's presidency represents a personal LSA watershed for me: from Emeneau on, I have met every president of the society.  Linguistics is a small field.

You'll already have noticed that this posting is really about me me me, not Murray Emeneau, who merely presented me with an occasion for reminiscing.  You get a Medicare card, you start thinking about the past.


Emeneau wasn't the first person on the presidential roster that I've met.  That would be Hans Kurath (1942, an even fifty years before I myself was president; I was still in diapers in 1942).  Then Y. R. Chao (1945), Adelaide Hahn (1946), and the unbroken string that begins with Emeneau, who was president when I was in the fourth grade.  As the author of the elegant booklet Sanskrit Sandhi and Exercises (1st ed. 1952, rev. ed. 1958, the year I graduated from high school), Emeneau even played a significant role in my academic life, showing me how central organizing principles could be extracted from a mass of very complex data (which I then bashed at in my Ph.D. dissertation).

But why do I say that we're unlikely to see another almost-twenty-year run like Emeneau's?  First, there's the extraordinary longevity of people like Kurath, Chao, and Emeneau.  More important, though, is the fact that many of the early presidents of the LSA (Emeneau among them) were in their 40s when they took office -- young adulthood in today's academic world.  These days, presidents are in their 50s and 60s.  (Henry Kahane (1984), who had surely one of the most active academic retirements in history, was in his 80s.)  In any case, there are a lot more linguists now than there were then, and many more excellent candidates for the office, so their age at taking office is going to creep up.

In any case, after Emeneau in 1949 we soon come to linguists who were my teachers: Roman Jakobson (1956), Henry Hoenigswald (1958, the year before he taught me my second course in linguistics), and my dissertation adviser, Morris Halle (1974).

And to people who eventually were my colleagues at one or another of my three institutions: Charles Ferguson (1970), Joe Greenberg (1977), Ilse Lehiste (1980), Henry Kahane (1984), Elizabeth Traugott (1987), Charles Fillmore (1991), and Joan Bresnan (1999).  The list begins to include personal friends like Dwight Bolinger (1972), Lehiste, Kahane, Vicki Fromkin (1985), Traugott, Fillmore, and many more.

With Barbara Partee (1986) the list begins to include people I went to graduate school with: Jim McCawley (1996), Terry Langendoen (1998), David Perlmutter (2000).  And then people who were graduate students at MIT after I left: Janet Fodor (1997), Bresnan, Ray Jackendoff (2003), and Mark Aronoff (2005).

Finally, inevitably, we come to an LSA president who took courses from me: Fritz Newmeyer (2002).  Now, all I have to do is live long enough for a "grand-student" (someone who took courses from someone who took courses from me) to assume the presidency.  That would be very pleasing.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 07:43 PM

Ethical nostrils

A few days ago, I quoted a passage from Robert Penn Warren's All the King's Men:

"And I wouldn't know the truth this minute if that woman right there --" and he pointed down at Sadie --- "if that woman right there --"
I nudged Sadie and said, "Sister, you are out of a job."
"-- if that fine woman right there hadn't been honest enough and decent enough to tell the foul truth which stinks in the nostrils of the Most High!"

Readers will recognize the last part as another one of those pieces of partly prefabricated rhetoric, something like "<bad smell> in the nostrils of <moral authority>". (How do we know this? A good question, to be answered another time...) So I spent a few minutes trying to find the source -- web search makes it easy to indulge in a little one-hour scholarship over morning coffee -- but so far, all I've learned is that this phrase has been around since the early 17th century, and seems always to have been metaphorical rather than literal.

The history first. The earliest citation in the OED is

1771 E. BURKE Speech Middlesex Election in Wks. X. 65 Our judgments stink in the nostrils of the people.

[Update: but as Jesse Sheidlower has pointed out to me, the OED has other examples of the figurative use of nostrils "with connotation of moral sensitivity", though involving slightly different phrasal patterns, going back to 1596:

1596 F. SABIE Adams Complaint 144 Realmes full of errors, mountaines huge of shrewdnes. The height whereof vnto his throne ascended, And with their stench his nostrils sore offended.

]

The earliest example I could find quickly of the specific pattern <bad smell> in the nostrils of <moral authority> was from Lancelot Andrews, Scala Coeli. Nineteene Sermons Concerning Prayer, which was "Printed by N. O. for Francis Burton, dwelling in Pauls Church-yard, at the signe of the Greene Dragon, 1611":

Euen so the wicked imaginations, and vnchast thoughts of our heartes, which yeeld a stinking smell in the nostrils of God, are sweetned by no other meanes then by prayer...

There are many examples through the middle of the 17th century, for instance from Benjamin Keach's 1689 description of ethnic cleansing in Distressed Sion Relieved, The Tryal and Condemnation of Mystery Babylon, the Great Whore:

4005 Those horrid tortures, which my Brethren say
4006 She exercis'd on them, the same I may
4007 Affirm t'have suffer'd, by the instigation
4008 Of this vile Strumpet, whose abomination
4009 Stinks in the Nostrils of each civil Nation,
4010 Her cursed Priests, when first they did begin
4011 Our Massacre, proclaim'd, it was a sin
4012 Unpardonable, if they durst to give
4013 Quarter, or our necessities relieve;
4014 Some they stript naked, and then made them go
4015 Through Bogs and Mountains, in the Frost and Snow,
4016 Men, Women, Children, then were butchered,
4017 And all that spake our Language punished;
4018 The very Cattel, if of English breed,
4019 They slasht and mangled that they could not feed.

Other mid-17th-century examples include John Bunyan in 1685:

Their rising, is called the resurrection of the unjust, and so they at that day will appear, and will more stink in the nostrils of God, and all the heavenly hosts, than if they had the most irksome plague-sores in the world running on them.

and Cosmo Manuche in 1652:

Rigg. How rank a Traytor smells.
Albin. Very true; especially, in the nostrils of the righteous.

The phrase has a biblical sort of ring, but there is no source for it in the King James version of the bible, nor could I find it in earlier translations like Wycliffe and Coverdale (and yes, I knew to search for "nostrels"). There are a number of examples in more recent translations such as the 1973 New International Version for Isaiah 65:5

Yet they say to each other, `Don't come too close or you will defile me! I am holier than you!' They are a stench in my nostrils, an acrid smell that never goes away.

which the 1611 KJV renders as

which say, Stand by thyself, come not near to me; for I am holier than thou. These are a smoke in my nose, a fire that burneth all the day.

and the 1587 Geneva bible has as

Which say, Stand apart, come not nere to me: for I am holier then thou: these are a smoke in my wrath & a fire that burneth all the day.

Whatever the source, there are plenty of examples on the web of this pattern "<bad smell> in the nostrils of <moral authority>". The unpleasant odor is always a metaphor for ethical disgust:

According to this biblical school of thought, Hurricane Katrina was sent to wipe away the stench in the nostrils of God.
Sanctimony stinks in the nostrils of the Lord.
And, no matter which it is, the result is inevitably death in our life (death in this connotation means barrenness, fruitlessness, spiritual poverty) which is a stench in the nostrils of God; it is hostile to God.
This bestiality, worse than anything in recent history, stinks in the nostrils of Heaven.
At times, I must be a stench in the nostrils of my heavenly Father.
He demanded complete separation of church and state as he declared that "forced worship stinks in the nostrils of God."
...a city whose putrefying iniquity became such a stench in the nostrils of the righteous God that love, made hopeless by their plight, gave place to wrath...
Gay marriage stinks in the nostrils of God.
I believe the enslavement of any human beings, as a race, is a stench in the nostrils of the Great Creator of man.
I submit to you that this is obscenity and it stinks in the nostrils of a holy and compassionate Cod [sic].

The "sin smells bad" metaphor is expected when the nostrils are divine -- people don't need to invoke heavenly olfaction to persuade others that it's unpleasant to be downwind of a pig farm on a warm day. However, the nostrils are not always divine -- as in the Keach and Burke quotes, they often belong to a group of humans to whom the writer attributes collective moral authority:

They stink in the nostrils of every honest man and woman.
Gentlemen, it is time that this stench in the nostrils of all decent persons in the West is buried.
The hypocrisy of the U.S. government, which proclaims that it is leading a fight against world terrorism, became a stench in the nostrils of Latin American public opinion.
Franklin [...] spoke of a war in which the colony's proprietors including Penn would be "gibbeted up to rot and stink in the Nostrils of Posterity."
Kameny called the government’s anti-homosexual policies "a stench in the nostrils of decent people."
...led even an English Conservative newspaper, the London Times, to declare that “the name of an Irish landlord stinks in the nostrils of Christendom”.
Their régime was a stench in the nostrils of every respectable man, North or South.
He had been the terror of the lawless element in Arizona, and with the Earps was the only man brave enough to face the bloodthirsty crowd which has made the name of Arizona a stench in the nostrils of decent men.
Whiggism was putrescent in the nostrils of the nation ...

Other secular substitutions in the frame "<bad smell> in the nostrils of ___" include Civilization, the community of Musselburgh, honest men, the common people to whose elevation they had dedicated their lives, the political class, the whole world professing any sort of pretensions to Christianity, the court, thinking men of all the civilized nations of the earth, swing voters, civilized governance, the American people, and so forth.

I don't find examples where the smell is a literal odor, or where the nostrils belong to a specific individual. These would be things like

? The decaying corpses stink in the nostrils of the city's inhabitants.
? His behavior was a stench in the nostrils of his aunt Mildred.

There are a couple of examples like the second one in some recent bible translations:

2 Samuel 10:6 in the 1973 NIV

When the Ammonites realized that they had become a stench in David's nostrils, they hired twenty thousand Aramean foot soldiers from Beth Rehob and Zobah, as well as the king of Maacah with a thousand men, and also twelve thousand men from Tob.

But the phrasing "in David's nostrils" is a recent introduction:

Wycliffe 1395: Sotheli the sones of Amon sien, that thei hadden do wrong to Dauid, and thei senten, and hiriden bi meede Roob of Sirye, and Soba of Sirie, twenti thousynde of foot men, and of kyng Maacha, a thousynde men, and of Istob twelue thousynde of men.
Coverdale 1535: Whan the childre of Ammon sawe that they stynked in the sighte of Dauid, they sent and hyred the Sirians of the house of Rehob, and the Sirians at Zoba euen twentye thousande fote men, and from the kynge of Maecha a thousande men, and from Istob twolue thousande men.
Geneva 1587: And when the children of Ammon sawe that they stanke in the sight of Dauid, the children of Ammon sent and hired the Aramites of the house of Rehob, and the Aramites of Zoba, twentie thousande footemen, and of King Maacah a thousand men, and of Ish-tob twelue thousande men.
KJV 1611: And when the children of Ammon saw that they stanke before Dauid, the children of Ammon sent, and hired the Syrians of Beth-Rehob, and the Syrians of Zoba, twentie thousand footmen, and of king Maacah, a thousand men, and of Ishtob twelue thousand men.

The bad-smell metaphor is there from 1535 onwards, but the phrasing is not: the Ammonites "stynked in the sighte of Dauid" or "stanke before Dauid", not in his nostrils.

Another passage involves Pharaoh's sense of smell, though it is located in his eyes by King James' translators:

Exodus 5.21, King James version:

18 Go therefore now, and work; for there shall no straw be given you, yet shall ye deliver the tale of bricks.
19 And the officers of the children of Israel did see that they were in evil case, after it was said, Ye shall not minish aught from your bricks of your daily task.
20 And they met Moses and Aaron, who stood in the way, as they came forth from Pharaoh:
21 and they said unto them, The LORD look upon you, and judge; because ye have made our savor to be abhorred in the eyes of Pharaoh, and in the eyes of his servants, to put a sword in their hand to slay us.

Other English versions have "made us a stench to Pharaoh", "made us odious in Pharaoh's sight", "made us stink before Pharaoh", "made us a rotten stench to be detested by Pharaoh", "getting us into this terrible situation with Pharaoh", "made us stink in the sight of Pharaoh", "making the king and his officials hate us", "made us abhorrent in the sight of Pharaoh", "caused our fragrance to stink in the eyes of Pharaoh", "made our odour to stink in the eyes of Pharaoh", "caused us to be hated by Pharaoh", "made us reek in front of Pharaoh", "We are like a very bad smell to Pharaoh"...

So I've finished my second cup of coffee without finding the answer. Did this expression emerge into common use during the 17th century, and stay in the phrasal vocabulary of the English language to this day, without any authoritative model at all?

[Update 9/18/2005: Edward Cook at Ralph the Sacred River finds some relevant (though not exact) quotation in the late-16th-century Marpelate tracts; he agrees in being surprised by the lack of a clear biblical source for the "<bad smell> in the nostrils of <moral authority>" phrasal pattern:

I could have sworn that this expression was Biblical, but as Liberman shows, there is no exact model in the early English translations (although Hebrew be'ash, hib'ish "to stink, cause to stink" = to incur dislike, is well known).

]

Posted by Mark Liberman at 07:22 AM

September 14, 2005

Pointless incessant barking

This was my favorite cartoon of the week (actually it was in last week's New Yorker).

Believe me, I am well aware that attempting to warn the public against eternally popular hardy perennials like Dan Brown or Strunk and White through Language Log posts is perilously close to pointless incessant barking. But one does what one can. Arf! Arf!

Posted by Geoffrey K. Pullum at 08:07 AM

September 13, 2005

Syntactic Incompetence?

The lead Canadian news item at the Canadian Broadcasting Corporation website right now is the revelation that two prominent pathologists have concluded that a little girl, for whose rape and murder a man has already served twelve years in prison, died of natural causes. There having been no crime, the man is of course innocent. Here is the first paragraph of the story:

Two experts have concluded that a man convicted of raping and killing a four-year-old Ontario girl in June 1993 actually died of natural causes.

That's all wrong, of course. The man is still alive, in prison. It's the little girl whom he allegedly killed who is dead, and the story is that she died of natural causes.

Linguistic errors in the news are hardly a rarity, but this one is unusual both for its severity and its nature. It isn't a typographical error, and indeed it doesn't result from any sort of small-scale mistake. A correct version would have been something like this:

Two experts have concluded that a four-year-old Ontario girl whom a man was convicted of raping and killing in June 1993 actually died of natural causes.

The correct and botched versions are grossly different - to transform one into the other you've got to move large chunks from one clause to another.

I suspect that the error is due to the differing difficulty of the syntax of the two sentences. In both cases the first-order subordinate clause, which expresses the conclusion reached by the two experts, consists of a subject noun phrase containing a relative clause followed by the predicate actually died of natural clauses. In the actual, erroneous, sentence, the subject of the main clause (of the conclusion), a man, is also the subject of the relative clause - a stand-alone equivalent of the relative clause would be A man was convicted of raping and killing a four-year-old Ontario girl in June 1993. In the correct version, the subject of the main clause, the little girl, is the object of the relative cause. A stand-alone equivalent of the relative clause would be: A man was convicted of raping and killing her in June 1993. Sentences in which the grammatical role of a noun phrase is the same in the main clause and the relative clause seem to be easier to process. Some languages reportedly have only this type of relative clause.

What seems to have happened here is that the author had difficulty formulating the correct sentence, came up with what appeared to be a less awkward version, and went with the latter without realizing that it said something quite different. Or it may be that the author of the piece got it right and that it was an editor whose "correction" turned it into an error.

Posted by Bill Poser at 05:42 PM

Effing avoidance (cont.)


The mail is in on pronouncing "unpronounceable" characters, and it appears that in addition to a growing conventional use of the verbs heart (for the <heart> symbol, once pronounced like "love") and bleep (for strings of punctuation marks that stand for some nonspecific profanity), a convention is spreading for the use of  eff in reading things like "f*ck" and "f**k" and "f***", and in fact, for replacing the word fuck in "polite" language, both in speech and in writing.

Eff has, of course, been available for quite some time, along with other avoidance words (frig, fug, freak, etc.) catalogued by Jesse Sheidlower in The F Word.  What's new is that many people seem to be unwilling to go all the way to fuck itself (as John McWhorter proposes in Doing Our Own Thing, a suggestion that I'd endorse) and are instead settling on eff as the substitute of choice.


In my e-mail, David Landfair (9/6/05) noted that he reads the second word of the title "Totally F***ed Up" as "effed", but wouldn't read, say, "s***y" as "essy".  I certainly agree with him that ess (or for that matter esh) just won't do for  "s***" or "s**t" or "sh*t".  Jesse Sheidlower himself (9/7/05) agreed on "effed" in the Araki title, but introduced a subtlety: the title of Araki's film is, in print, "Totally F***ed Up", with three asterisks in it, and the titles of Johnson's books have "F*cking", with an asterisk, in them; what about titles that have full frontal orthographic obscenity, like the Arthur Neresian book titled "The Fuck-Up"?  Here, Sheidlower goes for the reading "fuck", honoring the author's evident intentions.

Yet another subtlety from Jesse, who noted that when the New York Times referred to Mark Ravenhill's play "Shopping and Fucking" as "Shopping And..." (others used "Shopping and F*cking" or printed Ravenhill's title unamended), he read it, at least to himself, as "shopping and", reproducing the Times's ellipsis dots as, well, phonetic ellipsis.  And he suggested that he'd probably realize "#$*!" not as "bleep", but as a pause, as in "what the [pause] do we know" for "What the #$*! Do We Know?!"  I don't know how popular the pausing strategy is for rendering these conventions of print, but someone could study it.

The "eff" strategy, which also deserves study, was taken by the Palo Alto Daily News on 2/7/04, when it reprinted Jan Freeman's Boston Globe "The Word" column of 1/25/04 about taboo-avoidance strategies (reported on by Mark Liberman here on Language Log that very day) under the headline "Ban cuss words? Effing unlikely".  The Globe itself, no eff-up, chose the head "The eff factor".

Meanwhile, Lauren Squires (9/6/05) wrote to say that in her blog she has several times questioned the contention the asterisks, <heart>, etc. are unpronounceable, and notes that these entries contain some interesting comments by readers.  Check the entries for 11/5/04, 1/26/05, 5/30/05, and 8/24/05.

And Alison Blank (9/6/05) reported on other names that present pronunciation difficulties:

there is a band that goes simply by the name "!!!".  As explained by Jesse Ashlock, "the name is subject to myriad pronunciations, as the utterance of any repetitive sound in triplicate (for example, "Chik Chik Chik" or "Pow Pow Pow" -- or "Prince Prince Prince," if you like) is considered acceptable."  In practice, the pronunciation of the name is not as unrestricted as this implies; "Chik Chik Chik" is definitely the accepted title here, although, for all I know, at Stanford "Pow Pow Pow" may very well be more common.  As with phrasal names, this seems to be another instance of band names helping to introduce new conventions into English.

Finally, Lissa Krawczyk (9/7/05) noted that the avoidance characters almost always replace a vowel letter (or, I amend, a string of letters including a vowel letter) and goes out on an interpretive limb to speculate that "immorality or general uncouth behaviour" is "associated with the open mouth".  Ingenious, but a sufficient explanation for this fact comes from schemes for telegraphic writing in English -- "f u cn rd ths u cn gt a gd jb" -- which eliminate most vowel letters, on the grounds that enough information remains in the consonant letters to allow readers to reconstruct the original.  Same with taboo avoidance: enough needs to be left to allow readers to home in on the word the writer intended, but vowel letters are generally dispensable.

Well, not exactly finally.  I just wanted to point out that unpronounceable is a hard word to spell; not all of my correspondents managed it, and I fairly often fall into error myself.  Some raw figures from a Google web search:

unpronouncable: 34,300
unpronoucable: 1,160
unpronouceable: 576
unprouncable: 509
unprounceable: 488 [I incline to this one]
unpronunceable: 214
unpronuncable: 179
unpronucable: 11
unpronuceable: 6

And I can't tell you how hard it was to type that table.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 01:46 PM

Two, three... many prefabricated phrases

From Paul Krugman's 9/12/2005 NYT column:

The point is that Katrina should serve as a wakeup call, not just about FEMA, but about the executive branch as a whole. Everything I know suggests that it's in a sorry state - that an administration which doesn't treat governing seriously has created two, three, many FEMA's.

Anyone with a nose for prose will recognize "(create) two, three, many ___" as one of those fragments of prefabricated rhetoric that we've taken to calling snowclones. A search for {"two three many"} will turn up 17,600 hits, most of which are genuine instances of the type, with substitutions for the head noun including "Fallujas", "Arafats", "North Koreas", "Bake Sales", "Kosovos", "Marine Corps", and so on.

Krugman himself has used this rhetorical pattern in at least three previous columns:

Alas, it's all too likely. I can't tell you which corporate icons will turn out to be made of papier-mâché, but I'd be very surprised if we don't have two, three, even many Enrons in our future. {2/1/2002 NYT column}
And looking forward, I don't believe that even the pro-war candidates would pursue the neocon vision of two, three, many Iraq-style wars. {1/2/2004 NYT column}
Rather than concede that he made mistakes, he's sticking with people who will, if they get the chance, lead us into two, three, many quagmires. {8/31/2004 NYT column }

In case you don't know where this phrase comes from, here's the story. In 1967, Che Guevara, trying to start an anti-U.S. guerrilla movement in Bolivia, published in the Cuban magazine Tricontinental an essay whose English version was entitled "Create two, three ... many Vietnams". The Spanish original seems to have been titled "Crear dos, tres... muchos Viet-nam". The spread of the phrase in the U.S. seems to have been aided by a 1968 Ramparts magazine article by Tom Hayden, "Two, Three, Many Columbias".

Guevara failed miserably -- he was killed and his movement obliterated before his essay was published -- but like many spectacular failures, he lives on in unexpected ways. Alvaro Vargas Llosa, writing in the New Republic a couple of months ago, observes that

Che Guevara, who did so much (or was it so little?) to destroy capitalism, is now a quintessential capitalist brand. His likeness adorns mugs, hoodies, lighters, key chains, wallets, baseball caps, toques, bandannas, tank tops, club shirts, couture bags, denim jeans, herbal tea, and of course those omnipresent T-shirts with the photograph, taken by Alberto Korda, of the socialist heartthrob in his beret during the early years of the revolution, as Che happened to walk into the photographer’s viewfinder—and into the image that, thirty-eight years after his death, is still the logo of revolutionary (or is it capitalist?) chic. Sean O’Hagan claimed in The Observer that there is even a soap powder with the slogan “Che washes whiter.”

Something similar has happened with his major linguistic contribution, the "create one, two, many ___" snowclone. For him, it was a call to multiply the resistance movements that he admired, and you can find some similar uses these days, but all of the examples that I cited above are from right-of-center sources. Some of them warn about the proliferation of things viewed as bad ("Arafats", "North Koreas", "Kosovos") while others call for the proliferation of things viewed as good ("Fallujas", "Bake Sales"). In other cases, the authors' emotional stance is complex, though still hardly in tune with Che's point of view:

Where does this fragmentation leave the national military, including the United States Marine Corps? As we have seen in Lebanon, Yugoslavia, and elsewhere, when the nation fragments so do its military forces. We could end up with two, three, many Marine Corps: white Marine Corps, black Marine Corps, Christian Marine Corps, possibly even a gay Marine Corps. These fragments would compete with other organizations to provide the security that counts: security for the individual person, family, home, and neighborhood. In effect, the future Marine could be a rent-a-cop. [William S. Lind, Maj John F. Schmitt, and Col Gary I. Wilson, "Fourth Generation Warfare: Another Look", Marine Corps Gazette, Dec. 1994]

Obviously this phrase has not become the exclusive property of the political right -- Krugman is left of center. But his four uses in three years of NYT columns are all to warn against the proliferation of things he regards as bad: ill-advised wars, corporate accounting frauds, mismanaged federal agencies.

I'm far from the first to connect Che Guevara with another snowclone: "<something significant happened> and all I got was this lousy t-shirt". But I suspect that as he reminisces with Capaneus in the afterlife, Che is unhappier about what has happened to his catchphrase.

Posted by Mark Liberman at 08:24 AM

September 12, 2005

Leading with the chin

Because of recent events in Louisiana, I just re-read Robert Penn Warren's All the King's Men, which I first encountered in high school. I didn't remember much about it, beyond the standard plot summary, and the fact that at fifteen I found it wordy and boring. One of the problems with reading things when you're young is that you make mistakes like that.

This morning over coffee I caught up with Rob Balder's PartiallyClips, and was tickled by the 8/11/2005 strip:

One of the best things about Rob's work is the density of ideas in it. This strip evokes the sunk-cost fallacy, the dilemma sometimes posed by the social and personal costs of truth-telling, and the curious puzzle of the word asshat, which is not in the Oxford English Dictionary despite having 516,000 Google hits. You might expect that I'm about to lead you on a scholarly tour of that word's social and linguistic history, but I won't, except to observe that this version of the history is not true, though it's funny. Instead, I'm going to quote from All the King's Men, where the appeal and the peril of truth-telling are evoked in a wordier but more serious way.

The novel is about Willie Stark, said to be a fictionalized version of Huey Long. At the point where we pick up the narrative, Stark is an idealistic rural lawyer who's been manipulated into running for governor by some politicians who want to split their opponent's rural vote. The story is told by Jack Burden, who has walked away from his PhD dissertation in history to become a journalist. Stark's campaign is not going well, because he bores his audiences with laundry lists of policy-wonk prescriptions. He's telling Burden about his ideas in a hotel-room interview:

He must have noticed that I wasn't giving a damn. He shut up all of a sudden. He got up and walked across the floor and back, his head thrust forward and the forelock falling over his brow. He stopped in front of me. "Those things need doing, don't they?" he demanded.

"Sure," I said, and it was no lie.

"But they won't listen to it," he said. "God damn those bastards," he said, "they come out to hear a speaking then they won't listen to you. Not a word. They don't care. God damn 'em. They deserve to grabble in the dirt and get nothing for it but a dry gut-rumble. They won't listen."

"No," I agreed, "they won't."

"And I won't be Governor," he said, shortly. "And they'll deserve what they get." And added, "The bastards."

After some more back-and-forth:

"What are you going to do?" I asked.

"I got to think," he replied. "I don't know and I got to think. The bastards," he said, "if I could just make 'em listen."

Sadie Burke is a political operative assigned to look after Stark:

It was just at that time Sadie came in. Or rather, she knocked at the door, and I yelled, and she came in.

"Hello," she said, gave a quick look at the scene, and started toward us. Her eye was on the bottle of red-eye on my table. "How about some refreshment?" she said.

"All right," I replied, but apparently I didn't get the right amount of joviality into my tone. Or maybe she could tell something had been going on from the way the air smelled, and if anybody could do that it would be Sadie.

Anyway, she stopped in the middle of the floor and said, "What's up?"

I didn't answer right away, and she came across to the writing table, moving quick and nervous, the way she always did, inside of a shapeless shoddy-blue summer suit that she must have got by walking into a secondhand store and shutting her eyes and pointing and saying "I'll take that."

She reached down and took a cigarette out of my pack lying there and tapped it on the back of her knuckles and turned her hot lamps on me.

"Nothing," I said, "except that Willie here is saying how he's not going to be Governor."

She had the match lighted by the time I got the words out, but it never got to the cigarette. It stopped in mid-air.

"So you told him," she said, looking at me.

"The hell I did," I said. "I never tell anybody anything. I just listen."

She snapped the match out with a nasty snatch of her wrist and turned on Willie. "Who told you?" she demanded.

"Told me what?" Willie asked, looking up at her steady.

And Sadie realizes that she faces a dangerous choice, which the narrator explains in terms of communicative interaction viewed metaphorically as a brawl:

She saw that she had made her mistake. And it was not the kind of mistake for Sadie Burke to make. She made her way in the world up from the shack in the mud flat by always finding out what you knew and never letting you know what she knew. Her style was not to lead with the chin but with a neat length of lead pipe after you had stepped off balance. But she had led with the chin this time.

As usual, maybe the mistake is not entirely a mistake.

... floating around in the deep dark the idea of Willie and the the idea of the thing Willie didn't know, like two bits of drift sucked down in an eddy to the bottom of the river to revolve slowly and blindly there in the dark. But there, all the time.

So, out of an assumption she had made, without knowing it, or a wish or a fear she didn't know she had, she led with her chin. And standing there, rolling that unlighted cigarette in her strong fingers, she knew it. The nickel was in the slot, and looking at Willie you could see the wheels and the cogs and the cherries and the lemons begin to spin inside the machine.

Now Sadie pushes the interaction along for a while by doing nothing:

"Told me what?" Willie said. Again.

"That you're not going to be Governor," she said," with a dash of easy levity, but she flashed me a look, the only S O S, I suppose, Sadie Burke ever sent out to anybody.

But it was her fudge and I let her cook it.

Willie kept on looking at her, waiting while she turned to the one side and uncorked my bottle and poured herself out a steady-er. She took it, and without any ladylike cough.

"Told me what?" Willie said.

She didn't answer. She just looked at him.

And looking right back at her, he said, in a voice like death and taxes, "Told me what?"

And then she gets to choose which letter to send, so to speak. Like Rob's middle manager, she decides for the one with the insults and the vulgarities and the literary allusions and the burning of bridges:

"God damn you!" she blazed at him then, and the glass rattled on the tray as she set it down without looking. "You God-damned sap!"

"All right," Willie said in the same voice, boring in like a boxer when the other fellow begins to swing wild. "What was it?"

"All right," she said, "all right, you sap, you've been framed!"

He looked at her steady for thirty seconds, and there wasn't a sound but the sound of his breathing. I was listening to it.

"Then he said, "Framed?"

"And how!" Sadie said, and leaned toward him with what seemed to be a vindictive and triumphant intensity glittering in ther eyes and ringing in her voice. "Oh, you decoy, you wooden-headed decoy, you let 'em! Oh, yeah, you let 'em, because you thought you were the little white lamb of God --" and she paused to give him a couple of pitiful derisive baa's, twisting her mouth -- "yeah, you thought you were the lamb of God, all right, but you know what you are?"

She waited as though for an answer, but he kept staring at her without a word.

"Well, you're the goat," she said. "you are the sacrificial goat. You are the ram in the bushes. You are a sap. For you let 'em. You didn't even get anything out of it. They'd have paid you to take the rap, but they didn't have to pay a sap like you. Oh, no, you were so full of yourself and hot air and how you are Jesus Christ, that all you wanted was a chance to stand on your hind legs and make a speech. My friends --" she twisted her mouth in a nasty, simpering mimicry -- "my friends, what this state needs is a good five-cent cigar. Oh, my God!" And she laughed with a kind of wild, artificial laugh, suddenly cut short.

Well, she goes on in that vein for some time.

Willie, previously a teetotaler, then drinks himself into a coma, and the next day, still drunk, gives a speech at a political BBQ that does indeed make 'em listen. He withdraws from the race, and begins the personal and political transformation that leads him later to the governorship as an unscrupulous populist demagogue.

So Sadie's speech was a sort of resignation letter -- it betrayed the political machine she worked for, and she pays for it when Willie makes her revelation public in his drunken oration the next day:

"Those fellows in the striped pants saw the hick and they took him in. They said how MacMurfee was a limber-back and a dead-head and how Joe Harrison was the tool of the city machine, and how they wanted that hick to step in and try to give some honest government. They told him that. But -- " Willie stopped, and lifted his right hand clutching the manuscript to high heaven -- "do you know who they were? They were Joe Harrison's hired hands and lickspittles and they wanted to get a hick to run to split MacMurfee's hick vote. Did I guess this? I did not. No, for I heard their sweet talk. And I wouldn't know the truth this minute if that woman right there --" and the pointed down at Sadie --- "if that woman right there --"

I nudged Sadie and said, "Sister, you are out of a job."

"-- if that fine woman right there hadn't been honest enough and decent enough to tell the foul truth which stinks in the nostrils of the Most High!"

But Sadie's speech was also a sort of letter of application, since it signed her up as the political architect of Willie's career.

It doesn't turn out well, for her or for anyone else. But like they say, read the whole thing.

Posted by Mark Liberman at 09:24 AM

September 10, 2005

Inveigling against the bed-wetters

Tony Snow, a Fox News Radio host, is joining Mark Helprin in unloading on Americans of all political persuasions, whom he calls "a large, vocal and potentially pestilential cadre of First Over-Responders":

Let's face it, the political left -- aided and abetted by Pat Buchanan and members of the bed-wetting right -- made utter fools of themselves.

Helprin was concerned that

the country has lost, as exemplified by the Left now out of power, a great deal of the will to self-preservation, and, as exemplified by the Right now in charge, not a little of its capacity for self-defense.

while Snow feels that things are OK, except that everyone is complaining too much. It pegs the irony meter for an American talk-show host to complain about people complaining about politicians, but this being Language Log, my interest in Snow's screed is purely linguistic.

Two things caught my eye. To start with, who are the "bed-wetting right", and in what sense are they incontinent? It's starting to worry me that I'm puzzled by so many of the metaphors thrown around in American politics these days.

Apparently Snow means the conservatives who have been complaining about incompetent cronies at FEMA, calling for Michael Brown to be fired, and so on. That would be pundits like Bill Kristol, Andrew Sullivan, Newt Gingrich, Michelle Malkin, Bob Novak; outlets like the Weekly Standard, and so on.

And I guess that being a bed-wetter means "delayed in developing the basic characteristics for moving from infancy to childhood", or "inappropriately infantile", or something like that. Thus the idea must be that a grown-up conservative would defend a Republican administration no matter what. Expecting executive competence is something to outgrow, like diapers. It's hard to believe Snow meant that -- but what else could it be?

Snow's second contribution to the linguistic highlights of the week was a classic malapropism, apparently brought on by an acute attack of bigwordism:

In time, virtually every Democratic panjandrum found some novel way to politicize the Atlantic typhoon. Sen. Hillary Rodham Clinton inveigled against the evils of Big Oil. Sen. Edward Kennedy ...

Having used panjandrum instead of leader, novel instead of new, typhoon instead of hurricane, Snow went that lexical bridge too far: dimly remembering the verb inveigh against meaning "attack", he slipped up and wrote inveigled against instead.

Perhaps there's some sort of preservation law operating here, since California recently changed its criminal law jury instructions so that kidnappers are no longer required to have "inveigled" their victims. Tony Snow's gaffe is evidence that this was the right thing to do.

[via Professor Bainbridge]

[Update: I once again demonstrated the validity of the Hartman/McKean Law of Prescriptivist Retaliation by mistyping "inveigling" as "iveigling" in the title of this post.

And John Cowan wrote to suggest that bedwetting incontinence might be a metaphor for senility rather than infancy. True enough, but why would senility make people inappropriately expect official competence? No, I think it's got to be an invitation to Andrew Sullivan et al. to grow up and get over it. ]

[Update #2: Arnold Zwicky points out that "inveigle against" has 110 Google hits (against 56,000 for "inveigh against", but still...) and that

possibly the best is from http://blog.outer-court.com/encyclopedia/word-5970/ (the Google Blogoscoped Encyclopedia entry for "damn"), which gives us:

Synonyms for Damn

Abuse, anathematize, attack, ban, banish, blaspheme, blast, cast out, castigate, censure, complain of, confound, convict, criticize, cry down, curse, cuss, darn, denounce, denunciate, doom, drat, excommunicate, excoriate, execrate, expel, flame, fulminate against, imprecate, inveigle against, jinx, object to, objurgate, pan, penalize, proscribe, punish, revile, sentence, slam, swear, thunder against, voodoo.

(notice that "inveigh against" is not even listed.)

According to the website, "This English encyclopedia and dictionary was built automatically using Google definitions & translations, Dictionary.com's Thesaurus, and Technorati."

Arnold's comment: "Oops".]

Posted by Mark Liberman at 02:21 PM

A five-letter password for a man obsessed with Susan

I owe my loyal readers some comments about the novel I dutifully read on the plane flight from San Francisco to Boston on September 1. I will provide the promised commentary in what follows. It isn't pretty.

You may remember that in my earlier post Don't look at their eyes! I mentioned that in preparing to read Dan Brown's Digital Fortress I was expecting a novel about cryptanalysis, probably one in which on the first page a renowned male expert at something dies a hideous death and straight away a renowned expert at something quite different gets a surprise call and has to take an unexpected plane flight and then face some 36 hours of astoundingly dangerous and exhausting adventures involving a good-looking (and of course expert) member of the opposite sex and when the two of them finally get access to a double bed she disrobes and tells him mischievously (almost minatorily) to prepare himself for strenuous sex. Well, a renowned male expert at something (cryptography) does indeed die a hideous death on the first page, and a renowned expert in something else (foreign languages) does indeed then get a surprise call to face an unexpected plane flight and some 36 hours of astoundingly dangerous and exhausting adventures (though he has to do it without the company of the good-looking (and of course expert) member of the opposite sex; there is one, but she stays three thousand miles away across the Atlantic).

There is of course a ghastly sadistic foreign hit man (an obligatory ingredient for a Dan Brown; this one is Portuguese, but otherwise just another mysterious cold-blooded death machine like the Arab in Angels and Demons). The happy couple do get access to a double bed at the end and do have strenuous sex, and there are some pretend threats from the female. So no real surprises at all.

In short, to call this novel formulaic is an insult to the beauty and diversity of formulae. The relation to the later Angels and Demons and The Da Vinci Code is one of what algebraists call homomorphism: the names have been changed, but the two books have the same structure, from the opening assassination down to the final pyrotechnic special effects.

The most striking thing about the book for any linguist will be that the male lead is a linguist (Robert Langdon, the corresponding charismatic academic in A&D and DVC, is an expert in religious symbolism). But Bill Poser has already commented about that right here on Language Log, quoting the most astonishing statement about the hero's academic life: "His university lectures on etymology and linguistics were standing room only, and he invariably stayed late to answer a barrage of questions. He spoke with authority and enthusiasm, apparently oblivious to the adoring gazes of his star-struck coeds." But there is no follow-up to this wishful thinking for academics.

If you are star-struck, and do want to attract the attention of some male linguistics professor, nothing would work better than a challenging intellectual puzzle involving linguistic material. And the truly depressing thing about Digital Fortress is that its research is so feeble and its puzzles are so stupid. Dan Brown literally does not know bits from bytes (he thinks an encoded message presented in groups of four letters separated by spaces can be called a "four-bit code"). He doesn't understand the difference between source code and compiled programs. He thinks there are 256 ASCII characters. His figures for time taken to break encryption keys on a parallel machine make no sense (the problem is exponential increase in difficulty, and you don't fix that by setting up some fixed number of processors to run in parallel). He thinks once a "virus" has been disabled in a "data bank" that it has crawled into, a chief technician has to shout shout "Upload the firewalls!" (he doesn't know the difference between loading a program into core and uploading a file from one computer to another). Just about everything he says about computers, processors ("titanium-strontium"??), data banks, viruses, algorithms, codes, ciphers, decryption, and everything else technical is nonsense (see the remarks by Robert M. Slade archived here).

The stupidity of the NSA staff in the novel is unbelievable. On page 408 a clue is unveiled to a vital "pass-key": PRIME DIFFERENCE BETWEEN ELEMENTS RESPONSIBLE FOR HIROSHIMA AND NAGASAKI. The anwer is supposed to be the prime number 3, the arithmetical difference between 235 and 238. Hiroshima was destroyed with a U-235 bomb. There is another isotope of uranium, U-238. Dan Brown is very careful to build into the exposition a claim that "the Nagasaki bomb did not use plutonium but rather an artificially manufactured, neutron-saturated isotope of uranium 238." This is utter nonsense: as all sources confirm, Fat Man was a plutonium bomb with plutonium 239 as the crucial fissionable material. (Plutonium is made in a breeder reactor by enriching the stable and non-fissionable U-238 isotope of uranium with extra neutrons; that must be the source of the nonsense Dan supplies.) But just assume the false claim about U-238 for purposes of reasoning within the confines of the book's imaginary world. The fact is that the assembled eggheads spend half a dozen chapters debating what the sentence could possibly mean, looking at each other in bafflement and running through encyclopedias and following false leads as if none of them had even a high school knowledge of science.

At one point late in the book, Commander Strathmore, the immediate superior of female lead cryptanalyst Susan Fletcher, confesses that he is madly in love with her and has been so, madly and obsessively, for years, ever since he met her. To the chagrin of his wife, he is so in love with Susan Fletcher that he cries out for her in his sleep. And right after Susan learns about his love, in order to escape from (a) his groping her and (b) the imminent explosion of the NSA's giant computer in a ball of flame (its "titanium-strontium processors" having ignited), Susan has to guess the five-letter password that activates his secret personal elevator. Think about it. A five-letter password that might have been chosen by a man obsessed with love for a woman called S U S A N. And we are supposed to be in suspense about whether Susan will guess it in time.

I had hoped to have some more amusing stuff for you on inept use of language (other than the aforeblogged strange eyebrow stuff). But the problem with this novel is not that it is written badly. Its use of language seems to me slightly better than the more famous novels that followed it. The problem is that it is so utterly through-and-through top-to-bottom brainless. The fact that reviewers have praised this drivel for "realness" and being "masterful" and very close to "the truth" is really quite scary.

Posted by Geoffrey K. Pullum at 12:36 PM

September 09, 2005

Leave the Bushmen out of it

In today's WSJ Opinion Journal, Mark Helprin's lament "They Are All So Wrong" trashes the American political spectrum from one end to the other:

Our politics and policies have somehow been parceled out to opportunists like Michael Moore--purveyor of conspiracy theories and hatreds, whose presentation, unclean in every respect, is honored nonetheless by the controlling rump of Democrats--and to Bushmen like "Kip" Hawley of Homeland Security, father of the proposal to allow carry-on ice-picks, bows and arrows, and knives with blades up to five-inches long.

That's this Kip Hawley, now Assistant Secretary of Homeland Security for the Transportation Security Administration, and this proposal. So why is he a "Bushman", with a capital B? He certainly doesn't look like an aboriginal inhabitant of southwestern Africa, nor does he seem to be a dweller or traveler in the Australian bush.

When I read Helprin's essay, I thought it was obvious: Bushmen is metonymic for "savages". This usage would a bit insensitive, to say the least. The AHD calls Bushman itself "offensive" even when used literally to refer to the ethnic group in question; though a few days ago, Language Hat linked to a piece by Steve Sailer on "The Name Game", which offers a different point of view:

The fashion of renaming the Bushmen of Southwestern Africa as the "San" exemplifies many of the problems with the name game. University of Utah anthropologist Henry Harpending, who has lived with the famous tongue-clicking hunter-gatherers said, "In the 1970s the name 'San' spread in Europe and America because it seemed to be politically correct, while 'Bushmen' sounded derogatory and sexist."

Unfortunately, the hunter-gatherers never actually had a collective name for themselves in any of their own languages. "San" was actually the insulting word that the herding Khoi people called the Bushmen. ("Khoi" is the term used by those who were labeled "Hottentots" by the Dutch. As you can probably guess by now, "Khoi" means "the real people.")

Harpending noted, "The problem was that in the Kalahari, 'San' has all the baggage that the 'N-word' has in America. Bushmen kids are graduating from school, reading the academic literature, and are outraged that we call them 'San.'"

"I knew very well," he said, "That one did not call someone a San to his face. I continued to use Bushman, and I was publicly corrected several times by the righteous. It quickly became a badge among Western academics: If you say 'San' and I say 'San,' then we signal each other that we are on the fashionable side, politically. It had nothing to do with respect. I think most politically correct talk follows these dynamics."

Some more discussion of the same sort is here, also suggesting that for many members of the ethnic group in question, "Bushman" is indeed the favored term.

But Helprin isn't suggesting that Hawley is literally one of the Bushmen of southwestern Africa. Language Hat suggested to me by email that Helprin meant this as a sort of punning reference to the followers of President Bush: the "Bushmen", get it? If that's what Helprin meant, it went right over my head -- instead, I took him to mean something like "uncivilized people who think that the way to solve problems is to use bows and knives, typified by hunter-gatherer bands like the Bushmen". This would ungenerous, not to say offensive, given that the culture of the Bushmen/San/Khoe/Basarwa seems to be rather on the gentle side. Even if it's a joke -- "the Bush men are Bushmen, nudge nudge wink wink" -- it's an unkind one. Incompetent hunter-gatherers don't get promotions and presidential medals, they starve to death.

A traditional solution to the problem of finding a metonymic substitute for savage is to use the word Neandert(h)al. This has the advantage of referring to no living people (unless you're one of those who believes that modern Europeans are part Neandertal), but it has the disadvantage of being founded on an unjustified prejudice against people with brow ridges and weak chins.

A more rational solution would be use the name of the American political group now most associated with interest in weapons, namely Republicans. To provide anatomical balance with his prior use of the phrase "rump of Democrats", perhaps Helprin should have referred to "Republicans like 'Kip' Hawley". Indeed, for enhanced anatomical and neurological parallelism, he might have contrasted the "numb rump of Democrats" who honor Michael Moore with the "brain-dead Republicans" who want to see hijackers and airline passengers fighting it out with bayonets and crossbows. Of course, I suggest this purely as an matter of abstract rhetorical balance, not to express any political opinion or any derogation of Mr. Hawley.

Posted by Mark Liberman at 02:15 PM

Inaugural Embedding

A couple of days ago, I observed that a secular trend towards shorter sentences and paragraphs can be seen in the inaugural addresses of U.S. presidents from 1789 to 2005. When you read the speeches, you can see some other trends as well. For example, it seems that sentences are not only shorter, but also "flatter" -- fewer layers of subordinated clauses.

The fourth sentence of George Washington's first inaugural is

In this conflict of emotions all I dare aver is that it has been my faithful study to collect my duty from a just appreciation of every circumstance by which it might be affected.

At 34 words, this sentence is not monstrously long -- at least compared to the 69-word sentence that precedes it and the 88-worder that follows. However, it's 34 words of hard linguistic slogging.

For modern readers, this is partly due to some rare or archaic usages: aver instead of "assert", study meaning "inclination, pursuit", collect meaning "infer, form a conclusion about", appreciation meaning "perception". The audience in every century must also untangle the interactions of quantifiers ("all", "every") with aspect ("has been") and mood ("might be") across three tensed clauses, an infinitive phrase, and several nominalizations ("conflict of emotions", "just appreciation of every circumstance").

There is also a striking stacking of clauses: Washington gives us a relative clause ("by which it might be affected") inside an infinitive clause ( "to collect my duty from..." inside a content clause ("that it has been my faithful study..."). If we also include the nominalization "appreciation" as expressing another layer of propositional structure, we have something like

In this conflict of emotions all I dare aver is 
   [that it has been my faithful study 
      [to collect my duty from 
          [a just appreciation of every circumstance 
             [by which it might be affected. ]
          ] 
      ]
   ]

Skimming the inaugural addresses reinforces my earlier impression that this kind of embedding has decreased over time.

Unfortunately, English orthography doesn't mark clause boundaries, so it's not as easy to quantify this impression as in the case of sentence and paragraph length. Some hand annotation (or accurate automatic parsing) is required. I decided to annotate the boundaries of certain types of clauses in three texts -- George Washington's first inaugural address, Abraham Lincoln's second inaugural, and George W. Bush's second inaugural -- and use that as the basis for a rough-and-ready quantification of the distribution of degrees of embedding.

To make my job easier, I marked only finite subordinate clauses, not infinitive clauses or nominalizations of various sorts, and not main clauses strung together by coordinators like "and" and "but". Thus Washington's fourth sentence comes out this way:

0 In this conflict of emotions all I dare aver is
1    [ that it has been my faithful study to collect my duty from a just appreciation of every circumstance 
2        [ by which it might be affected. ] ]

In the left margin, I've indicated the level of embedding of the words in the line that follows.

One of the sentences from Lincoln' second inaugural comes out like this:

0 It may seem strange
1    [ that any men should dare to ask a just God's assistance in wringing their bread from the sweat of other men's faces, ]
0 but let us judge not, 
1    [ that we be not judged. ]

And here's an example of a sentence from Bush's second inaugural:

0 We do not accept the existence of permanent tyranny 
1     [ because we do not accept the possibility of permanent slavery. ]

I started from the text of each speech, added left and right square brackets according to these principles, and then wrote a little perl script to count the number of words at each level of embedding. The results:

 
0
1
2
3
4
Mean
Sentence
Length
GW1
629
(44%)
554
(39%)
206
(14%)
36
(3%)
5
(<1%)
60
AL2
440
(63%)
222
(32%)
38
(5%)
0
0
26
GWB2
1842
(88%)
244
(12%)
4
(<1%)
0
0
22

And in graphical form:

I did the annotation very quickly, no doubt with several errors, so no one should take these numbers as more than a rough indication of how such an analysis would come out if done properly and carefully. Still, even this quick-and-dirty swipe at the problem provides some empirical support for the impression that syntactic complexity in English public discourse has decreased over the past couple of centuries. Will we go all the way to Homo hemingwayensis, or will linguistic fashions reverse course first?

It's important to note that discourse (whether spoken or written) is full of structural relationships that are not explicitly signaled, either in syntactic structure or in other ways. For examples and discussion, see here , here and here. I don't know any psycholinguistic results that indicate whether it is harder or easier to understand material in which such relationships are left implicit, rather than being marked explicitly by syntactic subordination or other methods. My intuitive impression (for what little it is worth) is that the explicitly marked material may be, paradoxically, somewhat harder to understand.

And now for something completely different: a few words about the substance of these speeches. A fascinating discussion of the last century and a half of political rhetoric is now available in open-access form: Michael Silverstein's Talking politics: the substance of style from Abe to "W" (March 1, 2003. Prickly Paradigm Press, Chicago). Michael starts his pamphlet with an eloquent appreciation of Lincoln's style:

No doubt about it. Abraham Lincoln gets the prize among United States presidents for the sheer concentrated political power of his rhetoric. When he set his -- actual, own -- mind to preparing his text, he could come up with gems such as his Second Inaugural and, of course, his 272-word "Dedicatory Remarks" at Gettysburg. Even his extemporaneous public and private talk, transcribed, shows great verbal ability. Now Mr. Lincoln had no Yale or Harvard degree as a credential of his education. But he understood the aesthetic -- the style, if you will -- for summoning to his talk the deeply Christian yet rationalist aspirations of America's then four-score-and-seven-year-old polity. Striving to realize this complex style, he polished it and elaborated its contours. He embodied the style. So much so, that Lincoln's great later text, like the late, great man himself, now belong to the ages. They form part of the liturgy of what Robert Bellah has termed America's "civil religion".

His evaluation of GWB's style is much less positive. He admits that W, with the help of others, achieves "scripted eloquence on memorial and commemorative days", but argues that

Language used in the expository mode, used to create argument and therefore, at its most successful, to become the instrument of reason and rationality, is clearly not one of Mr. Bush’s attributes. This is not Lincoln. This is not Kennedy. Neither Roosevelt. Whatever else we think of him, not Mr. Clinton. These were Presidents for whom language was both a renvoi, a hearkening back, to the experiences of literary imagination made concrete in words, and to systematic use of language for critical thought such as we do in science, in religion for narrative and theological investigation, etc. Whatever the field. Mr. Bush’s is a phrasebook notion of political “message”-language, straight out of anxious corporate standard, in which saying the right terms, with luck in a poetically perfect arrangement, is all the message there is.

However, it seems to me that that W's second inaugural, apparently written in collaboration with Michael Gerson, contradicts this evaluation. It doesn't have the transcendent power of Lincoln's 1865 speech, but -- viewed dispassionately and apart from current partisan emotions -- it is surely a fine addition to the liturgy of America's civil religion. It strikes an internationalist version of some of the same chords that Lincoln sounded in a nationalist form. And these phrases are not empty ritualistic repetitions of message-fragments, but expressions of ideas that each of the speakers came through experience to believe deeply.

Washington's first inaugural address is certainly an example of language as an instrument of reason and rationality, but it strikes me as rhetorically inferior not only to Abe's address but also to W's. And this brings things back to the beginning: there is a secular trend in public discourse towards shorter, flatter sentences, but this in itself is not evidence of any decrease in compositional care or rhetorical power.

I agree with most of what John McWhorter has to say in his recent book about the loss of the art of oration. But John isn't claiming that we'll find the evidence, or the cure, by measuring sentence length or degree of clausal embedding. Those stylistic dimensions are sometimes correlated with the issues he raises, because some people see a connection between plain speaking and careless speaking, but the relationship is accidental at best.

Posted by Mark Liberman at 10:11 AM

September 08, 2005

Don't have to live like a refugee

I did a Fresh Air piece that airs today about the language that people were using to describe the Katrina disaster, including the flap over whether it was appropriate to describe the displaced people as refugees. a word that black leaders have objected to, even if others argue that it's the mot juste in the circumstances.

After I taped the piece, I got to wondering whether the media were using the word in a selective way. Turns out that they are — it's disproportionately used to describe poor blacks.

In Nexis wire service articles mentioning Katrina over the past week, articles containing evacuee outnumber those containing refugee by 56% to 44% (n=1522). But in contexts in which the words appear within 10 words of poor or black, refugee is favored by 68% to 32% (n=85). And in contexts in which the words appear within ten words of Astrodome, refugee is favored by 63% to 37% (n=461).

Those disparities likely reflect the image of refugees as poor, bedraggled, and abandoned, which would make the word seem apt to describe the people getting off the buses at the Astrodome. That stereotype may be unfair and invidious in its own right, as George Rupp, the CEO of the Interntional Rescue Committee, was saying this morning on WNYC's Bryan Lehrer Show, where I was also a guest. But the way the press is using the word refugee now hardly does much to dispel the stereotype. And while there may be polemical reasons for advocates of the displaced to use the term, the way Woodie Guthrie did in his song "Dust Bowl Refugee," that's hardly what the media are getting at when they use it, or what President Bush was thinking of when he objected to the use of the term the other day.

Posted by Geoff Nunberg at 12:18 PM

September 07, 2005

Complexity

Andrew Gelman at SMC&SS has asked "Are public utterances getting more complex?" He cites Steven Johnson's observations about the increasing complexity of TV dramas, and mentions the connection to the Flynn effect of worldwide rising IQ trends:

Several years ago, Seth Roberts, who told me about all this, had the idea of measuring changes in intelligence over time by looking at the complexity of newspapers and magazines. From a casual reading of Time magazine, etc., from 1950 or so, as compared to today, Seth had the impression that the articles had become more sophisticated.

Gelman's recent post quotes an email from Roberts, who in turn cites a NYT article that quotes a passage in John McWhorter's book ("Doing Our Own Thing: The Degradation of Language and Music and Why We Should, Like, Care" ).

On p. 45, John contrasted some pro-war rhetoric from Representative Charles Eaton in 1941:

"Mr. Speaker, yesterday against the roar of Japanese cannon in Hawaii our American people heard a trumpet call; a call to unity; a call to courage; a call to determination once and for all to wipe off of the earth this accursed monster of tyranny and slavery which is casting its black shadow over the hearts and homes of every land."

with a functionally similar speech by Senator Sam Brownback in 2002:

"And if we don't go at Iraq, that our effort in the war on terrorism dwindles down into an intelligence operation. We go at Iraq and it says to countries that support terrorists, there remain six in the world that are as our definition state sponsors of terrorists, you say to those countries: we are serious about terrorism, we're serious about you not supporting terrorism on your own soil."

Gelman quotes Seth Roberts as writing

Notice that what Brownback said is considerably more conceptually complex than what Eaton said, even though the number of words is about the same.

Now, John's comment about this contrast had to mainly to do with formality rather than with complexity:

On paper we see a mess of fragments and run-ons, and a colloquial phrase like go at that congressmen in 1941 wouldn't have dreamed of using in a public statement. Actually, to give all due credit to Brownback, inflection, gesture and context made a thoroughly comprehensible speech -- not polished, but hardly of the sort that would leave you shaking your head. After all, remember the transcriptions of casual speech in the previous chapter! We can be sure that congressmen Martin and Eaton sounded at least something like them when smoking cigars after sessions. But the point is that they did not talk this way when making speeches.

As I understand John's point, it's precisely that casual speech is generally "more conceptually complex" (in some sense) than formal rhetoric is, not because its content is more elaborate, but because its presentation is more chaotic and less carefully considered.

Andrew Gelman further quotes Seth Roberts quoting Dick Hamming's observations about how informality in dress and self-presentation often led people to underestimate John Tukey, and he wonders what John McWhorter would make of all this:

Interesting thoughts. I assume this is the same John McWhorter who contributes to the cool Language Log website. I wonder what McWhorter think of Seth's comments on the complexity of public statements.

I certainly agree that there are interesting issues here, but I've gotten myself into a situation of excessive bloggish complexity, and I'll therefore refrain from trying to represent John McWhorter's views of Andrew Gelman's citation of Seth Robert's presentation of Dick Hamming's ideas about John Tukey's dress code violations.

However, I will say something about Gelman's original question "Are public utterances getting more complex?", namely that I don't know what it means. There are about a half a dozen different questions mixed up here, I think.

Are we talking about the complexity of content, or the complexity of presentation? Does the phrase "conceptual complexity" refer to a property of the concepts being communicated, or to a property of the process of communication? Are we talking about difficult ideas, or difficult language? And in the arena of presentation, are we talking about the complexity of words, of syntactic relations among words and clauses, or of semantic or rhetorical structure? Is the explicit articulation of structure more or less complex than implicit presentation of the same structure?

What do we mean by "complex", anyhow? Are we talking about some sort of information-theoretical density -- the informativeness or surprisingness of someone's choices of words, phrase structures and ideas? Or is a more elaborate structure ipso facto more complex, even if it is almost entirely predictable?

And what does all this have to do with the original focus of John's jeremiad?

I'll leave it to Seth, Andrew and John to sort these questions out. However, I do have one small contribution to make to the historical study of American political rhetoric.

The 55 presidential inaugural addresses are available in a convenient form from Bartleby. Let's compare the first two paragraphs of the first one, delivered by George Washington in 1789:

Among the vicissitudes incident to life no event could have filled me with greater anxieties than that of which the notification was transmitted by your order, and received on the 14th day of the present month. On the one hand, I was summoned by my country, whose voice I can never hear but with veneration and love, from a retreat which I had chosen with the fondest predilection, and, in my flattering hopes, with an immutable decision, as the asylum of my declining years—a retreat which was rendered every day more necessary as well as more dear to me by the addition of habit to inclination, and of frequent interruptions in my health to the gradual waste committed on it by time. On the other hand, the magnitude and difficulty of the trust to which the voice of my country called me, being sufficient to awaken in the wisest and most experienced of her citizens a distrustful scrutiny into his qualifications, could not but overwhelm with despondence one who (inheriting inferior endowments from nature and unpracticed in the duties of civil administration) ought to be peculiarly conscious of his own deficiencies. In this conflict of emotions all I dare aver is that it has been my faithful study to collect my duty from a just appreciation of every circumstance by which it might be affected. All I dare hope is that if, in executing this task, I have been too much swayed by a grateful remembrance of former instances, or by an affectionate sensibility to this transcendent proof of the confidence of my fellow-citizens, and have thence too little consulted my incapacity as well as disinclination for the weighty and untried cares before me, my error will be palliated by the motives which mislead me, and its consequences be judged by my country with some share of the partiality in which they originated.

Such being the impressions under which I have, in obedience to the public summons, repaired to the present station, it would be peculiarly improper to omit in this first official act my fervent supplications to that Almighty Being who rules over the universe, who presides in the councils of nations, and whose providential aids can supply every human defect, that His benediction may consecrate to the liberties and happiness of the people of the United States a Government instituted by themselves for these essential purposes, and may enable every instrument employed in its administration to execute with success the functions allotted to his charge. In tendering this homage to the Great Author of every public and private good, I assure myself that it expresses your sentiments not less than my own, nor those of my fellow-citizens at large less than either. No people can be bound to acknowledge and adore the Invisible Hand which conducts the affairs of men more than those of the United States. Every step by which they have advanced to the character of an independent nation seems to have been distinguished by some token of providential agency; and in the important revolution just accomplished in the system of their united government the tranquil deliberations and voluntary consent of so many distinct communities from which the event has resulted can not be compared with the means by which most governments have been established without some return of pious gratitude, along with an humble anticipation of the future blessings which the past seem to presage. These reflections, arising out of the present crisis, have forced themselves too strongly on my mind to be suppressed. You will join with me, I trust, in thinking that there are none under the influence of which the proceedings of a new and free government can more auspiciously commence.

with the first two paragraphs of the most recent one, delivered by George W. Bush in 2005:

On this day, prescribed by law and marked by ceremony, we celebrate the durable wisdom of our Constitution, and recall the deep commitments that unite our country. I am grateful for the honor of this hour, mindful of the consequential times in which we live, and determined to fulfill the oath that I have sworn and you have witnessed.

At this second gathering, our duties are defined not by the words I use, but by the history we have seen together. For a half a century, America defended our own freedom by standing watch on distant borders. After the shipwreck of communism came years of relative quiet, years of repose, years of sabbatical—and then there came a day of fire.

I don't know about complexity, but there is an obvious difference in the length of paragraphs and sentences.

Cursed as I am with the habit of quantification, I wrote a little shell script to harvest the 55 inaugural addresses, extract the texts, and measure the mean and median length of sentences and paragraphs. Here are the results in graphical form:

There's a fair amount of local variation, not to say noise, and perhaps a non-monotonic trend in rhetorical style around 1900 -- did Henry James temporarily take over from Mark Twain? Overall, though, I'll take these graphs as prima facie evidence that whatever its complexity, political rhetoric in recent decades has used shorter sentences and shorter paragraphs than in the republic's first century or so. However, I believe that whatever is going on here, it's orthogonal to John's point in Doing Our Own Thing -- recent inaugural addresses are surely at least as carefully crafted as earlier ones were. In fact, I'll bet that recent addresses would count as much more carefully structured, by measures such as composition time per sentence, number of passes of revision, amount of testing of potential audience reactions, and so on. Furthermore, I suspect that the downward trend in sentence and paragraph length over the years is partly due to this increased focus on audience uptake, along with the changes in the effective audience for such addresses caused by media changes.

[Some other relevant Language Log posts:

A tale of two media (4/30/2005)
Quit email, get smarter? (4/23/2005)
Generational changes: decline or progress? (4/20/2004)
Balm in Gilead (4/16/2004)
]

Posted by Mark Liberman at 05:34 AM

September 06, 2005

Call me... unpronounceable


Having come across the books Watch Your F*cking Language and English as a Second F*cking Language, Geoff Pullum takes another look at expressions that are "unspeakable" because their orthographic representations have "no possible out-loud reading", "no phonetic counterpart".  In these cases, it's the asterisk that is the source of the problem.  In the cases that Geoff looked at in an earlier posting, the film titles I <heart> Huckabees (where "<heart>" stands for a heart symbol, one that doesn't appear properly on all browsers) and What the #$*! Do We Know!?, it's the "<heart>" and the "#$*!", respectively.

At the risk of telling Geoff things that he already knows, I'm going to argue that the asterisk is by no means unpronounceable, and that the <heart> and <expletive suppressed> probably aren't either.  The crucial observation is that it's not the pronounceability of particular individual symbols that is at issue, but the existence of conventions linking orthographic representations to pronunciations, conventions that can be quite complex and abstract.  A close analogue to the representational scheme using the asterisk for a suppressed letter (as in f*cking and in the title of a Gregg Araki film, Totally F***ed Up) is the representational scheme using the period as a sign of suppressed letters (as in Ariz., Nev., Mr. Majestyk, and Mrs. Doubtfire).  The motives for suppression are different in the two schemes -- modesty and brevity, respectively -- but in both some character, which has no pronunciation on its own, serves as an instruction to the reader to supply some letter or letters, after which the whole expression will be pronounceable.


I'll start with the <heart> and <expletive suppressed> cases.  The <heart> began, I believe, on bumperstickers like I <heart> Weimaraners, and similar (printed) public expressions of opinions and sentiments.  In this context, the symbol was intended to stand for the word love, so that I <heart> Weimaraners would be pronounced like "I love Weimaraners".  The <heart> would be much like & read as "and", in, say, Pullum & Zwicky.  It's a bit more abstract than &, though, since the <heart> is read as "loves" in the appropriate context, as on the license plate of a colleague of mine: JRR <heart> AER, which is to be read as "[someone whose initials are] JRR loves [someone whose initials are] AER".  (The <heart> is one of the non-alphanumeric characters you can get on vanity plates in the state of California.)

This isn't especially complex, but the <heart> soon took on a life of its own.  People began pronouncing it like "heart" (or "hearts", in the appropriate context), as Geoff noted in his first posting on unspeakability.  There now appears to be a new verb heart.  In fact, the IMDB (Internet Movie Data Base) lists the movie I <heart> Huckabees as I Heart Huckabees.  But the title is pronounceable, either as "I Love Huckabees" or as "I Heart Huckabees", depending on which convention for pronunciation you happen to use.

On to <expletive suppressed>.  Here the problem is not that that #$*! (and its kin) can't be realized in a pronunciation, but that it can't be uniquely realized; there are so many different expletives you could supply.  So people came up with a small set of conventional euphemistic readings for <expletive suppressed>: "bleep", "bleeping", "bleepity-bleep", "blankety-blank", and so on.  Of these, "bleep" seems to have pretty much won out, as (again) Geoff noted in his first posting.  And, indeed, the IMDB lists the movie What the #$*! Do We Know!? as What the Bleep Do We Know!?  So there now is a conventional way for pronouncing the name of the movie.

Once you start looking at characters that are, on their own, unpronounceable, you begin to realize how many of them there are.  There's the apostrophe, for instance -- in the negative inflectional suffix n't (can't), in reduced auxiliaries (it's), in proper names (D'Angelo, M'shelle N'degeocello), and elsewhere.  The main convention for the apostrophe is that, whatever other work it might be doing, it is ignored in pronunciation.  Then there are quotation marks, as in Jake "The Bronx Bull" LaMotta and Kim cried out, "Help me!", which are either ignored in pronunciation or serve as signs of a special prosody.  And asterisks used for emphasis in writing on the net -- Well, that's *your* opinion -- where the convention is that the material surrounded by a pair of asterisks is to be pronounced with a special prosody.  And, of course, commas, question marks, exclamation marks, and sentence-final periods.  And numerals used in representations of number words, as in the title of the movie '10' (yes, the single quotation marks are part of the title) -- a movie that is alphabetized in data bases under the letter T, for "ten" (which is, after all, how the title is pronounced).  And the dollar sign, read as "dollar" ($1) or "dollars" ($10), following the number word with the numeral representation it precedes.

The period is especially versatile.  Sometimes it's read as "dot", as in amazon.com "Amazon dot com".  Sometimes it serves as a sign that material is to be treated as an initialism, as in N.F.L. "en eff ell", in which case it has no pronunciation on its own.  Then there's the abbreviatory use of the period, to indicate that some letters have been suppressed in the orthographic representation that precedes it.  [There are, alas, two different conventions here, a mostly American one that uses a period across the board in abbreviations (as in both Ariz. and Mr.), and a mostly European one that uses a final period only when letters have been suppressed at the end of the orthographic word, as in Ariz., versus Mr).  I'll be sticking to the first system here.]  The reader supplies the needed letters, and then pronounces the result.  But the period itself has no pronunciation; it's only an instruction to retrieve the rest of an orthographic word.

At its outer edges, the system of abbreviations becomes grotesquely complex, with things like Mrs. read as "Missus" (though this spelling is virtually never used), lb. read as "pound", No. read as "number", exx. read as "examples", and Ms. read as /mIz/ (or some close variant), a pronunciation for which there is no conventional spelling.  And much more.

Within limits, such abbreviations are treated as if they were spelled out.  So, in Halliwell's Film and Video Guide 2002 (I'm a few years behind on the editions), Mr. Majestyk is alphabetized like "Mister", Mrs. Doubtfire like "Missus", and Dr. Doolittle like "Doctor".  I haven't been able to find out what Halliwell's does with Abel Ferrara's 1980 film, Ms. 45.

Finally, back to the asterisk of avoidance (f***ed), and its variants, like the elliptical period (f...ed), hyphen (f---ed), and underscore (f_ _ _ ed).  These are actually a good bit more straightforward than the abbreviatory period.  Each avoidance character stands for only one letter, in its normal place in the orthographic word.  All the reader has to do is fill in the blanks and pronounce the result.  That's the convention for relating orthographic form and pronunciation.  Like the abbreviatory period, the avoidance characters have no pronunciation on their own, but do have one in context.

Because of their taboo status, the orthographic words that avoidance characters conceal are rarely spelled out.  Totally F***ed Up is the listing in the IMDB, but it presumably would be alphabetized in between films titled Totally Folded Up and Totally Fussed Up, were there any such -- just where Totally Fucked Up would be.  Which is how the title of the film is pronounced.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 06:05 PM

Stupid bank transfer scam email

Here's a representative passage from a recent scam email message inviting me to share in $11.5 million waiting in a Burkina Faso bank account. Keep in mind that the idea is I'm expected to believe that this is from a trustworthy banker with international experience.

As it may interest you to know, I got your impressive information through a boy I know very well whom is expact in searching in Burkina-Faso chamber of foreign relations here in Ouagadougou Burkina-Faso but with the information I gave him, he did it but left you to well clear me with a clear information to be sure of you I will love to hear from you whether you or not. Meanwhile all the whole arrangement to put claim over this fund as the confide next of kin to the deceased, to get the required approval and transfer this money to a foreign account has been put in place and directives and needed information will be relayed to you as soon as you indicate your interest and willingness to assist us and also benefit your self to this great business opportunity.

Yes. I think I see. However, just in case there had been any French influence on the writing that might lessen intelligibility, I used a free web translation service to translate it into French and then used the service again to translate the result back into English.

The result was the following:

As it can interest you to know, I obtained your information impressive by a boy that I know very well that expact in looked for in the room of Burkina FASO of foreign relations here in Ouagadougou Burkina FASO but with information I gave it, it did it but you leaves for me clarifyes well with a clear information to be sure of you I will like to hear. During this time all entire arrangement to put the complaint by over these funds as the confides after family to the deceased, obtain the demanded approval and transfers this money to a foreign account was put to his place of the and the directives and necessary information you will be relayed immediately that you indicate that your interest and kindness helps us and takes advantage also your oneself.

I didn't find this quite helpful enough, so I ran it through an English-to-Dutch translator, and then translated the Dutch automatically back into English. What I got was this:

If the you can interest, I obtained your information to know that impressive is through a boy that I know quite good that expact in for in the room of Burkina FASO of foreign relations of this in Ouagadougou Burkina FASO looked, but with information I gave him, it did him, but you leave to which me good with a clear information clear up certain of you I to be will Hold to hear. During this time will all complete agreements the complaint through over this fund set as it after family to the passed away entrusts, obtain it demanded approval and brings this count to a foreign bill over became gezet to its place of it and the guidelines and necessary information that you immediately passed become will that you that your interest indicated and kindness helps us and takes advantage also your us.

That's better. Still a few problems, but I'm sure that after I've given the guy in Africa my bank account details, account number, and password, we'll be able to sort it out the details of for convenience of the clear.

Posted by Geoffrey K. Pullum at 09:56 AM

There's studies

A few days ago, I offered some evidence from web-corpus sociolinguistics that "there's" + <plural noun phrase> has become part of standard English. A couple of years ago, Jen Hay and Dani Schreier studied this phenomenon in the speech of New Zealanders, and concluded (as Jen wrote in email) that

It's so frequent in the contemporary speakers in our spoken corpus, that you might almost argue that "there are" is "non-standard" in speech.   It's not just "there's", though ... "there was + plural" is slightly more frequent than "there were" as well.

Jen added that

... you're certainly right about the untrustworthiness of people's judgements.   Most New Zealanders vehemently deny that they would use these forms.

Their work was published as Jennifer Hay and Daniel Schreier, "Reversing the trajectory of language change: Subject-verb agree with be in New Zealand English". Language Variation and Change, 16 (2004) 209-235. A link to the journal issue -- for those with subscriptions -- is here, and an open-access preprint is here.

Their abstract:

This article examines the historical evolution of subject–verb concord in New Zealand English. We investigate the usage of the singular form of be with plural NP subjects (existentials and nonexistentials) over the past 150 years. The results demonstrate that the New Zealand English subject–verb concord system has undergone considerable reorganization during this time. Singular concord in nonexistentials occurred in early New Zealand English, but is now largely absent. In existentials, it steadily declined during the late 19th century, and then reversed this trajectory to become a well established feature of modern New Zealand English. Singular concord in New Zealand English existentials is now conditioned by a range of social and linguistic factors, and largely resembles other varieties in this respect.

Ben Zimmer pointed out by email that George W. Bush is one of those who uses "there was" + <plural noun phrase> , as in these examples:

1st Debate <http://www.debates.org/pages/trans2004a.html>:

But there was fortunately others beside himself who believed that we ought to take action.

2nd Debate <http://www.debates.org/pages/trans2004c.html>:

We all thought there was weapons there, Robin. My opponent thought there was weapons there.

Listen, there is 30 nations involved in Iraq, some 40 nations involved in Afghanistan.

I wasn't happy when we found out there wasn't weapons, and we've got an intelligence group together to figure out why.

Ben observes that W sometimes uses the singular present-tense copula in non-existential cases:

http://www.whitehouse.gov/news/releases/2002/08/20020829-7.html
"What is your ambitions?"

http://www.whitehouse.gov/news/releases/2003/02/20030224-7.html
"What is life choices about?"

I haven't checked the audio to see whether "is" was contracted or not in those examples. It's possible that the transcripts got it wrong -- remember that this all started because CNN un-contracted "there's" to "there is" in a quote from New Orleans mayor Ray Nagin -- but in this case, contraction wouldn't reduce the vernacular character of the quotes, and seems inappropriate on other grounds.

Posted by Mark Liberman at 07:49 AM

September 05, 2005

Someone like me, someone such as myself


The reflexive pronoun myself used non-reflexively (without an antecedent in its clause) has been derogated for well over a hundred years -- see MWDEU, which summarizes the critical literature as follows:

Two general statements can be made about what these crtics say concerning myself: first, they do not like it, and second, they do not know why.  An index to their uncertainty can be found in this list of descriptors that they have variously attached to the practice: snobbish, unstylish, self-indulgent, self-conscious, old-fashioned, timorous, colloquial, informal, formal, nonstandard, incorrect, mistaken, literary, and unacceptable in formal written English.

Yet it is widespread in literary sources, and is "particularly popular" (MWDEU) after as, than, and like, as in this letter to the magazine Instinct (from Joseph Amodeo of Marlboro NY), September 2005, p. 16:

... I am writing to tell you how inspirational and uplifting your publication is for someone such as myself.

This is just the sort of use of myself that keeps making people's lists of Worst Errors in English Grammar, but it occurred to me that the variant "someone such as me" was inferior to Amodeo's "someone such as myself" -- but that "someone like me" would be better than "someone such as me".  I considered the possibility that, despite proscriptions, many speakers view myself as the upscale (fancier, more elegant) alternative to plain ol' me, in the same way that, thanks to instruction to replace like as a conjunction by as, they view such as as the upscale alternative to plain ol' like.  If so, there should be a concordance effect, with such as preferring myself and like preferring me.  And so there is, pretty spectacularly.


First, there's a clear preference for myself over me with such as.  In raw Google web hits:


...myself
...me
myself/me ratio
such as...
515,000
69,300
7.43

There's no easy way to compare like to such as in general, since searches on like myself and like me produce so many spurious hits.  But somewhat more constrained searches avoid this problem:


...myself
...me
myself/me ratio
people such as...
28,100
4,720
5.95
people like...
171,000
860,000
0.19
someone such as...
6,940
1,230
5.64
someone like...
49,000
432,000
0.11

Myself continues to dominate me with such as, by a factor of between 5 and 6.  The figures for like are just the reverse, with me dominating myself  by a factor of between 5 and 10.

So there certainly is a concondance effect.  You might want to argue with my interpretation in terms of an upscale/plain distinction, but the effect looks very robust. 

The careful reader will have noted that I'm suggesting that myself as the object of a preposition is perceived, by many speakers, as more upscale than me despite proscriptions, while such as is perceived, by many speakers, as more upscale than like in part because of proscriptions.  That is, my interpretations look inconsistent.  In fact, I believe that the inconsistency is in the speakers, not in my explanations: there's no reason to think that people respond in the same way to every explicit instruction in grammar.  Instead, they'll shift towards whichever variant they believe to be more appropriate in the context, and such shifts will sometimes run counter to proscriptions and sometimes conform to them.

In the world of pronoun forms, nominative conjoined objects -- the between you and I sort of thing so disdained by critics of usage -- are seen by many speakers to be upscale, and they'll shift towards them when they're taking care with their language, as when a British biologist being interviewed on the PBS program Origins (seen 8/30/05) explains, "For you and I, that's not a very exciting diet", or when a woman in a Prairie Home Companion comedy skit (heard 8/27/05, rebroadcast from 9/04) tells us, "These are the good days for Jim and me -- or Jim and I, as I used to say when I went to college." I'm suggesting that non-reflexive uses of myself are much like this.

On the other hand, like has no elegance points on its own, and it's been contaminated by instruction designed to steer speakers away from things like "Winston tastes good like a cigarette should."  So people shift away from the preposition like, towards such as, when they're taking care with their language.

Different effects in different contexts.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 10:26 PM

Cartoonists on the grammatical front line


The comics are on the front line of grammatical correctness, according to a letter by Byrna Weir (of Rochester NY) in the summer 2005 issue of The Key Reporter (the newsletter of the Phi Beta Kappa Society), p. 15.  Weir thanks the newsletter for a brief piece on Eleanor Gould Packard,

who worked as a grammarian at [The New Yorker].  I appreciated her "pet language peeve" asking that the writer "always change 'they only did five things' to 'they did only five.' "

    This mistake appears frequently in newspapers.  A favorite section is the comics, where language usage varies.  Creators could provide a service to readers of all ages if they would have characters speak correctly.  On a recent Sunday Hagar the Horrible said, "We only had time to save our most prized possessions!"  But "Jump Start" always has correct language--and a character who is a member of the Grammar Police.  In "Brenda Starr," not only is the main character, a journalist, careful, but everyone speaks well.  In a recent strip she [said], "But whom can I trust?"

    Some will say the funnies will not sound "real" if the speech is correct.  If not, let us have reality-plus.

[The boldfacing is mine.]  Correct is correct, no matter what the context.   Correctness trumps reality.  Novelists please copy: if they come for the cartoonists today, they may come for you tomorrow.

Many sighs.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 12:46 PM

The continuing misrepresentation of FOXP2 effects

It was very nice that the change of address notification for our subscription to the world's finest news magazine worked so well: the August 27th issue of The Economist arrived in Santa Cruz and was available for reading on the flight to Boston (Barbara and I have moved to Massachusetts for the academic year 2005-2006, hence the extensive clearing and packing chores that account for my near-total absence from Language Log over the past several weeks), and the September 3rd issue arrived at the new address in Cambridge to make us feel we were truly home. After some preliminary set-up chores in our new apartment, I settled down to read the usually excellent technology column — but I was horrified. The Economist reports on the recent sequencing of the Pan troglodytes (chimpanzee) genome, a topic I turned to with some interest, and suddenly (bottom of page 70 to top of page 71) I read this:

And then there is FOXP2. This gene is known to be involved in language (people who lack a functional version of it cannot learn to speak).

Gaaaahhh! Not true. An absolute howler. Affected people do learn to speak. They are just very bad at it, and for mostly articulatory reasons — problems with fine-scale movements of the tongue. The instinct for language is there to a large extent, but execution is very clumsy. There has been some sensational press about the FOXP2 gene and the discovery of its location and effects over the past few years, but it has been long on sensation and short on factual accuracy. FOXP2 does not constitute a special gene controlling just linguistic capabilities. It is not a "grammar gene", though some have irresponsibly called it that.

Here's a passage from Alec MacAndrew's authoritative survey of the issue on The Evolution Pages. He starts by referring to "the KE family", the family whose inherited tendency to certain speech and other defects led to the discovery of the gene:

The KE family were brought to the attention of the scientific community in about 1990. Over three generations of this family, about half the family members suffer from a number of problems, the most obvious of which is severe difficulty in speaking, to such an extent that the speech of the affected people is largely unintelligible, and they are taught signs as a supplement to speech as children. It is a complicated condition including elements of impairment in speech articulation and other linguistic skills, and broader intellectual and physical problems. From the outset it seemed quite likely, from the pattern of inheritance, that the disorder is associated with a mutation in a single autosomal-dominant gene. It is rather surprising that such a diffuse condition should be linked to a single genetic defect, but it turned out to be so for reasons that we shall see later.

From the beginning, there has been a range of views in the professional scientific community with regard to whether the gene in question is a `language' or a `grammar' specific gene. Those disagreements continue in a somewhat abated form today.

The Disorder is not grammar or speech-specific

In 1995, Vargha-Khadem et al. published a paper investigating the phenotype of the disorder and showing quite clearly that it is not grammar or speech specific (1).  They tested affected and unaffected family members and concluded that the disorder had the following characteristics: defects in processing words according to grammatical rules; understanding of more complex sentence structure such as sentences with embedded relative clauses; inability to form intelligible speech; defects in the ability to move the mouth and face not associated with speaking (relative immobility of the lower face and mouth, particularly the upper lip); and significantly reduced IQ in the affected compared with the unaffected in both the verbal and the non-verbal domain.

This last finding, about IQ, has been swept under the carpet by some commentators (including Nicholas Wade in the New York Times (2)), who claim that since the ranges of IQ of the affected and unaffected overlap and furthermore that since some of the affected achieve scores above the population mean for non-verbal IQ, then the disorder does not include a general intellectual challenge and is therefore language specific. But, note that the mean of the affected non-verbal IQ is 86 (range 71 - 111) versus a mean IQ for the unaffected of 104 (range 84 - 119). Not only is the mean significantly different between affected and unaffected family members but three of the affected had non-verbal IQ scores below 85, which is the normal lower limit for classifying speech defects as `specific language impairment' (that is, any disorder that affects speech only and is not caused by more general cognitive problems). Note, however, that the KE disorder cannot be explained solely by a general cognitive deficiency, because it is present in individuals whose non-verbal IQ is close to or a little above the population average and because it is accompanied by deficiencies in motor control of the face and mouth.

So the gene does have some effect on ability to articulate speech, but has other correlates as well, including significant intelligence deficit. What's really interesting, though, comes later in the piece. MacAndrews says:

The key point, that all the popular reports missed, is that FOXP2 is a transcription factor — in other words it has the potential to affect the expression of an unknown, but potentially large number of other genes. No wonder the syndrome presents in such a diffuse way. We know now that a FOXP2 homologue is strongly expressed in the development of the mouse brain. So not only does it potentially affect many other genes, but it is known to be important in the development of the brain (by being strongly expressed in the brain of the mouse embryo). I expect that breaking FOXP2 in mice would result in some compromises to brain structure and function - an experiment that someone is sure to do.

But there's more. That was dated 2003. This summer (while Barbara and I were engrossed in house-cleaning and book-packing) there was a new development. MacAndrews adds this note to his post:

27th June 2005:  That experiment has just been reported.  (Shu et al, Altered ultrasonic vocalization in mice with a disruption in the FOXP2 gene (7)).  They report that: 'Disruption of both copies of the Foxp2 gene caused severe motor impairment, premature death, and an absence of ultrasonic vocalizations that are elicited when pups are removed from their mothers. Disruption of a single copy of the gene led to modest developmental delay but a significant alteration in ultrasonic vocalization in response to such separation. Learning and memory appear normal in the heterozygous animals. Cerebellar abnormalities were observed in mice with disruptions in Foxp2, with Purkinje cells particularly affected. Our findings support a role for Foxp2 in cerebellar development and in a developmental process that subsumes social communication functions in diverse organisms.'  

This is exactly as I predicted above, two years ago.

FOXP2 is not the mythical grammar gene, no matter what you may have read about it in popular accounts, especially those favoring the view that language is a genetically inherited attribute of our species. FOXP2 relates to motor routines of various sorts including some of those that are involved in speaking. But its effects are wide-ranging. One of the things the Vargha-Khadem paper reveals is that affected members of the KE family cannot even wipe their upper and lower lips with the tongue tip and put their tongue back in their mouth, so profound is the disruption of their tongue-movement motor routines. Licking your lips isn't grammar. As MacAndrews goes on to say in conclusion:

We should beware of popular reports of scientific discoveries: almost all the popular reports of FOXP2 claimed that it was the gene for language or even more ludicrously the gene for grammar - the truth is more complicated and far more interesting than that. There are many popular reports of scientific discoveries which are equally sensationalised.

No-one should imagine that the development of language relied exclusively on a single mutation in FOXP2. They are many other changes that enable speech. Not least of these are profound anatomical changes that make the human supralarygeal pathway entirely different from any other mammal. The larynx has descended so that it provides a resonant column for speech (but, as an unfortunate side-effect, predisposes humans to choking on food). Also, the nasal cavity can be closed thus preventing vowels from being nasalised and thus increasing their comprehensibility. These changes cannot have happened over such a short period as 100,000 years. Furthermore the genetic basis for language will be found to involve many more genes that influence both cognitive and motor skills.

Human mind needs human cognition and human cognition relies on human speech. We cannot envisage humanness without the ability to think abstractly, but abstract thought requires language. This finding confirms that the molecular basis for the origin of human speech and, indeed, the human mind, is critical. Ultimately, we will find great insight from further unravelling the evolutionary roots of human speech - in contrast to Noam Chomsky's lack of interest in this subject.

Steven Pinker's view about FOXP2 is that the fixed human-specific mutations in the gene might enable fine oro-facial movements and so trigger the development of language.

My personal view is that the breaking of FOXP2 in the KE family is more likely to have caused a cognitive deficiency during development in those affected rather than a purely physical deficiency in oro-facial motor skills, and that these motor deficiencies are a secondary phenomenon, perhaps caused by lack of use.

It will not be easy to unravel the pathways by which language evolved in humans. If we are to have any hope of doing so, we will need close collaboration between linguists and biologists, who have, until recently, been rather suspicious of one another.

Here's to such collaboration.

Posted by Geoffrey K. Pullum at 11:12 AM

Illuding memory

Well, it happened again. I found another unexpected word in a book I'd read before. Last time, I relearned the tules from Ross Macdonald's Black Money. This time, it happened on p. 5 of his 1946 novel Trouble Follows Me.

The point of view is that of Sam Drake, a Navy lieutenant on leave in Honolulu during WWII:

I got some ice at the bar and went back to the table where the bottles were. I made a little celibate ceremony out of mixing and drinking a double highball. I concentrated on the good sharp clean taste of the whiskey and soda, the feel of the ice against my teeth, the cold wet glass in the circle of my fingers. Then the small expanding glow in my stomach, spreading from there through my body like a blob of dye in a beaker of water, finally working into my brain, warming and coloring my perceptions.

The first stages of drunkenness are delicate, illusive and altruistic, like the first stages of love. I became very pleased with the bright disorderly room, the merry drunken laughter, the sweet chiding clink of ice in glasses, the confusion of shoptalk and woman-talk, war and love.

Illusive, I thought. Is that a word? Well, yes, according to the dictionaries: it means "illusory", just as the context suggests. How did I miss this one up to now? Worse, Google has 441,000 hits for illusive, compared to 777,00 for illusory, so it looks like illusive is alive and well, and I just haven't been paying attention.

However, in this case the counts are misleading. Many of the top-ranked hits for illusive are parts of trade names like Illusive Entertainment, Illusive Records, Illusive Rapsody Studios, Illusive Media, and so on. In the case of such names, it's hard to tell whether the namer intends the "illusory" meaning, or a non-standard spelling of elusive, or both at once. And most of the other hits seem to be misspellings or misunderstandings of elusive, such as these headlines:

China's Illusive Billion Customers
Metroid film remain illusive
What Makes a Problem Real: Stalking the Illusive Meaning of Qualitative Differences in Gifted Education.
Liberia's Illusive Dream of Democracy
DNA Search for Illusive Iberian Lynx
Camera Catches Illusive Thief In Act
Further Carnage On the Road to Illusive Peace
Why Peace Remains Illusive in Mindanao
The Illusive "Higgs"

In fact, it's hard to find any web examples where illusive is used as the dictionaries say it should be. I had to look through more than 200 examples to find the title "Fibromyalgia: Elusive But Not Illusive", and another 120 to find "Are Scale Economies in Banking Elusive or Illusive?" (The elusive/illusive contrast in these examples is not what I was looking for -- I just wanted any reasonably clear example of illusive meaning "illusory".)

Still, it's obvious that I must have come across illusive a number of times, in Trouble Follows Me if nowhere else. I guess that some sort of inhibitory effect from elusive must have suppressed it.

Posted by Mark Liberman at 09:59 AM

September 04, 2005

No pain, no gain?


Even distinguished academics suffer on occasion from the Etymological Fallacy, according to which they believe the current meanings and uses of words can be illuminated -- perhaps even explained -- by their etymological sources.  Here are William Damon, Anne Colby, Kendall Bronk, and Thomas Ehrlich on "Passion & mastery in balance: toward good work in the professions", from Daedalus, summer 2005, p. 28-9:

All professionals must learn a formidable array of skills, habits, and understandings to master their fields.  But beyond this, to accomplish good work consistently, they must acquire a special orientation, a commitment to use their mastery to fulfill a mission that goes beyond the self.  It is the pursuit of a mission that inspires passion.  This does not mean that pursuing a mission is always pleasurable: we do not agree with the pop psychology view that equates meaningful work with fun.  Indeed, the etymological root of 'passion' is passe - or 'to suffer.'  We are aware that pursuing a noble mission is often painful.  Yet it is satisfying in a way that routinized, fill-the-hours work is not.  Good work is always mindful of its mission; and passion, whether painful or pleasurable, both energizes the mission and provides and enduring emotional reward that goes beyond pleasure or pain.


It's been a while since we looked at appeals to etymology here at Language Log Plaza, and the most recent discussion (you can start with Mark Liberman on hallucinate and work back from there) was about appeals to false etymologies.  Damon et al. are at least correctly tracking passion back to a Latin element meaning 'suffer' (though they have balled up the details -- see below).  Still, this appeal to etymology is a very silly idea.

Etymology is fascinating, but the whole point with passion (and many other words) is that things have changed.  If I tell you that modern English head is directly descended from OE heafod, which meant, well, 'head', no flash of illumination will descend on you.  Not much has changed.  But people get all excited about metaphorical and metonymical changes, missing the crucial point, that in such non-head-like cases, things really aren't what they once were.  Modern English nice can be traced back to a word meaning 'ignorant', and silly to a word meaning 'blessed, holy', but knowing that provides me with no insight into modern English.  Why should the historical connection of passion with suffering be any different?

A digression on some of the etymological details...  The relevant Latin element is the verb root pat- (stem pati:-) 'suffer pain', whose present participle lies behind modern English patient, both adjective and noun.  Both adjective and noun patient have lost the component of literal suffering, in the case of the noun in favor of the sense 'one who undergoes, is affected by, an action' -- so that in semantics it now denotes a participant role in events (one prototypically conveyed by direct objects, as in I moved the box and I admire Kim, but also sometimes by subjects, as in Kim got the Nobel Prize and Kim was given 500 dollars).

The past participle stem for pat- was pass-, from which an adjective stem passiv- was derived.  This is the source of the modern English passive, adjective and noun, both of which maintain the 'undergo' sense of pat-, but not the earlier 'suffer pain' sense.  Note that, like patient, passive has developed a special grammatical sense -- for one type of construction, as in Kim was given 500 dollars, in which the subject denotes the, yes, patient participant in an event.

Now we're up to the Latin noun that is the ancestor of modern English passion: passio: (stem passio:n-), built on pass- with the abstract-noun-deriving suffix -io:n- (still to be seen all over the place in modern English).  (I don't know where Damon et al. got their verb stem passe, but they should have checked the OED.)

The noun stem passio:n- originally would have meant 'suffering', and indeed passion is still used in this sense in the very specialized context of the sufferings of Jesus (The Passion of the Christ, passion play, etc.).  But early on -- the OED Online draft revision of 2005 lays out these changes in some detail -- it developed not only an 'undergoing' sense ('fact or condition of being acted upon') parallel to that of patient and passive, a sense that seems to have gone out of fashion some 500 years ago, but also a separate extended sense, a generalization from experiencing pain to experiencing any sort of intense feeling or emotion, especially love or sexual desire (His voice was husky with passion), or, in another direction, enthusiasm or zeal (a passion for astrology), or, in still another direction, anger or rage (a fit of passion).

The result of all this semantic radiation, generalization, and specialization is that modern English passion has a variety of senses -- among them, love or desire, enthusiasm or zeal, and anger or rage (all attested from the 16th century on) -- that are not directly connected to one another and have nothing in particular to do with suffering.  It might be that love hurts, and that "pursuing a noble mission is often painful", but insofar as these claims are true, they're observations about the human condition, not about the meanings or histories of words.

The persistence of the Etymological Fallacy among intellectuals is in some ways deeply puzzling.  When considering aspects of culture other than language -- practices, customs, attitudes, beliefs, values, and so on -- intellectuals tend to fix on things that have held constant through history, things parallel to the English word for 'head'.  Cultural historians, for instance, will tend to see certain modern American attitudes as rooted in, and continuous with, aspects of the early American experience, like beginning a new life in a strange land and having an apparently limitless frontier to settle.  Isolated survivals are fascinating, but they are not appealed to as ways of illuminating or explaining the present by the past.

But when it comes to language, intellectuals incline towards a kind of essentialism: words have an essential core of meaning (discoverable by examining their histories), which persists through time. Possibly what's going on here is that a lot of people are taking the connection between words and their referents to be a natural, rather than a conventional, one; there's a lot of word magic around.  If so, linguists have work to do getting out the news about the arbitrariness of the sign.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 08:09 PM

Wordsmiths


Recently I've come across the odd word wordsmith in both of its current meanings, 'writer' and 'expert on words':

1.  "Paradoxical wit and wisdom from history's greatest wordsmiths", subtitle of Mardy Grothe's Oxymoronica (HarperCollins, 2004), a compendium of oxymoronic quotations by writers through the ages.

2.  "It could be said of modern wordsmiths, perhaps as much as for any other group of writers, that they 'stand on the shoulders of giants.' ", beginning of Jeffrey Kacirk's acknowledgments for Informal English (Simon and Schuster, 2005), a collection of "curious words and phrases of North America".

So I've been musing about why wordsmith strikes people as an appropriate (fancy) label for writers, and also about the variety of things that wordsmiths in the second sense do.


Wordsmiths in sense 1 are like coppersmiths and silversmiths: they craft things from some material -- copper, silver, words.  (Tunesmith and songsmith aren't entirely parallel compounds, since in them the first element denotes the thing crafted rather than the material from which it is crafted.)  The image here is that a language is just a "big bag of words", as Geoff Pullum put it in an early Language Log posting (with reference back to a 2001 note that he and Barbara Scholz published in Nature); the craft of writing is then a matter of having a very big bag and of picking well from it.  Writers are word-slingers.

The problem with this way of looking at things is that it puts all the emphasis on words as the raw material, and totally disregards syntactic constructions as another kind of raw material.  Syntax becomes a matter of technique, not material.  This distorted view is part of the folk conceptualization of language, however.

On to sense 2, which seems to be -- the OED is of no help here, so I'm speculating -- a semantic extension, from one specific kind of expertise with words, artful word-slinging, to a more general expertise, applying to all sorts of people who collect and display information about words.  Not to morphologists, it seems, but to lexicographers (who do this sort of thing professionally) and all kinds of word fans (from what we might think of as semi-professionals to enthusiastic amateurs).   The product of this wordsmithery is an enormous range of publications about words (and idioms): scholarly reference works, advice literature, entertainment.  Everything from the OED, though usage dictionaries and compendia of often-confused words, self-improvement literature (like the Reader's Digest feature "It Pays to Increase Your Word Power" -- note its very American reference to benefits configured in economic terms), the website wordsmith.org ("We are a community of more than 600,000 linguaphiles in at least 200 countries"), with its A Word A Day feature, and what I think of as the "ooh, shiny" literature, like Kacirk's book (from which you will carry away virtually nothing that you could actually use).  Now-obsolete words, dialect vocabulary, vogue words, taboo vocabulary, foreign expressions borrowed into English, jargon, word and idiom histories -- all these, and more, are catalogued for us.  And most of this material is aimed at a general, not specialist, audience.

Turn with me now to syntax.  Where are the comparable compendia of syntactic constructions?  Almost entirely in the specialist literature: in the big reference grammars of English, in college textbooks, and the like.  The usage dictionaries are organized, insofar as possible, by reference to specific words; to find out about English relative clauses, for instance, you'll probably have to look under the various relativizing words (that, which, who).  In material for a general audience, it's pretty much all about words.

I study syntactic variation, I think it's really cool stuff, and I'd like to communicate my enthusiasm about it to a more general audience, but I haven't yet figured out how to pull that off.  Everybody wants to hear about words, words, words.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 03:22 PM

Modes of emphasis

The attempts to explain the delayed response to Katrina are inspiring bloggers to generate lots of examples of how emphatic speech is rendered in English text: repetitions, italics, boldface, capital letters, multiple marks of punctuation, and taboo words.

I cited one case yesterday. After the AP quoted DHS Secretary Michael Chertoff

“We were prepared for one catastrophe,” Chertoff said. “The second catastrophe, frankly, added a level of challenge that no one has seen before.”

Josh Marshall responded with italics and repetitions:

Clearly, clearly, the hurricane and the flood were part of the same natural disaster. This isn't like a tornado being followed up by an earthquake. The flooding is part of the hurricane. Clearly, clearly, the hurricane and the flood were part of the same natural disaster. This isn't like a tornado being followed up by an earthquake. The flooding is part of the hurricane.

This morning, CNN quotes FEMA director Michael Brown

"Saturday and Sunday, we thought it was a typical hurricane situation -- not to say it wasn't going to be bad, but that the water would drain away fairly quickly," Federal Emergency Management Agency Director Mike Brown said today. "Then the levees broke and (we had) this lawlessness. That almost stopped our efforts." ...
"Katrina was much larger than we expected," he said.

and Brendan Loy responded (in a post entitled "You have got to be kidding me")

No one -- NO ONE -- who knows anything about New Orleans's geography and topography and levee system would ever have thought for a single moment on Saturday and Sunday that Katrina, if it followed the predicted path, was going to be a "typical hurricane situation." Jesus Christ!! For how many years now has this article been out there?!? And this one? And many more like them? Did Michael Brown never read them? Was he not familiar with the science? Was FEMA's director unaware of what has been acknowledged for many years as the #1 most serious natural disaster threat in all of America?!?

If the braintrust running this country really thought that "the water would drain away fairly quickly" after a direct hit on New Orleans from a major hurricane, then my God, our country is run by the most absolutely incompetent bunch of nitwits imaginable. The city is below sea level. Once it floods, there is nowhere for the water to "drain away" to! Everyone knows this!!!

Wasn't the government studying this stuff? Didn't they have all sorts of "war games" and disaster drills, because they had been told by the scientists that the exact scenario that was being predicted on Saturday and Sunday would produce a catastrophic, months-long flood in New Orleans? So what the FUCK is Michael Brown talking about? (I'm sorry, I don't swear on my blog very often, but this is just absolutely fucking ridiculous.)

Actually, the emphasis has been growing for a while. On Friday 9/2/2005, Ed Oswald at WeatherBlog complained about the people who say that local officials should have called for an evacuation earlier:

Others say lets call for an evacuation on Thursday or Friday. That's ridiculous. In this day and age, you just can't call for evacuations 48 hours out period. Because if you're wrong, YOU DO MORE DAMAGE TO THE WHOLE SYSTEM, and next time people won't take the evacuation orders seriously.

Also on Friday, Andrew Sullivan posted a letter from a Las Vegas policeman that included passages like these:

Ask yourself this: What if Al-Qaeda blew up the levees instead of the hurricane? Would the response have been any different?

No. It wouldn't. That city flooded in a day. And if it were Las Vegas, I would have been in some operations center watching people try to decide who gets to starve to death and who gets to get on a bus to Los Angeles or Phoenix. And there would be no certainty that I'd be on that bus in time to protect my wife and kids.

But one thing sure would have been different.

They wouldn't have had a whole week to sort it out and know what's coming. They were supposed to KNOW this already. It will have been FOUR YEARS next weekend since someone probably said, "Hey, what if..."

And for that, the whole stack of them should be fired.

There's one typographical mode of emphasis that I haven't yet seen in the Katrina blame-blogging, though I'm sure it's out there somewhere: individual words presented as typographically separate sentences, like "The. Worst. Ever."

The use of repetitions and taboo words in text is pretty much a direct reflection of their use in speech. However, the other typographical modes of emphasis are at best an indirect way of alluding to the many ways of highlighting things in speech: greater loudness, slower speech rate, increased pitch range, harsher voice quality, off-setting pauses, hyperarticulation and so on. The vocal expressions of emphasis are generally natural rather than arbitrary -- we don't expect to find a culture where you emphasize things by mumbling -- but some of the naturalness is audience-related (because of increasing the perceptual salience of the emphasized material) and some of it is speaker-related (because of expressing things like level of physiological arousal and strong positive or negative emotion).

At least with respect to the audience-related aspects of emphasis, the typographical interventions can have their own life, independent of any attempt to represent spoken modes of emphasis. Increased font size, type face, color, spacing and so on have their own effects on the visual salience of the text presented. When I was growing up, the stereotype was that typographical indications emphasis -- especially italics (or underlining) and multiple marks of punctuation -- were a feminine thing. It seems pretty clear that this has changed.

Anyhow, the discussions of Katrina are a rich source of examples of emphasis. I've mentioned the textual ones because reading Brendan Loy's blog this morning brought them forcefully to my attention, but there are no doubt plenty of spoken examples these days as well, in media coverage and in podcasting.

Posted by Mark Liberman at 07:52 AM

September 03, 2005

A parse for the ages

According to an AP story on a news conference today by Michael Chertoff, Secretary of Homeland Security:

“We were prepared for one catastrophe,” Chertoff said. “The second catastrophe, frankly, added a level of challenge that no one has seen before.”

Josh Marshall at Talking Points Memo comments: "this is truly a parse for the ages", adding

Clearly, clearly, the hurricane and the flood were part of the same natural disaster. This isn't like a tornado being followed up by an earthquake. The flooding is part of the hurricane.

This is a slightly different spin on the modern notion of parsing as extremely careful examination . In the examples we looked at earlier, "parsing" sometimes involved overly precise or overly literal interpretations, calculated to mislead people ("We obfuscate when we parse the meaning of ‘is’ or ‘name’ to cover our actions rather than illuminate them"), and sometimes meant nothing more than careful analysis ("The blogs were quick out of the box this morning in parsing President Bush's choice of John Roberts to fill the Supreme Court seat left vacant by Sandra Day O'Connor's retirement").

Instead, Marshall is using parse to mean "divide a phenomenon up into parts". He quotes a post by Jon Cohn

"Chertoff says this was a unique, unpredictable one-two punch -- of a hurricane *and* a flood from a breached levee -- that nobody anticipated."

and comments

I actually thought I heard him parse it into three events. But I was writing as I listened; and press reports bear out Jon's recollection.

This is a pretty common usage in scientific and technical writing, for example:

Why do people parse action streams into discrete events?
I have developed some good SQL and PHP scripts to parse tags into reportable events.
The event server will parse users’ requests into basic events and store them into its event database.
The basic idea is to parse the MIDI file into a list of events which are displayed in a buffer, one line per MIDI event.
parse a base-10 numeric string into a 64-bit numeric value

Some of these may even be literal rather than metaphorical examples of parsing, though it's clear on balance that we linguists have lost the trademark on the word parse -- it's become a generic product, used by all and sundry to refer to any sort of analytic activity at all.

As for what Chertoff actually said, the press conference that the AP was quoting is available from CSPAN, but I haven't seen any transcripts. So as a service to the parsing of parsing, I present some bits of it below, focusing on the "two catastrophe" aspect. (The whole press conference is more than 39 minutes long, so this is just a small part of what was said.)

Let me start by pointing out that the AP's quote is relatively accurate, as journalistic quotations go:

                        We were      prepared for one catastrophe. 

Uh in this case I think we were well prepared for one catastrophe. The second catastrophe, frankly, added a level of challenge that no one has seen before. I think the second catastrophe, frankly, was a- was a- uh it added a level of challenge that no one had seen before.

My own reaction to the press conference is that the key question is not so much how many catastrophes there were, but whether the rapid flooding of New Orleans was what Chertoff calls a "reasonably foreseeable" consequence of a hurricane like Katrina.

At the start of the question period, Chertoff is asked

Uh two quick ones. One is why were military assets not brought in earlier? ((And)) the second question ((to)) follow-on is, given that catastrophic event planning has been a central plank of the federal government since nine eleven, why is something as basic as evacuation plans seem so chaotic?

Chertoff answers

Uh in this case um I- I will tell you that uh the way these catastrophes unfolded is unprecedented any- in anybody's experience. We had *two* catastrophes. We had a- a uh category four hurricane, that was followed the next day by the- really the collapse of a levee. Not merely a breach, though we had a number of breaches, but really the demolition of three hundred uh feet of the levee, which essentially turned uh New Orleans into a lake the day after the hurricane.

Uh I can't think of another incident, even the tsunami, which presented this combination of events. Uh it's as if the tsunami, we had to do the rescue while the water was still there in the tsunami. So under those circumstances, I think we uh have discovered over the last few days that with all the tremendous effort using the existing resources and traditional frameworks of the National Guard, um this- the uh unusual set of challenges of conducting a massive evacuation in the context of- of a still dangerous flood requires us to basically break the traditional model and create a new model, uh one for what you might call uh kind of an ultra-catastrophe. And that's one in which we are using the military, still within the framework of the law, to come in and really uh handle the evacuation, handle all of the associated elements, and that of course frees the National Guard up to do the security mission. So uh this is really one which I think um uh was breathtaking in it- in its surprise.

That comes to your second question. Uh there has been a lot of planning for catastrophes. I will tell you that the um there w- there has been speci- over the last few years some specific planning for the possibility of a uh significant hurricane in New Orleans, with a lot of rainfall, with water rising in the levees, and water overflowing the levees. And that is a uh a very catastrophic scenario, it's probably uh in- in itself it's considered one of the fifteen kind of great um template catastrophes that you plan for. And although the planning was not complete, a lot of work had been done.

But there were two problems here. First of all, um it's as if someone took that plan and dropped an atomic bomb simply to make it more difficult. We didn't merely have the overflow, we actually had the break in the wall. And I will- I will tell you that really- that ca- that perfect storm of combination of catastrophes um exceeded the foresight of the planners, um and maybe anybody's foresight.

To make matters worse, the storm itself um was uh unusual in its course. It began as a comparatively low power storm, it crossed Florida, it wasn't until comparatively late uh shortly before- a day, maybe a day and a half before landfall, that it became clear that this was be- gonna be a category four or five hurricane headed for the New Orleans area. Um in advance of that, recognizing the danger, the president um moved forward and declared uh states of emergency, which is a very unusual thing, in those states in the gulf. We began to preposition and move assets um as early as possible when we realized that that hurricane was coming in.

But with all that, um the body- the kind of- the knock-out punch that Mother Nature gave us was that breakdown of the levee and the swamping of New Orleans. And I have to tell you, I've now spent a fair amount of time talking to people who do this and nobody can come up in living memory with a pair of disasters like this. ((And)) by the way, on top of which we have the obliteration of a- of significant parts of the Mississippi gulf coast and a whole lot of associated uh problems with respect to infrastructure, ((of)) course the surrounding parishes in- of- of New Orleans are also under water Uh this um perhaps reminds us that with all of our planning and our modern technology and our- our confidence in our ability to master nature, when Mother Nature really wants to strike at us um she is a very very tough opponent.

So, a little long winded but I think that answers your questions.

A minute or so later, he is asked

Uh Secretary, stepping back, given that the system doesn't seem to have been able to respond adequately to um uh an event that has been hypothesized and planned for, what confidence can the administration give the public that um DHS or the government is ready for a terrorist strike or a WMD strike that it cannot uh predict?

and answers

Well I- first of all I- I don't know that I agree ((uh is)) that it can't- we haven't responded to something that was hypothesized and planned for, I think the problem is we had two events that have been hypothesized that occurred simultaneously. Um and I guess that does- uh you know indicate that at some level, with all the planning and all the resources if a truly catastrophic event, if an ultra-catastrophe occurs, there's going to be some harmful fallout.

Um what we wanted to do is do the best we can to respond um to mitigate and to recover. Um we'd ideally like to do it perfectly, we'd ideally like to be able to do it instantly, um but uh what we need to do is get closer and closer to that ideal.

Uh in this case I think we were well prepared for one catastrophe. I think the second catastrophe, frankly, was a- was a- uh it added a level of challenge that no one had seen before. But I have to say I think that with that, um everybody has been for- performed magnificently in stepping up to this increased challenge uh reaching out for more assets improvising additinal measures that allows us to deal with what nature has dealt to us.

A couple of minutes later, he's asked

Why- why should you not have anticipated both events coming ((and seen it ??))?

and answers

You know, if we had an atomic bomb on top of this, and you know I mean we could pile on catastrophes, um whenever you do a planning process, you have to deal with what is reasonably foreseeable.

It is true that you can sometimes have a combination of things that- that are reasonably foreseeable, but that combination is unreasonably foreseeable. You know, that's why they wrote- someone wrote a book called "The Perfect Storm".

Um the answer is that uh the planning that was done I think was- was although not completed was a very good plan for what was reasonably foreseeable. I think that- that this major breach, not merely an- an- an overflow but this major breach of the levee, um while something itself that might have been anticipated, coming together, I think was outside of the scope of what people reasonably foresaw.

I'm not even close to being in the disaster management business, but I certainly knew that a rapid and complete flooding of New Orleans as a result of a hurricane strike was a central feature of many published scenarios over the years, including this 2001 Scientific American article, which gave me the willies when I read it four years ago. That article doesn't discuss the exact mechanisms whereby a storm surge would flood the city, but it certainly predicts that it will happen. And while Katrina was heading across the gulf, I was riveted by dire predictions about how bad it was going to be, including predictions that the city would be rapidly and completely flooded by the storm surge. I don't know for sure whether anyone specifically predicted that levees would fail rather than merely being overtopped, but I expect that any competent civil engineer, if asked about the situation, would suggest this as a strong possibility, given that we're talking about concrete floodwalls whose foundations rest on dewatered soil.

And once we start getting into discussions about whether anyone "anticipated" that the levees would be "breached" -- and whether this matters or not to the stages of the flooding seen so far -- we're talking about another kind of parsing, one that involves making fine distinctions about the exact meaning of words.

Posted by Mark Liberman at 07:55 PM

September 02, 2005

The Great Tradition

Mark may well be right to suggest that Dryden was " the first poet to use the excretion of bodily wastes as a metaphor for the deprecated expression of ideas," but the association is implicit almost a century earlier in Montaigne's essay "De la vanité."

The essay begins:

Je ne puis tenir registre de ma vie, par mes actions : fortune les met trop bas : je le tiens par mes fantasies. Si ay-je veu un gentil-homme, qui ne communiquoit sa vie, que par les operations de son ventre : Vous voyiez chez luy, en montre, un ordre de bassins de sept ou huict jours : C'estoit son estude, ses discours : Tout autre propos luy puoit. Ce sont icy, un peu plus civilement, des excremens d'un vieil esprit : dur tantost, tantost lasche : et tousjours indigeste. Et quand seray-je à bout de representer une continuelle agitation et mutation de mes pensees, en quelque matiere qu'elles tombent, puisque Diomedes remplit six mille livres, du seul subject de la grammaire? Que doit produire le babil, puisque le begaiement et desnouement de la langue, estouffa le monde d'une si horrible charge de volumes? Tant de paroles, pour les paroles seules.

Here's the passage in Charles Cotton's rather stiff translation of 1877:

I can give no account of my life by my actions; fortune has placed them too low: I must do it by my fancies. And yet I have seen a gentleman who only communicated his life by the workings of his belly: you might see on his premises a show of a row of basins of seven or eight days' standing; it was his study, his discourse; all other talk stank in his nostrils. Here, but not so nauseous, are the excrements of an old mind, sometimes thick, sometimes thin, and always indigested. And when shall I have done representing the continual agitation and mutation of my thoughts, as they come into my head, seeing that Diomedes wrote six thousand books upon the sole subject of grammar? What, then, ought prating to produce, since prattling and the first beginning to speak, stuffed the world with such a horrible load of volumes? So many words for words only.

In other words, Montaigne seems to be saying, bullshit was the invention of grammarians.

(Note: A note in Cotton's translation explains that "It was not Diomedes, but Didymus the grammarian, who, as Seneca (Ep., 88) tells us, wrote four not six thousand books on questions of vain literature, which was the principal study of the ancient grammarian... But the number is probably exaggerated, and for books we should doubtless read pamphlets or essays.")

Posted by Geoff Nunberg at 07:01 PM

After dinner with those constitutional ladies in Iraq

There's an odd phrase in today's NYT story by Dexter Filkins "Ex-Rebel Kurd Savoring Victory in Iraq's Politics":

The old Kurdish guerrilla leader is savoring his most recent victory, won not on the field of battle but in the arid drawing rooms of Baghdad's constitutional convention.

I think of "drawing room" as an old-fashioned or regional name for certain rooms in private houses otherwise known as "living rooms" or "parlors". Am I missing a meaning, or is Filkins using the term in an odd way?

The OED offers some additional meanings related to Victorian-era sexual segregation, railroad cars and the presentation of ladies at court:

1. a. orig. A room to withdraw to, a private chamber attached to a more public room (see WITHDRAWING-ROOM); now, a room reserved for the reception of company, and to which the ladies withdraw from the dining-room after dinner.
b. The company assembled in a drawing-room.
c. U.S. Formerly, a section or carriage of a railway-train more luxurious or more private than usual. Also attrib.

2. A levee held in a drawing-room; a formal reception by a king, queen, or person of rank; that at which ladies are ‘presented’ at court.

But none of these offer any direct help in interpreting Filkins' phrase.

The American Heritage Dictionaries doesn't help either, defining drawing room as

1. A large room in which guests are entertained. 2. A ceremonial reception. 3. A large private room on a railroad sleeping car.

The literal meanings in Merriam-Webster's 3rd International are also no help:

1 a archaic : a room to which one may retire for privacy or rest : CLOSET; especially : one adjacent to public apartments b obsolete : a room or apartment forming a private part of the suite of a person (as a king) living in state and being often the setting of various informal activities or gatherings c : a more or less formal reception room (as in a home or hotel); especially : the room to which ladies withdraw from the dining room — compare PARLOR, SITTING ROOM d : a private room on a railroad passenger car with three berths and an enclosed toilet
2 : a formal or ceremonial reception; especially : one that is an official function of a royal court *made her curtsy at the queen's last drawing room*

Current uses on the web suggest that "drawing room" is the preferred term in South Asian English (and maybe in some other places) for a private home's big room for socializing:

Neha also warns that while there may be no major conflicts, one should be prepared for unusual stuff like someone wanting to play cricket in the drawing room at three a.m while someone else wants to sleep. [Deccan Herald (link)]
In a drawing room almost the first question is, "What’s happening?" [The Asian Age, Karachi (link)]

But as far as I can tell from the news stories, the official meetings to discuss the Iraqi draft constitution have not been held in private homes, nor in railroad cars, nor in royal salons for presenting ladies, but rather in the "convention center" in Baghdad's Green Zone. For instance, an 8/28/2005 LA Times article says that

Inside the Baghdad Convention Center, where negotiations took place Saturday morning and afternoon, Shiite and Sunni Arab leaders conferred amicably, despite the apparent deadlock.

By now I imagine that you're sputtering "but it's a metaphor, you dope!" And the OED even tells us that "drawing room" can be

[u]sed allusively to qualify a version of a story, etc., fitted by its observance of the proprieties for the society of the drawing-room.

Likewise, Merriam-Webster offers

3 b drawing rooms plural : people of substance and position accustomed to formal living : polite society *the report of her elopement shocked the drawing rooms*

Why were the drawing rooms in particular the symbol of polite society? According to 19th-century dining etiquette for the upper classes,

It was customary for the hostess and ladies to retire to the adjoining drawing room at the end of the meal leaving the men to their own discussions and to drink and smoke. Later in the evening the men would rejoin the ladies in the drawing room for conversation and card games and tea would be dispensed.

So Filkins means to tell us that battle is to constitutional convention as men drinking and smoking after dinner is to women waiting for them in the drawing room. OK, I'm glad that's cleared up.

Now an exercise for the reader: why are those drawing rooms arid?

Posted by Mark Liberman at 03:32 PM

September 01, 2005

When "there's" isn't "there is"

In response to yesterday's post on Mayor Nagin's remark about how "there's way too many frickin' -- excuse me -- cooks", Arnold Zwicky pointed out by (slightly edited) email that

"there is" + <plural noun phrase> is indeed nonstandard (and somewhat more common in the south and south midlands than elsewhere, I believe -- I'm away from my sources on this today) , but "there's" + <plural noun phrase> should really be characterized, in current English, as merely informal/colloquial, rather than nonstandard. Millions of people (like me) who wouldn't use "there is two people at the door" are entirely happy with "there's two people at the door". So the two versions differ not only in emphasis and/or formality, but also (for many of us) in standardness.

If Arnold is entirely happy with it, it must be standard, right? Seriously, I think I agree with him, but it might not be clear what we agree about. What does it mean to say that something is "informal" but "standard"? And how can we tell whether it's true?

Well, informal here means something like "Not formal or ceremonious; casual; ... more appropriate for use in the spoken language than in the written language". And nonstandard means something like "Associated with a language variety used by uneducated speakers or socially disfavored groups". So Arnold is saying that phrases like "there's too many cooks" are now commonly found in the speech of educated Americans from all strata of society, and also in casual kinds of writing like email, blog entries and informal essays. In contrast, a phrase like "there is too many cooks" remains the sort of thing that educated Americans from higher socioeconomic classes tend not to use, except as a self-conscious gesture towards a kind of language not natural to them.

To investigate this, we could do a sociolinguistic survey, either by asking people for their reactions (though these are often untrustworthy), or by looking at their actual patterns of usage. It's possible that someone has done this -- if I find out about it, I'll add a reference here. But as Yogi Berra once said, "you can observe a lot just by watching." And the indexed web makes it possible to do quite a bit of linguistic watching from the comfort of your computer.

Google News this morning gave me 36 examples of the string "there's several". All but four of these were in informal contexts like quotes, transcripts or blogs, and many if not most of them seem to come from the kinds of speakers whose usage defines what is "standard" in American English. For example:

...there's several downtown high rises that are windowless... [Peter Whoriskey, Washington Post reporter, in a PBS Newshour transcript]
There's several things.
[Colorado Congressman Joel Hefley, in a Western Skies interview transcript]
I mean, I think there's several reasons the presidents say they look at it.
[Gene Bartow, basketball coach (formerly UCLA, now UAB), quoted in the Memphis Commercial Appeal]
There's several things we would not have to do and it still would be a very nice place to live.
[Blake Robbins, vice president for work-force development at the Decatur-Morgan County Chamber of Commerce, quoted in The Decatur Daily]
There's several individual properties that have been recommended as eligible and a couple of districts
[Joni Jordan, grants coordinator for Conway, SC, quoted in the Myrtle Beach Sun News]
When you go into middle schools, maybe they have a few vending machines, but in high schools there's several walls of those big Coke and Pepsi machines
[Margo Wootan, director of nutritional policy at the Center for Science in the Public Interest, quoted in the NYT]

By comparison, Google News gives us only 2 genuine examples of "there is several", perhaps created by misguided copy editors uncontracting "there's".

This contractional contrast is not because journalistic sources normally contract "there is" 95% of the time. If we ask Google News this morning about "there is a difference", we get 721 hits, compared to 652 for "there's a difference", for a contraction rate of 652/(721+652) = 47%.

So far, so good -- "there's several" occurs fairly often in current journalism, but overwhelmingly in quotations, transcripts or casual writing, while "there is several" is a great deal rarer. This is just what we expect if Arnold is right.

We can come at this from another direction, based on counts from the web as a whole. Contraction rates in the pattern "there is a ___ difference", are quite different depending on what adjective is plugged in. The counts suggest (unsuprisingly) that contraction in general is commoner in informal contexts:

 
there's a big difference
there is a big difference
is/'s
Ratio
there's a large difference
there is a large difference
is/'s
Ratio
there's a considerable difference
there is a considerable difference
is/'s
Ratio
Google
154,000
298,000
1.98
3,700
35,600
9.62
224
22,200
99.1
Yahoo
879,000
1,270,000
1.44
1,630
73,800
45.3
598
53,900
90.1
MSN
111,172
189,633
1.71
805
18,461
22.9
434
12,011
27.7

(A side note: The three search engines all show the same trend, but the ratios as well as the counts are quite different in some cases. It would be interesting to try to figure out which of the obvious factors are most responsible for this: differences in what parts of the web are indexed; differences in how duplicates and "black hat" sites are screened out; differences in how hits are counted/estimated; etc.)

When we look at the pattern "there ___ several", limiting the choice to (contracted or uncontracted) is and are, we see that the plural is much more likely to be chosen than the singular, even in Google Groups (which indexes a large archive of newsgroup posts). The singular seems at least as likely to be chosen in journalistic sources as in the web at large:

  there's several there is several
is/'s ratio
there're several there are several
are/'re ratio
plural/singular ratio
Google
62,600
26,700
0.43
7,260
21,000,000
2,893
235
Yahoo
193,000
94,800
0.49
71,100
43,500,000
612
151
MSN
39,101
13,574
0.35
4,584
10,687,863
2,333
203
Google News
36
2
0.06
2
4,210
2,105
111
Google Groups
26,500
6,360
0.24
1,570
952,000
606
29

However, when the singular is chosen, it's much more likely to be in the contracted form, even compared to relatively informal sequences where singular agreement would be expected, like "there's a big difference". This effect appears to be somewhat larger in newsgroups, and overwhelming in the sources indexed by Google News.

Again, this is all consistent the view that "there's" + <plural noun phrase> is now informal/colloquial in current (American?) English, but not nonstandard, while "there is" + <plural noun phrase> remains nonstandard.

This leaves me with many questions. Some are specific to this case: What's the geographical story? What's the history? Is there a similar situation for any other examples of "is" with plural subjects? Others questions are more general: what do patterns like this tell us about what linguistic knowledge and how it's learned and used?

It's clear that there are somewhat similar patterns with here and where:

Here's a few thoughts for Stuart James to ponder, if he's waiting by the phone for the Mayne board to ring and ask him to run the demerged global pharmaceuticals business. [Sydney Morning Herald (link)]
Where's all the stories about breaking curfew and carousing until all hours of the night? [The Journal News (upstate NY -- link)]

The example from Australia suggests that this is not just an American pattern. Note by the way that with where, main verb is seems to work differently from the progressive auxiliary is: for me, "where's the kids going?" seems worse than "where's the kids?", and "*where's they going?" seems more non-standard than the form with to be deleted ("where they going?").

I don't know much about the history. I'll observe for now that it seems to be an old story to use "is" and "are" promiscuously in there-sentences with plural logical subjects. A few minutes poking around turned up Randle Holme (1627-1699) "An Accademie of Armory OR A Store House of Armory & Blazon Containeing all thinges Borne in Coates of Armes Both Forraign and Domestick. With the termes of Art used in each Science", which is chock full of examples of both kinds. Here is a small sample of the many instances in this work of the string "there is several":

THE next is the Cross Moline, a Cross both in nature and shape far different from any as yet presented to your view, from which form there is several others derived, yet of a contrary term in Blazon, as in the examples following.

Small Water Yarrow groweth much after the same manner, having five or six joints in the stem, at each of them there is several fine small green and winged leaves ...

Heads or husks of Flowers, are those things, out of which Flowers grow, of which there is several shapes, forms and fashions ...

THere is several things belonging to an Horse, and Horsemanship, ...

They have several names according to their office, and imployment: and there is several mixt kinds of them, being Mungrells.

This is called an Ape, a Iack-an-Apes, of which there is several sorts· all of them being of a sad brown, or Mouse-colour.

and a sample of the roughly equal number of hits for "there are several":

There are several other ways of bearing of Pales, both charged and otherwise, but in regard they are in every respect answerable to the Bend ...

THere are several Crosses alike in shew, yet are different in name; and others are very near in Name, yet far different in shew and form ...

THERE are several other partitions of Feilds, by which the Coat of Arms is Blazoned, but in a more obscure way ...

WE shall in the next place proceed to things produced in the Element of Air, in which there are several and various Products, which are born in Coats of Arms, and are such as follow, with their like.

There are several sorts of stones besides these; for in strickness stones are no more then earth hardned, and the softest is called Greet or Grit, which being ground small becomes Sand; being more grosser or courser we call gravel.

Some of the variation is in very similar pairs of examples, e.g.

There is two sorts of Lillyes, the one groweth with turned leaves half round much after the nature of the turn Cap: the other with leaves upright after the Tulipa, but sharper at the stalk ...

Baths, are Springs of Water in which Sick and Infirm people wash and bath themselves: of which there are two sorts the hot Bath, and the Cold bath.

As for the psycholinguistic questions, they'll have to wait for another time.

Posted by Mark Liberman at 06:58 PM

Park it if you can

The sign on the right was found on Dave Barry's blog, who attributes it to "Laura" with a citation to Christoblog. A good reason to avoid some driving, and donate what you save to disaster relief.

At least gas signs are still just three digits long.

Posted by Mark Liberman at 06:04 PM

Finding vs. looting

I definitely don't think Bill's use of exotic makes him a racist, under any reasonable definition of either "exotic" or "racist". If you want see a better (and very topical) example of racism probably governing word choice, check out this juxtaposition of photos and captions from New Orleans.

[

Update, 9/2/2005. Not too surprisingly, several people have written in response to this post. I address here what each of these folks had to say in the order they wrote to me.


I first heard from Melissa Fox:

Man, that's harsh. (I notice that one is AP and one is AFP, so I take a very very very small bit of solace in the idea that it's not likely the same writer responsible for both captions.)

Three of the four folks who have written so far make this point, as you'll see below. I did notice this before I posted the link, and (perhaps wrongly) figured that this was rather obvious. It's true that the juxtaposed web-clippings are identically formatted -- they are both clipped from Yahoo! News -- but I think it is much clearer what their respective sources are (big AP and AFP logos above the captions) than it is that they were republished on Yahoo! News (very small-type mention, practically cut off the clipping, at the bottom of each).

I agree with Melissa about taking only "a very very very small bit of solace" from this fact. My point was not to point the finger at Yahoo! News or at any organization in particular (or to accuse anyone of malicious intent, as someone else suggests below); my personal interest is in the evidence of systemic racism, which I think is very well demonstrated by the photos/captions.

Melissa continues:

Possibly-interesting side note: I've been seeing the AP photo of the black kid in the water in the Washington Post in photo arrays over the past couple days, where the caption says he's wading through chest-deep water after 'exiting' a grocery store.

(follow-up in a later message ...)

(Just double checked it) -- it's not possible to link directly to pictures at the Washington Post, but from the front page (http://www.washingtonpost.com - though this requires harmless spam-free registration), if you click on "The Latest on New Orleans Exodus" and then choose the "Aftermath" tab, it's #14 in that album: "A young man walks through chest deep flood water after exiting a grocery store in New Orleans on Tuesday." Photo by Dave Martin of AP, black kid in a yellow shirt with a case of soda under one arm and towing a trash bag -- same shot, right?

Yes, it is. When I found the picture today, the link from the main page is labelled "Despair Amid the Ruins", but the rest is as Melissa describes it.


Next, a correspondent who has not yet responded to give me permission to cite them explicitly wrote:

Yahoo News republishes wire reports verbatim, and the two photographs you show came from two different news agencies.

This is the same point made by Melissa, as noted above.

Racism is a plausible explanation for the difference in captions; on the other hand, so is only one of the photographers witnessing the act, so I'd like to know how you determined the former to be more probable.

It's not calling the black person a looter that bothered me; it's choosing not to call the white people looters, but to say that they "found" the stuff they had instead. I don't immediately see how the "witnessing the act" explanation would account for this.

Here are the AFP's waders and some more "Looters":

http://news.yahoo.com/photo/050830/photos_ts_afp/050830071810_shxwaoma_photo1
http://news.yahoo.com/news?tmpl=story&u=/050831/photos_wl_afp/050831100019_zhpvj5dd_photo3

Hey, those people are all... in front of a store!

Follow the first link, and you'll note that the "photo was removed from Yahoo! News at the request of AFP" and that there's a link to a "Yahoo! News statement on photo language controversy."

I don't know what to say about the correspondent's rhetorical comment on the second photo/caption. Again, my problem is not specifically with the choice of the word "looters", but with the distinction between "looting" and "finding", as the title of my post says explicitly, and with the fact that the former is attributed to blacks and the latter to whites.

Finally, this same correspondent provides contrast between two AP photos/captions on Yahoo! News (ignoring the "[i]f it's racism you want" quip):

If it's racism you want, the contrast between these AP photos is more convincing:

http://news.yahoo.com/photo/050830/480/ladm10208301530
http://news.yahoo.com/news?tmpl=story&u=/050831/480/wxs11908310010

The first photo/caption is the original "looter" (again, with a link to the Yahoo! News statement on photo language controversy) and the second is a shot of a white person "look[ing] through their shopping bag" and a black person "jump[ing] through a broken window" outside a convenience store, and neither of them is called a "looter".

Is this a better contrast than the original, though? They both originate with AP, but it's entirely possible that they came from different photographers/caption writers. I can easily imagine having posted a link to a juxtaposition of these two photos and having someone write to me making this (to me, completely irrelevant) point.


Chris at Mixing Memory writes to point out something similar to what Melissa noted about the Washington Post's caption of the same "looter" picture, but this time about CNN:

Hi, I read your post on "finding" vs. "looting." I was amazed at the language, as well. Racism is rarely that overt, though it pervades the media and popular culture all the same. Anyway, I thought I'd note that CNN, which picked up the photo of the young black man carrying groceries from the AP, took the word "looting" out. Their caption simply reads, "A young man drags groceries through chest-deep water in New Orleans on Tuesday." I was happy that at least one major news organization avoided explicitly racist portrayals.

CNN's pictures are here: http://www.cnn.com/2005/WEATHER/09/01/katrina.glance.9.1.ap/index.html (scroll down to "Related," and click on the "After Katrina" gallery).


Finally, another correspondent who has yet to give me permission to cite them explicitly writes to point out the fact that the photos and captions are from different sources:

I saw your recent posting on the excellent Language Log in which you commented on the finding vs. looting graphic that's been making its rounds over the past couple of days.

It *is* an interesting observation, and one worth drawing attention to, but it should also be qualified that the two stories are from two different news agencies (one is AP, one is AFP), and thus it's possible that the language chosen has as more to do with individual reporters or editors writing style than with any prejudicial agenda.

I do think it's important to be on the lookout for discriminatory bias in language (which often is subtle in its expression); I also think one should be wary of exaggerating cases where there's inconclusive evidence of malicious intent.

Again, I wasn't accusing anyone of malicious intent. Unfortunately, this is the real problem with discussions about racism: pointing out probable evidence of systemic racism in society is confused with an accusation of intentional racism by an individual (or well-defined group).

]

[ Comments? ]

Posted by Eric Bakovic at 04:00 PM

Political Correctness Beyond Thunderdome

One of the most extreme examples of political correctness gone wild that I have encountered came a while back when I was accused of racism for using the word exotic. Here is the passage in which I used it:

With the exception of Cree and Saulteaux, which have rather simple sound systems, the First Nations languages of British Columbia contain quite a few sounds that are exotic, and without instruction and practice, unpronounceable, from the point of view of the English speaker.

This passage appears near the top of page 15 of my paper The Names of the First Nations Languages of British Columbia.

I submit that this is a perfectly correct, innocuous, and appropriate use of the term. The sounds in question (ejectives, lateral fricatives and affricates, uvular stops and fricatives, and pharyngeals, among others) may indeed be considered as:

being or from or characteristic of another place or part of the world
(from the point of view of English-speaking, and indeed Chinese- and Punjabi-speaking, Canadians) and they are also, from the same perspective:
strikingly strange or unusual
which are the two definitions provided by Wordnet. The American Heritage Dictionary gives almost identical definitions, along with a third sense "of or involving striptease",that is not relevant.

The paper in question has, moreover, been read by any number of First Nations people (that's the politically correct term for Indians in Canada), among them at least three lawyers, several prominent politicians, and several close friends, who would not hesitate to point out anything they found objectionable. There hasn't been a peep out of any of them.

So, what is behind this? Does the author of the complaint simply not know the meaning of the word? That's hard to believe. Much as I'd love to, I won't embarrass this twit by naming him, but he is a full professor (not of Linguistics, thank goodness) at a major Canadian university, an editor of a respectable if not particularly illustrious journal, and a native speaker of English. It is very difficult to believe that he managed to read the passage as meaning that I consider these sounds to be associated with striptease and am thereby casting aspersions on them and the speakers of the languages that contain them, or that in his variety of English the term has some radically different meaning and that he is unaware that this differs from that in general use.

It is also worth noting that I did not say that I myself consider these sounds to be exotic. I made a claim, one utterly uncontroversial among linguists and easily confirmed, that most British Columbians find these sounds exotic. To me, with considerable exposure to these sounds, and in fact able, albeit less fluently than I would like, to speak a language containing them, these sounds are not exotic (except perhaps for those damnably difficult pharyngeals, which we haven't got in Carrier). Since when is the one who reports them responsible for the attitudes of others?

I can't say for sure where such a silly idea comes from, but I strongly suspect that it is the result of a mind befuddled by postmodernism, particularly in its "postcolonialist" strain. There is a large literature, of which the most prominent work is probably the late Edward Said's book Orientalism, that condemns virtually all "Western" analysis of "Oriental" and other non-"Western" societies as hopelessly distorted by self-interest, hidden agendas, and fear and inability to understand "the other". In this milieu, the word exotic has a special, negative connotation: it is what the evil and mis-guided "Orientalists" think of "the other". For adherents of the postcolonialist cult it may be impossible to disassociate the normal use and meaning of exotic from the one it has in their bizarre world.

Posted by Bill Poser at 01:20 AM