August 31, 2006

Arabic T-Shirt Grammar

The grammatical controversy as to how to say "I am not a terrorist" in Arabic to which Ben Zimmer refers, is, I think, not so much about what the correct grammar is as what sort of Arabic to use. As I understand it (and Arabic is not exactly my best language), it is definitely the case that a noun such as "terrorist" should be in the accusative case when it is the predicate nominal in a negative predicate. This rule, however, is a rule of Classical Arabic: the modern colloquials do not have the three-way case distinction of Classical Arabic and therefore have no such rule. What the debate is really about, then, is whether T-shirt slogans should be in the classicizing "Modern Standard Arabic" or in a more colloquial variety.

Addendum: Readers with greater knowledge of Arabic than mine have explained that the issue of what is correct grammar and which variety of Arabic to use are intertwined because the verb lastu "I am not" is used only in Modern Standard Arabic. It is not found in any of the colloquial varieties. Therefore, if you use lastu you also need to put "terrorist" into the accusative case. If you don't use the accusative case of "terrorist", you must be writing in a colloquial variety of Arabic and will therefore use a different negative construction, one that does not use the verb form lastu.

Posted by Bill Poser at 10:19 AM

Kahunas v. Cojones

The Editor at Blawg Review has noted an interesting usage on Kevin O'Keefe's blawg (or is it a meta-blawg?) Real Lawyers have Blogs. In a post under the title "New legal tabloid an idea long overdue", O'Keefe writes that:

It's going to take some big kahuna's to publish those cease and desist letters and demands for retractions. But they be some of the best posts.

O'Keefe probably meant to invoke the border-Spanish euphemism for manly assertiveness, "big cojones", rather than the surfing-derived term for boss or expert, "big kahuna". The Blawg Review suggests that this substitution is an eggcorn. It seems to me that the verdict is not clear -- perhaps O'Keefe is just confused about how to spell "cojones"; or maybe he's confused about what "kahuna" means; or maybe this was just a malapropism of the kind that afflicts all of us from time to time, when an unintended word comes out in place of one that sounds similar.

A brief Google search shows that O'Keefe is far from the first one to make the cojones → kahunas substitution:

You got some big kahunas taking on Pollux!
These guys must have some big kahunas to operate these trucks in these environments.
One of the areas is a very long uphill section where you have to have some big Kahunas to go all the way up over the hill (you can't see the other side) at full throttle and down into an off camber downhill left turn.
Took some big kahunas to actually go through with it, but it worked out well for them in the end.
The Roxio folks have some big kahunas calling it an "upgrade".
A big thankyou to WindWarrior cameraman Mark 'Willy' Williams who's got kahunas the size of coconuts to get out in the water with his camera and then get sailors to jump over him.

For some people, apparently, a Hawaiian word for "priest" has ended up as an English euphemism for "testicles".

Perhaps O'Keefe deserves to be memorialized in the eggcorn database. But what he really has to worry about is getting hacked up by Lynn Truss for that superfluous apostrophe.

[Update -- Ben Zimmer writes:

Note that since the 1980s "big kahuna" has meant, as HDAS defines it, "a large or important thing or person." I'd say there's at least some eggcornification going on here, since it's plausible to think that big kahunas are invested with big cojones.

kahuna n. [< Hawaiian 'priest or wise man']
a.  Surfing. an expert surfer.—often constr. with big. [...]
b.  Orig. Hawaii. an expert of any sort. [...]

2.  a large or important thing or person.—often constr. with big.
E. Spencer Macho Man 177: I am a witness for those big Kahunas, the B-52's.  1991 N.Y. Newsday (Feb. 7) ("City Living") 83: To this big kahuna, all things tiki really are quite chic-y.  1993 Frasier (NBC-TV): This is for television! The big kahuna!  1996 L.A. Times (Nov. 4) A12: To surrender their critical thinking and personal autonomy to the will of the big kahuna.


[Update #2 -- Jim Gordon points out the blend {"big cahones"}. And then there's {"big kahones"}, {"big cohunas"}, {"big cujones"}, {}"big kohanes"}, and doubtless many others. ]


Posted by Mark Liberman at 08:41 AM

Cucumber cows

I've learned from Hugo Quené that late summer, which in English-language journalism is called the "silly season" and in German is called "Sommerloch" (= "summer hole"), is known as "komkommertijd" (= "cucumber time") in Dutch. That's the basis for the cucumber slices in this picture, which adorns an item on Noorderlog, the weblog of the Dutch science news site Noorderlicht, posted on August 29 under the title "Komkommerkoeien" (= "cucumber cows"). The cow part of the picture comes from Noorderlog's earlier post, "Koeiendialect" (= "Cow dialect"), which had credulously passed along the BBC's reproduction of a cheese company's press release.

It seems that Hugo sent Noorderlicht a link to my post "It's always silly season in the (BBC) science section" (8/26/2006), and they found it persuasive enough to call in the staff photographer, slice up a cucumber and look around for a cow. (I like the visual reference to cucumber facials.) So far, the BBC has not updated its coverage of the cow dialect story, except to offer a link to a BBC Radio Five piece that compares and contrasts the moos of cows from Somerset, Essex, Norfolk, and the Midlands.

Memo to Ashley Highfield: ... oh, never mind.

[Update -- Jarek Weckwerth writes:

Re your post about the Dutch term for "silly season": in Polish, it's known as "sezon ogórkowy" ("cucumber season"). Just the other day, I was in a bar where they had a party to celebrate the end of the cucumber season, with heaps of cucumbers all over the place. Proved a good way to generate some interaction among the punters.


Posted by Mark Liberman at 07:52 AM


An addendum to Bill Poser's post about the fellow who couldn't fly out of JFK because he was wearing a T-shirt with an Arabic slogan... The wearer of the T-shirt was Raed Jarrar, the Iraqi Project Director for the human rights group Global Exchange, and the slogan in question was "We will not be silent" (لن نصمت), popular among opponents of U.S. policy in Iraq and elsewhere in the Middle East. There's a picture of Jarrar wearing the shirt here, and an image of a similar T-shirt accompanies this BBC report. Note that the shirt has both Arabic and English versions of the slogan on it, so it's not like the airline officials had no hints as to what the mysterious squiggles meant.

In protest of the incident, BoingBoing reports, Tim Murtaugh is selling a T-shirt that reads "I am not a terrorist" in Arabic: انا لست إرهابي (ana lastu irhaabi). There's been some nitpicking over the grammar — some say it should properly be انا لست إرهابياً since irhaabi 'terrorist' ought to be in the accusative after the verb lastu 'am not', unless irhaabi is intended as an adjective ("I am not terroristic"?). A good effort, in any case — Murtaugh even offers a female variant of the shirt.

This all reminds me of another example of American anxiety in the presence of written Arabic, involving the Nobel Prize-winning Egyptian novelist Naguib Mahfouz, who died on Wednesday at the age of 94. As noted by Lameen Souag on his excellent blog Jabal al-Lughat, Edward Said tried to convince a New York publisher to put out English translations of Mahfouz's great Cairo Trilogy back in 1980 (before he won the Nobel). The publisher demurred, telling Said that "Arabic was a controversial language." Sadly, it remains controversial, at least at our nation's airports.

Posted by Benjamin Zimmer at 07:48 AM

Terrorism and the magical power of words

According to press reports, a man wearing a T-shirt with an Arabic slogan on it was denied boarding on a flight out of Kennedy airport recently. One of the officials reportedly told him: "Going to an airport with a T-shirt in Arabic script is like going to a bank and wearing a T-shirt that says, `I'm a robber'". It isn't clear, apparently, whether the culprits were airline staff or TSA staff.

This raises concerns about freedom of speech, of course, and about the competence of the people in charge of airport security (hint: your better terrorists don't advertise their occupation), but there is also a linguistic curiosity here. What, exactly, did they think they were protecting against? The slogan was certainly not a weapon. If he were a terrorist, wearing the T-shirt would not have assisted him in his task. It's true that Arabs figure prominently in the terrorism game, so it may make sense to pay particular attention to Arabs, but if that were the point, they wouldn't have denied him boarding, they would just have selected him for extra scrutiny. It is remotely possible that they thought that he was so powerful and dangerous that even without any weapons he was a threat, but in that case they surely would not have allowed him to board once he covered up the T-shirt, which is what they did. Assuming that they weren't engaged in simple harassment, which is a possibility, the only sense that I can make of this is that the officials concerned attributed to the words some sort of magical power that could be contained by covering them up. There have been societies in which people held such beliefs, but I wasn't aware that the United States in the 21st century was among them.

Posted by Bill Poser at 03:02 AM

August 30, 2006

Free books

Danny Sullivan at Search Engine Watch has an informative post about the new option to download pre-1923 (public-domain) books from Google Book Search.

Posted by Mark Liberman at 11:19 AM

Oh sleepies!

What do you call the crusts of dried mucus that you sometimes have to rub or wash out of the corners of your eyes when you wake up? The dialect map that Bert Vaux collected for this item shows the variants sleep, sleepers, sleepies, sleepy bugs, sleepy dust, sleepy seed, eye crud, and several others, without suggesting much in the way of geographical regularity. It looks like Americans have a lot of idiosyncratic and mostly childish names for this substance. But none of us, as far as I know, pronounce any of these names in order to express frustration, annoyance, exasperation or pain.

In Finnish, I've recently been told, the word for sleepies is rähmä, and when something fails in a frustrating way, you can exclaim "voi rähmä!", where voi is an exclamation similar in force to English oh. I should add that, according to my Finnish correspondent, this expression "is pretty strongly associated with a local long-time celebrity who tended to use it in a TV show". And an online Finnish-English dictionary offers somewhat more generic glosses for rähmä, like "discharge" and "secretion". Not that muttering "oh secretion!" seems a whole lot more satisfying, as a way to discharge frustration, than "oh sleepies!" is.

This came up because I was quoted a couple of days ago in a Philadelphia Inquirer column by Faye Flam ("Why are sex words our worst swearwords?", 8/28/2006) , repeating something that I'd been told a long time ago by another Finn, namely that Finnish cuss words have to do with religion rather than sex. Since my knowledge of such aspects of Finnish is limited to these rather casual memories, it's lucky that what Faye actually quoted me as saying is apparently not completely false:

You can't employ Finnish sexual words to swear, he says, since it would come out something like "Oh, intercourse!"

According to my anonymous Finnish correspondent:

This is pretty much true for words about sex or intercourse, but not about sexual organs. The hands down most common Finnish swear word vittu translates as "vagina", although the way it's used corresponds very well to English fuck. If something or someone is unpleasant, he or it is vittumainen, "vagina-like". If you run your mouth at someone, trying to provoke or embarass, the verb is vittuilla.

You can also call a person a mulkku, which translates as "penis". Kyrpä is also a rather uncivilized word for penis and it can be used to refer to an unpleasant person or when cursing out loud: "voi kyrpä!" (voi = oh). You can also blurt out "voi perse" (= "oh arse"). Someone who's an asshole in English would be vittupää (vagina-head) or kusipää (urine-head) in Finnish, etc. Then you have creative stuff like "voi vitun viikset" (= "oh vagina's moustache") and so on. If someone has "a penis on his forehead" ("kyrpä otsassa"), he's very disgruntled indeed.

The possible source for this misunderstanding is that apart from vittu, which many younger people use as a comma, most of the commonly used swear words in Finnish are indeed about devil, hell or similar religious affairs. So, there is a distinct register of strong swear words that are not sexual. Apparently the word vittu was originally about animistic magic and it was used to call up the magical power of women or the female genitalia. The idea of a male using it as a swear word was apparently rather ridiculous.

Not as ridiculous as cussing about sleepies, in my opinion. Of course, the whole cussing phenomenon is faintly ridiculous, when viewed in the light of reason. Anyhow, I feel that I got off easy in my role as a self-appointed expert on Finnish cussing, compared to Bill Bryson. My anonymous Finnish correspondent explained:

When talking about Finnish language with foreign people, you very often find out that for some weird reason someone in the group knows one or two Finnish swear words. This makes a certain legendary misunderstanding about Finnish swearing rather amusing. Some devious person fooled the author Bill Bryson to think that there's only one swear word in the Finnish language: ravintolassa, which means "in a restaurant".

Bill Bryson's gullibility and carelessness is on display in his book The Mother Tongue, about which one Amazon reviewer writes:

... as many others have pointed out, every page is just error after factual error. Bryson simply does not understand how languages work, and whatever his sources are are frequently wrong. My favorite mistake is when he claims that in Finnish, there is only one swear word, ravintolassa, meaning "in the restaurant" (page 214). Now, ravintolassa DOES mean "in the restaurant," but that's ALL it means. Finnish has plenty of native swear words (saatana, perkele, vittu, jumalauta, and more), and I still cannot imagine how Bryson came to the conclusion that, not only did it have only one, but that it was the word for "in the restaurant." It's truly mind-boggling.

[Of the four Finnish cuss words cited, three are religious: saatana = "satan", perkele = "traditional Finnish thunder god" (currently also a name for the devil), jumalauta = "God help". The wikipedia article on perkele asserts that

The term also has the role of realizing and strengthening the Finnish national identity. It is a typical Finnish masculine curse word, used to appeal to Finns as a rural attitude in which trouble is faced and conquered with determination and direct action. This has also inspired to the today quite commonly used (originally Swedish) expression "Management by perkele" to describe the often somewhat stern attitude among Finnish chief executives.

The following comment was removed from the same entry as being "stringly [strongly?] POV":

'Perkele Satan' is a common expression used to expres piss offedness. However, this is just a pure anger expression. I am finnish and knew nothing about thunder gods and swedish priests and crap adopting this and turning our wonderful gods into satanistic worshipping people. So, I don't quite see how you can put so much history and stuff into a simple word that is really only the finnish equal of 'God Damnit!!'. God damn the people who turned this fine finnish expression into the material of a dictionary.

This deserves to be preserved as an example of "lexicography by perkele". ]

I suspect that the spectacular "in a restaurant" blunder reveals something about Finnish deadpan humor as well as something about Bryson's scholarship, so perhaps we should reserve judgment about that whole "penis on his forehead" thing, pending further lexicographical research. But blindly trusting that that my anonymous Finnish correspondent is not a "devious person", we can continue with the Finnish cussing lessons:

... there are no words about sexual intercourse that correspond functionally with the English word fuck. There are several widely used vulgar expressions for sexual intercourse which are at least somewhat demeaning and impolite. They correspond with such English expressions as "screw", "bone" etc. Those kinds of expressions are not something you'd use in polite situation or around older people you don't know, but depending on the relationship, you can use them playfully with your girl or boyfriend or spouse. You can't really swear or curse with them, though. If you tried, you'd pretty much end up saying something weird like "oh, screwing", unless you got creative. Well - you can call a person a "wanker" in the same way as in English, if that counts.

And "oh sleepies" makes more sense when we realize that sleepies are basically dried mucus, and another way that Finns voice frustration is with "voi räkä", meaning "oh snot".

There's a theory about how all this stuff works, not only in Finnish but around the world. In fact, as you'd expect, there are several theories. More on that another time -- for now, here are some other Language Log posts on related topics:

"The FCC and the S word" (1/25/2004)
"The S-word and the F-word" (6/12/2004)
"You taught me language, and my profit on't/ it, I know how to curse" (7/17/2005)
"Curses!" (7/20/2005)
"Goram motherfrakker!" (6/7/2006)
"The history of typographical bleeping" (6/10/2006)
"The earliest typographically-bleeped F-word" (6/15/2006)
"Avoiding the other F-word" (7/4/2006)
"C*m sancto spiritu" (8/7/2006)

[And, courtesy of, here's the passage on p. 214 of The Mother Tongue where Bill Bryson exhibits his gullibility and/or ignorance of Finnish:

Some cultures don't swear at all. The Japanese, Malayans, and most Polynesians and American Indians do not have native swear words. The Finns, lacking the sort of words you need to describe your feelings when you stub your toe getting up to answer a wrong number at 2:00 A.M., rather oddly adopted the word ravintolassa. It means "in the restaurant".

Given how badly Bryson got taken by the Finnish restaurant gag, it'd be smart not to trust his word on Japanese, Malay, or American Indian languages either. And indeed, a bit of web searching turns up plenty of information about cussing in all of these.]

Posted by Mark Liberman at 07:24 AM

August 29, 2006

Science is... a verb??

From an article in Salon last week about Michael Shermer of the Skeptics Society:

We've got to get past this idea that science is a thing. It isn't a thing like religion is a thing or a political party is a thing. It's true that scientists have clubs. They have banners and meetings and they drink beer together. But science is just a method, a way of answering questions. It's a verb not a noun.

And faith is a verb, and God is a verb, and fashion is a verb, and happiness is a verb... and so on and so on and so on.

It has become clear to me that there's no point in railing against this trope, or telling these people to get the dictionary out. They cannot conceivably think they are talking about the correct part-of-speech classification of words. They don't need or want a dictionary. When they say "is a verb" they clearly mean something like "is something that must be engaged in, or be engaged with, as an active practice".

And that would be fine, except that for grammarians such as me it is a sad reminder of how the unworkable old definitions of terms like "noun" and "verb" still hold sway, nothing having changed in a century, and not much in a millennium.

It absolutely is not the case that you can coherently define lexical categories this way — nouns as words that name things, verbs as words for actions, adjectives as words for qualities, prepositions as words for relations between things, and so on. It simply does not work. It part of an ancient theory of grammar that is not just sick but dead on arrival, like the phlogiston theory of combustion. Only grammar never had its chemical revolution as far as the general public is concerned. Some time in the future the prevailing nonsense about grammar, on which the "is a verb" snowclone is based, has to be replaced by one that works, and the non-linguistic public has to be convinced (if there is ever to be sensible public discourse about linguistic matters) that the revised view provides a more sensible and coherent theory. This is not going to be easy.

[Many thanks to Jonathan Lundell and Tam K for the tipoff.]

Posted by Geoffrey K. Pullum at 04:42 PM

By any other name

Mark Liberman reports, once again, on misapprehensions of Gregory Pullman's Geoffrey Pullum's name.  This after a week in which a blogger managed to get <Geoffrey K. Pullum> and <Roger Shuy> right (angle brackets enclose spellings) -- no small trick -- but stumbled some on <Mark Liberman> (just the usual <Lieberman>) and fell flat on his face with <Arnold Zwicky> (<Andrew Zwickey>).  Both Geoff and I have large collections of manglings of our names, painstakingly (or pain-stakingly) assembled over many years, but this is the first time I've been called Andrew.  <Zwickey>, thousands of times, but Andrew, no.

There's some linguistic interest, as well as entertainment, in where these misnamings come from.

First names first.  Almost all these errors come from replacing the relatively rare first name Arnold with a more common one of similar form: in rough order of decreasing frequency, Ronald, Donald, Harold, Albert, Leonard, Howard. 

What's similar here?  Well, Arnold and the other six are all two-syllable names with accent on the first syllable.  More impressively, they all fit the phonological template:

(C)  VLAX  S  (C)  ə  L  d/t

where C is any consonant, VLAX is a lax vowel (a, æ, or ɛ), S is a sonorant consonant (nasal n, liquid r or l, or glide w), and L is a liquid (r or l).  Arnold manages to have TWO sonorants (r and n) in the middle.  Also note that d and t are alveolar oral stops, differing only in voicing.  Ronald is really VERY close to Arnold, differing only in the ordering of a and r, so it's no surprise it's by far the most common error.

In this context, Andrew is really pretty far off the mark, though (like Arnold) it starts with a lax vowel, and its consonants n, d, and r are all there in Arnold, but in the grossly different order r n d.  So I'm not surprised it's never come up before.

In any case, the other first-name errors above -- of perception and/or memory -- show just how much speakers are sensitive to phonological properties and relationships.

The remaining errors on my first name are spelling mistakes, and are not very common: <Arnald> and <Aronold>.  <Arnald> might involve a perseveration of the first vowel letter, but the effect is probably mostly the old problem of how to spell unaccented vowels in English: <a>, <e>, <i>, <o>, and <u> would all be possible, in principle, in the second syllable of my first name.  (<Arnuld> occurs fairly often, but almost always with reference to the current governor of California, and never, so far in my experience, with reference to me.)  In this case, <Ronald> and <Donald> probably tip things towards <a>.  As for <Aronold>, this is surely an orthographic anticipation of the <o> in the second syllable of <Arnold> -- evidence that the writer is already thinking ahead to the second syllable at the end of the first.

Most of the misnaming action is in my family name.  Some of it is phonological, a result of the fact that zw is a marginal initial cluster in English; it's "hard to pronounce" for English speakers, who try to improve on it in one of three ways:

(1) replacing the z by its voiceless counterpart s, to get a fine English initial cluster, sw: Swicky;  this is probably the most common fix in writing (you can google up a bibliography in which a reference to Zwicky & Sadock 1975 on ambiguity tests gives <Swicky> as the first author), and it's pretty common in speech.

(2) breaking up the difficult zw cluster by inserting a schwa, which will appear in writing as <a>, <o>, or <e> (these choices of vowel probably facilitated by the fact that <Zawicky>, <Zowicky>, and <Zewicky>, and versions of these with initial <S>, and versions of these with final <ey>, are attested Slavic family names); this is probably the most common fix in speech -- even I regularly insert the schwa when I want to make my name clear to people, though that's inclined to lead them to this type of misspelling -- and it's not very frequent in writing.

(3) omitting one or the other of the two consonants in zw, to get Wicky or (occasionally) Zicky.  More common in speech, where it probably results from mis-hearing, than in writing.

The most creative approach to my family name was taken by a data processing staffer at the Mitre Corporation, where I worked during the Cretaceous Era.  Faced with that zw cluster, she apparently decided not to abandon the w, but to hold it off until there was a place for it.  This would have produced something like Zickwi (yes, a fourth possible fix), but the staffer seemed to feel that this didn't do credit to what she saw as a Slavic name, so I became Mr. Zickwich.  (I was so charmed by this that I didn't correct her.  Anyway, I didn't want to get on her bad side, since she was the person who cared for all the punch cards for a gigantesque program I was working on.)

On to purely orthographic errors in my family name, most of which have to do with the ways of spelling final unaccented i.  It's <y> (as in <sticky>) in my actual family name, but the system linking sounds with spellings in English orthography offers several competitors: in descending order of frequency, <ey> (as in <Mickey Mouse>), <i> (as in <Micki and Maude>, a 1984 movie), <ie> (as in <Mickie James>, the woman wrestler).  There are also <ee>, as in <Mickee Faust> ("The Mickee Faust Club is Tallahassee Florida's tongue-in-cheek answer to a certain unctuous rodent living in Orlando"), and <ye>, as in <Mickye Adams> (an actress), but I have no attestations of <Zwickee> or <Zwickye>.  <Arnold Zwickey> is on the web, in a Lavender Languages conference announcement from Bill Leap.  So is <Arnold Zwicki>, in a comment on the eggcorn database.  (And to be fair, the family tree includes some who have re-spelled the name to a more Swiss-German-looking <Zwicki>.)

Every so often I get <Zwiky>, without the <c> that signals a preceding lax vowel, so that it looks like it ought to pronounced like <Mikey> (most likely) or <tiki> (if it's a "foreign" word).  But we now have <wiki>, with a lax first vowel, so even this version makes some sense.

At least one error probably results from people having trouble deciphering my handwriting: <Arnold Zuckey>, with a <u> for my written <wi> (plus our old friend, final <ey>).  Maybe <Arnold Zwidry> belongs here too.  <Professor Zwinky> doesn't, because the editor who addressed me this way had just typed my name correctly in the line above; she was, unfortunately, unable to reconstruct what had happened -- but it was certainly an inadvertent performance error.

You can see what happened in another inadvertent performance error: <Arnold Zwizky>, with the <z> persevering from the first syllable.  And in the anticipatory example <Zrnold Zwicky>.  And in the modestly frequent, though at first glance very surprising, <Zqicky> (the NAACP is determined to address me this way, and they're not alone); look at your keyboard.  Even better, two in one blow: <Arjold M. Zqicky> (from Greenpeace, obviously having someone type address lists rather than using pre-printed address labels).

The errors can be compounded.  I have, alas, no idea what led from <A. M. Zwicky> to the remarkable <A. H. Tricky>, but I can reconstruct the path from <Zwicky> to <Soicky>: <Zwicky> to <Swicky> (easy step), <Swicky> to <Sqicky> (the typing error just above), and then, wonderfully, <Sqicky> to <Soicky>, when some human being notices the impossible <Sqi> and assumes it's an error based on the visual similiarity of letters, so fixes the <q> to the visually similar (and orthographically possible) <o>; <p>, giving <Spicky>, would also have been possible, and I'm hoping to live long enough to see a <Spicky>.

Next, global foul-ups in mailing lists, where pieces of entries get transposed.  This has produced mail (with the right street address) for Arnold Zweig (probably there was a Zweig just before me on this list) and for Arnold M. Osland (there is an Arnold Osland who's a Republican district chairman in North Dakota, but what mailing list would we have been on together? there are no Oslands in the local telephone directory, by the way).

Finally, my favorite category, the Vulcan Identity Meld, in which conjoined names are combined into a single name: Elizabeth Arnold (a delightful person in whom my daughter Elizabeth's best qualities are joined with mine), Jacques Trazwicky (making a true marital unit of my partner Jacques Transue and me), and, incredibly, Jacque Arnold Transuzwicky (an elaborate interleaved Vulcan Identity Meld, plus the annoyingly common misspelling of <Jacques> as <Jacque>, which has led many to assume Jacques was a woman whose name was pronounced like <Jackie>.)

soicky at-sing clsi peroid standford peroid edd

Posted by Arnold Zwicky at 03:14 PM

Language Log type size

Some readers tell us that they have difficulty reading Language Log because the type is too small and faint. For me, the type in Language Log is fine, but I find the type on some other sites difficult to read. Type weight is a bit harder to control — perhaps we should change the stylesheet to use a heavier typeface — but fortunately, it is easy to enlarge the type in your browser, or to make it smaller if you prefer. Most if not all browsers have a type size control either right on the front panel or on an easily accessible menu. I'm currently using Mozilla Firefox most of the time. In Firefox, the control is on the View menu (the third from the left). The sixth item down, the first in the third group, is the character size submenu. Here's a screenshot:


Mark Liberman tells me that in Internet Explorer as well the View menu has a Text Size submenu.

All of the browsers that I have tried also respond to Control-+ to increase type size and Control-- to decrease it. Eric Bakovic reports from Macintosh-land that this is true of Safari as well except that one uses Command-+ and Command-- instead.

Posted by Bill Poser at 12:36 PM

Suck it up, buttercup

Headsup: The Blog takes me to task in fine style, for blaming Prof. Alan Smither's mis-statement about Spanish in the U.S. on editors at the Guardian. It's a great rant, and Prof. Smithers and I deserve every syllable. But I still think that the Grauniad should have caught the mistake, or at least posted a correction in the original article. Can you imagine any self-respecting blog that wouldn't?

[Update -- Nicholas Lawrence writes:

For the record, the on-line version of [Smithers'] original article now says "pockets of".


Posted by Mark Liberman at 09:37 AM

A Geoffrey by any other name

In the Chicago Tribune a few days ago, Julia Keller took a stylistic look at Vice President Cheney ("Cheney's usage of 'if you will' is 'like' hedging", 8/24/2006), and cited Gregory "Grisha" Pullum's classic Language Log post "It's like, so unfair" (11/22/2003).

For the rest of us here at Language Log Plaza -- Arnold Zwickley, Boris Zimmer, Sally Thompson , and all -- references in the popular press are rare enough that we need to echo George M. Cohan's plea “I don’t care what you say about me, as long as you say something about me, and as long as you spell my name right.” (I expect that "Cohen" was a special problem for him.) But Jeff Pullman, whose names seem to be everywhere these days, has transcended mere nomenclature.

Posted by Mark Liberman at 09:22 AM


In response to Arnold Zwicky's post on Snickers morphology, several readers have written to reference the much-loved Jargon File entry for bogosity:

1. [orig. CMU, now very common] The degree to which something is bogus. Bogosity is measured with a bogometer; in a seminar, when a speaker says something bogus, a listener might raise his hand and say “My bogometer just triggered”. More extremely, “You just pinned my bogometer” means you just said or did something so outrageously bogus that it is off the scale, pinning the bogometer needle at the highest possible reading (one might also say “You just redlined my bogometer”). The agreed-upon unit of bogosity is the microLenat.

2. The potential field generated by a bogon flux; see quantum bogodynamics. See also bogon flux, bogon filter, bogus.

You should also consult the entries for bogotify, where the coinage autobogotiphobia is described as "a self-conscious joke in jargon about jargon", a phrase with some current relevance; and coefficient of X, where the subtle but important difference between "foo index" and "coefficient of foo" is exemplified as follows:

Foo index and coefficient of foo both tend to imply that foo is, if not strictly measurable, at least something that can be larger or smaller. Thus, you might refer to a paper or person as having a high bogosity index, whereas you would be less likely to speak of a high bogosity factor. Foo index suggests that foo is a condensation of many quantities, as in the mundane cost-of-living index; coefficient of foo suggests that foo is a fundamental quantity, as in a coefficient of friction. The choice between these terms is often one of personal preference; e.g., some people might feel that bogosity is a fundamental attribute and thus say coefficient of bogosity, whereas others might feel it is a combination of factors and thus say bogosity index.

I was always especially fond of the term microLenat:

The unit of bogosity. Abbreviated µL or mL in ASCII Consensus is that this is the largest unit practical for everyday use. The microLenat, originally invented by David Jefferson, was promulgated as an attack against noted computer scientist Doug Lenat by a tenured graduate student at CMU. Doug had failed the student on an important exam because the student gave only “AI is bogus” as his answer to the questions. The slur is generally considered unmerited, but it has become a running gag nevertheless. Some of Doug's friends argue that of course a microLenat is bogus, since it is only one millionth of a Lenat. Others have suggested that the unit should be redesignated after the grad student, as the microReid.

I recall a version of this entry that also mentioned the international standard unit of insincerity, but that's another story. I'm pretty sure that the term bogosity was already in use at MIT in the early 1970s, though my memory may be affected by the ubiquity of the term bogus in that culture. (If memory serves, occasions for use of the term were also plentiful, though I suspect that some future axiomatic social science will discover that conservation of bogosity is a universal law.) The 1981 CMU version of the Jargon File has

BOGOSITY n. The degree to which something is BOGUS (q.v.). At CMU, bogosity is measured with a bogometer; typical use: in a seminar, when a speaker says something bogus, a listener might raise his hand and say, "My bogometer just triggered." The agreed-upon unit of bogosity is the microLenat (uL).

I always thought that bogosity was formed by self-consciously false analogy to porous/porosity and other pairs of that type. Note that this element introduces the resonance of an adjectival form nougatous to the Snickers coinage. I also suspect that bogosity in turn influenced the coinage of travelocity, where the echo of velocity also comes into the picture, and this may also be one of the resonances of nougatocity.

[Update -- Ben Zimmer points out that:

Another form likely influenced by bogosity is bozosity/bozocity, as I noted in a comment on Double-Tongued Word Wrester.

By the way, Geoff Nunberg referred to "the administration's apparently bottomless bozosity" in a June 11 LA Times column.


[Update #2 -- Blake Stacey writes:

The closest I have seen to the "hackish" term "bogometer" entering a broader field of discourse is in the New York Times coverage of the
Bogdanov Affair, in which two television personalities managed to earn Ph.D.s by publishing artful nonsense. Quoting reporter George Johnson,

This is where experts say that sincere or otherwise, the Bogdanovs' papers fall flat. Reading through an Internet debate between them and the physicist John Baez of the University of California at Riverside is like watching someone trying to nail Jell-O to a wall. It is as though the Bogdanovs, like twins one reads about in psychology experiments, have developed their own private language, one that impinges on the vocabulary of science only at the edges.

If so, then their argot was apparently good enough to get past the gatekeepers at the University of Bourgogne, where the brothers recently got their Ph.D.'s with dissertations their colleagues find as baffling as their papers. (Some scientists are amused that long before anyone outside France had heard of the Bogdanovs, the term "bogometer" had been used to describe an imaginary device that blinks frantically when confronted with a bogus claim.)

The Wikipedia article on this affair has experienced some serious slings and arrows, mostly because people involved came along in person to slant its coverage they way they want to be seen (quite a "the map is not the territory" moment). However, now that the heyday is past, it's much more informative.

Johnson's parenthetical connection between Bogdanov and bogometer seems dangerously close to making fun of someone's name, which is generally considered to be an inappropriate form of argument. He's half-rescued in this case by the anonymous attribution to "some scientists".. ]

[Update -- Topher Cooper writes:

Found the posting on “bogosity” and the “microLenat” from the Jargon File interesting, largely because it doesn’t agree with my memory in a number of particulars. I was a part-time undergraduate working full-time for the AI department at C-MU during the period when the term was commonly in use.

The definitions are fully in accord with how I remember it being used – it’s the origin that seems a bit flakey. Mind you, my memory could be at fault.

From the proposed alternative unit – the microReid – I’m guessing that the grad student in question was Brian Reid. The problem with the whole story is that Doug Lenat was at the time a grad student at Stanford while Brian was a grad student at C-MU. Brian had previously been at the University of Maryland and worked in industry (yes, I checked his Wikipedia entry, but it is basically what I remembered). It doesn’t seem likely that he was ever a student of Lenat’s.

There were a bunch of people in the AI group at C-MU who had previously been at Stanford. My impression was that they introduced the use of the term to C-MU.

The explanation that I received for naming the unit of bogosity after him was that he was someone who would generate more ideas in five minutes than most people do in a week – sort of a comp sci Robin Williams. Of course, nearly all of those ideas were completely bogus. Every once in a while, though, one of those ideas was a true gem. That still left him with more good ideas a week than most people. Someone once suggested that the microLenat was an unusual unit of measurement because there were only 999,999 microLenats per Lenat – the one remainder measured something quite distinct from bogosity. I had met Lenat at conferences and attended some of his chaotic presentations so this explanation made a lot of sense to me.

Although I don’t remember any particular connection between Brian Reid and use of the term microLenat I could easily see where Lenat’s style -- which contrasted rather sharply with Brian’s – may have been grating to Brian. It is possible that he used the term with a bit more bite to it to someone outside of C-MU, giving rise to the sour grapes story.


Posted by Mark Liberman at 07:32 AM

August 28, 2006

Playing with your morphology

Advertisers like to play with language.  People notice, and maybe, they'll then remember.

So we get the latest Snickers ad campaign, in which striking invented words appear (on billboards, sides of buses, etc.) in the characteristic Snickers font and colors, on a chocolate brown background that looks like a Snickers wrapper.  The words:

[see correction below]

We here at LLP are not the first to comment on the phenomenon; google on "Snickers" plus one of the words (especially the last), and you'll find lots of discussion, ranging from admiration to annoyance to mockery.  Most of the commentators see the words as combinations of two contributing existing words (combinations that Lewis Carroll called "portmanteau words"), but that's not quite right, and the Playful Morphology office at LLP (which produced "Plain morphology and expressive morphology" [Berkeley Linguistics Society 13.330-40] back in 1987) is here to tell you about it.

Start with hungerectomy.  This combines a base hunger with the medical suffix -ectomy, referring to a surgical removal, as in appendectomy and tonsillectomy.  The (vivid, perhaps over-vivid) imagery is of a candy bar that physically removes hunger from your body.  Not really a portmanteau, but instead an extension of the noun bases eligible for combining with -ectomy, from medical ones to anything goes.  There are plenty of other nonce formations on non-medical bases: truthectomy, funectomy, zitectomy, for instance.  [Ben Jackson notes that the -rectom- piece of this word is homophonous with rectum -- not a good thing.]

Peanutopolis is similar.  The base is peanut, the suffix -opolis, used for city names (Greek polis 'city' with a combining vowel -o- for bases ending in a consonant, as in English metropolis and necropolis).  So it conveys Peanut City, which is pretty good.  Once again, the base isn't of the Greco-Latin sort that you'd expect, so it's noticeable.  In this case, there's a fairly long tradition of such combinations in place names (based on nouns), often jocular: porkopolis, cornopolis, cottonopolis, steelopolis, and more.

Nougatocity is several steps more complicated.  My guess is that the ad agency's impulse was to combine nougat with the Latinate suffix -ity, which forms abstract nouns from adjectives, to get something conveying 'the state or essence of being nougat'.  This would be noticeable in two ways: the base is from the wrong stratum of the vocabulary, and it's of the wrong category (noun rather than adjective).  Ok, that's playful morphology for you.

But there's a phonological problem.  The suffix -ity is one of those that requires accent on the syllable immediately preceding it, so the accent on the base will shift to accommodate this requirement (compare ACtive with acTIVity).  But that obscures the identity of the base word, not a desirable outcome for the Snickers people, who would want nougat to stand out clearly: NOUgat, but nouGATity.  Ugh.  There's a fix for this: use another suffix in between the base and -ity, so that nougat can keep its accent, and the extra suffix will get the accent required by -ity.  I'd expect -ic to be the intervening affix, as in multiple - multiplicity.  That would give nougaticity.

The ad agency didn't go for nougaticity, maybe because the high front vowel in -icity sounds too small or precious (the symbolic values of vowels have long been known).  Instead they went for a nice big back vowel in -ocity, despite the fact that the reasonably common words in -ocity (precocity, velocity, ferocity, reciprocity, atrocity) aren't likely to be ones they wanted to evoke.

[Update, 8/29: Suzanne Kemmer writes to supply a much simpler account.  Since 1988 she's taught classes on English words, in which she collects student reports on neologisms.  She says, "-ocity is one of the current favorite suffixes deriving nouns from adjectives (and now, from nouns too).  It seems to have the flavor of a humorously faux-Latinate derivation. It is very widespread among college students."  She suggests a connection to bogosity, a word that Mark Liberman has now posted on, and concludes, "The coiner on the ad campaign was probably young and knew the popular -ocity suffix; or found a test group of youths who suggested the word."  Sounds good to me.]

Now, substantialicious.  This looks like a portmanteau of substantial and delicious, and maybe that was what the writers were after.  But there's also an evolving jocular suffix -alicious (also spelled -ilicious and occasionally -elicious), conveying a high degree of some desired property, as in the (fairly widely attested) crunchalicious, crispalicious, and yummalicious (and probably others; these are just the ones I've noticed), and that might be the analysis of substantialicious.  [Turns out that the ads actually have the spelling substantialiscious, maybe to evoke luscious.  Google sources have both spellings, and this is one I hadn't seen on the hoof, so I was misled.  The ad strategy seems to have been to throw in as much as possible and hope that some of it sticks.]

[Further addenda, 8/29: Jason Parker-Burlingham and Jim Wilson write, separately, to say that the -alicious words remind them of the coining sacrilicious (a portmanteau of sacrilegious and delicious) on "The Simpsons" in 1994 (for the story, see the Wikipedia site on Simpsons neologisms).  Wilson suggests, in fact, that the -alicious words are cloned from sacrilicious.  I'm a bit dubious about this, since my impression is that crunchalicious, at least, pre-dates this Simpsons episode.  But I don't have any actual evidence, so I'm keeping an open mind.  As for echos and reminders, Jack Hamilton tells me the word reminds him of Mary Poppins: supercalifragilistic!]

(That 1987 BLS paper by Geoff Pullum and me looked at some evolving jocular suffixes that had already been discussed in the morphological literature: -orama/-rama/-ama and -eteria/-teria/-eria for shop names.  A variety of suffixes created from pieces of existing words are now inventoried in Michael Quinion's Ologies and isms: Word beginnings and endings (Oxford Univ. Press, 2002).)

Substantialicious is merely not very compelling.  But satisfectellent, apparently some sort of odd portmanteau of satisfaction (or satisfying) and excellent, is a real stinker.  For a lot of people, this one evokes, not these words, but fecal and repellent, not a good thing in a word that appears on a brown background.  [Addendum: you might hear an echo of infect in there as well, also an unfortunate effect.  Further addendum, 8/29: J.D. Stephens suggests that you might hear the unpleasant feculent in there too, if you know the word.]

Just a reminder: these words are not to be found in dictionaries.  That's the whole point; they should be ostentatiously novel, but still interpretable.  They should be crunchalicious inventions.

[Thanks to Doug Kenter for encouraging me to post on the Snickers ads.  They were, as he put it, carefully, driving him nuts.]

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 01:44 PM

A bad week for the lord of the underworld

You'd think that Pluto had just lost a contract with Viacom. Everyone has been worrying about whether Pluto is a planet or not, thereby proving that planet or no planet, Pluto is still a star:

The Washington Post turned it around the other way, with "5 Things that Need a Downgrade like Pluto": Godfather III, Gluttony, Tour de France, Segways, and Tom Cruise.

Simon Beck in the Globe and Mail took a more conventional line with "A star and a dwarf crash in war of worlds":

This week was marked by two of the most famous pink slips in recent history, as Pluto lost its job in the solar system and Tom Cruise lost his corner office in the star system.

Bert Caldwell in Forbes used the same equation in the lead for an article about magazine rankings of this and that:

Astronomers disowning planets. Hollywood casting out stars. Washington ranked 41st among the 50 states for quality of life. Just what in the name of this cosmos is going on here?

And Matt Shurrie in the Woodstock Sentinel-Review really let it all hang out:

Yes, the elite eight have thrown their one-time sibling to the curb - simply because it doesn’t measure up.
How typical. How rude.
Sure, the astronomical union defended the move by expressing its deepest affection for Pluto - Jocelyn Bell Burnell, a specialist in neutron stars from Northern Ireland, even joked that some sort of new umbrella called ‘planet’ had been created, drawing laughter by waving a stuffed Pluto of Walt Disney fame beneath a real umbrella.
However, anyone could see right through those hollow feelings.
If our ninth planet - sorry, former planet - can be removed without much of a second thought, it makes one wonder what’s next.
If size has become such an astronomical issue, what about those of us back here on planet Earth that somehow don’t measure up?
Now that Pluto has been classified as a “dwarf planet” could those among us characterized by their “dwarf” size be next on the chopping block?
A quick look at the entertainment, music and professional sports industries reveals a number of potential candidates facing the axe. Dolly Parton, the five-foot country singer; Danny Devito, the five-foot actor/director and Theo Fleury, five-foot-six hockey player are only the tip of the iceberg.
There are plenty more including Michael J. Fox, Gary Coleman, Tom Cruise, Verne Troyer and rapper Ja Rule. Not even former quarterback Doug Flutie and former Toronto Maple Leafs captain Doug Gilmour could consider themselves safe.
What happened to a time and place where those smaller in stature were embraced - even celebrated - for who they are and not what society expects them to be? Has the interplanetary society really turned its back on that once proud tradition?

No, Matt, not as long as the gods still rule from Mt. Olympus.

Here's a small selection from the rest of the "Pluto as star gossip" stuff floating around in the media:

Tom Cruise, Pluto and Hollywood’s entrenched system for getting TV comedies on the air all took a beating this past week.
Tomorrow astronomers will vote whether Pluto retains its planet status. If Pluto loses, it will run as an independent.
We lost a planet from our solar system this week and it couldn’t get Jon Benet Ramsey off the front page. Talk about lack of respect for a celestial body!
Maybe the new rush of Pluto research will reveal a shocking truth: Cruise and Suozzi are actually visiting from the ninth planet - or the first ex-planet or whatever those indecisive telescope jockeys are calling Pluto now.
And even in these dog days of August, when Frank Quattrone shares the front page with Tom Cruise, JonBenet and the planet Pluto, there are serious opportunities for investors not at the beach.
'I don't care if Pluto's not a planet anymore. Pluto never did anything for me.'
While CNN's Breaking News alerts occasionally drift into the mundane — apparently Mel Gibson’s no contest drunken driving plea was urgent enough to warrant one — they are always reserved for headlines that will get people talking, like the foiled terrorist plot in Britain or the demotion of Pluto.
Now that Pluto's back in official planetary orbit, does Goofy need to hire a PR rep?
I have to think Thursday was a pretty sad day for stargazers -- and no, I'm not talking about Paramount Pictures punting Tom Clueless, er, Cruise.
Pluto was too small to be in the solar system. It will now be mounted on a ring and given to Mrs. Kobe Bryant.

Even some of the quotes elicited or chosed from people in the science biz have got a show-biz flavor:

"Pluto is a chunk of ice which controls nothing," says Michael Shara, curator of astrophysics at the Rose Center for Earth and Space at the Museum of Natural History. "Its orbit is a slave to Neptune's orbit."

All this suggests that the planets are still effectively personified, in a fuzzy sort of way. Would there be so much fuss, even in the silly season, over a decision that (say) tyrosine isn't really an amino acid after all? This is one of the many things that are left out of the kinds of "meaning" represented in traditional ontological taxonomies.

Posted by Mark Liberman at 07:09 AM

August 27, 2006

Generational style as "language"

Since we haven't had a cartoon in a couple of days:

Posted by Mark Liberman at 01:52 PM

Another Plutonian Casualty?

It's an irresistible story, right down to the quaint names of the dramatis personae. On March 14, 1930, Falconer Madan, the librarian of the Bodleian Library in Oxford, reads his 11-year-old granddaughter Venetia Burney the press story about the discovery of a new planet by the Lowell Observatory in Flagstaff Arizona. She has been studying Greek and Roman mythology and tells her grandfather that the planet should be named Pluto, after the Roman god of the underworld. He relays the suggestion to the Oxford astronomer Herbert H. Turner, who in turn cables it to the Lowell Observatory. When the name is announced on May 1 by Vesto Slipher, the Observatory's director, Venetia is given due credit for her suggestion. After the story is popularized in a 1964 article in Sky and Telescope, "the girl who named Pluto" becomes a favorite topic in popular books about astronomy, and even in her 80's, Venetia is still the subject of news features and interviews.

As long as people are raining on 75-year-old planetary parades, maybe this one is worth some cold-eyed reconsideration as well.

The story has developed some elaborations over the years --some accounts, for example, have Venetia winning a contest to name the new planet. But there's no reason to doubt most of the details as they were originally given. There's no question that Turner forwarded Venetia's suggestion to Slipher, who acknowledged her as the first person to suggest the name in a May 1 observation circular (the formal announcement came later). (See William Graves Hoyt's article "W. H. Pickering's Planetary Predictions and the Discovery of Pluto," Isis 67,4, December, 1976.)

But it's hard to credit that Venetia was actually the first to bring up Pluto as a potential name. In a BBC interview published on January 13, 2006, Venetia Burney Phair herself reported that when her grandfather first went looking for Turner, he turned out to be at a meeting of the Royal Astromical Society in London, where people were naturally speculating on the name of the new planet.

"None of them came up with Pluto. That was another stroke of luck," says Mrs Phair. When Mr Madan eventually caught up with Herbert Hall Turner, the astronomer agreed Pluto was an excellent choice.

No doubt that's what her grandfather or Turner told the 11-year-old -- anyway, it's what I would have told my daughter in similar circumstances. But it's virtually certain that "Pluto" was already being bruited about by the members of the RAS and by astronomers elsewhere, including those at the Lowell Observatory. Inasmuch as the planets were conventionally named after Roman gods, it's hard to think of a choice more obvious than the name of "the god of the regions of darkness where Planet X holds sway," as Roger Lowell Putnam, a trustee of the Lowell Observatory, said on May 25, 1930 in announcing the choice of the new name that would be submitted to the American Astronomical Society and the RAS ("Pluto Picked as the Name for New Planet X Because He Was God of Dark Distant Regions," as the New York Times zeugmatically titled its 5/26/1930 article on the announcement). And the name was all the more appropriate, as Putnam noted, because Pluto was the brother of Neptune and Jupiter (as well as being the son of Saturn, he might have added). But Putnam made no mention of Venetia in the public announcement. And given that astronomers were rifling through the lists of Roman gods from the moment of the new planet's discovery -- and indeed, from well before that date, in anticipation -- it's not credible that Slipher would have opened the telegram containing Venetia's suggestion and said, "Pluto! Now why didn't we think of that one?"

In fact the name Pluto had occurred to other people, and some were already using it. A story in the New York Times on March 25, 1930, two months before Putnam's announcement, reported that the Italian astronomers at the Breara [sic -- actually Brera] Observatory who had photographically corroborated the discovery of Planet X had provisionally given it the name Pluto "because that ancient divinity was related to others for whom planets are named," being "the son of Saturn and the brother of Jupiter and Neptune." (The same story was carried in a March 25 AP dispatch.) Slipher must have known about this well before the date of the stories, since the Lowell Observatory must have been in touch with Emilio Bianchi and the other astronomers who made the observations. And by March 28, the suggestion of Pluto as a name was already being criticized by the astronomer Hans Hoerbiger of Vienna (who argued that since the planet consisted cheifly of frozen water, a name relating to Neptune should have been used.) (NYT 3/29/1930).

But if the name Pluto was adopted from the Italians or was simply in the air, why would Slipher have credited Venetia for playing a decisive role in his May 1 circular? Perhaps he simply felt that the story added a charming note of human interest -- and after all, Venetia really had suggested the name, and crediting her with first discovery would have been seen as a gracious gesture to Turner (a former Astronomer Royal) and the English. Slipher may also have wanted to reserve credit for the name to the Anglo-American sphere, rather than acknowledging that the Italians had come up with it first.

Or perhaps there's an additional explanation. As Hoyt points out in his 1976 article, the name Pluto was also being used by the astronomer William Henry Pickering. A fecund predictor of hypothetical planets, Pickering had conjectured a trans-Neptunian "Planet O" between 1919 and 1928 on the basis of perturbations in the orbit of Neptune, a theoretical rival to Percival Lowell's Planet X, which the Lowell Observatory had set itself to find. Given the rivalry that had developed between Pickering and the Lowell Observatory staff, it's no wonder that Harlow Shapley, the director of the Harvard Observatory, would warn Putnam two days after the discovery was announced that "we shall soon be hearing from W. H. Pickering."

And indeed, in an article published in 1930 in Popular Astronomy shortly after the discovery, Pickering identified the body found by the Lowell Observatory as his Planet O, and claimed priority of publication for the name Pluto, a name he later claimed to have been privately using for some time. When the observatory staff insisted that the body was identical with Lowell's Planet X, a dust-up ensued; in Janurary 1931, Pickering attacked Percival Lowell's work and the "surprising and reckless claims. . . put forth by his active adherents and administrators, backed by very extensive newspaper propaganda." And later, having decided that the object found by the observatory was not one of his hypothetical planets, he objected to the choice of "Pluto" as a usurpation: "Pluto should be named Loki, the god of thieves," he said. (See Hoyt, pp. 563-564.)

There is no way of knowing whether Slipher was aware of Pickering's use of "Pluto" prior to the appearance of Pickering's Popular Astronomy article. But he certainly knew of Pickering's claim before he gave pride of place to Venetia, and that may have given him another motive for crediting her with the suggestion, foreclosing speculation that the name was borrowed from Pickering. Whatever his motives, it isn't plausible that Venetia really was the first person to suggest the name of Pluto. Which is not to deny that she was a very clever young girl.

Posted by Geoff Nunberg at 01:06 PM

Spanish in the states

A few days ago, we looked into an article in the Guardian, which stated that "Spanish is fast rising in importance and there are now more Spanish speakers in the United States than English." I speculated that "the Guardian's entire editorial staff is on vacation, and has delegated its duties to the night office-cleaning crew, who are having a little competition among themselves to see who can slip the most extravagant falsehoods into print." But it turned out that it was simply a slip of the pen on the part of the article's author, Prof. Alan Smithers, who explained that "[t]he thought that was in my mind when I wrote that part of the sentence was `there are now more Spanish speakers in some of the United States than English'.

I was still skeptical about this modified statement. I wrote Prof. Smithers for clarification, and he was kind enough to reply.

Many thanks for keeping me posted.

The picture I had in mind was Figure 5 of the US Census 2000 Brief which shows large parts of the States bordering Mexico with 60 per cent or more, or 35.0 to 59.9 per cent, of people, five years and over, who spoke a language other than English at home. Table 4 of the same document lists the top ten areas for Spanish speakers all of which are above 50 per cent reaching as high as 91.9 per cent.

The figures are taken from the 2000 Census and are likely to have been under-estimates since they derive from a question about language spoken in the home rather than mother tongue, and the form could only have been sent to known households with reliance on an honest response. There has also been rapid growth in Hispanic migration since 2000, both legal and illegal.

I, therefore, felt justified in going for a dramatic statement. But since it has attracted attention out of all proportion to its importance in the article (which was about why we in England should not be too bothered by the decline in the learning of French and German in our schools given the increasing interest in Spanish and other languages), it should perhaps have been more qualified - though whether a more academic sentence would have survived the subbing is another matter.

Prof. Smithers is referring to census data like this:

But there's a danger in generalizing too quickly from such maps, as Ben Zimmer pointed out a few months ago. Here's the same data -- proportions of Spanish speakers by county -- graphed for the state of Texas by the excellent MLA language mapping web site:

This certainly makes it look as if at least as many Texans speak Spanish as English. However, many of the counties with the highest percentages of Spanish speakers are thinly populated. If you look instead at a map by number of speakers, a different sort of picture emerges:

This helps explain why the overall population of Texas was found by the 2000 census to be only 25.5% "Hispanic". And as I understand it, this is an ethnic rather than linguistic statistic, and so the proportion of Spanish native speakers would be somewhat lower. Texas has the third-highest state-level "Hispanic" proportion, essentially tied with California at 25.8%, and behind New Mexico at 38.2%. In fourth place is Arizona with 18.8%. (According to estimates of current proportions of Spanish speakers on the MLA site, New Mexico is at 28.76%, Texas is at 27.00% and California is at 25.8%. Note also that the majority of these "Spanish speakers" also speak English "very well" or "well" -- e.g. in California, 5,593,955 out of 8,105,445.)

Prof. Smithers suggests two reasons why today's real proportions might be higher than this: census undercounting and continued immigration. Both are valid points, but I think they're unlikely to rescue his statement.

As for census undercounting, this was investigated carefully by Eugene Erickon, "An Evaluation of the 2000 Census". He estimates that Hispanics were undercounted by about 2.85%, non-Hispanic Blacks by 2.17%, and non-Hispanic Whites by 0.67%. Given these estimates and the state-level percentages given above, it's clear that not even New Mexico is going to get close to 50% Hispanic.

And as for on-going changes, the census bureau estimates that between 1990 and 2000,

The Hispanic population increased by 57.9 percent, from 22.4 million in 1990 to 35.3 million in 2000, compared with an increase of 13.2 percent for the total U.S. population [which was 281.4 in 2000].

If we project the same rates of growth forward for New Mexico and Texas, we'd predict that New Mexico would become 46.3% "Hispanic" by 2010, and Texas would become 32.3% "Hispanic". These proportions would be increased slightly by allowing for undercounting, but not as much as they would be decreased by removing the members of the "Hispanic" category whose native language is not Spanish -- according to the accounts that I've read, that would be many of the second generation and nearly all of the third. [See the 1987 movie Born in East L.A. for the (hilarious but fictional) story of Cheech Marin as a Chicano, born in the U.S. and speaking no Spanish, who is mistakenly deported to Mexico by the U.S. immigration authorities. And way back in 1969, one of my army buddies was a Hispanic guy from San Antonio whose Spanish was not very good, though his ethnic identity was strong.]

So I think it would still have been factually incorrect if Prof. Smithers had written that "there are now more Spanish speakers in some of the United States than English". It would be closer to correct to say that "several of the United States are more than 25% Spanish speaking". Whatever the exact proportions, it's certainly true that many U.S. residents speak Spanish -- which I guess was Prof. Smithers' main point.

[Update (August 28, 2006): I guess some of the Guardian's editorial staff have returned from vacation early, and so there is now a correction here, three days later and (of course) not linked from the original article, which remains in place, unchanged, to enlighten future readers... ]

Posted by Mark Liberman at 12:07 PM

Scottish dialect genetics

For some reason, the worldwide excitement over English cow dialects hasn't connected with the more localized excitement over Scottish crossbill dialects, which was also recently featured on the BBC News web site ("'Accent' confirms unique species", 8/15/2006):

Debate has raged for years among experts about whether the Scottish crossbill was unique, or a sub-species of the common crossbill.
DNA tests had shown the Scottish crossbill, common crossbill and parrot crossbill - which visits from Europe - to be genetically similar.
The results of long-running research has now found, according to the RSPB, that the Scottish variety is a distinct species of its own.
The society said it had a "Scottish accent", or call, which it uses to attract a mate from among other Scottish crossbills.

The logic here is puzzling. Cows, the BBC told us, learn their regionally distinctive moos from the farmers that tend them. (Now in fact, there's no evidence for -- or against -- regional variation in cow vocalizations. The whole thing was an empirically vacuous PR stunt. But we're talking about logic here, not evidence.) Part of the argument for the plausibility of the cow story was the well-known fact (it really is a fact) that many species of birds learn aspects of their songs, and sometimes thereby develop local song "dialects". But in this other story, separated by only a few days, the BBC tells us that Scottish crossbills, though "genetically similar" to their European cousins, are now to be treated as a separate species, because they have a "dialect":

RSPB Scotland's senior researcher Dr Ron Summers, who led the study, said: "The question of whether the Scottish crossbill is a distinct species, and therefore endemic to the UK, has vexed the ornithological world for many years and split the bird watching community.
"This research proves that the UK is lucky enough to have a unique bird species that occurs here and nowhere else - and this is our only one."

Well, maybe the crossbills are not among the bird species that exhibit vocal learning, but instead develop songs that are genetically programmed in every detail. That would rescue the logic of the story, but its author shows no sign of having considered the question one way or another, which is a little odd, since the word dialect suggests social construction rather than genetic determination. (In fact, crossbills -- a kind of finch -- do learn their songs. See below for some details.)

So I spent a few minutes finding and reading the press release at the site of RSPB Scotland. (That's the Royal Society for the Protection of Birds.) This revealed where the BBC's take on the story came from -- as in the case of the regional cows, they were basically just repackaging the PR agent's press release (with a good deal of straight copying).

Quaintly, the RSPB has a picture caption that reads:

'Celtic' crossbills differ in bill size from other crossbill species found in Britain, and just like native Scots, they have also been found to have a distinct Scottish accent or call.

And it's well known that Scots (or Homo edinburgensis as scientists call them) are a distinct species. The RSPB press release has more about the three types of crossbills:

Scotland's conifer woods are home to three types of crossbill -
* the common crossbill (with a small bill best suited to extracting seeds from the cones of spruces)
* the parrot crossbill (with a large bill suited to extracting seeds from pine cones)
* and the Scottish crossbill (with an intermediate bill size used to extract seeds from several different conifers).
All three are similar in both size and plumage, and DNA tests have showed that the birds are genetically similar, casting some doubt on the Scottish crossbill's status as a distinct species.

In fact, you can't really tell the different kinds apart just by examining them:

Although the three species differ in average bill size, the actual differences are small and cannot be used reliably in the field by ornithologists to identify crossbills.

But wait:

The calls, though, can be distinguished by sonograms, or sound pictures, made up from recordings. Crucially, this provides the basis for a method to survey crossbills and, for the first time, gain a clear picture of their numbers and distribution in Scotland.

So the RSPB did a "long term field study" in which they captured 46 mated pairs of crossbills of various types, to learn "if the birds mate with those with a similar bill size and call, and whether young Scottish crossbills inherit their bill sizes from their parents".

Results showed that, of 46 pairs of different types of crossbills caught, almost all matched closely for bill size and calls. In other words, the different types of crossbills were behaving as distinct species.
The small number of 'mismatched' pairs was too few to suggest that the different types are not species, but enough to account for their genetic similarity. The fact too that young crossbills had bill sizes similar to their parents showed that they inherited their bill sizes, and also supports the species status of Scottish crossbills.

Gee, I bet you could use a similar technique to demonstrate that various ethnic groups in the U.S. are separate species.

The RSPB press release tells us that "Scottish crossbills (as identified by bill size) also have quite distinct flight and excitement calls from other crossbills", but unfortunately, neither the press release nor its replication at the BBC tells us what crossbills' calls are like in general, and how the Scottish-dialect version differs. The Scottish crossbill page on the RSPB website says that it has "[a] 'chup chup' call with a fluty quality", whereas the common crossbill has "[a] loud 'chip chip' call; a warbling, twittering song", whereas the parrot crossbill has "[v]ery similar calls to crossbills, but thought to give a distinctive deep ‘kop-kop’ and ‘choop choop’".

Since the RSPB has no equivalent of the International Phonetic Alphabet at its disposal, I'm puzzled about the difference between the Scottish crossbill's "chup chup" and the parrot crossbill's "choop choop". We want spectrograms and scatter plots! (There are audio samples on the RSPB site, but only one recording per "species". I presume that the RSPB's field study has been published somewhere, but I haven't tracked it down yet.)

The RSPB's "long term field study" now must be supplemented with a larger and longer field study:

The next steps in the Scottish crossbill study are to find out its population size and habitat requirements.

With the current estimate of 1,500 birds for its global population, being little better than a guess, a detailed survey is crucially important to put together the right conservation and management measures to protect and conserve it.

Dr Jeremy Wilson, head of research for RSPB Scotland said, 'Clarifying the status of the Scottish crossbill as a distinct species, and devising a survey method based on the bird's calls are exciting steps forward.

'We hope to carry out the first full survey of the numbers and distribution of Scottish crossbills in 2008, after which we will be better placed to understand how best to manage conifer woodlands in Scotland to secure the future of a bird found nowhere else in the world.'

Another, more cynical, argument for the RSPB's conclusion begins to suggest itself. Echoing Max Weinreich's observation that "a language is a dialect with an army and a navy", we might suggest that "a species is a phenotypic variant with a protected habitat". (Or take a look at the Wikipedia article on species for a sketch of the reasons why the concept "species", up close, is just about as contested as the concept "language".)

But if you ask me, the crossbills are getting their "accents" from those Scottish birders. At least, that's what the cow-dialect theory tells us.

[Hat tip: Edward Wilford.]

[Update: a few minutes with Google Scholar establishes that crossbills are indeed among the bird species that exhibit vocal learning. Thus P.C. Nundinger, "Call Learning in the Carduelinae", Systematic Zoology, 28:3 270-283 (1979). From the abstract:

Experiments and field observations document vocal imitation in six cardueline species representing four genera. Flight call learning was found in all birds studied; learning of many other call types was observed in two genera. Additional evidence extends call learning to other carduelines bringing to eight the total number of genera in which call learning has been observed. Call learning is perhaps characteristic of subfamily Carduelinae, and the taxonomist should consider the possibility that learning may affect the patterns of all adult cardueline vocalizations. The taxonomic value of cardueline calls in particular and passerine calls in general is re-examined in light of this extensive call learning.
[emphasis added]

From the body of the article, commenting on an experiment with call sharing in a pair of captured white-winged crossbills (Loxia leucoptera), who were mated by the experimenter's choice rather than their own:

I conclude that the call sharing exhibited by this pair of Loxia leucoptera is not due to chance but is the result of vocal imitation, and that vocal imitation affects most, or even all, of the call types in this species.

J.G. Groth ("Call matching and positive assortative mating in Red Crossbills", Auk, 1993) studied appalachian crossbills (Loxia curvirostra), and offers a plot of calls from members of 24 mated pairs:

Fig. 1 Audiospectrograms for call notes of 24 pairs (labelled A-X) of crossbills in Virginia. For each pair, call note of male illustrated on left and female on right. Short horizontal marks along vercial axes are at 2, 4, and 6 kilohertz, and width of each box represents 140 msec.

The overall conclusion of this paper is that

These findings are consistent with the hypothesis that distinctive forms of crossbill represent reproductively isolated groups (i.e. species).

However, his main evidence is there is a very wide range of body and bill sizes and shapes, and the shapes and sizes of mated pairs are highly correlated. The call types also match, but as he observes:

The Appalachian crossbills also showed a pattern of assortative pairing based on acoustic characters, but this observation is trivial because call matching was a prerequisite for identification of birds as mates. No information is available on the structure of the calls of these birds before they became associated with their mates.

Indeed, the very wide range of exactly-shared call types makes it seem unlikely that the call sharing is entirely genetic, though Groth's experience with captive crossbills was different from Nundinger's:

In two captive pairs with mates having intitially different call structures that produced nests and successfully fledged young, the mates never matched each other's flight calls.

(though this may have been because of the age at which the birds were captured). The role of genetics vs. vocal learning is left unclear:

The process by which crossbills choose their mates is not known. Bill size correlates with conifer preference in crossbills, and calls could function as signals giving information on morphology and, therefore, habitat preference, of individuals. A question that remains is whether vocalizations, visual assessment of morphology..., habitat preferences, or combinations of these and/or other cues provide the basis for mate choice in crossbills.

Some other background: Craig W Benkman "Divergent selection drives adaptive radiation of crossbills", Evolution, 57:5, (2003)

[C]rossbills are a recent adaptive radiation where the processes involved in population divergence may still be active. Red crossbills in North America are categorized into nine call types that are recognized by distinct vocalizations. At least seven of these call types are specialized for foraging on different species of conifers that hold seeds in partially closed cones through winter, and reproductive isolation is evolving between populations that have diverged in the last 10,000 years.

Also see Craig W Benkman, "Reciprocal Selection Causes a Coevolutionary Arms Race between Crossbills and Lodgepole Pine", The American Naturalist, 162:182-194 (2003).

And Sophie Questiau et al., "Phylogeographical evidence of gene flow among Common Crossbill (Loxia curvirostra, Aves, Fringillidae) populations at the continental level", Nature, 83:2 (1999):

Common Crossbill subspecies have been described according to morphological traits, vocalizations and geographical distribution. In this study, we have tried to determine whether the subspecies correspond to clear-cut mitochondrial DNA lineages ... We find a mixing of the mitochondrial haplotypes at the continental level among the different types or subspecies previously described. Morphological differentiation (in bill size and shape essentially) shows the possibility of rapid local adaptation to fluctuating resources (coniferous seeds), without necessarily promoting the development of reproductive barriers between morphs.

No doubt this was somehow taken into account in the RSPB study of Scottish crossbills, but it would be nice to know how. And the whole business seems like a good opportunity to raise issues of evolution, the meaning of the concept "species", and so forth. The position exemplified by the RSPB work seems to be that more-or-less reproductively isolated populations, with behavior that is somewhat differentiated from nearby groups, and average differences in some morphological characteristics (even if the distributions overlap a great deal), are separate species. By this definition, wouldn't Indian castes be different species?]

Posted by Mark Liberman at 08:15 AM

August 26, 2006

Silly-season linguifying?

Though the journalistic silly season may give rise to even-worse-than-average science reporting, at least there are some redeeming qualities. As recently noted by Matthew Yglesias guest-blogging on Talking Points Memo, the good news for reporters is that "editors tend to be on vacation, so you can get a little goofy." The example Yglesias gives is the sardonic conclusion to an otherwise unremarkable article in the Aug. 25 New York Times about the marketing campaign launched by "the CW," the network formed out of the merger between UPN and WB:

"We had a challenge," said Mr. Haskins, the CW marketing executive, "in that we had to put under one roof programming from UPN and WB and make it feel like one network."
The solution, Mr. Haskins said, was to focus on what the predecessor networks had in common, which was their younger viewers, "and create an environment that was relatable to their lives."
Someday, there will be an article about television in which no executive uses the word "relatable," industry jargon for something with which viewers are supposed to identify or connect. Alas, this is not that article.

On the surface, this little dig at TV executives seems to be a clear case of reportorial linguification. But it turns out that this claim about the prevalence of the word relatable has much more grounding in reality than other instances of linguifying we've examined here (see links at the end of this post).

Television executives really do love calling their shows "relatable," and reporters at the Times and elsewhere dutifully quote them using the word time and time again. In fact, just a day before the story about the CW, the Times carried an article about MTV's Video Music Awards, in which network president Christina Norman was quoted as saying:

"They love the onstage performance moment, but also that unguarded moment, that moment they run into somebody backstage who they haven't seen in a long time or that they don't want to see that night. That's the kind of stuff that makes them more relatable."

A Nexis search on the New York Times archive finds the R-word being used in network-speak way back on June 20, 1982, in a quote from a press release for the syndicated series "Couples" (an early entry in the genre of "reality programming"):

"The real difficulties, conflicts and problems of married, dating, living-together and divorced couples rival any type of fictional format for personal and relatable drama."

And that was just the beginning. Here is a selection of quotes from TV executives and producers in the Times archive, all ringing the chimes of relatability:

  • ''We want this to be relatable, and we want the drama to be honest and hard hitting. It was very important to me to present this working-class drama with as much flair and style as possible.'' (NBC programming executive Perry Simon about "Dream Street," 4/9/89)
  • "You need to have something that's relatable. The best case involves everyday people, somebody like your next-door neighbor doing something unexpected." (Ruth Slawson, NBC senior vice president for movies, 6/15/92)
  • "We all decided the show would be more relatable if it took place in the suburbs." (Producer Rob Burnett about "Everybody Loves Raymond," 12/1/96)
  • "I hope a lot of teen-agers watch it because there are a lot of relatable things going on that could make an impact on them." (Lindy DeKoven, NBC senior vice president for movies, 2/3/97)
  • "It has the combination of being a relatable show -- we all experience these situations -- yet it's a bizarre, off-center point of view like 'Seinfeld.'" (CBS entertainment president Leslie Moonves about "Everybody Loves Raymond," 2/1/98)
  • "These characters are all completely relatable. The only difference between Tony Soprano and me is that he's a mob boss." (Chris Albrecht, HBO president of original programming, about "The Sopranos," 2/3/99)
  • "He's as hip and relatable today as he was years ago." (Julie Weitz, TNT executive vice president of original programming, about "James Dean: An Invented Life," 7/19/00)
  • "That's the big difference between those older shows and the ones you see today. Writers today make a huge effort to make their shows relatable to their audience." (Suzanne Daniels, WB entertainment president, 9/17/00)
  • "People are fascinated by relationships. I think these shows will be very relatable." (Fox executive Mike Darnell, about romantic reality programming, 10/16/00)
  • "The Osbournes to me are a hugely relatable family, and they're famous and a little crazy, but human and identifiable." (Warren Littlefield, former NBC chief programmer, now an independent producer, about "The Osbournes," 5/19/02)
  • "It's, you know, dealing with kids living in the suburbs. You know, I think this is an incredibly relatable area for many people." (Jeff Zucker, NBC entertainment president, about "Hidden Hills," 7/25/02)
  • "In my own head as I was writing the pilot, I wanted an average Joe, and there's something very relatable about John." (Producer Tracy Gamble, about John Ritter on "8 Simple Rules," 9/15/02)
  • "The show is about what everybody does to everybody constantly. You know if your gardener is hot. You know if your bank teller is hot. You know if the guy at the gas station is hot. You know if your kid's kindergarten teacher is hot. It's a relatable, universal concept." (Producer Mike Fleiss, about "Are You Hot?" 2/23/03)
  • "The character is so relatable." (Producer Gina Matthews, about "Jake 2.0," 9/14/03)
  • "Their riches-to-rags story is a "relatable fear." (Producer Mitchell Hurwitz, about "Arrested Development," 11/9/03)
  • "We need to change the perception in the general public from the idea that this is a sprawling ensemble of eccentric characters to the truth, which is that it is actually a very relatable show about a family.'' (David Nevins, president of Imagine Television, about "Arrested Development," 8/1/04)
  • 'There is absolutely nothing about this show relatable to my story, in fairness.'' (CBS chairman Les Moonves, about "How I Met Your Mother," 9/11/05)
  • "The actual surgery itself is remarkable. These are also highly relatable stories about people's lives.'' (Jay Renfroe, principal at Renegade 83 Entertainment, about "Miracle," 1/17/06)
  • "She's totally relatable, and you empathize with her, and you like her and want to be friends with her.'' (Producer Chris Alberghini, about Tori Spelling on ''So Notorious," 3/30/06)

Whew! And that doesn't even cover the word's usage by TV writers or actors mimicking executive jargon, let alone its diffusion into related fields such as advertising, film production, or book publishing. Clearly "relatability" is some sort of litmus test for the execs: the success of a show like "Everybody Loves Raymond" is attributed to viewers' ability to "relate" to the protagonist. Conversely, if a show like "Arrested Development" is faring poorly in the ratings, then audiences aren't appreciating how relatable the characters really are. And the whole "reality TV" trend is predicated on the notion that real people are more relatable than actors playing roles and are therefore inherently more interesting to the average viewer.

("Relatability" seems to be a peculiarly American hangup. For instance, much of the best British television comedy has relied on fiercely unrelatable characters, a tradition running from "Fawlty Towers" to "Blackadder" to "I'm Alan Partridge" to the original version of "The Office.")

So to return to the linguifying claim made in the Times, it's a clear exaggeration that there isn't a single article about television "in which no executive uses the word 'relatable.'" Despite the fact that this is a patently false assertion, it at least has a grain of truth to it, unlike so many other linguifications. And it strikes me as a perfectly acceptable way to spice up a bland article about television network marketing — especially when vacationing editors aren't around to question its appropriateness.

Posted by Benjamin Zimmer at 03:23 PM

It's always silly season in the (BBC) science section

A few days ago, some brilliant British public-relations consultant decided to spread the fame of West Country Farmhouse Cheeses by floating the story that West Country cows not only give special local milk, they even moo in a special local way. (This concern for cheesy terroir is part of the spread, at least in Britain and the U.S., of wine-tasting culture into other agricultural product areas.) The PR firm asked a famous linguist, John Wells, whether cows could have regional accents. He gave a sober and sensible response , which they were able to spin into a form of "yes", even though what he actually said was "probably not". (Specifically, according to John's own account, "I told them I thought it highly unlikely; but that there was well established scientific evidence that several species of bird exhibit regional variability in their calls, so you could not entirely rule out the possibility.")

The PR firm's spin on the matter was eagerly accepted by journalists from media outlets around the globe, and within a matter of hours, increasingly preposterous versions of this story had been presented to the public via thousands of newspapers, radio stations and television channels. John had become "a group of British linguists"; his off-the-cuff answer had turned into an in-depth investigation in which "numerous of the country's herds" were "subjected to screening"; and the scientific validity of cow dialects is now an established fact for millions of the world's better-informed people.

It's a tradition in anglophone journalism that the late summer is treated as a sort of extended April Fool's Day, known as the "silly season". Because both newsmakers and subscribers are on vacation, the laws of journalistic supply and demand motivate attempts to stir up interest with extravagant nonsense. A similar phenomenon is called Sommerloch (= "summer hole") in German. But this silly-season cow-dialect case is not very different from the journalistic treatment of animal-communication stories throughout the year. Even though the cow-dialect story was created out of nothing as a PR stunt, it exemplifies a relationship between facts and their media presentation that is, alas, the normal one. In the world's science sections, it's always silly season.

I guess that some of this is just the normal "telephone game" of human communication, where each writer adds bit of misunderstanding or embroidery to the interpretations or fabrications of the the last one. And most journalists know nothing much about science, and most editors seem to believe that audience appeal is much more important than accuracy, as long as there are no powerful groups to complain about falsehoods. So we get the predictable result: wildly inaccurate stories.

In the cow dialect case, though, I detect an additional factor. At some point in the past decade or so, the British Broadcasting Corporation adopted standards for science reporting that are even lower than those of other serious publications. Nevertheless, readers around the world -- including other journalists -- still associate the BBC's past authority with its current output. [See the end of this post for links to some other BBC science-reporting howlers.]

When I scan the world's reporting on the cow-dialect story, which the internet and Google News now easily allow me to do, the BBC factor jumps out. The BBC was not the first place where this "news" appeared, and it certainly wasn't the only place, but it was an especially influential place. Here are a few examples of this influence:

Europa Press:

Un grupo de lingüistas británicos ha convenido que las vacas, al igual que los seres humanos, tienen distinto acento en función de la región donde viven, según informa la BBC.
A group of British linguists has concluded that cows, just like humans, have different accents depending on the region where they live, according to the BBC.

Los expertos decidieron ahondar en la cuestión después de escuchar que varios ganaderos británicos hablaban de la diferencia de acentos en los mugidos de las vacas, dependiendo de la zona de Inglaterra en que se encontraban.
The experts decided to go into the question in depth after hearing several British cattle dealers speak of the different accents in the moos of cows, depending on the part of England in which they were found

Diario de Léon:

¿Son diferentes en sus ento-naciones una vaca morucha salmantina y una cachena gallega? La respuesta, afirmativa, parecía encontrarse ayer en la sección de ciencia de la web de la BBC: bajo el titular «Las vacas también tienen acentos regionales» se informaba de que los especialistas apuntaban a que estos animales emitían sonidos ligeramente distintos en función de su procedencia geográfica.
Are a morucha (breed of) cow from Salamanca and a Galician cachena (breed of cow) different in their intonations? The answer "yes" seems to be found in the science section of the BBC web site: under the headline "Cows have regional accents" we learn that specialists indicate that these animals emit slightly different sounds as a function of their geographical origin.

The Ciencia section of Terra:

-Me parece que esa vaquita es de Buenos Aires.
-Te equivocas, si bien camina con paso rápido como si estuviese apurada por llegar al trabajo, su mugido es típico de las vacas santafecinas.

-I think this cow is from Buenos Aires.
-You are mistaken; although it walks rapidly as if it were late for work, its moo is typical of cows from Santa Fe.

No se asuste, este diálogo es ficticio pero según las más recientes teorías de los especialistas en lenguaje, esta conversación podría ser tranquilamente parte de la realidad, ya que las vacas, al igual que los seres humanos, parecen tener acentos regionales.
Don't worry, this dialog is fictitious, but according to the most recent theories of language specialists, this conversation could easily be real, because cows, just like humans, seem to have regional accents.

Los expertos decidieron ahondar en el tema después de escuchar que varios ganaderos hablaban de la diferencia de acentos en los mugidos de su ganado vacuno, dependiendo de qué parte del país provenía.
The experts decided to probe into the subject after hearing several British cattle dealers speak of the different accents in the moos of their cattle, depending on what part of the country they came from.

...Cortesía BBC Mundo
...Courtesy of BBC World

Die Welt:

Das Interesse der Wissenschaftler weckte die Tatsache, dass sich Kühe unterschiedlich „äußerten“ – je nachdem, aus welcher Herde sie kamen. John Wells, Professor für Phonetik an der Uni in London, erklärt: Dieses Phänomen war zuvor von Vögeln bekannt.
The fact that cows "express themselves" differently depending on what herd they come from has aroused the interest of scientists. John Wells, professor of phonetics at the university in London, explains: This phenomenon has long been known in birds.

Die BBC berichtet, dass Farmer in Somerset das Phänomen als erste beobachtet hätten.
The BBC reports that farmers in Somerset were the first to observe the phenomenon.

TGCOM Mundo: ("Singolare scoperta di alcuni scienziati" -- "Singular discovery of some scientists")

Proprio come gli uomini anche le mucche quando muggiscono rivelano la loro provenienza. Hanno insomma un accento caratteristico che cambia da regione a regione. Lo sostiene uno studio britannico di specialistii del linguaggio che hanno sottoposto a screening numerose mandrie del Paese dopo che alcuni allevatori avevano segnalato loro questa curiosa caratteristica.
Just like men, cows also reveal their origin when they moo. Thus they have a characteristic accent that changes from region to region. This is the conclusion of a British study by language specialists, who have subjected to screening numerous of the country's herds, after some breeders had noticed their peculiar characteristics.

Un analogo studio condotto da John Wells, professore di fonetica all'Universita' di Londra, confermerebbe l'ipotesi anche per gli uccelli. "In questo caso è assolutamente provato -ha detto alla Bbc- uccelli appartenenti alla stessa specie cinguettano in modo diverso da zona a zona. E lo stesso potrebbe valere anche per le mucche".
An analogous study carried out by John Wells, professor of phonetics at the University of London, confirmed the hypothesis also for birds. "In this case it is absolutely proved", he said to the BBC, "birds belonging to the same species chirp in different ways from place to place. And the same thing could be true for cows."

I'll leave it to the reader to verify that similar stuff can be found worldwide, with the BBC's fingerprints on much of it.

Some other BBC science silliness that we've noted, in passing, over the years:

"Parrot telepathy at the BBC" (1/28/2004), "Stupid fake pet communication tricks" (1/29/2004)
"More junk science from the BBC" (3/10/2004), "The decline of the BBC" (3/10/2004)
"Chatnannies debunked" (3/31/2004), "Chatnannies update" (4/3/2004), "We are all Big Brother" (4/15/2004)
"Talking chimp" (4/7/2004)
"The most untranslatable word" (6/23/2004)
"Transmutation of wood chips at the BBC" (8/28/2004)
"Enhance breast size by 80%" (4/9/2005)
"Tudor linguistic homogeneity" (7/29/2005)
"The Agatha Christie Code: stylometry, serotonin and the oscillation overthruster" (12/26/2005)
"The brave new world of computational neurolinguistics" (12/27/2005)
"Linguists have different brains" (4/7/2006), "How much do those red and blue jellybeans predict about linguistic ability?" (4/17/2006)
"Maurice Saatchi, cognitive neuroscientist" (6/23/2006)
"We feel sad because we say ü" (7/21/2006)
"Vicky Pollard's revenge" (1/2/2007)

Posted by Mark Liberman at 12:00 PM

August 25, 2006

Make Very Excellent Mnemonics: Just Start Using Noggin!

As Interplanetary Linguistics Week continues here at Language Log, let's return to Geoff Pullum's post about planet mnemonics back on Sunday, when it appeared that the International Astronomical Union might add three new planets to the current lineup. The IAU chose instead to banish Pluto, leaving us with just Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune. So what's a good Pluto-less mnemonic to replace old ones like "My Very Excellent Mother Just Sent Us Nine Pizzas"?

A few science educators are reporting that they're switching to:

My Very Excellent (or Educated, or Elegant) Mother Just Sent Us Noodles

Boooring! Surely we can do better. My first suggestion was:

My Very Excellent Mother Just Sent Us Naugahyde

...but somehow I doubt that will catch on. Let's see some other attempts.

Here's a selection from the comments section of the Scientific American blog:

My Very Excellent Mother Just Sent Us Nowhere

My Very Evil Mother Just Sent Us Nothing

My Very Educated Mother Just Said "Um, No"

Many Venetian Explorers Might Just Sail Until Nightfall

My Very Elegant Mnemonic Just Stops Under Nine

And here are some submissions to the mnemonic contest:

My Vision, Erased. Mercy! Just Some Underachiever Now
(as spoken by Pluto discoverer Clyde Tombaugh)

Most Vexing Experience, Mother Just Served Us Nothing!

Molesting Very Excitedly, Michael Jackson Sucks Underage Nipples

Most Virgins Eventually Marry Jocks So Unscrupulously Naughty

Some submitted haikus like this one:

Most Vegans Envy
My Jovian Silhouette,
Usually Not also accepted mnemonics for the old nine-planet lineup from contributors who wanted to protest the demotion of Pluto. Indeed, the winner and one of the runners-up stuck with the old arrangement:

My! Very Educated Morons Just Screwed Up Numerous Planetariums

Many Very Earnest Men Just Snubbed Unfortunate Ninth Planet

As for the eight-planet lineup, the memory aid that seems to have the greatest chance of success is this one, which has appeared in a few places:

Mary's Violet Eyes Make John Stay Up Nights

Some say this evocative mnemonic actually dates back to the era before the discovery of Pluto. Afterwards, it was extended by the addition of another word like Period, Plenty, Pining, Planning, Praying, or Pondering. Now John can go back to staying up nights without any extraneous words.

[Update, 8/30: On the American Dialect Society mailing list, Barry Popik reports finding early evidence for "Mary's Violet Eyes" and another planet mnemonic:

14 March 1934, Winnipeg Free Press, pg. 8:

Do you find it hard to remember the order of the planets in their distance from the sun?

If so, you may find it helpful to use a "memory sentence." Here is such a sentence: "Men Very Early Made Jars Serve Up New Potatoes."

The initial letter in each word in that sentence is the first letter in the name of a planet. Thus we may write it this way: "Men (Mercury), Very (Venus), Early (Earth), Made (Mars), Jars (Jupiter), Serve (Saturn), Up (Uranus), New (Neptune), Potatoes (Pluto)."

If you get the sentence clearly in mind, I think you will have little trouble in remembering the order of the planets, Mercury being closest to the sun and Pluto farthest away. There are two "M's" but Men starts with "MR," the same as Mercury; and Made with "MA," the same as Mars.

A similar memory sentence, told to me by a friendly reader, was used in the old days before the planet Pluto was known. The sentence was, "Mary's Violet Eyes Made John Stay Up Nights."

Posted by Benjamin Zimmer at 11:28 PM

Dwarf planets and California lilacs

Ben Zimmer tackles the new technical term dwarf planet (denoting Pluto, Ceres, Xena, and others on the way), noting that some astronomers -- Owen Gingerich, in particular -- are offended that, with the new definitions, dwarf planets are not planets, which runs against our expectation that an English compound of the form A+B is a hyponym of B (so that, in this case dwarf planets WOULD be planets).  Ben considers, and dismisses, one class of compounds where hyponymy doesn't hold (ironyms).  But in fact ironyms are a special case of a more general phenomenon.  Is there a place for dwarf planet there?

What Ben wrote:

So the fact that the IAU would like us to think of dwarf planets as distinct from "real" planets lumps the lexical item dwarf planet in with such oddities as Welsh rabbit (not really rabbit) and Rocky Mountain oysters (not really oysters). In a 2004 article in American Speech, Larry Horn dubbed such formations ironyms, since they "represent lexical irony."

Should we think of dwarf planet as the latest ironym, then? I doubt the astronomers in Prague really had lexical irony in mind...

In the larger class of compounds to which ironyms belong, the denotation of A+B doesn't involve (the denotation of B) directly, but rather picks out a class of things r() that RESEMBLE the things in in some specific way; (A+B)´ is then a subset of r() -- rather than of -- related in some way to A´.  Let's get concrete: look at daylily, rockrose, and California lilac (three types of plants that are all over the place here in northern California). 

A daylily (genus Hemerocallis) is not a lily (genus Lilium), but it looks pretty much like one.  A rockrose (genus Cistus) is not a rose (genus Rosa), but its flowers are very rose-like.  A California lilac (genus Ceanothus) is not a lilac (genus Syringa), but it's a shrubby plant with lilac-colored flowers in clusters; that is, a California lilac is a lilac-like plant that's connected in some way to California.

There's no irony here, just the conveying of some resemblance, and there are huge numbers of examples.  (Ironyms have the component of resemblance, PLUS an ironic overtone.)

So, is dwarf planet like California lilac?  It could have been, except for the fact that dwarf is one of a small set of nouns -- giant and monster are two others -- that have developed conventional, and productive, uses as size modifiers of nouns: unfortunately, dwarf X is already specialized with the meaning '(very) small X', to the extent that modifying dwarf is starting to push into the syntactic territory of adjectives (and is so classified in some dictionaries).  At least in horticultural usage, it can conjoin with clear adjectives:

Unusual dwarf, bushy, tufted habit and spectacular foliage! The leaves of 'Shaina' are the same dark red all summer... (link)

and occur predicatively:

Nothing is dwarf if the spot you put it in is too small. (link)

and be compared:

Compact & slow growing, it is more dwarf than Okushimo. (link)

In fact, dwarf is already used in astronomy as a size modifier, in the technical term dwarf star.  Dwarf stars ARE stars, stars that are (among other things) small.

Given all this, dwarf planet was a really bad choice of terminology, pretty much guaranteed to sow confusion.  But would the astronomers consult a linguist?  Noooo.

[It would please me to write no more on this topic.  Every single time I tried to type "dwarf", I typed "drawf" first.  Ack.]

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 05:12 PM

New planetary definition a "linguistic catastrophe"!

Owen Gingerich, chairman of the International Astronomical Union's Planet Definition Committee, is quite distressed about the resolution passed by the IAU's General Assembly in Prague yesterday, wherein "planets" (encompassing Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune) are distinguished from "dwarf planets" (the newly demoted Pluto, along with Xena and Ceres, with many more on the way). Gingerich told the BBC that the resolution was reworked after a "revolt" by planetary dynamicists, who felt "terribly insulted" that the original definition of a planet was drafted with a focus more on geology than dynamics. What's more, only 424 of the 2,700 attendees in Prague ended up voting (Gingerich himself didn't vote since he had to catch a plane back to the U.S.), or about 4 percent of the IAU's total membership of 10,000.

Gingerich laid out his objections in stark terms to the Guardian:

"We now have dwarf planets which are in fact not planets. I consider this a linguistic catastrophe. I think the union is going to get a lot of flak for this, in doing it in such a muddy way."

As I mentioned in my post yesterday, distinguishing "dwarf planets" as non-planetary runs counter to our expectations of hyponymy in English. In a compound noun of the form A-B, we generally assume that the compound is composed as a hyponym, a particular type of a more general category B, which in turn is called a hyper(o)nym. So alley cats are types of cats, rocking chairs are types of chairs, bay windows are types of windows, and so forth. The impulse towards hyponymic compounding is so strong that we sometimes form such compounds out of thin air, just to make sense out of a semantically opaque word — consider such folk etymologies as sparrowgrass, reinterpreting asparagus as a type of grass, or crayfish, reinterpreting French (é)crevisse as a type of fish (or at least a fishy thing).

So the fact that the IAU would like us to think of dwarf planets as distinct from "real" planets lumps the lexical item dwarf planet in with such oddities as Welsh rabbit (not really rabbit) and Rocky Mountain oysters (not really oysters). In a 2004 article in American Speech, Larry Horn dubbed such formations ironyms, since they "represent lexical irony."

Should we think of dwarf planet as the latest ironym, then? I doubt the astronomers in Prague really had lexical irony in mind — rather, the planet vs. dwarf planet distinction emerged as a somewhat messy compromise between different stances on planetary redefinition. Though the result may not be quite the "linguistic catastrophe" that Gingerich envisions, it still opens the door to some unintended possibilities. For instance, if Dutch people get offended at pejorative ironyms like Dutch courage meaning 'false courage (brought on by drunkenness)', then will dwarf humans take offense at the dwarf planet distinction, on the grounds that it implies that dwarfs are somehow not quite human? I haven't seen any such outcry yet, though some famous fictional dwarfs did release their own statement yesterday.

[Update #1: I see Gingerich himself raised the question about dwarf humans, as quoted in the Washington Post:

"Pluto is a dwarf planet, but we are now faced with the absurdity that a dwarf planet is not a planet,'' Gingerich retorted. ''Is a human dwarf not a human?'' ]

[Update #2: Betsy McCall writes:

Even worse than defining a "dwarf planet" as not-a-planet, the difference between a dwarf planet and a planet has NOTHING to do with its size! We could - and are likely to - find objects in the Kuiper Belt the size of Mars and they would be called dwarf planets even though they are significantly bigger than Mercury.

Indeed — what makes a dwarf planet a non-planet according to the IAU resolution is that it "has not cleared the neighbourhood around its orbit" (something of a contentious point among astronomers right now). This is apparently what happens when you let dynamicists define the terms.

And Robert Cumming of the Stockholm Observatory writes:

One thing that no one seems to have mentioned yet is that hyponymy in the solar system is nothing new. The name 'minor planet' been more or less synonymous with 'asteroid' for a very long time. So it seems to me pretty insane to complain about any ambiguity or risk for confusion with the introduction of 'dwarf planet'.

I've actually seen this point raised in some of the coverage of the IAU vote — for instance, in this New Scientist report. As I understand the history, astronomers of the early 19th century wanted to classify the first discovered asteroids — such as Ceres, the largest — as planets, and the minor planet label simply stuck around even after asteroids were classified as distinct from planets. The difference with "dwarf planets" is that they're being defined as non-planets from the outset, rather than reflecting a bygone terminological system. In any case, it looks like minor planet has been eliminated by the IAU. Ceres now joins the dwarf planets Pluto and Xena, while all other former "minor planets" are to be known as "small solar system bodies."]

[Update #3: Anatoly Vorobey objects to the hyponymy argument:

A sea cow is not a cow. A guinea pig is not a pig (nor is it from Guinea). A sea lion is not a lion, and a sea horse is not a horse.
A koala bear is not a bear. A buffalo bison is not a buffalo. An aardvark is not a pig (vark).
All of these examples are perfectly ordinary and aren't really perceived as oddities by anyone except when using them to make a joke. In fact, some of them are formed rather regularly, like the "sea" compounds. Hyponymy in compounds may be the usual case in English, but that doesn't mean that non-hyponimic compounds are somehow problematic or unintuitive - we're living with many of them and using many of them daily without giving them a second thought. Insisting that they must in some way be ironic seems fatuous and unnecessary.
Is it really a problem that "dwarf planets" aren't really planets? Of course not, no more than the fact that a sea lion is not a lion. Is it going to confuse people? Well, are they regularly confused by guinea pigs?
Why make a mountain out of a molehill? Especially considering that it's not really a hill at all.

This is a fine and valid point. But I think we're able to recognize that, say, sea lion and sea cow are not compositionally typical hyponyms because we know that actual lions and cows aren't sea creatures. Thus we have sufficient knowledge to interpret those compounds metaphorically. (Except, perhaps, for Jessica Simpson, who was notoriously stumped by "Chicken of the Sea.") But how are laypeople supposed to grasp that dwarf planets are not to be considered planets at all, despite seeming to fall in the same natural class (which can't be said for pigs and guinea pigs, and so forth)? I don't think minor planet was known well enough outside of astronomical circles for this to have become an issue with that term, but now dwarf planet is getting massive exposure because of the demotion of Pluto. So I agree with Owen Gingerich that confusion will continue to reign over this terminological distinction (even if that confusion isn't "catastrophic").]

Posted by Benjamin Zimmer at 08:12 AM

Who done it?

It's August, and the world's news media are apparently being managed by the night-shift cleaning staff. In today's Guardian, there's an article under the headline "A tale of many tongues" that presents an astonishing statement:

The four most often spoken languages in the world are, in order, Mandarin, English, Hindustani and Spanish. Spanish is fast rising in importance and there are now more Spanish speakers in the United States than English. [emphasis added]

Now, the facts in this matter are not hard to find. The U.S. Census publishes a data set on Language Spoken at Home, which is one reasonable way to define who is a "Spanish speaker" as opposed to an "English speaker", and according to the data from the 2000 census, 10.71% of households use Spanish, as opposed to 82.105% who use English.

In fact the Census bureau asks a number of other questions about Language Use, and publishes the results. So we also know that of the 28.1 million people who spoke Spanish at home, 14.3 million speak English "very well". If we subtract these from the Spanish-speaking group, the proportion of Spanish-speakers in the U.S. drops to under 5%. (The overall proportion of the U.S. population who speak English "very well" -- or English only -- was 92%.)

These are easy numbers to find -- it took me one Google search and a couple of clicks on resulting links. But before doing the research, I already knew that the statement in the Guardian -- "[T]here are now more Spanish speakers in the United States than English" -- had to be a preposterous falsehood. What I learned from my 30 seconds of research was just the specific numbers -- 82 to 11, or 92 to 5, or whatever, depending on how you define your terms. Anyone with a working brain and basic knowledge of the modern world must have had the same reaction. You certainly don't need to live in the U.S. to have this basic common-sense understanding of how things probably are -- I got a link to the Guardian article in email from John Wells, who lives in London.

So how did this spectacular piece of nonsense get into the pages of the Guardian, which generally attempts to pass itself off as a serious publication, rather than what in the U.S. we call a "supermarket tabloid"?

Frankly, I'm baffled. We can't directly blame the (admittedly often slipshod and credulous) research practices of journalists, because the author of the article, Alan Smithers, is "director of the centre for education and employment research at the University of Buckingham", and thus not a journalist at all. On the other hand, we can't be sure that this is just one of the (often careless and even dishonest) talking points of public intellectuals, because the article was edited at the Guardian, and might well have been changed substantially from the text that Prof. Smithers submitted.

It's that old problem of attributional abduction. My best guess is the one I started with -- the Guardian's entire editorial staff is on vacation, and has delegated its duties to the night office-cleaning crew, who are having a little competition among themselves to see who can slip the most extravagant falsehoods into print.

[No disreprect to janitors is intended. When I was in college, I worked for about a year as a night-shift office cleaner. None of the clients was a newspaper, worse luck -- I would have enjoyed the opportunity. Let's see: "Paris is closer to Istanbul than it is to London"; "Americans spend more on coffee than they do on gasoline"; "Women speak twice as fast as men do"; "More British teenagers play World of Warcraft than cricket". I coulda been a contender! ]

[Update -- Brett Reynolds writes:

Perhaps 'English' was supposed to refer to people from England rather than speakers of English, although clearly the former is not the salient interpretation.

Maybe so. Then the statement would only be wrong by a factor of 2 or so. But I think that an editing glitch is more likely: perhaps Prof. Smithers wrote something like "there are now more Spanish speakers in some U.S. cities than...", and a rushed and undercaffeinated sub-editor, trying to make a word-count target, cut carelessly.]

[Update 2 -- John Wells wrote to Alan Smithers, who responded:

Many thanks. The statement as it appears is ludicrous. The fault is entirely mine and The Guardian is blameless.

The thought that was in my mind when I wrote that part of the sentence was `there are now more Spanish speakers in some of the United States than English', and I didn't notice in the read-through that I hadn't written it this way. (I was taking native speaker to be implied by the context.)

The Guardian will be printing a correction.

John later sent him a link to this post, and he responded:

Many thanks for this also. Mark Liberman is too kind to me. It was me reading what I thought I had written rather than what was on the page that led to the false statement. The Guardian newspaper, especially its night office-cleaning crew, are blameless on this occasion.

Prof. Smithers is courageously taking the blame on himself, but I submit that he's being too kind to the Guardian's editorial staff. We often hear about the "multiple layers of checks and balances" in the editorial process of the big-time media, which ought to catch such simple slips of the pen.

With respect to the facts of the case, I'm not convinced that there are any U.S. states where native speakers of Spanish now outnumber native speakers of English. I'll ask Prof. Smithers what measures he's relying on, but I can't see how to square this with (for example) the census bureau's table of Language Spoken at Home and Ability to Speak English by Nativity for the Population 5 Years and Over by State. ]

[Update: more here, and the Guardian's correction is here, three days later and not linked from the original article... ]

Posted by Mark Liberman at 06:58 AM

The mathematical Itô and the phonological Itô

The obvious question most theoretical linguists will ask on learning that Kiyosi Itô has been announced as the first winner of the new Gauss Prize in mathematics (a major $11,500 prize established by the International Mathematical Union, comparable to a Nobel for mathematicians) is whether perhaps — given that unusual circumflex accent used to mark length on the final vowel — he might be some relation of Junko Itô, my phonologist colleague in the Department of Linguistics at the University of California, Santa Cruz (and my department chair until this past June 30). And the answer is yes. It is Junko's dad. Junko flew to Kyoto to be with her family before the award became public, and the plan was for the whole family to fly from there to Spain to be present at the public award ceremony at the International Congress of Mathematicians in Madrid this week. However, Kiyosi Itô (who is 90 now) was not in good enough health for such a long flight, so Junko went to the ceremony to receive the award from the King of Spain on her father's behalf, and thus became one of the very few linguists to receive a congratulatory handshake from a reigning monarch (there is a photo of it on page 5 of the August 23 ICM Daily News). Best wishes from Language Log to the whole Itô family: Kiyosi's stunning work in creating stochastic differential equations and founding the study of mathematical operations on stochastic processes (the Itô calculus, originating as far back as the mid-1940s) is long overdue for recognition at this level. And if there was a special award for being a fine phonological theorist, a dedicated administrator, and a great colleague, then Junko would be collecting a prize for herself.

Update: The above was modified when I learned after posting it that the family's plans had changed and Kiyosi had not been able to attend the ceremony in Madrid.

Incidentally, this meant that two honored awardees were absent fromthe congress of the International Mathematical Union, as you'll read in the newsletter: the chosen winner of the Fields Medal (awarded every four years to a mathematician under 40 who has done truly ground-breaking work) was Grigori Perelman of St Petersburg, Russia, but he simply refused to come to the congress, or accept the medal at all, despite extraordinary efforts by the president of the IMU to persuade him. Perelman doesn't want a medal. It is not even clear that he will accept the Clay Institute prize of $1 million for his solution to the Poincaré conjecture if it is offered to him. He simply wants to have correctly proved the conjecture. No one has ever declined the Fields Medal before.

Posted by Geoffrey K. Pullum at 02:12 AM

Where's the beef?

Sweet product placement for Ruth's Chris Steak house ("No ordinary restaurant!" says their website) in a CNN online article  this week:

It's all about relevance and order, two of  Grice's maxims (no, not these maxims), the rules which make pragmatic theory tick. Put two things next to each other, and the human mind just can't help drawing a connection. Like the fact that you took the previous sentence ("Put two things...") to be a clarification of the sentence before ("... make pragmatic theory tick.") I didn't say explicitly that there was any connection at all between them, but you knew I intended it. This is obviously not a just a matter of human communication. You might say that relevance is a special case of the human proclivity for making associations, a word that in the good old days, when psychologists first became the statistic grinding lab robots we know and love, was all the rage in the theory of cognition.

Yum, Yum! Doesn't that steak look juicy? An old Aztec recipe, no doubt....

[PS. Thanks to Mark and Geoff for debugging this post.]
Posted by David Beaver at 12:27 AM

August 24, 2006

Obscenicons in the workplace

Here's the latest example of cartoon meta-commentary on cursing characters (let's call 'em obscenicons). It's the most recent winner of the New Yorker's weekly caption contest.

"The hours here are obscene."

(Hat tip to Arnold Zwicky, whose hours at Language Log Plaza are apparently too obscene for him to post this on his own.)

[Update, 8/26/06: Amusingly enough, this caption actually grew out of the anti-caption contest on, the objective of which is to find "the worst possible caption" for each week's uncaptioned New Yorker cartoon. Daniel Radosh explains the history of the "obscene hours" caption here. (Thanks to Matthew Hutson for the tip.)]

Posted by Benjamin Zimmer at 09:12 PM

Mutating netlore, from "fuck" to "snakes on a plane"

That Indian spiritual figure Osho may have known how to work a crowd, but his grammatically questionable lecture on the utility of the word fuck is nothing more than a bit of musty netlore. The piece on fuck he's cribbing from has been circulating on the Net in one form or another since at least 1985, and may go back even further in xeroxlore.

A new mutation recently appeared as a tie-in to the movie Snakes on a Plane. Here you'll find all of the same creaky grammatical humor, now repurposed for the catchphrase of the moment.

For anyone who needs a primer on the history of Snakes on a Plane as an Internet meme, see the informative Wikipedia entry. (This is what Wikipedia was made for.) The idea of applying the phrase "snakes on a plane" to a variety of contexts having little or nothing to do with the movie evidently grew out of an Aug. 17, 2005 blog post by screenwriter Josh Friedman, an early disseminator of SoaP buzz:

Snakes on a Motherfucking Plane

It's a title. It's a concept. It's a poster and a logline and whatever else you need it to be. It's perfect. Perfect. It's the Everlasting Gobstopper of movie titles

In fact ... I become obsessed with the concept. Not as a movie. But as a sort of philosophy. Somnewhere in between "Cest la vie", "Whattya gonna do?" and "Shit happens" falls my new zen koan "Snakes on a Plane".

WIFE: "Honey you stepped in dog poop again. "
ME: "Snakes on a Plane..."
DOCTOR: "Your cholesterol is 290. Perhaps you want to mix in a walk once in a while."
ME: "Snakes on a Plane..."
WIFE: "Honey while you were on your cholesterol walk you stepped in dog poop again."

You get the picture.

Sad to say, what seemed wonderfully absurdist a year ago has long since been driven into the ground, particularly through the endless propagation of the "Snakes on an X" snowclone. (Snakes on a Blog has been the main clearinghouse for variations on the SoaP theme.) It all reached tiresome levels long before the movie's release last week. One diamond in the rough, however, is puzzlemaster Francis Heaney's Snakes on a Sudoku, which started off as a one-off puzzle idea on Heaney's blog and eventually turned into a full-fledged book. (It is, in fact, "the official Snakes on a Plane puzzle book." Accept no substitutes.)

[Update #1: Dave Wilton points out that Osho was better known as Bhagwan Shree Rajneesh, the leader of the highly controversial Osho-Rajneesh movement. He died in 1990, and the Youtube video is in fact marked as copyright 1984, a year before the earliest Usenet appearance of the piece on the versatility of fuck.]

[Update #2: Ron Hogan wonders if Osho's discourse on the F-bomb might owe something to George Carlin. The piece does bear a family resemblance to Carlin's analysis of shit and fuck in his controversial 1973 monologue, "Filthy Words" (which itself was a revamped version of the equally controversial "Seven Words You Can Never Say on Television"). The transcript of "Filthy Words" can be found in the appendix to FCC vs. Pacifica Foundation, the landmark obscenity case sparked by the broadcast of Carlin's routine on Pacifica's New York radio station, WBAI. Note, however, that Carlin never bothered with the grammatical nonsense pervading the netlore/Osho discussion of fuck.]

Posted by Benjamin Zimmer at 05:18 PM

God dead; "fuck" now the most important word in the language

Yes, it's an irretrievably silly idea, but a wonderful example of linguifying (a phenomenon named here by Geoff Pullum on July 3).  And by the Indian spiritual figure who calls himself Osho, so it has something of a higher imprimatur.  Having floated this remarkable proposal, Osho goes on, in a YouTube video, to riff at length on the uses of the word "fuck", exhibiting along the way the tenuous grasp of grammatical terminology that has so often nettled us here at Language Log Plaza.

"Osho - Strange Consequences" begins with eye-catching screen:

When Friedrich
Nietzsche declared
"God is dead"

F*CK became the most
important word in the
English language

Osho then appears (clad in white and gray, and sitting on a silver-patterned throne-like chair) to announce, "If God is dead, then you lose the most important word in your language.  And you will need a substitute."  God (or perhaps "God", there's no way to tell) is one extreme, Osho explains, and when one extreme disappears, you will inevitably fall to the other extreme.  (Why don't you just promote the #2 item at that extreme, I wondered.  But I'm just a linguist, not a seer, so I probably don't understand these matters deeply enough.)  And so, he tells us, "fuck" has become the most important word in our language.  Beautiful linguification.

After some reflection on how Nietzsche would take this development, should he return from the dead, Osho undertakes to provide a report on the uses of the word "fuck".  And gives us another item in the rich genre of playing in public with the word.

First comes the grammatical prelude.  Osho tells us that it can be a transitive verb ("John fucked Mary" -- all these examples are his, by the way), an intransitive verb ("Mary was fucked by John" -- oh dear, well, passive verbs lack direct objects, but the label "intransitive" would normally be reserved for things like "Mary fucks like crazy"), or a noun ("Mary is a fine fuck").  Or "as an objective", as in "Mary is fucking beautiful".  This last bit of terminology isn't borderline; it's totally fucked.  But we are forever noting here on Language Log that most people -- including a lot of English professors and other people who really ought to know better -- are pretty hazy on grammatical terminology, using it primarily as intellectual parsley, an attractive garnish to the things they really care about.  Why should we expect more of a holy man?  After all, Osho did get "transitive verb" and "noun" right; he's batting at least .500, which is a pretty good average for connecting with grammatical terminology these days.

Then comes a long riff on uses of "fuck", to convey aggression, suspicion, enjoyment, request, and much more -- there's no point in trying to take this inventory of uses seriously -- and concluding advice to begin the day by meditatively repeating the mantra "Fuck you!" five times.  And he grins mischievously.  The audience goes wild.  (Why can't I get audiences like this?)

[Thanks to Vishy Venugopalan for the pointer.]

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 01:18 PM

ATM languages

The bank nearest to my house -- a Washington Mutual, or WaMu, as it often refers to itself -- has installed a spiffy new ATM, which moves beyond the standard language choices (English and Spanish) and offers two more.  Any guesses about what they are?

Well, language #3 is Russian and #4 Chinese.  These stuck me as sensible choices, given the likely clientele for WaMu's services at this location.  Surely, I thought, the choices are different at other locations, where the population mix is different.  But no, as googling on <"Washington Mutual" ATM language> shows: it's Russian and Chinese everywhere.  As Andrew Leonard wondered on Salon on 27 April, why Russian?

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 12:02 PM

The communicative power of silence

While driving around yesterday, I caught the tail-end of this piece on All Things Considered. At first I could only hear enough of what was being said to figure out that it was a story about the planetary status of Pluto, something we've all been reading about here on Language Log (for more and less language-related reasons). So I began to listen more closely, and this is what I heard.

Robert Siegel is interviewing "Kepler College professor and astrologer Robert Hand" -- the piece is entitled "Astrologers Join Debate of Pluto's Planetary Status". But I don't figure this out until the third sentence of Hand's reply to Siegel; I have to admit that I was completely thrown off by the subtle distinction Hand makes between planets not being capable of indicating what a person can do vs. what a person should do in order to achieve some goal.

But the most interesting bit, in my opinion, is how much information is communicated by the long silence toward the end of Hand's reply. (The clip really must be heard to fully appreciate this.)

Siegel: You know, one of our listeners heard our correspondent David Kestenbaum's piece yesterday about this controversy, and sent us an e-mail saying that he had had his chart done some years ago, and the position of Pluto led the astrologer to conclude that he would never marry. Therefore, he wonders whether indeed that forecast might be rescinded based on the new status of Pluto, should it change.

Hand: It's not necessary to rescind that forecast; it should never have been made. No planet is capable of indicating absolutely that a person can't get married. All a planet can do is indicate what a person has to do in order to get married. [Siegel: Aha.] And sometimes that requires so much work on the part of a person that they're not likely to do it. But it isn't actually the planet that's preventing it, it's the person's own inclinations. I consider a forecast like that to be malpractice. [Silence: 3.6 secs.] And I have a lot of company.

Siegel: Well, Mr. Hand, thank you very much for talking with us. [Hand: You're welcome.] Robert Hand is an astrologer who spoke to us from his home in Northern Virginia just outside Washington D.C.

What do you think is being communicated by this long silence? Submit your answer here.

Posted by Eric Bakovic at 11:24 AM

Pluto is a dwarf planet, but not a planet

The International Astronomical Union has spoken, and Pluto is no longer to be classified as a "planet." It is, however, still considered a "dwarf planet." Don't be fooled by any preconceptions you might have about English hyponymy: a dwarf planet is not, in fact, a planet. A proposal to encompass "classical planets" and "dwarf planets" under the same "planet" umbrella failed, so Pluto has been definitively cast out of the planetary club.

Yet another resolution was confirmed supporting the creation of "a new category of trans-Neptunian objects" of which Pluto is the prototype. Even though this gesture to assuage Plutophiles passed, we'll have to wait to find out what to call members of the new category. There was no clear majority in favor of the leading candidate, "plutonian objects," so it will take a longer procedure by the IAU to establish how to refer to this anonymous family of Pluto-ish thingamajigs.

(The text of the resolutions, as well as up-to-date coverage from Prague, can be found at Rob Britt's LiveScience blog.)

[Update: See here for backlash against the confusing planet vs. dwarf planet distinction.]

Posted by Benjamin Zimmer at 10:17 AM

Where are moo from?

Below is an email from John Wells, sent in response to my query about the great outpouring of cow-dialect stories.

Experts have backed a claim by Somerset dairy farmers that cows moo with a regional accent. The phenomenon was noticed by members of the West Country Farmhouse Cheesemakers group, who put it down to the close bond between farmer and cow. The group also noted similar accent shifts in Midlands, Essex, Norfolk and Lancashire moos. John Wells, Professor of Phonetics at the University of London, said: "This phenomena is well attested in birds. You find distinct chirping accents in the same species around the country."

What I actually said was "This phenomenon...", but no matter. The words put into my mouth continue.

"This could also be true of cows. In small populations such as herds you would encounter identifiable dialectical variations which are most affected by the immediate peer group."

And there you can see from the strange word dialectical (= dialectal) that those are not my words at all but the inventions of a public relations firm.

They had been engaged by a cheese manufacturer, West Country Farmhouse Cheesemakers, to publicize their regional varieties of cheese. They telephoned me to ask whetheer there was any possibility that cows' moos might vary geographically. I told them I thought it highly unlikely; but that there was well established scientific evidence that several species of bird exhibit regional variability in their calls, so you could not entirely rule out the possibility. (To see some evidence re birds, do a Google search on "avian dialects".)

Cows, of course, do not in general form stable isolated populations such as would presumably be necessary to allow such regional diversity to develop. On the contrary, cattle are bought and sold and trucked around the country and indeed internationally.

The next thing I knew was that the PR people had put out a press release with their selective and garbled version of what I had said. It was embargoed until midnight Tuesday/Wednesday. Soon after midnight one a radio station rang me and set up an interview for 00:55. Less than five hours later I was woken by a call from Australia pursuing the same matter, and for the next twelve hours my phone hardly stopped ringing. Ever compliant, I gave over a dozen radio interviews and made three television appearances. In every one I poured cold water on the suggestion of bovine dialects, while suggesting from time to time that if any funding body cared to come up with five million pounds I would be happy to direct a research project into this vital issue.

The story appeared in various places on the web: on the BBC, the British commercial television channel ITV, among the serious newspapers in the Guardian, and in The Register. A correspondent even sent me an article in Polish.

Which all goes to show that in August the media love a silly story, however implausible. If only my publishers could be similarly inventive when my new book comes out.

John Wells

That's it! Let's find out who this PR firm is, and get a consortium of language-related scholarly and scientific societies to engage their services! They figured out that the right way to focus public attention on the "protected designation of origin" for their clients' cheeses is to get the public thinking that West Country cows even moo differently from other cows. Perhaps a bit of the bobo fascination with terroir will, in turn, rub off on linguistics? (And I wonder if this PR firm was the same outfit that figured out that "email lowers IQ more than pot"...)

By the way, John's new book is English Intonation: an Introduction. It's coming out in Britain on Aug. 31, and at some later time in the U.S. I've already pre-ordered a copy from

[Update -- in today's Guardian, a letter from Peter Stockill in Middlesbrough:

I once worked on a farm at Glastonbury (Cows moo with an accent down on the dairy farm, August 23). The cows were able to wander around the orchard where they ate apples that had fallen from the trees. These apples fermented into scrumpy in contact with digestive juices. Watching the cows stagger was the most hilarious thing I have ever seen. Yes, Somerset cows have a drawl - it is because they are drunk.


[Update #2 -- Martin Torres writes:

I thought you might be interested to know that the story has made it all the way to Spain. I found it on the local newspaper of my city, San Sebastian, right on the last page. It's the one reserved for irrelevant curiousities such as Siamese twins or eccentric millionaires buying something in the Seychelles islands.

It is in this page that I found this brief piece, which I recognised instantly thanks to the Language Log. This is the link to the web version of El Diario Vasco: (link).

I'll point out the last sentence, regarding Wells. The line is pretty ambiguous on whether he accepts bovine accents or not, saying only that "he remarked that these variations have been found in birds as well".

What is more surprising is the first sentence, which places the "blame" of these studies on a group of British linguists. From your articles I gathered that Wells was for the most part the only one acquainted with linguistics in the whole story...

In any case, and since the news piece is credited to the EFE agency, it's probably on every newspaper.

I expect that there are versions of this story in French, German, Russian, Chinese and so on, around the globe. I hope someone got a bonus at that PR firm -- has a local variety of cheese ever gotten so much free publicity for no particular reason? Now, if we could only figure out how to harness that kind of PR power to turn interest in cheese into interest in linguistics, instead of the other way around... ]

Posted by Mark Liberman at 06:14 AM

August 23, 2006

Oh, the moos you can moo

A couple of years ago it was ducks, and now it's cows. According to the BBC:

Cows have regional accents like humans, language specialists have suggested.

They decided to examine the issue after dairy farmers noticed their cows had slightly different moos, depending on which herd they came from.

I would normally refer this directly to the Language Log humor writers, who value BBC science stories above rubies. But the apparent source of this one is a real and serious scientist:

John Wells, Professor of Phonetics at the University of London, said regional twangs had been seen before in birds.

A bit later on, the story quotes Prof. Wells making a suggestion about mechanisms:

Prof Wells felt the accents could result from their contemporaries.

He said: "This phenomenon is well attested in birds. You find distinct chirping accents in the same species around the country.

"This could also be true of cows.

"In small populations such as herds you would encounter identifiable dialectical variations which are most affected by the immediate peer group."

Unfortunately, the BBC's reporter didn't give Prof. Wells space to provide any details, for which the reporter relied on another source:

The farmers in Somerset who noticed the phenomenon said it may have been the result of the close bond between them and their animals.

Farmer Lloyd Green, from Glastonbury, said: "I spend a lot of time with my ones and they definitely moo with a Somerset drawl.

"I've spoken to the other farmers in the West Country group and they have noticed a similar development in their own herds.

"It works the same as with dogs - the closer a farmer's bond is with his animals, the easier it is for them to pick up his accent."

John Wells is one of the world's most eminent phoneticians, and the author of a terrific blog that I read regularly. I've written to invite him to tell us more about the bovine (and canine?) versions of British regional speech. I'm hoping for audio clips, transcriptions, spectrograms, scatter plots! (The IPA may not be up to the task, though perhaps the International Phonetic Association already has a committee working on the extra symbols needed for ruminant regional variation.)

Then again, it could be that the BBC reporter just called John for a comment, and John said the sort of thing that I might have said: "Well, I'm not sure about cows with Somerset drawls. But there certainly can be regional variation in animal vocalizations --cases of that sort of thing have been studied among birds for more than 60 years (e.g. the review in R. B. Payne, "Population structure and social behavior: models for testing the ecological significance of song dialects in birds", in R.D. Alexander and D.W. Tinkle, Eds., Natural selection and social behavior, 1981). However, for this to happen with cows, it would have to be the case that they learn to moo from the other members of their herds, the way that some songbirds learn to sing from the songs that they hear in the nest. I haven't heard this suggested, though -- the literature on ruminant dialectology is, shall we say, a little thin, [blah, blah]." And then I could have been the expert cited to back up Farmer Green's intuitions about how Bessie and Bossy speak just like him.

On the other hand, maybe there's more to moos than I knew. We'll see.

[Hat tip to Julia Hockenmaier]

[Update -- Geoff Nathan sent a link to a CNN publication of a Reuters newswire story, which suggests that it's a farmers' group (the "West Country Farmhouse Cheesemakers") that contacted John Wells:

Dom Lane, spokesman for a group called the West Country Farmhouse Cheesemakers to which Green belongs, said it contacted John Wells, Professor of Phonetics at University College London, who said that a similar phenomenon had been found in birds.

"You find distinct chirping accents in the same species around the country. This could also be true of cows," Wells said on the group's Web site.

According to Lane, accents among cows probably develop in a similar way as among humans, and resulted from spending time with farmers with differing accents.

"Apparently the biggest influence on accents is peer groups -- on children in the playground, for example," he said. "Herds are quite tight-knit communities and don't tend to leave the area."

He added that more scientific research was needed to prove what was just an anecdotal theory at this stage.

If John Wells is quoted on the West Country Farmhouse Cheesemakers web site, neither a bit of browsing nor a search via Google was able to find it. In any case, alas, I don't think we're going to get the audio clips, transcripts, spectrograms and scatter plots for a while yet! However, as a lover of cheese and rural life, as well as a devoté of animal communication, I'm open to offers from anyone who'd like to sponsor some of that needed scientific research.

Seriously, this is a once-in-a-lifetime opportunity to update the old joke about the out-of-work physicist who took a job as a consultant to a dairy cooperative, and produced a report that began "Consider a perfectly spherical cow, radiating milk isotropically." I have an agent-based model for social convergence on shared word pronunciations that I've been meaning for a while to extend to cover dialect development. "Consider a herd of markovian cows, generating moos randomly..."]

[Update: John has blogged about his encounter with the fourth estate:

It's the silly season -- August, when there is not much political news, so the newspapers print stories that are not altogether serious.

I was telephoned by a public relations consultant on behalf of a cheese manufacturing company in Somerset. Was it possible, they asked, that the local cows might moo with a west-of-England accent? I told them that I thought it was highly unlikely, but that there had been serious research showing that various species of bird exhibit geographical variation in their calls. And if birds and human beings have local accents, you can't entirely rule out that cows might too.

The PR company issued a press release. They showed it to me only after they had sent it out, which meant that it was too late for me to protest that they had put into my mouth the solecism "This phenomena is...". Of course I would always say only "This phenomenon is..." or "These phenomena are".

The press release was embargoed until midnight. At half past midnight yesterday my phone rang: it was a call from BBC Radio Five Live setting up a telephone interview for 00:55. I snatched a few hours sleep, but was woken by a call from Australia, about bovine dialects, at about 05:45. From then on my phone hardly stopped ringing all morning.

In all yesterday I did twelve phone interviews plus three television interviews, one of them transmitted live from Vauxhall City Farm in central London in the rain, alongside a disconsolate heifer.

You can read on-line reports from the BBC, ITV, the Guardian, and something called The Register.

I shall be invoicing the PR company for a full day's work.

If you'll pardon the comment, John, perhaps you aren't thinking about this on a large enough scale. Rather than merely "a full day's work", how about an international multi-site investigation, to correlate the vocalizations of domestic mammals with the taste of local cheeses, while also taking account of the influence of soil and weather as indicated by carefully-monitored physiological reactions to a selection of regional wines?]

[Update: more from John here, and some general commentary on the global diffusion of this story here.]

Posted by Mark Liberman at 04:17 PM

Iceballs, revisability, language, and intelligent life in the universe

It's looking bad for Pluto. The International Astronomical Union is facing an upwelling of protest and a new motion that would deny planethood to the tilt-angled weird-orbited iceball from the ragged fringes of the Kuiper belt that was irresponsibly added to the planet roster in 1930. Of course, you may be wondering, why is this an issue of interest to linguists? Why are you reading about it on Language Log, where you normally turn for a brief respite from astronomy, ferret training, farm machinery insurance, whatever? Well, Adrian Morgan wrote to me with these ruminations on the connection:

It occurs to me that there is, to some extent, a parallel between the long-running debate about the planetary status of Pluto and the controversy between certain competing ideas about grammar. One side thinks it all-important that what people were taught in school should remain true forever (hence, Pluto is a planet). The other side thinks that classifications should be based on observable facts about the universe really does, and revised when necessary (hence, Pluto is a Kuiper belt object). Sound familiar? I thought so.

It certainly does sound familiar. I think the parallel is spot on. For people who want to make sure the material they were taught in elementary school and high school stays unchanged forever, the path is clear: stay away from all intellectual activity, avoid contact with anyone who is intellectually curious, live a dull and unexamined life. You'll be fine. But astronomers expect that evidence should be relevant to decisions about how to apply scientific concepts, and that in the light of new discoveries it will sometimes be necessary to revise previous decisions about how to apply them. The thing about linguists is that they take that attitude toward language.

You don't have to, though. Really, you don't. You can just stay away from Language Log and continue to believe that faith is a verb and that split infinitives can and should be avoided, if that's what you'd like. And Pluto can be forever a planet.

Posted by Geoffrey K. Pullum at 12:58 PM

Good-bye plutons, hello Plutonian objects

The latest news from Prague is that the International Astronomical Union is now leaning towards demoting Pluto from planethood rather than elevating three new candidates. Also, instead of creating a classification of not-so-planety planets called plutons, the IAU is now considering throwing the sadly downgraded Pluto a bone by declaring it the prototype of a new class of subplanetary bodies. Pluton seems to be out of the running as a name for members of the category, while plutoid and plutonoid are faring no better. The New Scientist reports that Plutonian object is currently the "least unpopular choice." (Quite a ringing endorsement!) Owen Gingerich, chairman of the IAU's Planet Definition Committee, explains, "The purpose of this is to give a nod to those people who are great Pluto fans."

That's right, Pluto fans! You might still get a modicum of respect after all. Don't let the Plutophobes get you down.

(Pluto's proposed new status reminds me of the scene in The Life Aquatic with Steve Zissou where a petulant Klaus interrupts Team Zissou's "lightning strike rescue op" to complain about being relegated to the B Squad. Steve patiently explains, "You might be on B Squad, but you're the B Squad leader!" So Pluto will now get to be the B Squad leader.)

The term pluton is evidently losing out in large part due to the objections of geologists that I discussed in my last post. As Gingerich told Nature, the IAU members did not appreciate that pluton was a common geological term:

Gingerich ... says that they were aware of its usage amongst geologists, but unaware of its importance to the field. "Since the term is not in the MS Word or the WordPerfect spell checkers, we thought it was not that common," Gingerich wrote in an e-mail to The geologic definition of the word does appear in common dictionaries, including the Oxford English.

(Yes, the astronomers really did rely on spellchecker software when assessing pluton. Deutsche Presse-Agentur reports, "One panel member quipped that geologists should attack software-maker Microsoft, not astronomers, for the 'pluton' oversight because the word did not appear in the panel's spellchecker.")

Bucking the anti-pluton sentiment, Gingerich still thinks the two definitions of pluton could happily coexist: "We think words can (and frequently do) have alternative meanings - for example, is there mercury on Mercury?" (To which I'd add: there's definitely earth on Earth.) Despite this sensible argument for polysemy, it looks like pluton is a non-starter.

(See also coverage in the New York Times, which mentions mnemonics that astronomers were devising for the proposed twelve-planet lineup, though sadly there's no mention of Geoff Pullum's attempt. And to satisfy a request from Vili at Vertebrate Silence, here's a suggestion for the Pluto-less arrangement: My Very Excellent Mother Just Sent Us Naugahyde.)

[Update, 8/24/06: Good-bye plutonian objects, hello ... ??? The IAU shot down "plutonian objects" as the name for the new class with the now nonplanetary Pluto as its prototype. Details here.]

Posted by Benjamin Zimmer at 11:01 AM

August 22, 2006

Excessive activity leads to loss of intelligence

As readers well know, Language Log has been hot on the trail of false claims about language and gender recently (here, here, and here). Now it's time for the Geriatric Division to kick in because, I suppose, as we grow older we worry about our loss of mental capacity. How to guard against this comes from Pope Benedict XVI himself, who advises that work is the major culprit leading to IDS (intelligence deficiency syndrome). As proof, the Pope cites the 12th century writings of St. Bernard, who made this point:

Watch out for the dangers of an excessive activity, whatever ... the job you hold, because many jobs often lead to the 'hardening of the heart,' as well as 'suffering of the spirit, loss of intelligence.'

I don't know about you but I never knew this before. Since St. Bernard penned this for the benefit of popes, one might conclude that it relates only to them. But no, says the "79-year-old" Pope Benedict. He says it's valid for every kind of work, including, I would think, linguistics and even Language Loggers.

Whether or not the bosses upstairs have noticed, I want them to know that I've been relatively silent recently simply because I've been  trying hard to preserve my spirit and intelligence. Okay, so we don't have any empirical evidence that the Pope is right about this one. Who needs evidence? It matters little in an age when people can make claims about virtually anything with little or none of it.

P.S. And while I'm at it, I'd like to know why the Associated Press seems to feel it necessary to tell us that the pope is 79 years old. Any suggestions?

Posted by Roger Shuy at 02:40 PM

Bringin ablative from Latin

Some hip-hop artists might call themselves Linguistics, but others actually rap about the subject. Straight outta Vancouver, BC (Canada, y'all) come some linguistics students/artists with Schwhat's up? and Morphologilistic. Check 'em out.

Selections from the lyrics to Morphologilistic:

I'm bringin ablative from Latin
my nominative is smoother than a kitten made of satin
and I'm flattenin your vowels from the bottom to the top
and to your slopply laterals I drop a glottal stop

My civic duty's to inform you about all things velar
if you need a consonant then consider me your dealer
or not. You've got a lot of sounds of your own
I wouldn't want your business even if you had an allophone

You down wit S O V?
yeah, you me know
from the CP to the DP I flow
know this, though, I make no apology for my zeal
from etymology to phonology
it's a quality that consumes all o me constantly

but I'm heartfelt and emotive, listening to your vowels
when we're through with gentle ones we're movin on to howls
hooting like owls cause I'm a linguistic freak
"oh you're a linguist, eh? how many languages do you speak?"

[ Thanks to Ryan Field for sending the link. ]

[ Comments? ]

Posted by Eric Bakovic at 11:34 AM

More on rats and men and women

In response to my post on Leonard Sax's account of the "emerging science" behind the claim that "Girls and boys ... see the world differently", David Hilbert sent some additional information. David argues that Sax's account of sex differences in vision is even more misleading, and also more internally incoherent, than I had understood. Below, I've reproduced David's note, with his permission.

We're getting pretty far away from speech and language, it's true, but this started because David Brooks picked up on Sax's misleading description of some shaky findings on sex differences in perception of emotional faces, and concluded that boys and girls need to be given different books to read. I've also just discussed Sax's claim that "Girls and boys... hear differently", and therefore need different teaching styles and different classroom environments.

[Guest post by David Hilbert]

In reading your discussion of Sax's book I was struck by his purported explanation for the claimed difference in retinal thickness: "That's because the male retina has mostly the larger, thicker M cells while the female retina has predominantly the smaller, thinner P ganglion cells."  Although I'm a philosopher, not a visual scientist, my research does require basic knowledge of retinal anatomy and physiology and I know that there are 7-9 times as many P-cells as M-cells in the primate retina.  (My source here is a useful textbook, Wandell, B. A. (1995). Foundations of Vision. Sunderland, MA, Sinauer Associates, pp 121-122.) This statistic is hard to reconcile with the claimed predominance of M-cells in males.  I have been unable to find any authoritative source for the claim that there are sex-differences in the ratio of M-cells to P-cells although I have found it repeated elsewhere.  The only evidence I have found offered is to work backwards from differences in performance between males and females to claims about their inferred physiological basis.  These kinds of inferences are notoriously tricky and I have not been impressed with the reasoning I have encountered.  I went to Borders to look at Sax's book, I don't want to buy it, and the only source in the vicinity of the claim is the second of the rat papers.  The only relevant material in that paper is the claim that, "cytoarchitectonic studies are warranted to determine the number, type and density of cells as well as the thickness of each of the cellular layers before such hypotheses can be validated." (p. 135)  In other words they don't know the reason for the differences in retinal thickness that they measured.

I was able to find a paper that had data on the area of the magnocellular and parvocellular layers of the lateral geniculate nucleus of the thalamus in humans broken down by sex.  (Andrews, T. J., S. D. Halpern, et al. (1997). Correlated Size Variations in Human Visual Cortex, Lateral Geniculate Nucleus, and Optic Tract. J. Neurosci. 17(8): 2859-68.) These layers in the LGN are the recipients of the projection from the M-cells and P-cells in the retina.  Although the paper doesn't discuss sex differences I did compute the average ratio of the areas of the M- and P-layers for all subjects, female subjects and male subjects.  The overall ratio was .324; for females .329 and for males .318.  This very small sample (10 female hemispheres, 9 male hemispheres), contradicts Sax's claim although the difference is probably not significant.  I am not an expert on this topic and there may be better data that I have not been able to find.

The citation to rat data for this type of claim is more irresponsible than you indicate.  It would be hard to pick a sighted mammal with a visual system more different from the human one than the rat.  Rats are nocturnal animals with rod dominated retinae.  The human retina is cone-dominated and also is characterized by a number of specializations found only in primates.  Among these are the P-cells which are not found in mammals other than primates (although there are analogues in the cat which partly explains why cats provide a much more useful model for human vision).  The most bizarre claim in Sax's discussion of vision is found in the table on the page following the one you quote from.  There he asserts that the M-cells receive input from the rods while the P-cells receive cone inputs.  Notice that given his early claim about sex differences this would mean that males have primarily rod-driven vision.  We now have an explanation for the insistence of men on wearing their sunglasses indoors.

[Guest post by David Hilbert]

[Here's the table that David is referring to, from p. 22 of Leonard Sax's book Why Gender Matters:

Are wired predominantly to . . . Cones Rods
Are located mostly in . . . The center of the retina (center of the field of vision) All throughout the retina (entire field of vision, peripheral and central)
Are best adapted to detect . . . Color and texture Location, direction, and speed
Answer the question: "What is it?" "Where is it now? Where is it going? How fast is it moving?"
Ultimately project to: Inferior temporal cortex Posterior parietal cortex
Predominate in: Females
(more P cells than M cells)
(more M cells than P cells)

(Warning to unwary readers: some of the "facts" in this table are clearly false, and none should be relied on without checking in more reliable sources.)

I'd like to point out that Sax's presentation of this material is getting very wide distribution, and is having a significant impact on the way that serious people think about educational policy today. ]

Posted by Mark Liberman at 10:33 AM

Leonard Sax on hearing

Leonard Sax's recent book, Why Gender Matters: What Parents and Teachers Need to Know about the Emerging Science of Sex Differences, presents an impassioned argument for single-sex education. His basic message is that "Girls and boys play differently. They learn differently. They fight differently. They see the world differently. They hear differently." He claims that these are not stereotypes, but scientific facts. And therefore, boys and girls should be educated differently -- and separately.

In an earlier post, we took a look at Sax's argument that girls and boys "see the world differently", and found that he builds his case by presenting data about sexual dimorphism in rat retinas as if it were data about human retinas, when the (easily available) comparable human data is not at all like the rat data. In this post, we're going to consider what Sax has to say about how girls and boys "hear differently". Part of the problem here is the usual rhetorical move of claiming that boys and girls (or men and women) are essentially different as groups, when in fact the difference in average values between the sexes is small relative to the within-sex variation. But in this case, bizarrely, one of the two pieces of research that Sax cites actually found a difference in means that's in the opposite direction from what he claims.

On the web site for Sax's book, he offers a page on sex differences in hearing, which expands on the content of the corresponding section of his book. Below, I'll quote some passages from that web page, in blue, with interspersed commentary, with some figures and quotations from one of the key sources that Sax cites.

The first systematic evaluation of the hearing of girls and boys was performed by Professor John Corso of Penn State University in the late 1950's and early 1960's. Dr. Corso simply used a soundproof booth, headphones, and a tone generator. He consistently found that the girls hear better than boys do, especially in the range of frequencies above 2 kHz. See John Corso, Age and sex differences in thresholds, Journal of the Acoustical Society of America, 31:489-507, 1959; also John Corso, Aging and auditory thresholds in men and women, Archives of Environmental Health, 6:350-356, 1963.

The title of Corso 1959 is actually "Age and sex differences in pure-tone thresholds", and its page numbers are actually 498-507, not 489-507. These bits of bibliographical sloppiness are trivialities, but they're an initial indication that Sax is not applying a very high standard of care here, despite the centrality of this work to his argument. There's a much more important problem about the way that Sax respresents the content of Corso's work. The difference that Corso found between average male and female hearing thresholds is generally about 1/4 to 1/2 of a standard deviation, depending on age and frequency. As a result, the distributions are so heavily overlapped that (as I put it in another post on the same topic), "[i]f you pick a man and a women (or a boy and a girl) at random, the chances are about 6 in 10 that the girl's hearing will be more sensitive -- but about 4 in 10 that the boy's hearing will be more sensitive." It's misleading, at best, to describe this by saying that as "[Corso] consistently found that girls hear better than boys do".

Now comes the bizarre part.

Pediatric audiologists Barbara Cone-Wesson, Glendy Ramirez, and Yvonne Sininger have done careful studies of the hearing of newborn babies. When any baby or child (or adult for that matter) hears a sound, there's an immediate reaction, called an acoustic brain response.

Actually, it's called the "auditory brainstem response". (As we'll see, this terminological sloppiness is followed by some extraordinarily careless -- i.e. backwards -- reading of the reported results.) It's also a bit misleading to call the ABR an "immediate reaction". The infants are not "reacting" in the sense of perking up and looking around -- the tests were done while they were sleeping. ABRs are event-related potentials, "very small electrical voltage potentials originating from the brain recorded from the scalp in response to an auditory stimulus", and representing the automatic passage of signals from the cochlea via the 8th cranial nerve through various brainstem structures. "Wave V", which is what Cone-Wesson et al. rely on, "is believed to originate from the vicinity of the inferior colliculus".

Cone-Wesson and her colleagues decided to measure the acoustic brain response of more than 60 newborn girls and boys.

For the record, they measured the "auditory brainstem response" of 72 neonates.

For a 1500 Hz tone played to the right ear, they found that the average girl baby had an acoustic brain response about 80% greater than the response of the average baby boy. Here are the references for those studies: Barbara Cone-Wesson and Glendy Ramirez. Hearing sensitivity in newborns estimated from ABRs to bone-conducted sounds, Journal of the American Academy of Audiology, 8:299-307, 1997.
Yvonne Sininger, Barbara Cone-Wesson, and Carolina Abdala. Gender distinctions and lateral asymmetry in the low-level auditory brainstem response of the human neonate. Hearing Research, 126:58-66, 1998.
The fact that girl babies have a more sensitive threshold for very quiet sounds does not by itself prove that girls hear better than boys when the sound is louder;
[emphasis added]

Here's where it gets weird. What these (eminent and deservedly respected) researchers found in their (exemplary) study was exactly the opposite of the way that Sax describes it in the emphasized phrase above. In their own words, Yvonne Sininger, Barbara Cone-Wesson, and Carolina Abdala, "Gender distinctions and lateral asymmetry in the low-level auditory brainstem response of the human neonate", Hearing Research, 126:58-66, 1998, wrote:

This study revealed three significant and surprising results. First, ABR thresholds are lower for male newborns than for females. Second, wave V amplitudes elicited with low-level stimuli are larger when the right ear is stimulated in male and female neonates. Third, wave V amplitudes elicited by low-level stimuli are larger in female than male neonates but only in the right ear.

In other words, the boy babies had a "more sensitive threshold" than the girl babies, on average, not the other way around! Repeating it another way, the infant boys showed a statistically significant ABR "wave V" response to softer stimuli, on average, than the girls did. Of course -- as usual -- the differences were small compared to the within-sex variation:

Fig. 1. Mean ABR thresholds for clicks and tone bursts plotted as a function of gender (A) and ear stimulated (B). Gender and ear effects are condensed across frequency in bar graphs at right. Only gender effects reach significance (P<0.05). Error bars indicate ±one standard deviation.

[For the geeks among you, SPL is "sound pressure level" in DB relative to a reference level (typically 20 µPa), and peSPL is the "peak equivalent" SPL, defined as the SPL of a 1000-Hz tone whose maximum amplitude is the same as the peak value of a transient signal like a click.]

At signal levels where responses were well established, girls' ABRs were (on average) somewhat greater in amplitude than the boys' ABRs were (in terms of voltages measured through the scalp, whatever that really means). Again, as usual, the between-sex differences were small relative to the within-sex variation:

Fig. 5. Mean wave V amplitude from neonates in response to low-level stimuli. Latency is plotted by gender (A) and by ear (B). Bar graph at right condenses data across frequency. Error bars indicate ±one standard deviation.

And the researchers note that the combination of lower thresholds and lower amplitudes for boys vs. girls is "paradoxical". So when Sax writes that "the average girl baby had an acoustic brain response about 80% greater than the response of the average baby boy", we move from carelessless to misrepresentation. Let's ignore the fact that he's got the terminology wrong. The cited researchers tested the baby's response to clicks and to tones at four different frequencies. He ignores (in the book) or inverts (on the web page) the threshold differences. It's only for one of the five stimulus conditions -- the 1500 Hz tone -- that there's much of a difference between the sexes in the response amplitude. The overall sex difference in Wave V amplitude (which is the only kind of amplitude data reported) is roughly .2 μV for the boys and .25 μV for the girls. I make that 25% greater, not 80% greater. The difference in means for the 1500 Hz tones might be 80%, but what about all the other stimulus conditions? (And look at the error bars in the 1500 Hz condition!)

[A couple of caveats are in order here. While I'm not an expert in this technology, it strikes me that there are a lot of things besides the brain in the electrical pathway between the brainstem and the electrodes measuring the voltage -- blood, bone, flesh, and skin among other things -- and it's not clear to me how we can be sure that the voltage differences didn't result from differences in these non-neurological components. There are some other questions to explore about this experiment as well -- thus "female infants were tested at an average of 127 h post partum whereas male infants were on average 78.3 h old", and "The slope of the regression line for the threshold hours post partum function is positive for both male (+0.032×hours) and female (+0.027×hours) infants." The study's authors test and reject this explanation on statistical grounds, but I'd feel better about it if I could explore their raw data a bit.]

This study goes a long way toward establishing the conclusion that its authors came to see in its data. They began with the assumption that the well-known sex differences in hearing were due to nurture, not nature:

Surveys of auditory sensitivity have consistently shown adult females to have superior hearing (lower thresholds) above 1000 Hz (Corso, 1963; Glorig and Roberts, 1965; Axelsson and Lindgren, 1981) whereas low frequency sensitivity (500 Hz and below) seems to be better in males (Jerger et al., 1993). Hearing decrement with age is also more rapid in males than females (Pearson et al., 1995; Kryter, 1983). Sex-related differences in auditory sensitivity are often attributed to greater occupational and recreational noise exposure in males (Kryter, 1983) because no significant differences in hearing related to sex were found when studying adult tribal populations in non-industrialized areas free of high levels of noise (Rosen et al., 1962; Goycoolea et al., 1986). This led to the conclusion that no inherent differences existed in sensitivity between males and females and left the assumption that gender-related differences in hearing could be attributed to differences in susceptibility to noise-induced hearing loss alone.

As a result, they did not really plan their study for the purpose of studying sex differences:

It is important to note that the number of subjects in this study is relatively small. The original purpose of this study was to characterize ABR thresholds in newborns. Whereas gender and ear randomizations were included in the original design, it was not anticipated that distinctions in ABR threshold based on these factors would be found.

However, they did find sex differences in their data, and they make a convincing argument that the differences are probably due to "nature" (i.e. hormonal or sex-linked gene expression differences) rather than "nurture" (i.e. different life experiences). BUT, crucially, the differences between the sexes are so small relative to the within-sex variation that no possible conclusions can be drawn about sex-related educational policy -- and, as it happens, the differences in thresholds are in the opposite direction from Sax's description.

Because the between-groups differences are so small, and the situation with newborns is apparently so different from the situation even a short time later in life, these results have no logical bearing at all on matters of educational policy. However, Sax's mistake in describing the direction of the threshold effect does matter, at least to his rhetoric if not to his logic. The point of all this, for Sax, is not just that boys and girls are different, but that they're different in particular ways that happen to coincide with a common set of sexual stereotypes about sensitivity vs. insensitivity (and other metaphorically-related oppositions). Sax writes (Why Gender Matters, p. 18):

The difference in how girls and boys hear also has major implications for how you should talk to your children. I can't count the number of times a father has told me, "My daughter says I yell at her. I've never yelled at her. I just speak to her in a normal tone of voice and she says I'm yelling." If a forty-three-year-old man speaks in what he thinks is a "normal tone of voice" to a seventeen-year-old girl, that girl is going to experience his voice as being about ten times louder than what the man is hearing. [...]

The gender difference in hearing also suggests different strategies for the classroom. More than thirty years ago, psychologist Colin Elliot demonstrated that eleven-year-old girls are distracted by noise levels about ten times softer than noise levels that boys find distracting. ... One reason for that difference, of course, it that eleven-year-old girls hear better. If you're teaching girls, don't raise your voice .... [but] the rules are different when you're teaching boys.

But one of the major pieces of research that Sax cites actually finds that newborn boys have somewhat more sensitive hearing, on average, than newborn girls do! I haven't tracked down the references that Sax gives to support his striking statements about girls perceiving voices and noises as "ten times louder" than boys do, but I'm willing to guess what I'll find if I look into it. [Update: and, alas, I guessed right. See "Girls and boys and classroom noise", 9/9/2006).]

My conclusion from all this is that Leonard Sax has no serious interest in the science of sex differences. He's a politician, making a political argument. For all I know, his political goal -- single-sex education -- might be a good thing. But he should stop pretending that he's got science on his side, or else he should start paying some minimal attention to what the science actually says.

[A list of Language Log posts on Leonard Sax and Why Gender Matters:

"David Brooks, cognitive neuroscientist" (6/12/2006)
"Are men emotional children?" (6/24/2006)
"Of rats and (wo)men" (8/19/2006)
"Leonard Sax on hearing" (8/22/2006)
"More on rats and men and women" (8/22/2006)
"The emerging science of gendered yelling" (9/5/2006)
"Girls and boys and classroom noise" (9/9/2006)


Posted by Mark Liberman at 06:18 AM

August 21, 2006

No plural shifting term

I just popped down to the Starbucks on the ground floor at Language Log Plaza and found when I got back to my corridor that the place is buzzing with a new problem from Caroline Henton and Wade Dowdell: the (supposed) problem of finding a term for shifting the "s" to the end in plurals like "WMDs" (compare "Weapons of Mass Destruction", which some people seem to think might suggest one would have expected the "s" before the "M"). I wish people would just wait for me to come back up with my latte and ask me. I have the answer. It is that there is no term, and there won't be one, and there shouldn't be, because there is no shifting here. The problem arises purely from a misapprehension.

There is a process of forming new nouns by concatenating initial letters of phrases. The process is called "initialism" in The Cambridge Grammar of the English Language (chapter 19, pp. 1632-1634). There are varieties of initialism: acronyms like SARS or UNESCO and abbreviations like BMW or WMD. But both kinds simply form nouns (they're discussed in our chapter on lexical word formation). So once WMD is a noun, its plural is formed in the usual way: "s" on the end. There is no "transference of position" for the -s suffix. That's why there's no special name for such transfer. WMD is just a noun pronounced "double-you-em-dee". Its plural is regular, and naturally, it sounds like "double-you-em-deeze". In the written form, you just write the abbreviation (or acronym) in capitals and add a lower-case "s" on the end.

(In rare cases an apostrophe is used too, if the look of it would be highly confusing otherwise: you don't write "is" for the plural of the word "i", a one-letter initialism denoting the 9th letter of the alphabet, because it would look like the 3rd singular present tense form of BE; so you would write "I remember there were two i's in his name." But that is the exception. Most plurals written with apostrophes are merely a consequence of — there's no gentle way to say it — illiteracy.)

Just ask, OK, guys?

Posted by Geoffrey K. Pullum at 06:20 PM

Is "singular they" verbally and plenarily inspired of God?

Back in October of 2004, Geoff Pullum discussed a wall inscription where they had a singular definite antecedent: "This person is not ignorant. They are a prophet." Geoff wrote that

The pronoun form they is anaphorically linked in the discourse to this person. Such use of forms of they with singular antecedents is attested in English over hundreds of years, in writers as significant as Chaucer, Shakespeare, Milton, Austen, and Wilde. The people (like the perennially clueless Strunk and White) who assert that such usage is "wrong" simply haven't done their literary homework and don't deserve our attention.

But Geoff left out the single most compelling example.

In the King James Version, Deuteronomy 17:

2: If there be found among you, within any of thy gates which the LORD thy God giveth thee, man or woman, that hath wrought wickedness in the sight of the LORD thy God, in transgressing his covenant,
3: And hath gone and served other gods, and worshipped them, either the sun, or moon, or any of the host of heaven, which I have not commanded;
4: And it be told thee, and thou hast heard of it, and inquired diligently, and, behold, it be true, and the thing certain, that such abomination is wrought in Israel:
5: Then shalt thou bring forth that man or that woman, which have committed that wicked thing, unto thy gates, even that man or that woman, and shalt stone them with stones, till they die.
[emphasis added]

Many people believe that the King James Version is "God's preserved word in English", or "verbally and plenarily inspired of God", or some similar formulation. Thus the Doctrinal Statement of the Emmanuel Baptist Theological Seminary says: "We believe that the Hebrew and Greek manuscripts which underlie the King James Version (the Masoretic text of the Old Testament and the Textus Receptus of the New Testament) are the preserved words of God. Furthermore, we believe that the King James Version of the Bible is God's preserved word in English and therefore, it shall be the official and only translation of the Holy Scriptures used by this Church and all of its ministries." And the Sword of the Lord's listing of What We Believe asserts that "We believe the Bible, the Scriptures of the Old Testament and the New Testament, preserved for us in the Masoretic text (Old Testament) Textus Receptus (New Testament) and in the King James Bible, is verbally and plenarily inspired of God. It is the inspired, inerrant, infallible, and altogether authentic, accurate and authoritative Word of God, therefore the supreme and final authority in all things." (See this compilation for more examples.)

As Ben Zimmer recently observed, singular they "is old hat and hardly worth remarking on". But in defiance of authoritative scholarship and everyday usage, some pockets of stubborn prescriptivist resistance remain, and it's a comfort to know that we can count on the Sword of the Lord to help mop them up.

[Hat tip: Steg]

[Update -- Coby Lubliner writes:

The use of singular "they" in the KJV is the Jews' fault, since the original has "u-s'qaltam [and thou shalt stone them] ... va-metu [and they shall die]". More modern versions try to weasel around it. The New International Version has "and stone that person to death"; the New Living Translation uses (horrors!) the passive: "then that man or woman must be taken to the gates of the town and stoned to death"; the English Standard Version: "you shall stone that man or woman to death with stones"; the New King James Version: "shall stone to death that man or woman with stones"; theNew Life Version: "and kill the man or woman with stones"; and so on. (I used Where does that "stone to death with stones" come from, anyway?

Well, it's the preserved words, like they said. Deal with it, ye prescriptivists.]

[Update #2 -- Wayne Leman has more at Better Bibles Blog.]

[Update #3 -- Adam Booth writes:

In your update, Coby Lubliner asked, "Where does that 'stone to death with stones' come from, anyway?" The answer is: Greek. I'm pretty sure (that scholars have settled that) the KJV translation of the Hebrew scriptures was done not from the Hebrew text at all, but mainly from the Greek LXX (with some support from Jerome's Latin and earlier English translations). The LXX rendering of the verse can be found here:

and has, literally, "you (sing.) stone-throw them with stones (dat.)" (and a plural third person "they shall die").

The plurals here are a little odd; the noun phrase which is meant to serve as subject for "they shall die" is an eta-coordinated noun-phrase, where eta coordinates "the man that" and "the woman that" (in a word-for-word translation, which is a perfectly natural way of saying "that man" and "that woman" in Greek). Eta can function as a coordinator (like "or") as a coordination marker (like "either") in Greek. When singular NPs are coordinated by eta to form a subject, the VP normally takes a singular verb (cf. Matt 18:8 for an example, which shows this and eta playing both its roles).

So, I don't think it's quite clear from this verse alone that the KJV translators were all that into singular 'they' themselves -- they simply translated Greek plurals into English plurals, even when those Greek plurals are a little non-standard.

The Wikipedia article on the King James Version claims that "The Old Testament of the King James Version is translated from the Masoretic Hebrew Text", though I have no idea whether that's true. In this particular case, both the Hebrew and the Greek versions seem to have the same plural pronouns as the KJV English does.

Whatever their textual sources, I prefer to think that Lancelot Andrewes and the First Westminster Company were, if not divinely inspired, at least moved by the spirit of the English language.]

[Update #4 -- more here]

Posted by Mark Liberman at 06:10 PM

Term for shifting plural s to the end of initialisms and acronyms?

One of Wade Dowdell's friends asked him, and he asked Caroline Henton, and Caroline asked us. I don't know the answer, so I'm asking the readers of Language Log, in the hopes that someone out there can tell me, so that I can tell ... well, you get the idea.

A friend has asked me the following question and I don't know the answer. Weapons of Mass Destruction is abbreviated WMDs, not WsMD. My friend wants to know if there is a name for this kind of transference of the plural ending to the end of the abbreviation, and if there is, what it is.

My only substantive contribution here is to observe that we could avoid the whole problem, in this case, by joining Ali G and Pat Buchanan in rendering the abbreviation for "weapons of mass destruction" as BLTs.

Posted by Mark Liberman at 05:22 PM

Translating leadership, creating verbiage

Every now and then you see some apparently simple text in your own language that you realize you simply do not understand. Gerbig Management has the slogan "Translating thought leadership...creating business results", six words with no plausible parse that I can detect ("translating thought"? "thought leadership"?); and the four things the company does are described as: "Structuring working partnerships that team with client resources, operations, and infrastructures"; "Assessing and approaching variegated causes and root concerns with targeted analysis solutions"; "Developing and managing complex information technology lifecycles"; and "Assisting client partners by managing risk and navigating an increasingly complex regulatory environment." Yes, one wants to say, but what do you do when you get to the office? It all makes me realize that the Language Log corporation just doesn't have enough sloganware, mission-statementry, or marketspeak to belong to the modern business world. We just sit around writing stuff about language (or in young Bakovic's case, playing World of Warcraft when he thinks I am not watching his screen through the crack of his office door). It is sooo yesterday. We should be structuring linguistic resources to team with client communicational information infrastructures, assessing root neuropharmopsychosociolinguistic concerns for targeted linguistic lifecycle technologies, managing linguistic risk quotients to assist partners in navigating increasingly complex morpholexicosyntactic environments... That sort of thing. I think I just made my head hurt.

By the way, "thought leadership" is not a new concept; it's just new (indeed, non-existent) in our organization. The phrase gets 7.42 million Google hits. We are so far behind the business-speak curve here that it's pathetic. We need... thought leadership. If I may re-cycle some wise words that Scott Adams once quoted in one of the Dilbert books, we need a change that will allow us to better leverage our talent base in an area where developmental roles are under way and will strategically focus us toward the upcoming Business System transition where Systems literacy and accuracy will be essential to maintain and to further improve service levels to our customer base going forward. That's what I think, anyway.

Posted by Geoffrey K. Pullum at 02:13 PM

Earnest snowclone of the month

Although the Eskimos are horseless, alert readers will sense them hovering in the background as Lawrence Scanlan reviews J. Edward Chamberlin's How the Horse Has Shaped Civilizations and Margeret E. Derry's Horses in Society: A Story of Animal Breeding and Marketing Culture, 1800-1920. Scanlan's lead:

The Blackfoot of the Plains had more than 100 words for the colours of horses, the Kazaks of central Asia 62 for bay shades alone. These are not just numerical curiosities from old horse societies, but signs of a human watchfulness and a deep connectedness to the natural world that was the norm, and is now rare. ["The horses we rode in on", Globe and Mail, posted online 8/19/2006]

If you know enough Blackfoot or Kazak to evaluate these claims, or can find a relevant reference, let me know. [Hat tip: Patrick King.]

[Update -- Alexander Jabbari writes:

As I'm sure you know, the "Eskimo words for snow" issue stems partially from Inuit's richly agglutinative morphology (and partially from bad journalism). Kazak is similar in that it is also very agglutinative and forms new nouns by adding a number of affixes to a root noun. It's thus very plausible that it has 62 "words" for different shades, but these are no more distinct words than "dark brown" is a distinct word from "light brown" in English; they're merely compounded into "darkbrown" and "lightbrown".

While I'm not familiar with Blackfoot to any degree, it is also an agglutinating language (according to Wikipedia) so I suspect it's another case of a writer's inability to understand polysynthesis.

So the 62 Kazak words for "bay shades" are like "light bay" and "dark bay" and "pretty dark bay" and "somewhat darkish bay but not really that dark, actually", and so forth? On that basis, surely there are going to be many more than 62. Can anyone supply some information about the specific vocabulary that actually might be used for horse colors in these languages? ]

[Update #2 -- Bridget Samuels writes:

Finally, LL reports on a snowclone that I know something about! Well, that is to say, I don't know anything about Kazak or Blackfoot, but the claims seem plausible even for English. I have a big book about the genetics of horse color (aptly called Horse Color) stashed away somewhere in my folks' house, so when I go visit in a couple of weeks I'll be able to give you a more accurate report. But, having memorized much of the book when I was a [more] horse-crazed kid, I have a feeling there are more than 100 colors. It does depend how loosely you're going to define "color," though. Do you want to count all the different appaloosa and pinto coat patterns? Do you want to count "with flaxen mane & tail" as a separate color variant for each relevant coat color, or the various white markings on the legs and face, each of which have specific names? If so, I can't imagine that there are fewer than 100 distinct "horse colors" for which we have names in English. In my experience, quite a bit of the terminology is only used these days by people who are into specific breeds prized for their unusual color patterns, if at all, but it does exist in theory. (I remember one time in pony club when I was five or six, I'd just read Horse Color for the first time, and the trainer asked me to name the color of a specific pony. I contemplated the stout little appy for a moment and spouted out a long stream of words that the woman had apparently never heard before. She didn't quite know what to say.)


[Update #3 -- Peter Austin writes:

David Harrison has written a nice account of Tuvan yak naming conventions in a paper called 'Ethnographically informed language documentation' published in "Language Documentation and Description Volume 3" (2005) available from

There is a hierarchy of terminology for head markings, body patterns and body colours that is more suggestive of complexity than mere lists of lexical items for colour.

The Sasaks of Lombok, Indonesia, have a complex system of describing water buffalo that includes both body colour and patterns, as well as shape and directions of horns.


[Update #4 -- Gary Daine sent in a link to a page that "shows many of the specific terms used for the colouring of 'toros bravos', with illustrations". So we have English terminology for horse colors, and Tuvan terminology for Yak colors, and Spanish words for bull colors -- but we're still missing the actual Blackfoot and Kazak horse color vocabulary. ]

Posted by Mark Liberman at 12:58 PM

Piling on "pluton"

As we wait nervously for the General Assembly of the International Astronomical Union to decree how many planets are in the solar system, we are also kept on the edge of our seats about the status of the term pluton. The IAU is considering the creation of the category pluton to cover any planet beyond Neptune, taking more than 200 Earth years to complete its orbit around the Sun. (Or as phrased more pungently by Geoff Pullum, a "planet that is really nothing more than a God-forsaken frozen slushball way out beyond Neptune drifting around in the Kuiper belt and damn lucky to be called a planet at all.")

Pluton has been kicking around astronomical circles since at least 1991, when it showed up in a Science News cover story ("Plutos galore: ice dwarfs may dominate the solar system's planetary population," 9/21/91). Though the article generally refers to Pluto-like objects as Plutos, one quoted source, George W. Wetherill of the Carnegie Institution, called them Plutons instead, using criteria a bit looser than the IAU's:

There's really no adequate theory for the formation of Uranus and Neptune, and I really can't see how one can speak too intelligently about Triton and Pluto and these 1,000 'Plutons' without some framework for the whole origin of that part of the solar system.

(Triton, Neptune's largest moon, would not make the grade for the IAU definition of pluton, nor would the thousand or so ice dwarfs discussed in the article. So far the only transneptunian qualifiers are Pluto; Charon, formerly considered a moon of Pluto; and 2003 UB313, soon to be officially named but now traveling under the moniker Xena.)

There are scattered uses of pluton or Pluton in the relevant sense throughout the 1990s. For instance, the astronomer Tom Burns wrote a column for the Columbus Dispatch on June 8, 1997 in which he stated his preference for the label Pluton. And it has entered science fiction as well: according to a 1996 discussion on the Usenet newsgroup sci.astro, Fred Pohl used pluton as a generic term for Pluto-type objects in his 1992 novel Mining the Oort. (The word has had other sci-fi uses, though. Michael Quinion observes that Robert Heinlein used it as the name for a plutonium-based Earth currency in his novellas Gulf [1949] and Tunnel in the Sky [1955].)

One scientific group is not too pleased with the proposed designation: geologists, who already use the term pluton to refer to igneous rock intrusions. On SciAm Observations, the blog of Scientific American, George Musser considers the "appropriation" of pluton to be "a remarkable violation of professional courtesy." In another post, he continues his gripe:

It's bad enough when popular culture pilfers scientific terms, such as "quantum leap", "epicenter", and "light-year", and twists their meaning. But it's all the more galling when one scientific discipline appropriates the terminology of another.

Musser reports that University of North Carolina-Chapel Hill geologist Allen Glazner has petitioned the Geological Society of America and the American Geophysical Union to draft a cease-and-desist order of sorts, telling the IAU to lay off pluton. Glazner has asked his colleagues to join him in "trying to get the IAU to come up with a new word, instead of hijacking our perfectly good one and causing endless confusion."

Is polysemy among scientific disciplines really such a terrible offense? Glazner wonders, "What happens when we find a pluton on a pluton?" I think context should be more than enough to obviate the "endless confusion" that Glazner foresees. Do geologists also worry about the fact that veins can refer to both rock fractures and blood vessels, or that a delta can be either a sedimentary deposit or a finite mathematical increment?

A more damaging objection about pluton, to my mind, is that it's a bit too similar to the root form Pluto, especially when you consider that the French term for Pluto is... Pluton. Similarly, Spanish uses Plutón and Italian has Plutone. How would we able to say in such languages, "Pluto is a pluton"? Linguablogger Gheuf suggests that the plutons would be better served with some other suffix:

But if the Plutons are given a suffix, what suffix should it be? I am partial to the diminutives, but it would be hard to choose between the contending charms of Plutitos, Plütchen, and Plutoncini; the last has a certain gastronomic appeal. Or we could give up the Roman gods, and name them "Planettes". But to please the sober tastes of the Scientists, I suggest "Plutonoids" (Pluto-like), which has a learned air, and is not too badly formed either.

I do like plutitos, plütchen, and plutoncini, but there's a problem with diminutivizing Pluto: astronomers have already done that with plutinos, a term they use to refer to various small objects in the inner part of the Kuiper Belt up to and including Pluto. (If I understand the distinction correctly, Pluto itself is the only plutino that would also fit the IAU definition of pluton.) Plutonoid sounds good though, and I wonder why the IAU hasn't considered it (or the similar plutoid). Perhaps it was deemed inappropriate because the -oid suffix implies that such objects only resemble Pluto, which might make it hard to justify including Pluto itself in the same category.

Whatever the IAU's decision ends up being, we'll probably all learn to live with it, regardless of current terminological grumbling. And we can always fall back on "God-forsaken frozen slushballs" (GFFSBs for short).

[Update, 8/23/06: It looks like the astronomers are backing off pluton. Details here.]

Posted by Benjamin Zimmer at 12:11 PM

August 20, 2006

New planet mnemonic: Language Log is there for you

If the change to the definition of the term planet goes through at the International Astronomical Union in Prague this week, an old mnemonic phrase won't work any more: My Very Excellent Mother Just Sent Us Nine Pizzas will not give a clue to the names of the planets in their correct order proceeding outward from the sun (originally Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, Neptune, and Pluto, but I suspect that after August 25 you can soon kiss that list goodbye). Fixing up the old memory aid to allow for the addition of Ceres (between Mars and Jupiter), Charon (which circles around Pluto, and we'll list the two in alphabetical order), and Xena (way outside Pluto's orbit) is clearly a job for Language Log (though we're a little bit behind the leading edge of research here: the linguistically unqualified people at Scientific American got onto this just before we did, and there are already about fifty suggestions posted; God, I hate to work alongside amateurs...). So don't worry: we're on the case. Just learn this: My Very Excellent Mother Could Just Send Us Nine Cheerleaders Playing Xylophones. Piece of cake.

Actually, "Xena" is an unofficial nickname at the moment and will almost certainly not be approved, so the cheerleaders will be playing something other than xylophones. Language Log will keep you informed.

Those who recall other mnemonics (like "Many Very Early Men Just Sat Up Nights Playing") will be handled separately; please don't call us, just wait to be contacted by the Mnemonics Division of Language Log.

Posted by Geoffrey K. Pullum at 11:57 PM

Crocodiles are fish in Australia

an Australian crocodile

The press is making much of the Australian government's redefinition of crocodiles as fish in the Agriculture, Fisheries and Forestry Legislation Amendment (Export Control and Quarantine) Bill 2006, as if this were on a par with the stupidity of the attempt by Indiana legislators in 1897 to redefine π as 3.2, 3.232, 3.236, or 9.24 (all in the same bill!), or the nefarious silliness of the Reagan Administration's redefinition of ketchup as a vegetable in school lunches. As the digest of the bill explains, the purpose of this is to ensure that the law regulates the export not only of fish in the usual sense but of prawns, clams, mussels, and crocodiles.

The relevant portion of the bill amends the law to include the definition:

fish means aquatic vertebrates and aquatic invertebrates but excludes mammals and birds

The purpose is perfectly sensible, but the mechanism seems perverse. Why not just define the scope of the law as "aquatic invertebrates and aquatic vertebrates excluding mammals and birds" and thereafter refer where necessary to "the regulated organisms" or whatever? The way legislators write and amend legislation seems to me very similar to the way inferior computer programmers develop software.

Posted by Bill Poser at 06:38 PM

Translingual ironic snowclone of the day

Another snowcanard, submitted by Rob Malouf.

The title is "Fokke & Sukke feel just like Eskimos", and they're saying more or less: "As experienced Lowlands attendees, we can distinguish between at least... ...twenty different kinds of puke".  (Lowlands  is a famously raunchy music festival).

(More on Fokke & Sukke here.)

Posted by Mark Liberman at 01:18 PM

Bad headline pun of the week

"Flipper only a deep C student" -- Newsday/Scripps Howard.

Posted by Mark Liberman at 01:17 PM

Quantifier domain restriction and gel-filled bras

As Mark Liberman noted, security expert Bruce Schneier had some fun with this line from the Transportation Security Administration's byzantine list of prohibited carry-on items:

We encourage everyone to pack gel-filled bras in their checked baggage.

Schneier's riposte:

Everyone? Do I have to as well? Where should I go buy one?

But he wasn't the only online wag to crack this joke. Compare these two other iterations:

Everyone? First order of business, go buy a gel-filled bra. (Dvorak Uncensored)

Everyone means everyone. Even you. (Digg)

Nobody ridiculed the TSA for matching up an ostensibly singular quantifier (everyone) with an ostensibly plural personal pronoun (their). That sort of thing is old hat and hardly worth remarking upon (though Lord knows that doesn't stop us from remarking upon it, e.g., here, here, here, here, here, here, here, here...). The humor, such as it is, instead hinges on a supposed ambiguity in determining the domain of the quantifier everyone.

For the bloggers' joke to work, we have to toss out our Gricean maxims and apply an extremely uncooperative reading to the TSA's injunction. In a more explicit metalanguage, we could render the obviously intended proposition as:

For every person x such that x is traveling by air in the U.S. with gel-filled bras y, we encourage x to pack y in x's checked baggage.

The willfully obtuse interpretation given by the bloggers would go something like this:

For every person x such that x is traveling by air in the U.S., we encourage x to pack gel-filled bras y in x's checked baggage.

How do we know how to restrict the domain of a quantifier like everyone? We rely on a variety of contextual clues. In this case, since the sentence appears in a list of instructions for passengers, we know that everyone must at least be limited to the set of people traveling via U.S. airports, as even the disingenuous reading recognizes. Elsewhere in the TSA text, those domestic air travelers are given instructions through a series of law-like statements addressed to a generic you, e.g.,"You are permitted to bring solid cosmetics and personal hygiene items as such lipstick, lip balm and similar solids," or this bizarre bit of advice: "We also ask that you follow the guidelines above and try not to over-think these guidelines." (I tried not to over-think but failed. Sorry, TSA — over-thinking is de rigueur here at Language Log Plaza.)

In the ridiculed sentence, the TSA text shifts to a different type of indirect speech act. Instead of second-person address directed to a generic you, an "encouragement" is given to a generic everyone. But only a subset of passengers was meant to be selected by the use of everyone: those traveling with gel-filled bras, who must then decide where to pack them. The writer of the text could have made this shift of address more explicit by specifying the object of encouragement as "everyone traveling with gel-filled bras," but then that pesky coreference of everyone with their might have gotten in the way:

We encourage [everyone x] traveling with [gel-filled bras y] to pack [them y] in [their x] checked baggage.

The possible confusion between the two different coreferents of them and their would have been spared if the writer had chosen a singular pronoun her to match the antecedent everyone: "pack them in her checked baggage." But then the sudden change of addressee from all passengers to only female ones might have sounded a bit unusual. Of course, the non-gender-specific alternative his or her is no help here in avoiding the gender-bending ridicule of Schneier et al. In any case, I'm pretty sure the TSA wants to insulate itself from the tricky question of whether only women might be packing gel-filled bras. (This might explain why singular their was used in the first place.) It looks like the whole sentence would need to be recast in order to bypass all of these pitfalls.

The wisecrack at the TSA's expense actually strikes at the heart of a long-standing conundrum in the philosophy of language: how do we determine the intended domain of everyone and similar quantifiers? In the Anglo-American philosophical tradition, the question has been posed as a challenge to Bertrand Russell's theory of descriptions. Russell's theory is not well-equipped to deal with incomplete descriptions where quantifier domains must be inferred from pragmatic context. For more on the subject, see the April/June 2000 issue of Mind & Language, featuring "On Quantifier Domain Restriction" by Jason Stanley & Zoltán Gendler Szabó, with commentaries by Kent Bach and Stephen Neale. Some of the same ground is covered in a recent paper by Scott Soames, as discussed by Brit Bogaard on her blog Lemmings.

Posted by Benjamin Zimmer at 12:10 PM

The Planet Tlahuizcalpantecuhtli

Astronomers have recently been debating the definition of a planet and whether Pluto really is one, with some astronomers reluctant to adopt a definition that would increase the number of planets in the solar system. User Friendly suggests that they will adopt a linguistic tactic for keeping down the number of planets recognized.

Posted by Bill Poser at 11:34 AM

Another step towards gender equality

The Transportation Security Agency has advised that

We encourage everyone to pack gel-filled bras in their checked baggage.

Bruce Schneier asks

Everyone? Do I have to as well? Where should I go buy one?

This reminds me of a point about intonation and meaning. About 35 years ago, I heard Martin Kay describe a sign on the London Underground that read (as he performed it) "DOGS must be carried". (Here's a more modern version of the sign, courtsy of Annie Moie's London Underground Tube Blog.) I've never figured out a really convincing explanation for why stressing "dogs" seems to encourage the interpretation "everyone must carry a dog", while stressing "carried" encourages the interpretation "if you have a dog, you must carry it".

Posted by Mark Liberman at 09:37 AM

August 19, 2006

Of rats and (wo)men

I've recently read a couple of recent books on the science of sex differences: Leonard Sax's "Why Gender Matters: What Parents and Teachers Need to Know about the Emerging Science of Sex Differences", and Louan Brizendine's "The Female Brain". As I explained in previous posts, I was disappointed and even shocked at these authors' cavalier treatment of issues in speech, language and communication ("Are men emotional children?"; "Neuroscience in the service of sexual stereotypes"; "Sex and speaking rate"; "Sex-linked lexical budgets"). And as I've read more, I've learned that at least some of the core neuroscience in both books is equally shaky. More precisely, both authors erect some robust and convincing rhetorical structures on some very weak scientific foundations.

That's not to say that every piece of science that these books present is faulty -- I've only checked a few things. But what I've found, in the cases where I've checked, suggests that you'd be wise not to accept anything from such sources without looking into it carefully for yourself. There's a spectacular example in chapter 2 of Sax's book, under the heading "The Eye of the Beholder" (p. 18-29).

Sax starts by telling us a bit about the retina -- rods and cones, P and M ganglion cells -- and gives this functional account of the ganglion cells:

P cells and M cells have very different jobs. M cells are ... essentially simple motion detectors. ... You can think of the M cells as being wired to answer the questions "Where is it now and where is it going?" ... You can think of P cells as answering the question "What is it?" P cells compile information about texture and color; M cells compile information about movement and direction.

The P cells send information via their own special division of the thalamus to a particular region of the cerebral cortex that appears to be specialized for analysis of texture and color. The M cells sen their information via a separate pathway to a different region of the cerebral cortex, a region that is specialized for analysis of spatial relationships and object motion. And guess what? Every step in each pathway, from the retina to the cerebral cortex, is different in females and males.23
[emphasis original]

Cool. You can see where he's going with this: women's eyes and brains are specialized for a certain set of tasks, men's eyes and brains for a very different set of tasks. But if you follow his end-note 23 to its reference, you find: Tamas Horvath and K.C. Wikler, "Aromatase in Developing Sensory Systems of the Rat Brain," Journal of Neuroendocrinology, 11:77-84, 1999. So we might have expressed that italicized phrase differently: "Every step in each pathway, from the retina to the cerebral cortex, is different in female rats and male rats." Except, we're all mammals, right? with the same basic hormones and hormone responses? Sort of. Hold that thought, because now the emerging sexual science gets really hot:

The real surprises have come from microscopic analyses of the eye performed in the past five years. Using recently developed techniques, scientists have found that the human retina is full of receptors for sex hormones.24 Anatomist Edwin Lephart and his associates have found that the male retina is substantially thicker than the female retina.25 That's because the male retina has mostly the larger, thicker M cells while the female retina has predominantly the smaller, thinner P ganglion cells.

We're not talking about small differences between the sexes, with lots of overlap. We're talking about large differences between the sexes, with no overlap at all. Every male animal had a thicker retina than any female retina, due to the males having more M cells (see the accompanying graph).

That's some serious differentiation, alright, and Sax shows us the graph to prove it:

(This is not exactly Sax's figure, since I took it from his source rather than from his book, but it's the same data.)

Pretty impressive, right? I was completely blown away when I saw it, I'll confess. The only thing is, if we follow up that end-note 25, we find: David Salyer, Trent D. Lund, Donovan E. Fleming, Edwin Lephart, and Tamas L. Horvath, "Sexual dimorphism and aromatase in the rat retina", Developmental Brain Research, 126:131-136, 2001.

Rats, foiled again! (Or maybe that should be "Foiled, rats again!")

[Just for completeness, the actual figure from the Salyer et al. paper is shown and explained at the end of this post. You'll also learn there what "Oil Females" means.]

But Sax is not just a journalist or a politician -- he has an M.D. and a Ph.D, so surely he wouldn't use this figure as part of an argument about how boys and girls see the world in completely different ways, if humans weren't the same as rats in this respect. After all, humans have the same sex hormones that rats do, and the same sort of eyes and overall visual neurology, and scientists use rats all the time as model organisms for learning about human physiology. Still, it would be nice to see some data about retinal thickness in our own species, so I hit up Google Scholar with "retina thickness male female", and in about five minutes I had an answer.

Yoshikatsu Wakitani et al. , "Macular thickness measurements in healthy subjects with different axial lengths using optical coherence tomography", Retina 23(2) April 2003.

Purpose: To evaluate the retinal thickness of the macula in healthy subjects with different axial lengths.
Methods: Included were 203 healthy subjects (116 males and 87 females). The axial length of the eyes ranged from 22.68 to 30.22 mm. Four optical coherence tomograms were obtained in a radial spoke pattern centered on the central fovea. Retinal thickness was calculated from the inner and outer retinal boundaries. The average retinal thickness in three circular areas surrounding the central fovea (350, 1,850, and 2,850 [mu]m in diameter) was determined.

[The "macula" is an area of the retina around the fovea, where vision is most acute.]

And indeed, Wakitani et al. did find a significant mean difference in retinal thickness between human males and females, as this table shows:

But wait a minute -- we're supposed to be seeing completely separable distributions for males and females here, with "no overlap at all", so that "every male ... had a thicker retina than any female". However, in this table, the differences between the male and female means are only about a third to a half a standard deviation: in area A it's 9 μm difference vs. a 20-21 μm standard deviation; in area B it's a 7 μm difference vs. an 16-18 μm s.d.; in area C it's a 6 μm difference vs. a 13-15 μm s.d. (Note that these differences are also small relative to the means -- about a 4% relative difference, which is significantly less than other mean differences in linear dimensions between men and women, such as the 8% difference in height; and much less than mean differences in measurements where there is really a significant amount of sexual dimorphism in humans, such as the 60% difference between men and women in vocal cord length.)

If we want a more visceral sense of what this means, Wakitani et al. provide some scatter plots:

Fig. 4. Relationship between axial length (mm) and retinal thickness (µm). A, Area A (350 µm in diameter, including the foveola). B, Area B (1,850 µm in diameter, including the fovea). C, Area C (2,850 µm in diameter, including the parafovea). Linear regression analysis that included both male and female data indicated no statistically significant change in retinal thickness with increasing axial length of the eye (P = 0.10 for area A, P = 0.39 for area B, and P = 0.12 for area C).
From: WAKITANI: Retina, Volume 23(2).April 2003.177-182

Not exactly "no overlap at all", would you say?

Sax connects these "non-overlapping" retinal differences with lots of other facts about sex differences (or perhaps I should say "facts", subject to tracking down the rest of his references): women "interpret facial expressions better"; "girls draw nouns, boys draw verbs"; "females and males use fundamentally different strategies" for "geometry and navigation"; at ages when "little children ... had not a clue which gender they belonged to", "nine-month-old boys strongly preferred 'boy toys' such as balls, trains, and cars", while "nine-month-old girls preferred 'girl toys' such as dolls and baby carriages". Putting the retinal differences together with other "brain differences" (including a version of the hearing differences that Louann Brizendine also features), Sax concludes that

Girls and boys play differently. They learn differently. They fight differently. They see the world differently. They hear differently.

And he draws the lesson that girls and boys should be educated separately.

For all I know, his conclusions might be correct. There are certainly plenty of differences between boys and girls. But so far, every aspect of his book that I've looked into seems to be riddled with the sort of overinterpretation and flat-out misrepresentation that was on display in his treatment of emotional maturation and self-expression, and is displayed again in this treatment of retinal thickness.

[As promised, here's the full figure from Salyer et al. "Oil" means the control group, which was injected with plain peanut oil (since peanut oil was the carrier for the other injections); "Flu" means the group that was injected with flutamide, which blocks testosterone receptors; and "Test" means the group that was injected with testosterone. The results clearly show that the retinal changes in these rats are (mostly) caused by testosterone.]

Fig. 1. The influence of prenatal administration of oil, flutamide (Flu), and testosterone (Test) on retinal thickness expressed as the mean (±S.E.M.) collected at postnatal day 30. Sample size for each group is indicated at the base of each bar. *Significantly larger than oil-treated females or flutamide-treated females (P<0.05).

I'm not completely sure why the ranges of retinal thicknesses in the two experiments are so different (30-38mm for the rats; 162-228μm for the humans). The species are different, of course, but the measurement techniques were also very different, and I think this probably explains the otherwise preposterously thick retinas of those rats -- 38mm might well be thicker than the rats' whole heads! In the Salyer et al. study, the technique was:

Extracted eyes were embedded in parafin and sectioned at 25 μm using an American Optical 820 rotary microtome. Five sections from the center of each eye were mounted onto slides and stained with hematoxylin and eosin. Stained retinal sections were projected to a screen a standard distance of 227 cm from a Ken-O-Vision microprojector using a 10 mm NA .25 lens and measured with calipers (precision to ±0.1 mm).

while Wakitani et al. used "optical coherence tomograms" of living subjects.

[OCT] is a new imaging technology with ophthalmologic applications based on the principles of laser interferometry. Its high depth resolution (10 μm) makes it possible to measure retinal thickness more accurately. Moreover, measurements in the axial direction do not depend on the axial length or refraction. Thus, thickness values obtained by OCT are not affected by these parameters. We think that OCT is to date the best technique for measuring retinal thickness in eyes with various axial lengths.

Thus the Wakitani measurements are probably really retinal thicknesses, while the Salyer (rat) measurements are the results of projecting images on a screen and measuring distances in the projected image, without attempting to calculate back to the real-world values.]

[A discussion of Sax on sex differences in hearing is here, and more on Sax on rat vision is here.]

Posted by Mark Liberman at 10:13 PM

"Minority are" in the wild

Following up on my informal research on majority and minority: the other day, a friend of mine uttered the following sentence:

There's a small minority within this group that are primarily Spanish speakers.

Note that are cannot be replaced with is here, unless it's changed to something like There's a small minority within this group that is primarily Spanish-speaking. Most instructive to those who insist that a minority must be singular no matter what, of course you couldn't say There's a small minority within this group that is primarily a Spanish speaker.

[ Comments? ]

Note: Please don't comment on what you learned in school, and don't insult other commenters for freely expressing their English-speaking intuitions. This is Language Log, not Richard Lederer.

Posted by Eric Bakovic at 12:18 PM

More on Macaca-gate

The story of George Allen's odd epithet for an Indian-American opponent just keeps getting odder, at least linguistically. According to a post at National Journal's Hotline:

Three Virginia Republicans confirmed to the Hotline that several Allen campaign aides and advisers are telling allies that the word was a made-up, off-the-cuff neologism that these aides occasionally used to refer to tracker S.R. Sidarth well before last Saturday's videotaped encounter.

According to two Republicans who heard the word used, "macaca" was a mash-up of "Mohawk," referring to Sidarth's distinctive hair, and "caca," Spanish slang for excrement, or "shit."

Said one Republican close to the campaign: "In other words, he was a shit-head, an annoyance." Allen, according to Republicans, heard members of his traveling entourage and Virginia Republicans use the phrase and picked it up.

Note how mash-up has come into general use as a term for "blend" or "combination". I wonder whether this word was supplied by the two bloggers who signed the post (Jonathan Martin and Marc Ambinder) or the "two Republicans" that they quote.

The Democrats seen to be be lagging in lexicogrphic creativity -- at least there aren't any recent examples of novel name-calling from Democratic politicians. However, on the front lines of sentence structure, the same Hotline post quotes Kristian Denny Todd, James Webb's communications director, as follows:

I don't know what's worse; calling this innocent 20-year-old a "shit head" or a racist slur that was debatable that it wasn't.

Exercise for the reader: explicate the syntax and semantics of that last noun phrase.

[Update -- email from Zeno:

I grew up with the words macaca and macaco. Portuguese-speaking households use them to upbraid youngsters when they're misbehaving (or, more to the point, acting like monkeys). When used as a disciplinary epithet, it's not a particularly big deal. However, I never heard an adult use it on another adult. It would be extremely demeaning and a grave insult. In particular, it would be grievously offensive if used by a light-skinned person against a dark-skinned person, because the racist connotation is unavoidable.

Perhaps the taint of racism in that usage was picked up in the U.S. (where I was born into an Azorean-Portuguese community that was immersed in the dominant Anglo culture), since simian metaphors for black people are considered deeply racist here, but I don't think so. The revelation that macaca is a racist slur in French-speaking north Africa and elsewhere in the vicinity of the Mediterranean suggests that this aspect of the insult is deeply rooted. Since Sen. Allen has a direct family connection via his mother to French Tunisia, no one needs to construct an unwieldy excuse about campaign neologisms to explain how macaca came into his vocabulary. It's an ex-post-facto cover-your-ass rationalization that lacks credibility. Given more time, perhaps they could have come up with a better excuse, but Mohawk+crap is ridiculously lame.

The sensible explanation is that Sen. George Felix Allen is a casual racist to whom race-baiting epithets come naturally.

This "sensible explanation" makes sense to me as well, though I believe that the French word would be "macacque" rather than "macaca". My original reaction was well expressed by Ann Althouse, who (like me) had no previous experience with the term "macaca":

The mere fact that he looked at a dark-skinned man and said "Welcome to America and the real world of Virginia" is repugnant. And it turns out that "macaca" is an offensive racial term. It's hard to believe that's mere chance taken in conjunction with the "Welcome to America" stupidity.

It's hard to get around that, whatever influences there might have been from "mohawk" and "caca".]

Posted by Mark Liberman at 08:52 AM

August 18, 2006

Just in case

An American acquaintance who had used the phrase just in case twice in a technical article about the philosophy of linguistics was asked twice by the copy editor what he meant by it. A reasonable question if (as I suspect) the copy editor was British. There is an unusually nasty dialect difference in connection with this phrase.

In British English (and, let me say by way of update, also for most ordinary Americans), just in case always means "lest". Thus We'll take an umbrella just in case it rains means nothing other than "We'll take an umbrella in order to insure ourselves against the unfortunate possibility that it might rain." In the Everly Brothers' song "Just In Case", on their superb 1960 album It's Everly Time, the title clearly has this sense.

But in American English as used by those trained in the formal sciences and philosophy (and perhaps also now by some British writers who have picked up the usage of American scholars), just in case has acquired a new idiomatic sense: "in, and only in, the case where", or in other words, "if and only if". So a sentence like An integer greater than 1 is a prime just in case it has no divisors other than 1 and itself means "An integer greater than 1 is a prime if and only if it has no divisors other than 1 and itself."

I have always thought this is a disastrous piece of dialect divergence. Imagine an American computation theorist talking to a British English speaker who has lived a sheltered life:

A: A set of of strings has a decidable membership problem just in case it and its complement are both recursively enumerable.

B: What, you mean the set acquires decidability to guard against the worrying eventuality that it and its complement might turn out to be be r.e.? You must be bonkers.

Or imagine a British English speaker planning a picnic with an American logician whose horizons are entirely limited to within logic and philosophy (someone totally unacquainted with the Everly Brothers or other ordinary people):

B: We'll take an umbrella just in case it rains.

A: What? Material equivalence holds between us taking an umbrella and it raining? What're you talking about? I don't see an implication in either direction, given that the second proposition is a contingent claim about the future; let alone a biconditional... You gotta be kidding.

I do not normally have much regard for the old saying about the British and the Americans being two nations divided by a common language; but it really does come to mind in this restricted case.

[Please note the green updates above. I am not saying that no American sources have the "lest" sense. I am saying that American logicians, mathematicians, philosophers, and such have departed from it. Try a Google search on this pattern:

"just in case"

That will bring up a slew of examples of the "if and only if" meaning.]

Posted by Geoffrey K. Pullum at 01:57 PM

How to baffle Welsh cyclists

By law, road signs in Wales must be printed in both English and Welsh. But let's hope the highway authorities generally do a better job with creating bilingual signs than they did with this unfortunate example between Penarth and Cardiff. Dymchwelyd may mean 'dismount' (among other things), but llid y bledren has nothing to do with cyclists — in fact, it means 'bladder inflammation.' Whoops! The mistranslation was first reported by icWales and was soon picked up by BBC and AFP for the entertainment of the rest of the world. The best guess among red-faced local officials is that someone accidentally entered cystitis instead of cyclists into an online English-to-Welsh translator.

I can see how an online translator could have been partially to blame, but how exactly does one accidentally enter cystitis for cyclists? I wonder if this might have been caused by the Cupertino effect: first cyclists was mistyped, then "corrected" by an automatic spellchecker to cystitis, and then finally misrendered into Welsh. If an online translation program were really involved, I think a more likely explanation would be an error in the lexicon, with the translation equivalent for cystitis erroneously duplicated for cyclist(s) (especially if they're adjacent headwords). Or the whole thing could have been human error, with the sign-writer consulting an English-Welsh dictionary and taking the gloss from the wrong headword on the page.

Even if the correct translation equivalent for cyclists had been chosen, the instruction on the sign would still not have been grammatically well-formed, as one Welsh language activist told icWales:

Owain Sgiv, an officer for the Welsh language campaign group Cymdeithas yr Iaith Gymraeg, explained: 'Roughly translated, llid y bledren dymchwelyd means bladder disease has returned.
'But I have to stress that the order in which the words have been placed means the sentence makes no sense whatsoever.
'It certainly does not mean anything like cyclists dismount.'
'Cyclists dismount is an awkward sentence to translate as there is no Welsh word for dismount,' he added.
'But the correct translation would be something like dim beicio, which means literally no cycling, or man disgyn i feicwyr, which means fall-off area for cyclists.
Aran Jones, of Welsh Language group Cymuned, was equally baffled - although not for the first time.
He said: 'Llid y bledren means inflammation of the bladder.
'This sentence structure makes no sense, but dymchwelyd means return.
'This is a real peach. Road signs are mistranslated on an enormously regular basis, usually because people use online translators.
'But we don't often get them quite as insane as this.'

So dymchwelyd doesn't map precisely to dismount in the relevant sense, and furthermore, Welsh doesn't follow English in allowing a noun phrase and uninflected verb to be construed as an imperative (i.e., "Cyclists dismount" is understood as shorthand for "Cyclists must/should dismount from their bicycles"). Such is the danger of attempting a dictionary-aided word-for-word translation without having any knowledge of the syntax of the target language. And it's a danger that's clearly not limited to Chinese menus.

[Update #1: John Wells writes in to say:

Actually, "Aran Jones, of Welsh Language group Cymuned" is wrong. DYMCHWELYD doesn't mean 'return' (that's DYCHWELYD). It means 'overthrow, upset, subvert', which I suppose is a lot closer to 'dismount' than 'return' would be.
But yes, LLID Y BLEDREN certainly doesn't mean 'cyclists', but (literally) 'inflammation [of] the bladder'. The word for bladder is PLEDREN, but since it is feminine its initial consonant mutates after the definite article.

[Update #2: This isn't the first time a mistranslated Welsh road sign has made the news. In January the BBC reported on this sign in Cardiff — unfortunately, the Welsh translation tells pedestrians to look to the left (chwith) instead of to the right.]

[Update #3: Regarding John Wells' comment above, Aran Jones emails to say that he received a media request inquiring about a road sign reading llid y bledren dychwelyd and responded accordingly. So it was evidently icWales that got dymchwelyd and dychwelyd confused.]

Posted by Benjamin Zimmer at 01:32 PM

Denial of service attack

   <church sign>

We saw a sign just like that the other day. Outside a bar rather than a church, but I'm in love with a church sign generator. The sign actually looked more like this one, but without the bit about the animals:

   <no shoes no service image>

What animal would you leave outside? But I digress. When my son noticed the sign, the following brief exchange took place...

Noah: Daddy, why can't you have shoes and shirts and service?
Me: Huh?
What I should have said: Well, son, the first two Noun Phrases are understood disjunctively, and the third is understood as the consequent of an implicit conditional which takes wide scope. You mistakenly analyzed both missing connectives as conjunctions. (Whack!) The same is true for this sign I once saw outside a hairdressers, denying curling service  to scruffy would-be customers.

   <no pain no gain>

OK, I lied. Like most good things in life, I found it on the web. It's probably not from a hairdressers, and obviously is intended to be understood as involving implicit conjunctions. Randomly pick a natural occurrence of two noun phrases in a row, and I'll give you good odds they're understood conjunctively:

   <no retreat no surrender>

As far as I can tell, the implicit disjunction is more unusual than the implicit conditional, though conditional readings are often indistinguishable from conjunctions. Take the earliest no-no construction I came across, from our pal Shaxper's Timon of Athens:

Flavius: No care, no stop! so senseless of expense,
    That he will neither know how to maintain it,
    Nor cease his flow of riot: takes no account
    How things go from him, nor resumes no care
    Of what is to continue: never mind
    Was to be so unwise, to be so kind.

Flavius is complaining about his overgenerous master, Timon, who doesn't care about his rapidly empyting coffers, and (hence) doesn't stop spending money and giving stuff away. Is no care no stop a conjunction or a conditional?

 But it's easy to pick up bona fide implicit conditionals on the street:

   <no pain no gain>

And summarizing the blogger's life:
   <no cure no pay>
Oddly, the No cure, no pay refrain is more common in the Netherlands than in the UK or US. The meaning is sufficiently bleached that it's standardly used to describe a lawyer taking on a case paid only on condition of winning. I've also seen the oxymoronic eggcorn no cure no pain. Something else about no cure no pay:  it's not clear that  pay is a noun. Verbs following no are not uncommon, though perhaps more common in dialectal or non-standard Englishes, as  with this protest from Zambia:

   <no fish no eat>:

We also find adjectives. On the web I found No Edie, No Happy! and  No food no happy. There's often something mock-Asian about no combined with a non-noun, like the web example: No likee, no clickee. And this Korean movie

   <no sexy no happy image>

is rendered in English translation No sexy, no happy, though I need the services of a helpful web denizen to tell me what the original says literally. Asian salaciousness may be part of the subplot of the no-no construction (though one can't really make such a prognosis on the basis of the web, which filters everything through porn-tinted spectacles):

"no money no honey"

And to provide a musical frame for the no-no construction,  here's the same thing in blues.

But I didn't.

Posted by David Beaver at 10:48 AM

August 17, 2006

A short judicial shift

In her recent U.S. District Court decision in the ACLU v. NSA case, Hon. Anna Diggs Taylor wrote:

As long ago as the Youngstown case, the Truman administration argued that the cumbersome procedures required to obtain warrants made the process unworkable. The Youngstown court made short shift of that argument and, it appears, the present Defendants’ need for speed and agility is equally weightless. The Supreme Court in the Keith, as well as the Hamdi cases, has attempted to offer helpful solutions to the delay problem, all to no avail. [p. 42]

"Made short shift" may well have been some anonymous law clerk's typographical error for the expected archaic idiom "made short shrift". Then again, the Eggcorn Database's entry for short shift already includes citations from the Guardian, the Stanford Daily, and the Johns Hopkins Newsletter. The reflexes of the verb to shrive in modern English are few and far between -- how many know what "Shrove Tuesday" is really all about? "Short shrift" would probably not survive at all if it hadn't been used by Shakespeare; and "short shift" could mean a number of semi-sensible things, from a limited period of duty to a compact gear lever or an abbreviated chemise.

As the OED explains, shrift has a long past but a doubtful present and future, with no citations for the "short shrift" idiom between William Shakespeare in the 16th century and Sir Walter Scott in the 19th:

[OE. scrift m., corresp. to OFris. skrift m. and f., MDu. schrift (schricht) f. and n., (Du. schrift), OHG. scrift f. (MHG., G. schrift), ON. skript, skrift f. (Sw., Da. skrift), vbl. n. f. SHRIVE v.
The meanings ‘penance’, ‘confession’ are confined to English and Scandinavian, arising app. from an original meaning of ‘prescribed penalty’. The other languages have only the senses ‘writing’, ‘graphic art’, ‘scripture’, ‘written character’.]

1. Penance imposed by the priest after confession; chiefly in phr. as to take, nim shrift; to do shrift; to give shrift. Obs.

9. short shrift: orig. a brief space of time allowed for a criminal to make his confession before execution; hence, a brief respite; to give short shrift to, to make short work of.

1594 SHAKES. Rich. III, III. iv. 97 Make a short Shrift, he longs to see your Head.
1814 SCOTT Ld. of Isles V. xxxii, Short were his shrift in that debate... If Lorn encounter'd Bruce!

Still, "made|make|makes|making short shift" has only 203 Google hits, to 25,800 for "made|make|makes|making short shrift".

[Hat tip to Fernando Pereira.]

Posted by Mark Liberman at 10:26 PM

Gay marriage and counting the planets

When I first wrote on Language Log about the issue of same-sex marriage, I thought, and still think, that there was some dirty work afoot in talking about the issue as if it was a linguistic one. It is not about defining what the word "marriage" will stand for, I argued; there are already dictionary entries allowing a sense in which there can be a same-sex marriage, so it's not a contradiction to talk about such a union (the way it would be a contradiction to talk about a kind of triangle that had four corners). It's not about word meaning at all; it's about whether certain people will get certain rights. The laws of my home state currently deny my friends Susannah and Kirstin the right to file joint tax returns or be considered each other's default inheritor of property, and so on. Writing a law that ensures they will be denied those rights cannot honestly be downplayed as harmless democracy in a matter of mere lexicography (recall President Bush's remark that "Marriage ought to be defined by the people, not by the courts"). It's interesting to compare this with the current issue of how many planets there are. That truly is just a matter of lexicography.

Next week astronomers will vote on a newly proposed definition (not redefinition — we've never really had a definition before). There are various positions out there, including a restrictive one that says Pluto is an eccentrically-orbited snowball from the Kuiper belt that should never have been included, so there are 8 planets, and a very inclusive one that allows at least 53 spherical sun-orbiting bodies to count as planets (an idea some astronomers refer to contemptuously as "No Snowball Left Behind"). The official proposal is to say that a planet is a body that meets the following four conditions.

  1. it has sufficient mass to assume near-spherical shape because of its own gravity, and
  2. it is in orbit around a star, and
  3. it is not itself a star, and
  4. it is not a satellite of a planet in the sense of having an orbit that goes around a center of gravity that is located inside a body that is independently a planet.

Under this definition, the number of planets will go up from 9 to at least 12, and unless a committee keeps a lid on things, the number could go way up from there to 53 or more.

The last of the four clauses quoted looks a little bit gerrymandered, doesn't it? It turns out to allow Charon to be a planet: Charon looks a lot like the largest moon of Pluto, but in fact it does not orbit around a center of gravity inside Pluto; it revolves around a center of gravity determined by both bodies and located in the space between them, so you can see the two of them as a pair of small planets orbiting the sun but at the same time slowly twirling around each other like a couple waltzing around the outer edge of a ballroom. (Our own moon's orbit is around a center of gravity located deep within the earth, so the moon definitely can't count as a planet.)

Given clause 4, the peculiar anomaly of Charon's almost-independence of Pluto scrapes in as an extra planet. Pluto retains planet status too. (One astronomer who thinks it shouldn't is Neil deGrasse Tyson. He said recently, in a remark that is a clear example of linguifying, that fans of Pluto get lucky with the proposed official definition: "It is one of the few that allow you to utter Pluto and Jupiter in the same breath.")

The other two objects that get into the hall of planetary fame are Ceres (which was first counted as a planet and then demoted in modern times to being considered a very large and unusually spherical object of the asteroid belt between the earth and Mars), and 2003 UB313, unofficially known as Xena (yes, it's been jokingly named after the warrior princess on TV).

But in all of this, no planet loses any rights. Orbits stay the same, like masses, distances, year lengths, and so on. No census issue is raised: the very same objects are, uncontestedly, out there, governed by the same laws of physics. If the new definition gets the votes at the International Astronomical Union on August 25, all that will happen is that we will start using the word "planet" somewhat differently (if we decide to go along with astronomers' usage: we don't have to).

That's what it's like when a truly terminological issue comes up. When the International Astronomical Union makes its decision, I predict there will be no lawsuits. Only the definition of a word is at issue. It's not that way with the issue of whether you count as married to the dying domestic partner whose doctor you want to consult with; the marriage issue has real consequences for people's lives and is not just a matter of how we define a word.

Oh, and one other piece of lexicography is on the table: astronomers are thinking of introducing the word pluton, meaning something like "planet that is really nothing more than a God-forsaken frozen slushball way out beyond Neptune drifting around in the Kuiper belt and damn lucky to be called a planet at all". I paraphrase, but you get the general drift. Pluto and Charon and Xena may get to be called planets, but they are not quite to be compared with the gas giants (Jupiter, Saturn, Uranus, Neptune) or the inner planets that are warm enough that there has been speculation about the existence of life on them (Mercury, Venus, Earth, Mars). (One of the latter group, Earth, is actually known to have developed blogs, pepperoni pizza, cell phones, and intimate massage, all of which are considered prime indicators of the probability of some kind of life.)

Posted by Geoffrey K. Pullum at 08:32 PM

To prudently recreate

Something about those days recently when President George W. Bush riding his mountain bike around his ranch near Crawford, Texas, and making occasional remarks to the press about the crisis-ridden Middle East and very high oil prices, reminded me of many years ago, when his father President George H. W. Bush was tooling around the bay off Kennebunkport in a high-powered speedboat during an earlier time of war in the Middle East. Reporters asked Bush (Sr.) whether he thought it was right, during a time of great national anxiety about the interrupted oil supply, to be spending precious gasoline on riding a huge boat around the Maine coastline for fun. And I remember that President Bush said, a tad defensively, that he thought it was perfectly appropriate for Americans "to prudently recreate". The barbarous phrase did convince me of the elder Bush's capacity to hideously back-formate.

Umm... Forgive me my little linguist joke, which I suppose I had better ruin by explaining it, for we do not like to be obscure here on Language Log. The issue is not that he used a split infinitive; split infinitives are fine — they're grammatical, and always have been. No, it was the word recreate, with the first syllable pronounced like wreck. The existence of the noun recreation, meaning "pursuit of non-work-related activities with a view to fun or relaxation", you see, is not a guarantee of the existence of a related verb recreate. (True, there is a completely different verb that I'll write as re-create, in which the first syllable is the prefix also seen in re-educate or re-enter. But that means "create again", and it is not the word I'm talking about.)

Assuming that there must be a verb because of some noun that appears to have been formed from it, and inventing that verb when in fact it is ahistorical, is a familiar process known to historical linguists as back-formation. Some back-formations catch on and become fully respectable; edit is one.

One can use the term back-formation to illustrate its meaning: there is no verb *back-formate, just as the noun formation is not derived from a verb *formate. If someone were to mistakenly assume there was such a verb, as I pretended to do at the end of my first paragraph, they would have coined a back-formation.

If you did not get my joke first time round, perhaps now that I've ruined it, you can (just for recreation) re-create it. Prudently, of course.

Update: Of course, just because something struck my ear as new in 1991 or whenever it was, that doesn't mean it was. People have written to me with evidence of other, earlier citations for recreate as a verb meaning "take recreation" or "amuse oneself", so it is perfectly possible that GHWB did not imprudently back-formate. A few of the citations are hundreds of years old. The noun recreation does seem to be older than the verb; and the verb is regarded by the OED as now mainly American; but I'm probably wrong about GHWB having been a back-formator, which ruins my little joke even more, doesn't it? I'm dyin' out here. Who writes this stuff for me? Get me rewrite.

Posted by Geoffrey K. Pullum at 07:14 PM

Hungarians wanted

If you're a native speaker of Hungarian, know how to make an audio recording on your computer (or would like to learn), and are willing to devote a few minutes to science, please contact me. The goal is to do a small experiment to test the relative speaking and subvocalizing rates of Americans and Hungarians. (If you can recruit a few Hungarian-speaking friends to join the group, so much the better.)

Posted by Mark Liberman at 08:03 AM

August 16, 2006

Those fast-talking Hungarians Marketing Researchers

According to the New York Times, some scientists found that

Hungarians are far better than Americans at recalling long prices; on average, they can recall 19 to 24 syllables with decent accuracy, while Americans can recall only 13. The authors suggested that this was because Hungarians speak 41 percent faster, both out loud and when repeating sounds to themselves “subvocally.”

I expressed skepticism, and promised to say more when the paper came out. Google Scholar found me an online preprint: Marc Vanhuele, Gilles Laurent, and Xavier Drèze, Consumers' Immediate Memory for Prices, and Eszter Hargittai (who blogged about it at Crooked Timber) pointed me to the final version online here. (This discussion is mostly based on the preprint, but a scan of the final version suggests that the experiments haven't changed much, except that the researchers appear to have added a subvocalization-rate experiment on American subjects.)

What I learned from reading the manuscript, alas, is that my skepticism was amply warranted.

1. The American and Hungarian experiments on price recall involved different stimuli and different experimental methods, so that cross-language comparison of the number of syllables remembered becomes problematic.
2. Specifically, the subjects' "syllable span" in recall of prices was not measured in the same way in the two languages. Instead, it was inferred from averages combining quite different sets of factors in the different languages, reflecting the different experimental designs. Given the differing structures of the experiments and the low percent of variance accounted for by regression models of the data, these inferred estimates of digit-string memory shouldn't be compared.
3. The researchers never measured the actual speaking rates of either the American or the Hungarian participants, nor (as far as the paper tells us, anyhow) of any other Americans or Hungarians. Instead, they measured the subvocalizing (i.e. imaginary speaking) rates of Hungarian speakers and French speakers. (In the final version, they compare subvocalization rates of Hungarian speakers and American English speakers.) This is a problematic thing to do: how do you really tell when someone starts subvocalizing and when they finish? The authors don't tell us their method, or give any assurance that the same method was used in France and in Hungary. Just as important, the strings being subvocalized were not comparable in the two languages, being samples of the differing stimuli used in the two different experiments.

The American and Hungarian stimuli were different on purpose -- that was the whole point of using Hungarians:

"...given our previous demonstration of the effect of the number of syllables of the price on price recall, we are interested in examining how consumers in countries with currencies with high face value deal with the challenges of price recall. Our two previous recall tests used monetary units with comparable face values -- the euro and the dollar -- but in some other currencies, the prices of the products include many more digits. The Hungarian marketplace is ideal to examine both dimensions. The Hungarian forint is pegged to the euro at an exchange rate of approximately 250 forint per euro (202 forint per dollar at the time of the study). In addition, price endings in Hungary follow a restricted number of patterns, which enables us to manipulate the visual "usualness" of prices in the Arabic format. Hungarian prices do not use decimals, and though coins of 1, 2, and 5 forint exist, unit prices that are three digits long (e.g. for the candy product category used in our study) always end in a 5 or 9. Prices with five or six digits (e.g., digital cameras) always end in 90 or 00."

In other words, the Americans saw prices like $3.91 or $543, whereas the Hungarian prices for the corresponding items would have been 789 or 109,690 (presumably with whatever the forint sign is, and I suppose with periods instead of commas marking the thousands). We'd have to look at the full range of stimuli and their likely pronunciations in English and Hungarian to figure out what effects this might have -- but in any case, the stimuli were quite different in kind as well as in detail.

The experimental conditions were different, it seems, out of practical necessity. In America, "ninety-one U.S. undergraduate business students participated for course credit in this online experiment", presumably participating individually, at times and in places of their choosing. In Hungary, "Due to practical constraints participants could not be tested individually and were tested in groups, which precluded a full randomization of prices across word lengths and presentation and test orders. Participants viewed a PowerPoint slide show with the instructions, study, and test screens." I believe that the Hungarians were tested in class; at least, this is explicitly what was done with the French participants in a pilot experiment.

Could these differences make a difference? Well, from what I know about Penn undergraduates (and as faculty master of Ware College House, I live among them), I'd guess that some of the American participants might have taken part in "this online experiment" while simultaneously listening to music (their own or their roommate's), participating actively or passively in a conversation, and otherwise multi-tasking. As for the Hungarians, they might have been distracted in other ways, but they also might have (even without wanting to) been able to see the answers written down by their neighbors in class.

Note that these experiments were fairly complex, and in different ways:

The main experimental variables of interest are the price length of the target product in number of syllables and the price length of the other products on the study screen (3-8 syllables each). Recall performance might also be influenced by the number of products per study screen (2 or 3), the presentation order (from left to right, 1, 2, or 3), and the question order (1, 2, or 3), which affects the time the consumer has to rehearse an item before recalling it. Therefore, we introduce these factors as control variables. ...

For each product, we created a price we labelled "short" (S) and another we labeled "long" (L). For study screens that displayed two products, we used the ... four price/length combinations: SS, SL, LS and LS [sic -- probably should be LL]. For three-product screens, we provided eight combinations from SSS to LLL. ... To limit the length of the task, we used a fractional factorial design in which we showed each subject only two of the three product categories ... Each subject therefore viewed a total of 24 screens with 64 product-price pairs to recall, and the product sequence, price length conditions were randomized for each subject.

That's for the American subjects, who were tested on three product categories (candy, DVDs and digital cameras). The Hungarians were tested in groups, and in a different way, using "the staircase method in which the total number of syllables to be remembered on each subsequent screen increases". They were tested on only two categories, candy and digital cameras. The candy trials escalated from 8 to 21 syllables, and the camera trials escalated from 21 to 42 syllables. (The Americans would have seen screens with a total of between 6 and 24 syllables, a much lower range.) As a result, in the Hungarian experiment,

...there is a high correlation between the number of syllables of the target and the other price(s) in a given trial (candy prices are three digits long; camera prices are six digits). In the logistic regression, we therefore used only one variable for syllable length to code the total syllable length of all prices in a given trial.

And see the paper for the details of an additional complication: the variation in the Hungarian experiment between "usual" prices (3-digit prices ending in 0 or 9, 6-digit prices ending in 90 or 00) and "unusual" ones (all the other possibilities). One group of subjects got only "usual" prices; the other group got 50% "usual" prices and 50% unusual ones. This is important, because it means that 3/4 of the Hungarian prices -- all the prices for the "usual" group, and half the prices for the "unusual" group -- contained significantly less information than the number of digits involved would indicate. In the case of the three-digit "usual" prices, there were only 200 possibilities instead of 1,000 (about 7.6 bits instead of 10 bits), and in the case of the six digit "usual" prices, there were only 20,000 possibilities instead of 1,000,000 (about 14.3 bits instead of 19.9). Combined with the fact that the length-in-syllables conditions were randomized in the English-language experiment, but always yoked stepwise for the Hungarians, it starts to look like an information-theoretic measure of the difficulty of the Hungarian and English tasks might end up nearly the same, despite the apparent difference in the number of digits in the prices.

OK, enough. I'm not convinced by the proposed inferences about comparative memory spans, whether measured in digits or in syllables. The purpose of this experiment was to "[examine] how consumers in countries with currencies with high face value deal with the challenges of price recall", and I'm happy to grant that it succeeds in doing so. It convinces me that Hungarians have no more trouble than Americans or French people do in remembering the prices of candy and digital cameras, despite the fact that their currency exchange rates make typical prices for these items take more syllables to pronounce than corresponding American or French prices.

But the researchers didn't measure the digit spans or speaking rates of Americans and Hungarians (or French either). They measured the ability to remember the prices of candy, digital cameras (and for the Americans, DVDs), using different stimuli, different experimental tasks, and different experimental settings in the experiments on subjects in different countries. Specifically, they grouped objects by 2s and 3s, but grouped the stimuli in different ways, presented the stimuli by different methods, and recorded the responses in different ways in the different countries. Then to estimate the number of syllables in a price that subjects were likely to remember 50% of the time, they averaged across the (many) experimental conditions other than the number of syllables in the target price -- conditions which were very different for the Hungarian participants and for the English-speaking (and French-speaking) participants. The result is interesting but extremely unlikely to yield reliable or comparable numbers for the subjects' ability to recall a given number of syllables.

In a more ideal universe, the people who write for a major paper like the New York Times would able to figure such things out. I guess we have to blame people like me (by which I mean the faculty of American universities) for failing to educate them properly.

Posted by Mark Liberman at 10:47 PM

Majority/minority complex

Time to comment on the results of last week's Language Log poll concerning the words majority and minority. Unlike the majority of folks who commented on the poll, I'm not going to try to explain the results, and I'm certainly not going to tell you which choice is correct. This is Language Log, not ... well, not someplace else.

I don't know what got me thinking about all this in the first place, but a couple of weeks ago I came to what I thought was an odd conclusion about my own native English-speaking intuitions: that I would fill in the blanks this way.

The poll shows that a majority of people are against the war.

The poll shows that a minority of people is against the war.

(FWIW: at least two commenters agree with me, but at least two commenters don't. Go figure.)

My intuitions are a little more subtle than this, though. In the majority case, I'm very certain, and substituting is for are sounds wrong to me. In the minority case, on the other hand, I'm significantly less certain, and substituting are for is sounds OK -- though is is preferable enough for me to make the choice.

As of 3:50pm PDT today, 71% of you (930 of 1316) chose are in the case of majority, vs. 29% or 386 of 1316 who chose is. In the case of minority, 63% of you (711 of 1129) chose are, vs. 37% or 418 of 1129 who chose is. Given comments like this one ("That's what I learned, and I'm sticking to it."), I'm thinking that the is answers in both cases are somewhat overrepresented. In any event, there's clearly a strong preference for are in both cases, and that preference is stronger in the case of majority, which loosely matches the difference in my intuitions in the two cases.

Some quick-and-dirty googling on several variations reveals an even more interesting picture. I did two types of searches, one with a wildcard in object-of-of position and another with that same wildcard plus another one in prenominal position (to catch examples such as a significant majority of ...), and I grouped the results together. I didn't do anything fancier than that, and so these numbers are not as reliable as one would like, and I was too lazy to try any other verbs besides is/are. Anyway, here are the results when the determiner is a alongside the results when the determiner is the:

search string% ghits, x = a% ghits, x = the
x-maj-of-are total76.38%72.18%
x-maj-of-is total23.62%27.82%
x-min-of-are total66.43%54.84%
x-min-of-is total33.57%45.16%

If you leave out the of-phrase and just search for a/the (*) majority/minority are/is, things get really interesting. The percentages get a little closer to each other in the case of the (*) majority are/is, and they flip almost completely in the case of a (*) majority are/is. The minority cases flip, too, but the a case looks like a flip of the previous the case (closer to 50-50) and vice-versa (closer to 60-40).

search string% ghits, x = a% ghits, x = the
x-maj-are total29.12%62.55%
x-maj-is total70.88%37.45%
x-min-are total45.52%37.49%
x-min-is total54.48%62.51%

The flipping observed here is predicted by the theory in this first comment (and others): that it's the (presence and) proximity of the plural object of of that establishes a preference for are in the former set of cases with the of-phrase. (Why flipping fails in that one case beats me, though.) You might think that a (as opposed to the) would tend to pull things toward is, but this is only true in one case (the completely flipped case vs. the nonflipped case, so there may be something else going on here). I have to admit, I'm baffled.

Many commenters are also baffled, but by something else: why anyone in their right mind would choose are in either of these cases (which, as I've just summarized, most people do in most cases, so everyone seems to be out of their mind). For example, Emily writes:

I'm stunned (stunned!) that there could be any disagreement on this topic at all. Both are singular nouns! One can have "a majority."

"A lot" is an exception. It may have once denoted exactly one "lot" of people, but since that noun became old-fashioned (hie ye hence, the lot of ye!!), it's become a phrase like "many." (You can tell because people keep trying to write "alot," the same way they often write "alright.")

I think what's confounding people is not the semantic "manyness" of the words but the proximity of the "of people" modifier. This is a classic SAT error we teach all our students to avoid. ("They stuff it with words to confuse you!" we tell them. "don't be seduced!")

Because really, which would you prefer: a language where verbs agree with any damn nouns they're next to, or one with heat-seeking verbs that see through entire clauses to lock unshakably onto their subjects?

There's the proximity-of-the-plural-of-object argument again. While this indeed seems to be part of what's going on, there has to be more to it: even without the of-phrase, there are still plenty of folks who insist on using are.

The "a lot" example is (I think) a reference to this immediately preceding comment:

Suppose you replaced 'a majority' with 'a lot'. I'd expect all those who apply this singular rule to suddenly retract and go for the plural form of 'be'. I mean, "A lot of people is opposed to the war" sounds utterly aweful (sic), yet it is just as singular an entity as 'a majority'.

This is an interesting tactic: bringing up an example about which virtually nobody disagrees to talk about an example about which there's significant disagreement. This reminds me of the arguments that are sometimes used to convince people that you shouldn't say "Me and so-and-so did such-and-such". You would never say "Me did such-and-such", right? It's obviously "I did such-and-such", so it must be "So-and-so and I did such-and-such". It makes a certain amount of sense, but it's clearly not very relevant to the facts: there's disagreement in one case but not in the other, and what we should be trying to figure out is why there is this difference in the two cases.

Well, I've about exhausted myself with this one. I'll bet that a significant minority of you will comment. (How's that for avoiding the issue?)

Posted by Eric Bakovic at 06:57 PM

Omit needless commas

In a recent Wired interview, Bart Kosko explains why he's given up commas:

Q: I noticed there aren’t any commas in your book. Is this your way of cutting back on punctuation noise?
A: Commas are a kind of channel noise. You’re not getting to the verb fast enough. Why make us wait? The comma is on its way out. Use small words. The perfect illustration is a swear phrase: Go to hell! Screw you!

Hell, why not leave out the spaces, too, andgettothoseverbsevenfaster? Kosko is plugging his new book Noise, and I guess he's newly converted to commalessness, since his previous book, Fuzzy Thinking, was full of them, nineteen on the first page alone.

This calls to mind Michel Thaler and his 2004 verbless novel Le Train de Nulle Part ("The Nowhere Train"), dedicated "à tous les partisans de la décolonisation de l'écrit et de la mise à mort ... du verbe" ("to all the partisans of decolonization of writing and of putting the verb to death"). And Gertrude Stein was all over the punctuation issue 70 years ago: "when I first began writing I was completely possessed by the necessity that writing should go on and on and if writing should go on what had colons and commas to do with it". Though Kosko doesn't want writing to go on and on, apparently, he wants it to get quickly to the verb and stop. Anyhow, the California Digerati seem to be taking on the rationally irrational quirks of 20th-century French intellectuals, without even following Stein from California to Paris. [Hat tip to Phil Resnik, who was too lazy to blog it himself]

Posted by Mark Liberman at 11:06 AM

Hungarian speech rate and the tribunal of revolutionary empirical justice

A couple of days ago, a reader ("aka darrell") sent me a link to a New York Times article citing a claim that "Hungarians speak 41 percent faster" than Americans do (Alex Mindlin, "For a Memorable Price, Trim the Syllables", 8/14/2006):

Consumer researchers know that people are terrible at remembering store prices: two seconds after taking a product from a shelf, the average person has roughly a 50 percent chance of remembering how much it cost. But few researchers have examined why some prices are more memorable than others.

According to a new study, it is a matter of syllables. Each extra syllable in the price reduces the chances of it being recalled by 20 percent, according to the study, which will be published in the September issue of The Journal of Consumer Research. In other words, someone faced with a $77.51 camera (eight syllables) and a $62.30 bookshelf (five syllables) is about 60 percent more likely to forget the camera’s price than the bookshelf’s, after half a minute.

“The way information goes from the environment to your memory, there is this phonetic loop which is a two-second buffer,” said Xavier Drèze, one of the study’s authors.

Hungarians are far better than Americans at recalling long prices; on average, they can recall 19 to 24 syllables with decent accuracy, while Americans can recall only 13. The authors suggested that this was because Hungarians speak 41 percent faster, both out loud and when repeating sounds to themselves “subvocally.”

The cited article in The Journal of Consumer Research won't be available for a month or so, and so my first thought was to leave this on on the to-blog list until then. But I've posted recently about "Sex and speaking rate", and there are some other points here worth discussing; so I'll take a shot at it now, and come back again when the the promised journal article appears. [Update: I found a preprint online, and Eszter Hargittai blogged about this work at Crooked Timber and pointed me to an online version of the final paper, so further discussion is here.]

It's plausible, to start with, that the ability to remember an exact price depends inversely on the length of the phrase required to name the price -- though I'll bet, on common-sense grounds, that memory for approximate prices behaves quite differently. But the piece of this that attracted my attention is the assertion that "Hungarians speak 41 percent faster" than Americans.

I don't have a lot of experience listening to Hungarian, but such experience as I have doesn't leave me with the impression that it's spoken especially fast compared with English. However, it's not going to be easy to measure the difference, if there is one. First, anyone who has ever taken a look at speech rate, or even thought about it much, knows that people vary in their habitual rates, and that lots of factors affect how fast any given individual talks (emotional state, cognitive load, fatigue, etc.), and that a given individual in a given situation can consciously choose to speak faster or slower. Any one of these factors can easily make a difference of 50% or so. Therefore, a believable cross-language comparison would need a lot of subjects, and you'd have to be certain that you were comparing comparable samples from comparable populations of subjects in comparable settings doing comparable things. Second, because languages have different sound inventories, different word and syllable structures, and different densities of information per word or syllable or whatever, it's not entirely clear how to denominate speaking rate in units that can fairly be compared across languages.

The work referenced in the NYT article seems to be using syllables as the units of measure, so let's take a few trivial measurements to give us an idea of how this is likely to work out. From a Google search for Magyar audio, I picked a recent hir-TV segment ("Hankiss Elemér a szolidaritásról", July 24, 2006), in which Ókovács Szilveszter interviews Hankiss Elemér. In the interviewer's first 10 breath groups, I counted 114 syllables in 26.4 seconds, giving us 232 msec. per syllable, or 259 syllables per minute. From a web search for English-language interviews, I picked an MSDN segment from 2004 in which Shawn Morrissey interviews Steve McConnell. I counted 117 syllables in the first 28.08 seconds of McConnell's first answer, for a rate of 240 msec/syllable, or 250 syllables per minute.

OK, so the Hungarian guy was 3.6% faster. Not exactly 41%, but it's in the right direction.

Not so fast, though -- McConnell is an author, not a professional talker, and he's responding to a question, not reading a prepared statement or question. So I picked a Kai Ryssdal segment ("Good day on Wall St."), from yesterday's Marketplace radio show. Ryssdal's opening bit involves 83 syllables in 17.971 seconds, for 217 msec/syllable and 277 syllables/minute. That's 6.9% faster than the Hungarian interviewer. And the start of the body of Ryssdal's segment had 119 syllables in 24.826 seconds, for a smokin' 209 msec/syllable and 288 syllables/minute, or 11% faster than the Hungarian.

This is silly, of course. We can't conclude anything meaningful from one or two short samples from one or two speakers of each language, even in the more-or-less consistent communicative context of a broadcast interview. But at least we can see that even if there's really a speaking-rate difference between Hungarian and English, the essentializing phraseology of the NYT article ("Hungarians are 41% faster") is profoundly misleading.

The reader who sent the NYT link wrote that "[t]his struck me as being the sort of thing you often like to refute". Really, though, I don't enjoy refuting things, of this or any other sort. I'm a positive kind of person, tempermentally inclined to the enthusiastic suspension of disbelief. But I subscribe (out of reluctant conviction) to what what Cosma Shalizi (following Ernest Gellner) calls the "Anglo-Austrian Tribunal of Revolutionary Empirical Justice":

The procedure of the court was as follows: the accused was blindfolded, and the magistrates then formed a firing squad, shooting at it with every piece of possibly-refuting observational evidence they could find. Conjectures who refused to present themselves might lead harmless lives as metaphysics without scientific aspirations; conjectures detected peaking out from under the blindfold, so as to dodge the Tribunal's attempts at refutation, were declared pseudo-scientific and exiled from the Open Society of Science. Our best scientific theories, those Stakhanovites of knowledge, consisted of those conjectures which had survived harsh and repeated sessions before the Tribunal, demonstrated their loyalty to the Open Society by appearing before it again and again and offering the largest target to refutation that they could, and so retained their place in the revolutionary vanguard until they succumbed, or were displaced by another conjecture with even greater zeal for the Great Purge.

Like other political ideologies, this is a sort of secular religion. At least, it's hard to avoid echoing religious phrases in talking about it. Cosma observes that this ritual of refutation "is very reminiscent of The Golden Bough", talks about "our wanderings from the Goshen of superstition to the Canaan of statistical inference", and quotes Popper's self-consciously sacrilegious quip "better our hypotheses die for our errors than ourselves".

Fine points of doctrine aside, I'm a believer. And so when I lead a hypothesis before the Tribunal, it's an act of respect and devotion, not prejudice and hostility.

[Let me add that I'm also skeptical of the article's assertion that

Hungarians are far better than Americans at recalling long prices; on average, they can recall 19 to 24 syllables with decent accuracy, while Americans can recall only 13.

While there are well-documented differences in digit span among languages (whose explanation is, as I understand it, somewhat disputed), this difference seems too large. There could be something funny in the selection of subject populations, or (more likely) a misunderstanding on the part of the journalist, Alex Mindlin; but we'll have to wait until the article comes out to see.]

[Update -- I found a preprint, and Eszter Hargittai pointed me to the official version. The subject selection seems OK (basically business-school students in both countries), and the reporter seems to have taken the claims fairly directly from the paper. The problems are elsewhere -- see here for more discussion.]

Posted by Mark Liberman at 07:38 AM

Eskimo ballplayers have 108 words for slump

So (jokingly) suggests Ken Arneson over at Baseball Toaster:

We definitely could use some more precision when talking about slumps. If Eskimos can have N words for snow, why can't we have some more words to describe slumps? This would be especially useful for fantasy baseball players, because you'd want to drop a player if he's in one kind of slump, but keep him if he's in another. So I'm going to make up some more terms.

So when you want to know why Jorge Posada just go 0-for-25, you could answer, "Oh, I think he's in a...":

Ken creates and defines the terms Gauss, Byrnes, Gomes, Chavez, Miller, Crosby and Blass, and invites his readers to contribute the other 101. Of course, the link to the Language Log post "More rhetorical abuse of the Eskimo lexicon" is in Ken's original. To recycle Karl Marx's overused remark about Hegel on history: stereotyped rhetoric repeats itself, first as cliché, then as irony. Now we need a word for the self-consciously ironic use of a snowclone.

A few web examples, cut from the same bolt of cloth as mine:

History, Karl Marx might have observed had he been more savvy about public relations, repeats itself first as documentary, then as a panel discussion.
Revolution repeats itself, first as tragedy, then as entertainment.
James Taranto notes that this incident (if true) shows that history indeed repeats itself first as tragedy, then as Farsi.
History repeats itself first as tragedy then as fashion.
History repeats itself: first as collegiate high jinks, then as TV movie.
History, Karl Marx might have said, repeats itself: first as Thatcherism, then as bling bling.
History repeats itself, first as triumph, then as muddle.
Toy history repeats itself -- first as plastic merchandise, then as licensed interactive media.
No less a philosopher of historical dialectic than Karl Marx once claimed that history repeats itself first as tragedy and then as disco.
History repeats itself, first as irony then as law.
History repeats itself, first as the movie, then as the Austin Powers movie.
As Karl Marx should have said, history repeats itself: first as tragedy, then as pop culture.
History repeats itself, first as tragedy then as plagiarism.
I do reach back sometimes because history repeats itself -- first as tragedy, then as Presidency.
History Usually Repeats Itself First as Tragedy, and Then as an Existing Visual Amenity.
History repeats itself; first as Tom Clancy novel, then as farce.

[Update -- Ben Zimmer points out some similar examples of ironic snowclones/canards from Slate:

Middle East. Fighting intensifies in Lebanon as dozens of innocents die, but President Bush senses a "moment of opportunity." Linguists note that in Chinese, the character for "opportunity" also means "quagmire." And "Hezbollah" means "Party of Mel Gibson."


Posted by Mark Liberman at 07:33 AM

Makaku, macaco, macaque, macaca...

By now everyone's no doubt heard about Virginia Senator George Allen's unfortunate appellation for S.R. Sidarth, a 20-year-old of Indian descent working for Allen's Democratic opponent James Webb: [məˈkɑːkə], generally represented in press accounts and blogs as Macaca. (See the well-traveled YouTube video, reports in the Washington Post and the New York Times, and blogospheric reactions from WonketteSepia MutinyAndrew Sullivan, Josh Marshall, and Ed Kilgore). Allen addressed Sidarth this way twice: "This fellow here, over here with the yellow shirt, Macaca, or whatever his name is... Let's give a welcome to Macaca here." The Allen campaign claims that Macaca was just a silly name bestowed on Sidarth, who had been tracking Allen's appearances with a video camera, based on his Mohawk hair style. (Sidarth says he has a mullet, not a Mohawk.) But Macaca is also a genus of monkeys encompassing the macaques, and macaque and its cognates are evidently used as epithets for dark-skinned people in various European languages.

I won't speculate on whether Allen had any previous knowledge of the monkey genus or macaque-related racial slurs. (Some liberal bloggers are convinced that he must have known about the racial import of macaque, since it's used by some insensitive Francophones to disparage North Africans, and Allen's mother happens to be French Tunisian.) As far as I can tell, Allen's usage was nothing more than a dumb ad-hoc label for "funny-looking foreign guy," which is offensive for reasons having nothing to do with monkeys. Nonetheless, the etymology of macaque and related forms is interesting in its own right.

According to the Oxford English Dictionary, the name for the monkeys originates in (unspecified) Bantu languages where makaku is a plural form of kaku 'mangabey':

The form kaku is the name for the mangabey in a number of Bantu languages of southern Gabon and the Congo, and is generally regarded as imitative of the animal's cry. The plural is kaku, bakaku, or makaku, according to the language.

Around 1650 the word began appearing in European accounts of the Congo as makaku (in a vocabulary copied by the Flemish missionary Joris van Geel) and as macaquo (in Historiae Rerum Naturalium Brasiliae by Georg Marggraf/Marcgrav). The term settled into Portuguese as macaco, and by 1680 it appeared in French as macaque. In 1798 the French taxonomist Bernard Germain Étienne de la Ville Lacépède dubbed the genus Macaca, though the scientific name also appeared throughout the 19th century in the more Latin-sounding form Macacus.

Though English borrowed macaque and the genus name Macaca from French, echoes of the earlier Portuguese form macaco continued into the 19th century. John Camden Hotten's Slang Dictionary of 1874 contains an entry for murkarker mentioning a "famous fighting monkey" named "Jacko Macauco, or Maccacco":

The "vulgar Cockney pronunciation" given by Hotten as murkarker is evidently a non-rhotic pronunciation spelling intended to represent [məˈkɑːkə] — which, as it happens, is just about what Senator Allen was caught saying 132 years later.

Courtesy of the New York Public Library Digital Gallery, here is an engraving of a fight between Jacko Maccacco and "Mr. Thos. Cribb's well known bitch 'Puss'":

[Update, 8/16/06: The Hotline gives the latest rationale for Macaca from the Allen camp:

According to two Republicans who heard the word used, "macaca" was a mash-up of "Mohawk," referring to Sidarth's distinctive hair, and "caca," Spanish slang for excrement, or "shit."
Said one Republican close to the campaign: "In other words, he was a shit-head, an annoyance." Allen, according to Republicans, heard members of his traveling entourage and Virginia Republicans use the phrase and picked it up.

In the words of Wonkette, "Shittiest. Explanation. Ever."]

[Update, 8/19/06: Mark Liberman weighs in on the new explanation here.]

Posted by Benjamin Zimmer at 01:15 AM

August 15, 2006

Commas for kids

Finger-wagging Lynne Truss has reached down to the kindergarten set (ages 4-8), to put them straight on commas, in Eats, Shoots & Leaves: Why, Commas Really Do Make a Difference! (Putnam Juvenile, 2006, with illustrations by Bonnie Timmons; 32 pages).  Now that Strunk & White has been illustrated, can we expect a kids' version of it?  Go Away, Needless Words!

Language Log works hard to keep you abreast of hot developments in the publishing world -- the latest Dan Brown, Lynne Truss for the afternoon-nap set, the hiphop Syntactic Structures, with a CD jam-packed with hot tunes (ok, I made that one up).  Remember, you read it here first.

According to the book description:

lluminating the comical confusion the lowly comma can cause, this new edition of Eats, Shoots & Leaves uses lively, subversive illustrations to show how misplacing or leaving out a comma can change the meaning of a sentence completely.

This picture book is sure to elicit gales of laughter -- and better punctuation -- from all who read it.

I raise an eyebrow at the better punctuation part.  Let's see some evidence, I say.  Not that I'm particularly enthusiastic about correct punctuation as a high priority for young children.  But here's a positive Amazon review from reader Robert Schmidt of Honolulu:

Examples: "Eat here, and get gas," versus "Eat here and get gas."

"The student, said the teacher, is crazy," versus "The student said the teacher is crazy."

Now I'd say this is a book for, say, 4-6 year olds. No more than 6. I think this is a book to be read to kids, not necessarily to have them read to themselves. One, it is fun to read, and kids and adults can joke about even more interesting examples of clever, contorted meanings. Two, it is a book that plants a seed for kids and adults seeing signs or other writings during the day that they can tease each other about. And if it makes kids AND adults more aware of the power of commas, so be it!

(I'm afraid that the comma really doesn't help the first example; the causal-sequence reading is still easy to get.)

I haven't seen the book yet, so I'm still hoping the ratio of fun to fodder-for-teasing is high.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 02:47 PM

Here we go again

Reuters tells us: "Snow White's dwarfs more famous than US judges" (8/14/2006).

Exercise for the reader: find a couple of relevant Language Log posts (e.g. "Freedom of Speech: more famous than Bart Simpson"; or "Selling ignorance") and adapt them to the occasion. I'd give it a shot myself, but it'd be a violation of the First Law of Blogging, which is "If it's not fun, don't do it". OK, it's not fair to ask you to do it either. So just re-read those old posts, and consider the Reuters story in that context.

Posted by Mark Liberman at 09:26 AM

Liberate the BBC

There was an interesting piece yesterday in the Independent about Ashley Highfield, "the decidedly ungeeky head of the [BBC]'s web operation". The headline says most of it: "Ashley Highfield: '99 per cent of the BBC archives is on the shelves. We ought to liberate it.'" In the body of the article, we learn that the Highfield's operation has an annual budget of £400m and a staff of 1,500. and that

He plans to use this power base to put Britain at the forefront of internet-based technology and to transform all our lives by giving us access to the entire video archive of the BBC, a treasure trove of 1.2 million hours of film, where and when we want, and for free.

This sounds wonderful. And Highfield's goals go way beyond online archive access:

There is no reason, he believes, why the BBC - in co-operation with other British players - cannot exploit this video-led era to put Britain in a far more advanced position in the online world than it currently occupies. It does not, he says, have to stand back and give centre stage to US-based concerns such as Microsoft, Google, Yahoo! and Apple. "It's wide open. These [US] companies are only a few years old. There's no reason why we couldn't actually be the companies that come out on top of this second wave of the digital revolution. It is not too late.


The key weapon in this transition, he believes, will be the BBC film archive. "We've got one of the world's largest archives, if not the largest archive. And yet, because we've got so few channels - routes to our audience - inevitably 99.9 per cent of that content stays on the shelves. We ought to liberate it and make it available, how, when and where our audience would like to consume it."

Highfield's enterprise seems a lot more credible than what's been happening across the channel. His analysis of why Britain has lagged behind the U.S.? It's that old devil the two cultures:

"The streaming of people in England into arts and science means that people who can explain technology are few and far between. It's so rare in the creative industries to find creatives who are interested in technology, because a lot of them look down on it. It wouldn't happen in America or Germany," he says. "It's very rare as well to find technologists who have been taught how to sell their ideas. It's one of the reasons why the entrepreneurial culture here hasn't made many dotcom successes."

But if the article quotes him accurately (always an issue, alas), I'm not sure that Highfield himself gets it:

But now he faces his biggest challenge as the head of the BBC's newly-created Future Media and Technology division, as the internet starts to move into a second stage, widely referred to as 'Web 2.0', characterised by interactive, highly-visual, user-led sites such as MySpace and Bebo.

"I think we are about to go from the predominantly text-based, predominantly static world into the video-rich, dynamic, two-way engaging environment. That for me is when it starts to get really interesting. It's more than putting a newspaper online it's where you can really start to empower people and give them total control over their media consumption."

Consumption? Take a look at myspace or facebook and guess again, dude.

[Update -- Tom Phillips writes:

Just as an addendum to your post on Ashley Highfield's plans for the BBC, and whether or not Highfield "get's it", you might be interested in this post from Tom Coates, who used to work at the BBC. He's very much of the opinion that highfield doesn't get it, or that if he does, he doesn't know how to make it happen. There's an interesting debate in the comments between various UK media geek types, some defending Highfield and others agreeing with the criticisms.

Interesting. Here are some of the sceptical passages:

...[N]ot all parts of the organisation were similarly dynamic, despite the often amazing number of talented people working within them - specifically, in my opinion, Central New Media under the direct management of Ashley Highfield.

You'll have heard a lot of announcements coming out from his part of the organisation over the last few years, but surprisingly few of them have amounted to much. They all made headlines at the time, but they've all rather disappeared. Do you know what happened to the grand plans of the Creative Archive or the iMP? They were both being talked about in press releases in 2003, but the status of the iMP now appears to be a closed content trial and the Creative Archive has amounted to nothing more than a truncated Creative Commons license used by several orders of magnitude less people and a few hunded short clips of BBC programmes. Highfield's most recent speeches from May this year are still talking about these projects, with him showing mock-ups of potential prototypes for the iMP replacement the 'iPlayer' that could be the result of a collaboration with Microsoft. Are you impressed by this progress? I'm not.

And then there's BBC Backstage - a noble attempt to get BBC APIs and feeds out in public. What state is that in a couple of years down the line? Look at it pretty closely - despite all the talk at conferences around the world - and it still amounts to little more than a clumsy mailing list and a few RSS feeds - themselves mainly coming from BBC News and BBC Sport. There's nothing here that's even vaguely persuasive compared to Yahoo!, Amazon or Google. Flickr - a company that I don't think got into double figures of staff before acquisition - has more public APIs than the BBC, who have roughly five thousand times as many staff! This is what - two years after its inception? Even the BBC Programme Catalogue that came out of this part of the organisation a while back has gone into a review phase (do a search to see the message) without any committment or indication when it's going to be fully opened up.

I'm sure - in fact I know - that there are regulatory frameworks that get in the way of the BBC getting this stuff out in public, but these long lacunae go apparently unnoticed and unremarked - there's an initial announcement that makes the press and then no follow-up. If Ashley Highfield really is leading one of the most powerful and forward-thinking organisations in new media in the UK, then where are all these infrastructural products and strategy initiatives today? And if these products are caught up in process, then where are the products and platfoms from the years previous that should be finally maturing? It's difficult to see anything of significance emerging from the part of the organisation directly under Highfield's control. It's all words!

And that's just the past. This is a man who decides to embrace social software and the wisdom of crowds in 2006 - clearly waiting for Rupert Murdoch to buy MySpace and show the self-appointed R&D lab of the UK new media industry the way. His joy for this space is expressed in lines like, "The 'Share' philosophy is at the heart of 2.0 ... your own thoughts, your own blogs and your own home videos. It allows you to create your own space and to build around you", which is ironic given that earlier last year he stated in Ariel that he didn't read any weblogs because he wasn't interested in the opinions of self-opinionated blowhards.

Well, there's a lot more where that came from, and none of it's pretty. Maybe the BBC is more like the BNF than I thought. Still, if Highfield can really make 1.2 million hours of video easily accessible for free, that'll make up for a lot. We'll see. ]

Posted by Mark Liberman at 07:45 AM

Don't accept nothing less, y'all

This morning, Kerim Friedman sent in a link to a California-based hiphop group called Linguistics.

Their self-description:

A new breed of MC's, Linguistics is here to entertain you with true real hip-hop, amazing rhymes and beats, along with intense cuts and scratches by top-notch DJ's like Dj Step 1, Dj Solo, and Dj 3rdi. Linguistics consists of 3 MC's - Kasper, IQ, and Entity. All solo acts, Kasper and Entity joined together with a very talented group of MC's called Kastlevania in the 1990's and early 2000. They won the San Diego music award for best new hip-hop, but soon the group started breaking up and going their own ways. IQ was busy working on a mix CD with his company "Intelligence Records" that he founded with his cousin Jay War. Jay was a long time friend of Kasper and hit him up to see if he wanted to get involved with the Intelligence Records project, the mix CD featuring some of the best up-and-coming hip-hop artists around Southern California. Eventually Kasper and IQ we're creating many songs together and decided to put one together with Kasper's old friend Entity. At that studio session, they realized how strong they were as a group, and decided to make an album together. 3 different styles from 3 different places in Southern California. Entity from Texas originally bringin the southern style, Kasper growin up with east coast hip-hop and IQ growin up with west coast influence somehow combine to form the ultimate group with a unique sound that is sure to appeal to any true hip-hop fan. Add to that the amazing ground-breaking producers and you have something truly special to listen to and enjoy. It's all about the hip-hop.

Some of their stuff is here. I think their choice of name means that the field of linguistics is so out that it's in -- a point I've been making for a while, in my own quieter way. Here's how they set the name of their group (and our field):

        X               X               X                           X
X       X       X       X       X       X             X             X            X
X X X X X X X X X X X X X X X X X X X X X X  X    X   X   X X    X  X  X X  X    X     X
L I N   G       U I S   T   I   C   S  yes we're the best  don't accept nothing less y'all

[Audio clip for this refrain is here. And here's another track with an upbeat message, "No turnin back".]

Posted by Mark Liberman at 06:46 AM

August 14, 2006

Yet another sex-n-wordcount sighting

Last week, I tried without success to locate some empirical support for the widely-reported difference in daily word usage betwen men and women ("Sex-linked lexical budgets", 8/6/2006). I started with Louann Brizendine's claim that "A woman uses about 20,000 words per day while a man uses about 7,000". I soon found a stunning array of other claimed numerical comparisons -- 50,000 vs. 25,000; 30,000 vs. 15,000; 30,000 vs. 12,000; 25,000 vs. 12,000; 20,000 to 24,000 vs. 7,000 to 10,000; 8,000 to 9,000 vs. 2,000 to 4,000; 7,000 vs. 2,000; 6,000 to 8,000 vs. 2,000 to 4,000 -- but no evidence that any of the numbers were not simply pulled out of the air (or perhaps out of some less salubrious place).

Today, Peter Seibel wrote in with another sighting, this time all the way from New Zealand. According to a "national news story" headlined "When young girls go bad", an Australian "psychologist and media pundit" named Michael Carr-Gregg has written a book (The Princess Bitchface Syndrome: Surviving Adolescent Girls) with yet another pair of numbers:

Brain differences make girls trickier to manage than boys in another way: they're cleverer. "Girls speak 5000 words a day whereas boys, with a good wind behind them, will speak maybe 2500," says Carr-Gregg. "Girls are much more manipulative and they use language more cleverly, so you have to be so much more aware of the techniques they use."

Peter writes:

I don't know if he says anything about this in his book and/or if he has any evidence to back up his claim.

I'm not willing to order a copy from Australia in order to find out, but my guess is that Dr. Carr-Gregg's book either just asserts the claim with no evidence, or else frames it with an empty appeal to authority, like "research shows that ..." Can any readers in Austalia or New Zealand leaf through a copy in a bookstore, and let us know?

[Let me add this. It might well be true that women use more words per day than men do, on average. It might even be true that this is a direct result of chromosomal or hormonal effects on the brain, whether during development or in the mature state, rather than being a result of different life experiences and different social contexts. Then again, it might be true that men use more words per day than women do, on average. The problem is that none of the people making assertions about this actually seem to have bothered to collect any evidence at all, one way or the other. There's a technical term that philosophers use to describe the practice of asserting things without caring much about whether they're actually true or not: they call this bullshit.

Though I freely admit that I don't have any quantitative evidence either, I'll make a common-sense prediction about what the numbers will be like, if and when someone collects them. Whatever the average between-sex difference turns out to be, there will also be an enormous amount of within-sex variation in talkiness, and there will also also be big differences, for any given individual, in talkiness from one setting to another. And I'll bet that the within-individual and within-group variation will turn out to be large compared to whatever the between-sex average difference turns out to be.]

[Update -- David Nash writes from down under:

When I asked for the cited book in the lunchtime crowd just now, the bookshop person said it is quite popular at the moment and pointed to it immediately (and it was low down and not on display).

"... my guess is that Mr. Carr-Gregg's book either just asserts the claim with no evidence, or else frames it with an empty appeal to authority, like "research shows that ..."

I couldn't find the quote from the author's interview that you quote, and the book even has an index (of sorts). The closest I could see is pp.6-7 "their [girls'] general superior communication skills ... tend to make them more articulate than their male counterparts ..." (no numbers).

On p.8: "... [MRI] has shown that in girls the corpus callosum ... has approximately thirty percent more connections than is the case with boys" though not explicitly linked to language in this passage. The author indeed has the odd "research shows that ..." but does sometimes name the researcher or the institution.

Well I suppose we (well, you...) could ask the author (

It looks like I might owe Michael Carr-Gregg an apology -- the assertion in question was attributed to him as a direct quotation in a review of his book, but apparently it wasn't quoted from the book, but was either taken from the interview or made up by the journalist. So following David's advice, I'll write to the author directly and ask about it.]

(I'll pick up the MRI/corpus callosum issue another time.)]

Posted by Mark Liberman at 07:53 PM

Ask Language Log: confessions of an interpolated sub-head

In response to my post "Sharps, Sharks and Gentlemen" (8/13/2006), Eric Christopherson wrote:

[Y]ou quote something with the title "Confessions of an--Played the World Over---Old-Time Gambler". What is going on with that title? Specifically, a) is it normal in English to use _an_ rather than _a_ before a parenthetical expression beginning with a consonant, if the first non-parenthetical word starts with a vowel sound? And b) does this mean the gambler _has_ played the world over (I would read it to mean the gambler _is_ played the world over, which doesn't really make sense)?

Well, here's an image of the top of page 6 of The National Police Gazette, June 20, 1903 (courtesy of ProQuest's American Periodicals Series):

The interpolation of "Played the World Over" into the middle of "Confessions of an Old-Time Gambler" seems weird to me, too. However, it was standard practice in The National Police Gazette at that time. I picked another 1903 issue of the same publication at random -- August 1 -- and immediately found another example:

This one doesn't have the a/an problem, but the result is equally ill-formed from a modern point of view: "Jeff Supremely Confident -- Neglects His Training to Go Hunting -- of Whipping Corbett". As far as I can tell, this sort of thing was purely a matter of headline-writing style -- probably motivated by the way that the periodical was displayed for sale -- and never happened within the text of a story. I don't know when such interpolated sub-heads began, or when they ended (if they did end -- I haven't seen any recently, but maybe I just read the wrong periodicals...).

As for Eric's second question, my own interpretation would be that "played the world over" involves the preterite form of play, shortened from "He played the world over", and thus is neither short for "has played" nor for "is played".

[By the way, the headlined confidence was vindicated. On 8/14/1903, James J. Jeffries knocked out James J. Corbett in the 10th round, retaining his heavyweight championship. Was that the only championship fight between two men with the same first name and the same middle initial?]

[Update -- several people have written to me with variations on the theme "it's not an interpolated phrase, you dope, it's just a funny way of laying out a sub-heading". That's exactly what I thought I was saying about it, actually, but clearly I wasn't clear enough.]

Posted by Mark Liberman at 05:42 PM

More on "Israelis Killed, Lebanese Die"

There is an additional point to make about the fact that a newspaper headline says that Israelis were killed but that Lebanese died. The difference in choice of words may reflect the the fact that the Israelis were intentionally murdered while the Lebanese were killed accidentally. Hezbollah does not restrict itself to targetting legitimate military objectives - it is just as happy to kill civilians as to kill soldiers or destroy military installations and equipment. This is in line with its policy, which is genocide. Its leader, Hassan Nasrallah, in a commencement speech said: "if they [Jews] all gather in Israel, it will save us the trouble of going after them worldwide." (The Daily Star, October 23, 2002). In contrast, Israel is targetting legitimate military objectives: rocket launchers, headquarters, and means of transportation. It is not targetting civilians, and indeed is taking measures to avoid civilian casualties, such as dropping leaflets in advance warning civilians to evacute the area.

That deaths are due to collateral damage rather than intent is not much consolation to the dead and to those they leave behind, but the difference is signficant morally and legally, and it is a difference that has a linguistic reflex. The use of the passive "be killed" presupposes the existence of an agent or instrument and thus raises in our minds the possibility that the event was the intended result of an agent's volitional action. The use of a basic intransitive like "died" does not presuppose any agent.

Consider the headlines "Rapist dies", "Rapist killed", "Rapist executed", and "Rapist murdered". "Rapist executed" and "Rapist murdered" indicate that the rapist was intentionally killed. They differ in that one treats the kiling as lawful, the other as unlawful. "Rapist dies" and "Rapist killed" could both be used to describe either an intentional killing or an accident, but I think that there are subtle differences in which seems more appropriate in which circumstances. If the rapist is murdered by another inmate, for example, "Rapist killed" seems more apporiate, whereas if the rapist dies from inadvertently eating a bit of peanut butter, to which he is allergic, "Rapist dies (of fatal allergic reaction)" seems more appropriate. "Rapist dies in exercise yard brawl" suggests that his death was an accident in the sense that the fighting just got out of hand, while "Rapist killed during exercise yard brawl" suggests that the rapist was intentionally killed by someone using the brawl as cover.

Such nuances are often subtle and they aren't easy to study objectively in part because there isn't a clearcut difference in acceptability between the alternatives, but I think that they are there, and they, together with the facts of what is happening in the Middle East, very likely explain the choice of wording in the headlines.


Reader Alex McGee points out another couple of factors that may have affected the choice of wording of the headline. One is that editors will avoid repeating the same term to avoid monotony. The other is that headlines are subject to severe length restrictions, which may override accuracy. In this particular case I suspect that the linguistic factors I mentioned are at work, in part because the headline is fairly easily reformulated to use only a single verb if one wants to avoid monotony and keep it short, e.g. "M Israelis, N Lebanese killed", but it is certainly true that these are relevant factors and that they may often suffice to explain what seem to be curious headlines.

One reader seems to think that my use of a rapist in examples was intended to suggest that Hizbollah are rapists. That is not the case. I just needed something that would be natural in a sentence about execution. The fact that the word "rapist" appears in the same post as discussion of Hizbollah doesn't associate one with the other. If you wish, replace "rapist" with "killer" or "prisoner" or "John Doe". Hizbollah are murderous bigots but to my knowledge they are not particularly prone to rape. (Indeed, Hizbollah actually seem to hold relatively progressive views on the status of women, in comparison, e.g., to the Taliban.) Furthermore, why would anyone think that it is Hizbollah who are implicitly tarred as rapists and not the Israelis? Nothing in the passage suggests an association with one rather than the other. The association is in your mind, not my text. Food for thought, eh?

Other readers have political objections to my description of the situation in the Middle East. I don't want to go into the politics in detail since this is about language, and actually, it doesn't matter, as far as explaining the headline is concerned, whether I am right or wrong. What matters is that I am far from alone in this perception and that it is quite possible that the author of the headline shares it.

Some objections aren't actually to what I wrote but to what their authors perceive to be my overall stance on the Israel and the Middle East. I'm not going to debate you on this because I didn't say anything about it and it isn't relevant. For the sake of argument, I could stipulate that Israel is scum, the Palestianians the most horribly wronged people in history, and fundamentalist Islam the best thing since sliced bread and it would make no difference to my linguistic point. The only relevant aspect of the political situation, and the only one I said anything about, is whether the two sides intend civilian deaths.

A few people complain that I shouldn't raise political issues in a language blog. Normally, I don't, but sometimes there is an unavoidable connection. When the question is what underlies a headline about current events and whether it reflects political attitudes, political issues are likely to be relevant, aren't they?

In any case, it seems to me that the above description of the situation is indisputable. Reasonable people can differ in their evaluation of how good a job Israel has done of minimizing civilian casualties and where the tradeoff should be between attaining military objectives and causing collateral damage, but that Israel is aiming at military targets and not purposely killing civilians is crystal clear. Does anyone honestly think that if a force with the resources and reputation of the IDF were targetting civilians the casualties would be in the hundreds, as they are, rather than the tens if not hundreds of thousands? If Israel is attempting to kill civilians, it's the most incompetant attempt in history.

The objections to the characterization of Hizbollah as genocidal are insubstantial. They don't dispute the fact that Hizbollah targets civilians both with their rockets and their suicide bombers, and they don't dispute the explicit statements of policy such as the one cited above. At best, you can argue that Hizbollah doesn't mean what it says and isn't truly genocidal. That Hizbollah targets civilians is as far as I can see beyond dispute. A small minority have real points to make about the interpretation of Hizbollah's genocidal statements, but mostly the objections come from people who just aren't comfortable with the fact that their political beliefs conflict with reality. My advice is: when the facts conflict with your beliefs, change your beliefs. That is equally good advice in politics and in linguistics.

Posted by Bill Poser at 11:35 AM

Billions for X-ray machines and we're not any safer

Rick Lyman ("My Liquid-Free Flight Abroad", NYT Week in Review 8/13/06, p. 4) quotes for us an example of what looks like a WTF coordination:

Even the cabdriver got into the game, barking back at the radio.  "What about the X-ray machines that don't work?  What about the billions we spent and we're not any safer?"

Some of our earlier discussions of WTF coordination specifically involved questions, but the interrogative character of this example is irrelevant.  As a whole, the sentence is an instance of a verbless question type What about NP?, as in the movie title "What About Bob?" and in these quotations, from the Google help site and the sci.lang faq site, respectively:

What about privacy? ...
What about spam?

What about those Eskimo words for snow? (and other myths about language)
What about artificial languages, such as Esperanto?

All the grammatical action in the cabdriver's outraged question is in the NP, which has a head "the billions" followed by the relative clause (lacking a relativizer):

we spent and we're not any safer

This is a coordination, of a clause with an object gap in it -- "we spent ___" -- and a clause with no gap in it -- "we're not any safer".  In the terminology of classic transformational grammar, the object "the billions" has been extracted from one conjunct (the first in this case), just the sort of thing that Ross's Coordinate Structure Constraint was supposed to forbid.

The cabdriver's outcry could be recast as an exclamatory construction (also involving extraction), but still with a relative clause in which the CSC is violated:

The/Those/What/So many   billions we spent ___ and we're not any safer!

Or as a WH question (again, involving extraction) in which the CSC is violated:

How many billions did we spend ___ and we're not any safer?

None of these seem nearly as bad to me as some of classic CSC violations (e.g., "What book did John buy and read the magazine?"), which I talked about here two years ago in connection with a relative clause somewhat similar to the cabdriver's:

Hyatt Rickeys, which will be demolished and the property turned into a residential development

This one has a subject (rather than object) gap in the first conjunct, plus some additional grammatical action in the elliptical second conjunct.  It's not so bad.  Back in 2004 I referred to Andy Kehler's book Coherence, Reference, and the Theory of Grammar (2002), which suggests viewing CSC violations in discourse-structural, rather than purely syntactic, terms.

This view is particularly attractive for a class of CSC violations with an object gap in the SECOND conjunct, like

some milk I ran down to the corner store and bought ___ for breakfast tomorrow

where the two conjuncts together describe a single coherent event (with its parts ordered in time); the parts are expressed via the syntax of coordination, but the event itself has a subsidiary subevent (the running down to the corner store) followed by a main subevent (the buying of the milk).

For the cabdriver sentence, we have two subsituations -- the event of spending billions and the state of not being safer than we were before -- which are closely tied, both temporally and logically.  The logical connection can be seen from the fact that "but" (which is more explicitly contrastive than "and") is possible in the coordination, as well as an explicitly contrastive "still" in the second conjunct:

What about the billions we spent but we're (still) not any safer?

and from the fact that the contrastive subordinator "though" (which would of course not give rise to a CSC violation) is also possible:

What about the billions we spent, though we're not any safer?

So I've made some kind of coherence account plausible here, but there's a lot of work still to be done, since varying bits of the cabdriver example -- the exclamatory character of the example, the negative second conjunct, for instance -- produces examples that don't strike me as quite as good as the original.

I'm thinking about the billions we spent and we're not any safer.
What about the billions we spent, and we're now a lot safer?

Maybe I'm being hypersensitive.  Unfortunately, real-life examples like the cabdriver sentence aren't easy to come by; even coordinations of VPs with a gap in the first conjunct are not all that easy to find (below are a couple supplied to me by Chris Potts, with VPs set off in red and with gap sites marked)

Then he took the family phone apart. Finally, he figured it out to his satisfaction. "This was a fantastic high, something I could get absorbed in ___ and forget that I had these other social problems."
(Tracy Kidder. 1981. The Soul of a New Machine. Back Bay Paperback edition, 2000, p. 93)

It's one of those rare books that you read ___ and think, I know that woman. She's me.
(Ad for the book Girls' Poker Night, The New Yorker, June 17 and 24, 2002 , p. 62)

and coordinations of CLAUSES with a gap in the first conjunct are even rarer.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 11:11 AM

John Wells' phonetic blog

I've got no time to post this morning -- my breakfast hour was spent doing laundry and catching up with other post-vacation chores. So go and read John Wells' blog. This morning's post is about the intonation of event sentences.

Posted by Mark Liberman at 08:41 AM

August 13, 2006

Sharps, sharks and gentlemen

A couple of days ago, during the recent Language Log eggcorn fest that was sparked by Mark Peters' article in the Chronicle, Alan Hogue wrote in with a suggestion that hasn't made it into the eggcorn database yet: card shark for card sharp. This is an especially interesting case. For one thing, this eggcorn (if it is one) is winning: according to Google, card shark (with 318,000 hits) has outpaced card sharp (with 167,000) by almost two to one. And the success of card shark is understandable: "shark" has developed a general slang sense "A person unusually skilled in a particular activity"; and the relevant sense of sharp, perhaps the same one involved in "sharp practice", is rare if not obsolete. So maybe card shark isn't an eggcorn after all, or at least maybe it sort of isn't one completely. And there are some interesting bits of linguistic and literary flotsam along the way to figuring this out.

The Wikipedia entry for card sharp says that:

The etymology of the term "card sharp" is debated. A popular theory is that it comes from the German word Scharper, which in one sense means swindler. Another theory, which is likely fake etymology, is that card sharp is a degenerate form of card shark, which itself is an analogy to the term pool shark. In actuality, the reverse is probably true: card sharp is the original term, and card shark is a back formation.

(Let's pass over in silence the mistaken use of the term "back formation" here -- whatever card shark is, it's not "a neologism [created] by reinterpreting an earlier word as a derivation and removing apparent affixes".) Card sharp certainly seems to be about a half a century older than card shark. The OED's first two citations relevant to card sharp are:

1870 Daily News 20 Apr., Two men..were charged with.. card-sharping in a railway carriage.
1884 HARTE On Frontier 273 To make a card sharp out of him.

And the OED doesn't give card shark at all.

Searching the ProQuest American Periodicals Series allows us to back card sharp up to 1858. "Epsom Course, Derby Day, England", in Ballou's Pictorial Drawing-Room Companion, Boston, Oct. 16, 1858.

The large engraving which occupies the whole of the last page, will serve to give the American public some idea of the motley crowd assembled on the Epsom race-course on the Derby Day. [...] The Derby Day may even be compared to the saturnalia of ancient Rome; for at Epsom, for one day in the year at least, the rich and the poor, the nobs and the snobs, the patricians and the plebians, are on an equality. Mark the scence on the "hill." All Bohemia seems to have emptied its floating population upon this portion of Epsom Downs. Mountebanks with monkeys, and dancers on stilts; Punch-and-Judy men, with panpipes complete; card-sharps, Ethiopian serenaders, troubadours, dark gipsey fortune-tellers; grooms, porters, postillions, cab-drivers, stable-boys, racing-touts, beggars, costermongers, newspaper reporters, policement and pickpockets, are all mixed up with the lords and the ladies, the guardsmen and the dandies, the great betting men, and the young ladies with long ringlets; and, as accessories to the motley tableau, we have a heterogeneous salmagundi of lobster-salad, champagne, pale ale, betting-books, race-cards, opera glasses, cold lamb, crinoline, pigeon-pies, smelling-bottles, whistles, penny-trumpets, jacks-in-the-box, white kid gloves, white top-coats, brown stout and beer.

The next hit from APS, for card-sharper, involves a well-known author, and picks up on the class consciousness implicit in the 1858 quote from Ballou's. It's from a serial notel by Anthony Trollope, "Sir Harry Hotspur of Humblethwaite" (Chapter XXII), Lippincott's Magazine of Literature, Science and Education, Dec. 1870:

On the second of last month Mr. George Hotspur met two men, named Walker and Bullbean, in the lodgings of the former at about nine in the evening, and remained there during the greater part of the night playing cards. Bullbean is a man well known to the police as a card-sharper. He once moved in the world as a gentleman. His trade is now to tout and find prey for gamblers.

And the uneasy association between card-sharpers and gentlemen is maintained G. Colmache, "Gentilhomme and Gentleman", Lippincott's Magazine of Literature, Science and Education, Jan. 1876.

It would be impossible to explain the difference which exists between the "gentilhomme" and the "gentleman". It is felt and understood, but cannot be described. The term "gentleman" itself is conventional. Neither birth nor accomplishments, nor even gentle manners, are necessary for undisputed assumption of the title. The man who acts as a lawyer's clerk cannot be called a gentleman, according to Judge Keating's decision, because, the title having no place in the language of the law, if he chanced to be indicted for a criminal offense he would be denominated a "laborer." Serjeant Talfourd's sweeping theory, of the "gentleman" being legally applicable to every man who has nothing to do and is out of the workhouse, cannot be accted, as it would of necessity include theives, mendicants and out-door paupers. The American police have been compelled to defend the border-line of gentility against the encroachments of their vagabond gold-seekers, card-sharpers and ruffians, and confine the term to those of respectable calling. In California the term may be applied to every individual of the male gender and the Caucasian race, the line being drawn at Chinamen. An American writer contests the acceptance of the term in England as being too vague and uncertain for comprehension by foreigners, and suggests that some less conventional designation than those now in use should be found to indicate the idea. To the moral sense it would be natural to suppose that character rather than calling would be the most important point in the consideration of the question; but it is not so. In the four-oared race of gentlemen amateurs held last year at Agecrost in Lancashire the prize of silver plate was won by a crew taken from a club composed entirely of colliers, who had been allowed to row under protest, they not being acknowledged as "gentlemen amateurs". The race over and the prize won by the colliers, an investigation place by the committee. The result was unanimity of the vote against acceptance of the qualification of the winners. Here, then, occurred the best illustration of the comprehension of the term by the moderns, for the "gentlemen," deeming that money must be a salvo to price in the bosom of all whose quality of gentleman remains unacknowleged, subscribed a handsome sum to be distributed amongst the disappointed crew. But here, again, the proof was given of the vague uncertainty of the term, for the crew of colliers were gentlement enough to refuse the proffered gift with scorn.

Another famous author also used card sharp (well, "short-card sharp") in 1876 -- Bret Harte, "Gabriel Conroy", Scribner's Monthly, May 1876:

"Your suggestion, Peter," returned Jack, with dignity, "emanates from a moral sentiment debased by love-feasts and camp meetings, and an intellect weakened by rum and gum and the contact of lager beer jerkers. It is worthy of a short-card sharp and a keno flopper, which I have, I regret to say, long suspected you to be."

As for card shark, the earliest citation I've found (in a few minutes of searching, over my breakfast coffee from a vacation place in Florida) is almost 50 years later than the 1858 Ballou's Pictorial Drawing-Room Companion example of card sharp -- "Confessions of an--Played the World Over---Old-Time Gambler", The National Police Gazette, June 20, 1903:

After forty years of cheating at cards, during which time he has played in most of the cities of the world, made and lost a dozen fortunes, and out of the wreck saved enough to support in fine style a wife, three daughters and two sons at college, an old man of sixty years has retired; yet in his eager, alert mind there still dwells every secret known to the card shark.

It could be true that card shark was created by analogy to pool shark, but they seem to have appeared at about the same time. The earliest citation that I could find for pool shark was a story about "Grant Eby", from The National Police Gazette, Feb. 2, 1895, just 8 years before the card shark citation from the same periodical:

Grant Eby is the clever young pool player, better known as the "Springfield Kid," who recently defeated Champion De Oror in an offhand match at continuous pool, by the remarkable score of 200 to 99. He desires to play De Oro again, for the championship of the world. If he fails to negotiate this match, he will go to England and play John Roberts, the English Champion, at the latter's own game, English billiards, for any amount of money. Eby is a steady player, and a terror to the pool sharks who infest the country.

Curiously, if we combine the 1903 Police Gazette citation with the 1870 Trollope example, we get almost exactly the opposite of the claim made in this Wikipedia entry for card shark:

A card shark is an expert card game player who feasts on weaker "fish" players. A card shark is different from a card sharp, who uses deception for purposes of either card tricks or to cheat at a game like poker.

The "trade" of Trollope's 1870 card-sharper was "to tout and find prey for gamblers", while the 1903 Police Gazette article makes it clear that the "secrets" known to the card shark consist of methods for "cheating at cards". Still, there are two different occupations here, and it makes sense to have two different terms. And a pool shark is not someone who wins at pool by cheating, but rather someone who wins money from more naive players by being better at the game than they realize he is.

After the 1903 Police Gazette citation, the next card shark example that I found was from a work by another well-known author -- Nicholas Vachel Lindsay's "Adventures of a Literary Tramp", Outlook, Jan 9, 1909. Again, the crux of the matter is the association of the card sharp/shark and the gentleman. The context is fascinating, and worth quoting at length:

There was the smash, clang, and thud of making up a train. A negro guided me to the lantern of the freight conductor swinging in the midst of the noise. The conductor had the lean frame, the tight jaw, the fox nose, the Chinese skin of a card shark. He would have made a name for himself on the Spanish Main, some centuries since, by the cool way he would have snatched jewels from ladies' ears, and smiled when they bled. He did not smile now. He gripped his lantern like a cutlass, and the cars groaned. They were gentlemen in armor, compelled to walk the plank by this pirate with the apple-green eyes. We will call him Mr. Shark.

I put my pious letter into my pocket. "Mr. Shark, I would like to ride to Macon in the caboose."

Mr. Shark thrust his lantern under my hat-brim. I had no collar, but was not ashamed of that. He said, "I have met men like you before." He turned down the track, shouting orders. I jumped in front of him. I said: "You are mistaken. You have not met a man like me before. I am the goods. I am the wise boy from New York. I have been walking in every swamp in Florida, eating dead pig for breakfst, water-moccasins for lunch, alligators for dinner. I would like to tell you my adventures."

Mr. Shark ignored me, and went on persecuting the train.

Valdosta was a depot in the midst of darkness. I hated the darkness. I went into the depot. Vermont was offering Flagman the bottle. He drank.

Flagman asked me, "Can't you make it?"

"No. Grady turned me down. And the conductor turned me down."

Mr. Flagman said, "The sure way to ride in a caboose like a gentleman is to ask the conductor like he is a gentleman, and everybody else is a gentleman, and when he turns you down, ask him again like a gentleman." And much more, with that refrain. It was wisdom lightly given, profounder than it seemed. Let us remember the tired Flagman and engrave the substance of his saying on our souls.

I sought the Pirate again. I took off my hat. I bowed like Don Cæsar de Bazan, but gravely. "Mr. Shark, I ask you, just as one gentleman to another, to take me to Macon. I have friends in Macon."

Mr. Shark showed a pale streak of smile. "Come around at one o'clock."

By another coincidence, this took place near the start of Lindsay's walk from Jacksonville, FL, to Kentucky, in the summer of 1906. Specifically, he was in Fargo, GA, around 40 or 50 miles west of Amelia Island, FL, where I am now, a hundred years later.

Anyhow, it's plausible that card shark arose partly as an eggcorn derived from card sharp, and partly as a new formation, whether by analogy to pool shark or as a fresh shark metaphor.

[Update (from Jacksonville Airport) -- Ben Zimmer writes:

Here's an 1884 citation for "card shark" from Newspaperarchive:

1884 Perry (Iowa) Pilot 2 Apr. 8/3 Perhaps it is that the most picturesque and attractive men to be found in New York streets, are bunko men, card sharks, adventurers and dissipated club men, who live without visible means of support.

Gentlemen all, no doubt. Ben also points out that "'Card sharp/shark' was discussed in an eggcornological context on alt.usage.english last year" .]

[Update #2 -- Jay Cummings wrote:

You did notice that there are bout 16300 Ghits for "pool sharp" (subtracting out a number of "...customer pool. Sharp...") didn't you? Including, interestingly, one from Random House's "The Mavens' Word of the Day" in an article about a completely different phrase (spin doctor), and one from whose writers would hate to be called pool sharks (because pool sharks are no gentlemen) but celebrate being "sharp" pool players.

Pool shark has about 300000 Ghits.


[Update #3 -- Ernst Mayer writes:

From The Newgate Calendar, circa 1821 ( gives the date range for the appendices section as 1800-1842, but the appendix in question cites the year 1821 in several places, so it certainly could not have been written before then):

There is a new firm of Greeks established at Cheltenham, who think themselves very snug. The proprietors of this firm are, a person of the name of K--, master of the rooms, a son of K--, who kept a Hazard table, in Jermyn-street, and Pall Mall: also a Mr B--, who was a Billiard sharp in London for years. This B--, was considered the best packer of cards at Rouge et Noir of any of them, and cogger of a dice on dice, so you may judge how the people are fleeced here.

An excerpt from Gilbert and Sullivan's The Mikado (1885):

The billiard-sharp who any one catches,
His doom's extremely hard--
He's made to dwell--
In a dungeon cell
On a spot that's always barred.
And there he plays extravagant matches
In fitless finger-stalls
On a cloth untrue
With a twisted cue
And elliptical billiard balls!

From P.G. Wodehouse, A Prefect's Uncle - Chapter 5 (1903):

Considering his age he was a remarkable player. Later on in life it appeared likely that he would have the choice of three professions open to him, namely, professional billiard player, billiard marker, and billiard sharp.

Returning briefly to the card table: Andrew Steinmetz's classic 2-volume The Gaming Table (1870) also makes specific use of the term "card-sharper."


Posted by Mark Liberman at 12:18 PM

August 12, 2006

Mission Postposition

Speaking of postposed adjectives, can we spare a moment for "Target America: Terror in the Sky," the running header that MSNBC chose for its coverage of the recently uncovered plot to concoct bombs out of liquids smuggled onto aircraft, the upshot of which left me with sweat beading on my forehead as I tried to sneak a tube of toothpaste through the X-Ray machine while boarding a plane on Thursday ("The only way you'll take my Crest is if you pry it from my cold, dead fingers").

That's the same header that Frontline used for its special-edition broadcast on PBS shortly after the 9/11 attacks, and that Newsweek used (albeit with an all-important colon between target and America) for its August 16, 2004 cover story on the discovery of Al Quaeda's "Pre-Election Plot" to strike America, which for a while elevated the terror alert status to "Code Orange" -- and while we're at it, there goes another one. What's with these, anyway?

The pattern originated with German operation names in the waning years of World War I, with the N + Adj. word order almost certainly borrowed from French, which since the days of Louis XIV has provided the lexical and syntactic models of military nomenclature for other European languages (think of court martial, surgeon general and the like). That formula was adopted for operation names by the English and then American military during World War II. But as Lt. Col. Gregory Sieminski noted in a 1995 article, those names didn't become familiar to the general public until some of them were declassified after the war. That's what sparked the flood of movie titles beginning with Operation (e.g., ~ Manhunt, ~ Murder, ~ Crossbow, ~ Petticoat, ~ Dumbo Drop, and of course the 2001 SpongeBob SquarePants vehicle Operation Krabby Patty), not to mention about a gazillion other movies and thrillers beginning with Destination, Assignment, Target, Objective, Mission and other words of that agitated ilk.

Over the course of time, not surprisingly, the semantics of the construction has gotten a little blurry -- even by 1966, the impossible of Mission Impossible was something other than the name of a mission or a destination. By now, in fact, the headwords in these titles are usually little more than genre classifiers that evoke the portentous echoes of past thrillers. Adjective postpositions like these are the syntax you use to promise a whale of a tale, with nail-biting tension and seat-of-your-pants action sequences, even if the spectacle is actually just a fashion reality show called Project Runway. And it can't be an accident that postmodification has become a regular feature of the banners that the cable news networks assign to every major running story, whether they're headed by mission or target or by some other word. America Rising, America on Alert, Decision 2000, Boy in the Middle (remember Elian?) all suggesting that what they're giving us is an overarching narrative, and not just one damn thing after another. Not that anybody's going out of his or her way to scare us, just so we keep tuned.

Added August 13: I discovered an earlier antecedent for the the MSNBC header in the 1989 TV doc Terrorism: Target USA, featuring the columnist Jack Anderson.

Posted by Geoff Nunberg at 09:10 PM

Israelis killed, Lebanese die

Dr. James Eitel wrote to the Oakland Tribune to point out that an Aug. 7 front page story headline in the paper had said, in large print, Rocket attack kills 12 Israelis, and then in smaller print, At least 16 in Lebanon die from airstrikes. Dr Eitel remarked: "This has an immediate bias that Israeli lives are more valuable than Lebanese lives." Does it? It's certainly not immediately clear.

To say that an event kills people is to say (roughly) that the event was the immediate cause of their their death in a direct way. To say that people died from an event is to say (roughly) the same.

I think it might be more common to use "die from" with causes of death like diseases or effects of injuries that might be spread over time (The man had died from exposure/burns); and "kill" is more common with sudden events (The fall/gunshot killed him instantly), but I don't have statistical evidence for that, and one could hardly be surprised to read The man had died from a fall, or The cancer had slowly killed him. And anyway, the difference would hardly support a claim of implying greater value for the lives lost from events described one way rather than the other. Dr Eitel is right to be looking for evidence of bias one way or the other, but the linguistic analysis involved is subtle, and must be done with real care. (Recall this case where some claims about bias in the opposite direction — anti-Israeli — involved claims about passive clauses but the analysts could only identify passives correctly one time in three.)

Dr Eitel goes on to say that "in the text of the article, there were four paragraphs describing the location, circumstances, and human consequences of the rocket attack on Israel. There were exactly zero words describing the location, circumstances and human consequences of the Lebanese killed by Israeli weapons. Are Israeli lives more valuable than Lebanese?" They are not, of course. But I bet the exact locations and human consequences of an air strike north of the Lebanese border in a bombing zone are harder for a journalist in Israel to find out about than the exact locations and human consequences of a rocket hitting a kibbutz in the relative safety of Israel. One has to think about that too: reporters are located in specific places, and have just hours to find out interesting facts and file a story, and often don't want to just ride a motorcycle into a war zone and die for their paper. We can ask for some editorial judgment back home where the paper is composed, but we can't insist that nothing be printed until the reporters have defied death to get exactly the same amount of it regarding the deaths of the Lebanese victims as they have regarding the deaths of the Israeli victims.

[Update: A correspondent points out to me that Dr Eitel might have been talking solely about the size of the letters in the two headlines as suggesting differential importance. If that were so, the above might be just an irrelevance, stimulated by a misreading. I note, though, that "Rocket attack kills 12 Israelis" has only 31 characters (including space). "At least 16 in Lebanon die from airstrikes" has 42. I don't know whether that length difference alone would have required the smaller type, and I also don't know whether one could argue that size of type in headlines symbolizes a newspaper's feeling of differential importance of the facts stated. Maybe. Again, this should be an empirical matter: it ought to be checkable by comparing (I suppose) editors' opinions with lengths of headlines and subheaders suggested. I wouldn't bet on what investigations of this topic might turn up.]

Posted by Geoffrey K. Pullum at 10:51 AM

Truth in captions?

In this morning's online New York Times, the text under the picture at the top of the page puzzled me:

The story that the picture and caption link to ("Israel's Wounded Describe Surprisingly Fierce, Well-Organized and Elusive Enemy", by Greg Myre, August 12, 2006) is a straightforward set of human-interest interviews with wounded soldiers in a Haifa hospital. So why does the caption tell us that "Wounded soldiers seem to have stories of fierce ground battles with Hezbollah"? Why not just put "Wounded soldiers have stories etc." under the picture on the front page?

Journalists and their editors seem to have several different motivations for adding words like seem. Sometimes it seems to be a signal that the unqualified statement may not be true, or at least seems to have no source other than the journalist's own observation.

(link) Whether in Toronto or London, police and spies seem to be getting more adept at handling "homegrown" terrorism cells.

Sometimes it seems to flag outright disbelief, whether on the part of the news outlet or of a third party:

(link) Some people protest that the “bang for the buck” they get from their pay packets seems to fall faster than reported inflation rates suggest; that money does not go as far as the official numbers say it should.

But Greg Myre's interviews don't seem to deserve either kind of qualification. Seasoned observers know that when journalists seem to paraphrase or even quote from interviews, even with named subjects, the results can be randomly inaccurate, and often present the story that the journalist wanted to tell rather than what the interview subjects actually said. Sometimes this even happens to quotes from powerful individuals in the context of important stories about interviews whose transcripts are independently available. But I'm not used to seeing this problem recognized, even implicitly, on the front page of the New York Times.

[Seriously, I guess that this seem was meant to indicate that Myre's story presents a sample of interviews, not a scientific survey. And I have no reason to believe that Myre's selection was misleading -- though the point of his story is a common thread in recent reporting from the Lebanese war, and I think we can assume that he had the shape of his story in mind before he began interviewing the wounded soldiers in Haifa, and selected from his interview notes in view of the points he wanted to make.]

[Update -- John Cowan wrote to suggest that the caption editor was adapting this sentence from the body of the story: "There are dozens of wounded soldiers here in northern Israel's main hospital, and all seem to have stories of unexpectedly fierce ground battles with Hezbollah." John's comment:

He simply botched it: it should have been either "All wounded soldiers seem to have stories" or (better) "Wounded soldiers have stories".

John also reminded us of what Hamlet told his mother: "Seems, madam? Nay, it is: I know not 'seems'", which John described as "a very famous linguification, the more interesting for being flatly self-contradictory". ]

Posted by Mark Liberman at 09:03 AM

August 11, 2006

Neocon "Islamic fascist" designation spreads to government

I have an addition to the assorted linguistic thoughts that occurred to me during my recent experience with air travel immediately following discovery of a huge terrorist plot against civil aviation.

While on the plane I read in the Oakland Tribune about President Bush's response to the announcement of the foiled terrorist plot. The linguistic point that interested me was his used of the term "Islamic fascists".

This is familiar neoconservative terminology. The phrase "Islamic fascism" gets over 200,000 Google hits now; the one-word version "Islamofascism" gets nearly 900,000. The terminology is about sixteen years old: as far as I know, neither version of the term was used before the publication of an article by Malise Ruthven in The Independent on September 8, 1990. The locution began to spread mostly after Christopher Hitchens started talking about Islamic fascism, during his journey from being primarily a Trotskyist to being primarily an enemy of Muslim theocracy.

Anthony Clark Arend at Georgetown University notes that the phrase has crept into Bush administration's vocabulary very recently, starting last May 25, and he invites comment on the significance of this lexical change: "I will be interested in seeing how other commentators analyze this new language from the Administration." I will not indulge in political analysis, but I have just one remark about the lexical semantics as I understand it.

Arend may be right that the Bush administration is seeking a connection to the politics of the 1940s to make its conception of the present anti-terrorism struggle as a war just like the 1939-1945 world war against the Axis powers. But it does not strike me as by any means inappropriate for the neoconservatives to use the term fascism in this context.

The word stereotypically connotes a combination of complete control of all institutions by a highly militarized authoritarian state headed by a charismatic leader. It is used for political systems that are radical, totalitarian, corporatist, and chauvinist. It is quintessentially opposed to liberalism — not liberalism in the (now much more common) sense that Geoff Nunberg's latest book talks about, where it is a kind of Republican term of abuse, but the older and more technical sense: individual rights, free-market economics, and a minimum of control by authorities of how people should live, worship, trade, interact, or express themselves.

"Fascism" is not a bad term to pick for the kind of nightmare that would probably result if a global Islamic caliphate were to be established by the sort of Waziristan cave denizens who issue taped messages encouraging disaffected young Pakistanis in Britain to go out and blow themselves and a few hundred passengers to pieces on a train or a plane to glorify Allah. (Yes, I despise this corrupt cult of mass slaughter and theocratic bigotry. Did you think I would be all latte-sipping gooey-relativist about it?) The opposition to individual rights, free markets, choice in lifestyle, tolerance in religion, and expression of dissent of the jihadists is plangent.

So it may indeed be true that right now the Bush administration has a desire to forge a rhetorical connection to the struggle of the Allies against Mussolini and Hitler; but independently of any such desire, the term "Islamic fascism" seems to me perfectly reasonable one to use when characterizing the movement in question.

Posted by Geoffrey K. Pullum at 08:16 PM

Majorities and minorities

Time for this week's Language Log poll!

The poll shows that a majority of people ___ against the war.
Which of these best fills in the blank for you?

The poll shows that a minority of people ___ against the war.
Which of these best fills in the blank for you?

Watch this space for discussion of the results sometime soon.

[ Comments? ]

Posted by Eric Bakovic at 03:33 PM

Language without meaning at the airport

No jokes from me today. Not much was really funny about early morning air travel on the first full working day after the announcement of a Code Orange (and Code Red for the North Atlantic). At 4 a.m. today I was at the Oakland International Airport in California, and five hours later I had reached Seattle, where I am now. I have a few linguistically-related observations for you from my air travel experience today. Several things about the pre-takeoff experiences moved me to wonder whether expressions of English still have their literal meaning anymore in an air travel context.

  • We were all sternly warned to be at the airport two hours before takeoff for domestic flights. No exceptions. I have a 6 a.m. takeoff. I duly slept (briefly) at a hotel near the airport and took the 4 a.m. shuttle over to the terminal. So did nearly everybody else with a similar takeoff time. We packed into that departure terminal like beef on the hoof at the start of a long cattle-drive. The lines snaked all over the terminal, the Delta line intersecting the Alaska line but facing the opposite way. But no one had even turned on the lights behind the counters. The check-in machines weren't even booted up. Not a single airline employee appeared for nearly three-quarters of an hour. By the time the first desultory attempts to get some passengers checked in began, it was getting on for 5 a.m. Why tell us "two hours before takeoff time" if they meant "about an hour" as usual?

  • Just after 5 a.m. the first desultory attempts were made to communicate with the waiting herd — about a thousand of us by now. A voice over a PA said something about reminding travelers about something. I have no idea what. It was inaudible. Why pay for a PA and a person to speak over it and not check whether the amplitude is going to be adequate to allow enough of the acoustic stream to reach the auditory processing centers of the brain? We did without the warning or welcoming or advice, whatever it was. We couldn't hear it.

  • The staff worked under red LED signs that said "POSITION CLOSED". All of them did. Even at 5:20, when I finally got checked in by one of the five or six counter staff, all twelve desks we still labeled "POSITION CLOSED". Why have LED signs with language on them if no one sets them up to say anything true?

  • I walked to the security area, which unlike the check-in area was properly set up for serious passenger flow (though it was not getting it; everyone was tied up in the chaos of the check-in area). I read all the signs as I snaked my way fairly rapidly along the xi-shaped path between the ropes. One of them said that guns had to be unloaded and declared to the airline. I looked around at my fellow passengers and wondered whether I could be sure they had all slipped the ammunition clips out of their handguns. One man travelling with his young daughter was using some Chapstick lip balm. Some nearby passengers jeered at him and told him jokingly that he wasn't going to be allowed on the plane with that, because it was clearly a gel. Was it, I wondered? Do we have a clear enough definition of terms like "liquid" and "gel" in terms of viscosity to permit decisions to be made? What about solid deodorant sticks, for example? Well, Chapstick man smiled and said he didn't dream he was going to get near the plane with such a dangerous object; and having finished doing both lips, sure enough, he tossed the tiny plastic lipstick-like tube into a waste bin. Doubtless he had also removed the bullets from his firearms.

  • We were told to present a photo ID as well as a boarding pass to the security guys before we could get near the hand-baggage X-ray machines and the body scanners. I handed over my California driver's license and my Alaska Airlines boarding pass, and the security man, his eyes continually downcast, looked at each and compared them and handed them back without raising his head. He never looked up. He never saw my face. He did not seem to understand that he was supposed to be doing something that could only be done by an entity with his mammalian binocular vision and uniquely human facial recognition capacity specifically equipped him for: he was supposed to make sure it was me handing him those documents and not some 29-year-old religious fanatic who had knocked me down behind a stairwell and grabbed them from me. Why say "photo ID" if all the guy is going to do is compare letter strings on documents? A computer program could do that, and do it better. (Incidentally, as I struggled with shoes and laptop and jacket and bag I unaccountably forgot to divest myself of my spare change, belt buckle, and watch, so I walked through the metal detector with three-quarters of a pound of metal on me, far more than a box-cutter's worth. The machine did nothing. I walked on. Am I satisfied that we are being adequately safeguarded from hijacking, as the perennially interrogative Donald Rumsfeld would say to journalists in his familiar exasperated tone? No.)

Despite everything the would-be Heathrow terrorists could do to disrupt world aviation, Alaska (which is probably my favorite domestic airline) did a fantastic job and got the crowded plane pushed back from the gate at exactly 6 a.m. as promised. We flew into the grey blanket over Seattle early, and I got my checked baggage (with precious liquids and gels therein) immediately. I hopped on a city bus that was waiting outside, and for $1.50 rode into the city and walked two blocks to the Sheraton, where I am covering the Federated Logic Conference for Language Log (we go everywhere to bring you the language news). It was 8:45 a.m. I found the registration desk, and everything was efficiently organized there. The first workshop I was registered for started at 8:55. By 8:52 I was in the elevator to the 29th floor. At 8:54 I got out, and at 8:55 a.m., to the exact second, I entered the conference room. I had made it on time! And I found that the first speaker, the eminent Patrick Blackburn, who does terrific work on applying modal logic to such domains as the description of natural language syntax and semantics, had not arrived. He lives in France. The airline problems of the previous forty hours had defeated him. For now, the terrorists had won.

Posted by Geoffrey K. Pullum at 02:26 PM

Do magpies understand structural ambiguities?

No, this is not another discussion of animal communication research. We're talking about critters that are closer to home, a quintessentially modern subspecies of homo sapiens: lawyers.

In my post "Lawyers in need of linguistic training" (8/7/2006), I suggested that lawyers ought to be better than they apparently are at detecting and correcting pernicious structural ambiguities in the language of contracts -- and that a few linguistics courses might help. There's a serious logical flaw in this argument, and I expected that someone would call me on it. Ambiguous contracts create disputes that require more lawyering, so (according to the standard argument-form of evolutionary psychology) we should expect lawyers to evolve a culture that maximizes ambiguity in their work product rather than one that minimizes it**. However, both the blogospheric discussion and my email correspondents let this one slip through.

But email from Laura Petelle raised another logical problem for my campaign to persuade lawyers to learn some basic skills in linguistic analysis. Laura took me to be arguing that lawyers need help in learning to write clearly, and observed in response that lawyers mostly don't write, they plagiarize borrow adapt existing text:

... the truth is that lawyers are magpies. Drafting original language that says what you want it to say in precise terms AND fits the statute or case law is a pain in the butt. In trying to draft new language for a legal document, you're dealing with "terms of art" (legal jargon with technical meanings), plain language, shifting meanings of words, etc., etc., etc. It's a linguistic nightmare. So what most lawyers do is try to smash together paragraphs they already have and only change some words. Some of these paragraphs you store in your brain -- I could write six will recitals off the top of my head -- and others you store on your hard drive. When you have a document to draft that resembles one you already drafted, you go dig that one up. Lawyers are also quite generous about giving language to one another. When I started in solo practice, many established lawyers of my acquaintance sent me just reams of documents they thought would be useful to me so that I could steal the language. It was a gift beyond price. When I run up against a new and unique situation, I often e-mail other lawyers and say, "Do you have a document with this?" to get their language.

The motivation for this practice is not laziness or incompetence, it's money:

I drafted a will early in my practice wherein the client wanted to leave his gun collection to certain people. None of my existing documents (mine or the gifts from others -- or the reference books) had language for this, so I had to draft it from scratch to comply with the state gun ownership and transfer laws, AND with the state will laws, AND to be clear, precise, and understandable to a layperson. It took me almost three hours. Now, however, when I get wills clients with guns, it takes me 10 minutes to pull up the language from the older document and change appropriate nouns.

Most lawyers do not charge -- or downcharge -- creation of new language. (Or at least, most where I am. I suppose in big-city firms with pressure for 80-hour billing weeks, young lawyers must do so.) My client was charged 30 minutes for my 3-hour research and hair-tearing experience coming up with the fresh language for the gun bequest.

So when someone gives me pre-existing language that complies with the statute, I take it, even if it's archaic and painful and far too technical for the average reader. When I have time, I try to build my library of paragraphs in plain language and rewrite older paragraphs in simple, clear language. But I don't always have time, both because hours in the day are limited, and because sometimes documents are rushed. I recently turned out a prenup that was rife with legalistic language that could easily have been simplified, but they came to me two weeks before the wedding so it was a very high-speed drafting experience. There simply wasn't time.

This is all very sensible, but it's orthogonal to the original issue. You'll recall that the case of the two-million-dollar comma involved the provision:

[The SSA] shall be effective from the date it is made and shall continue in force for a period of five (5) years from the date it is made, and thereafter for successive five (5) year terms, unless and until terminated by one year prior notice in writing by either party.

The argument was about whether the phrase "unless and until terminated by one year prior notice in writing by either party" should be construed as modifying just the phrase about renewal ("and thereafter for successive five (5) year terms"), or the whole previous sentence and thus the contract as a whole.

I imagine that in the lives of corporate lawyers, renewable contracts like this are common as dirt. So the lawyers who drafted this one must have had plenty of prior examples to choose from. But whether they borrowed boilerplate from earlier contracts, or painstakingly drafted new provisions from scratch, they should have been able to look at the result and say to themselves, "wait a minute, does the notification of termination apply just to the renewal, or to the contract as a whole?" And if they'd paid attention in their linguistics courses, they would have learned that in general, such structures are ambiguous. For a good example of how linguists present such issues to students in elementary courses, see Heidi Harley's discussion here. Heidi argues that the structure in question is ambiguous, using the example

Jane will ask Bill to the dance, and John will ask Sue, unless Phil asks Sue first.

which she (plausibly) argues is structurally amenable to the interpretation where the unless-clause only modifies "John will ask Sue", as well as to the interpretation where it modifies everything that precedes it.

Laura suggested that the English of today's legal documents is a sort of linguistic breccia, composed of mingled bits from many historical layers:

[S]ince I primarily deal in wills, I see a lot of other lawyers' work on prior wills. OH. MY. LORD. Some of them shouldn't be trusted with a pencil, let alone a typewriter. The language abuse is amazing. Some of the paragraphs I see read like they've been around since about 1400 -- and I'm positive some of them have; will laws have changed very little since then, and some of the recitations are exactly the same -- with just a few adjustments for language drift. And I think that drift often goes in the direction of "less comprehensible over time" than "more comprehensible over time" since only bits and pieces get changed, and you end up with this horrific hodge-podge of legal Latin, Norman French, Shakespearian English, colonial English, and modern English. It's no wonder regular people can't read lawyer-speak. I'm a lawyer and I can't read lawyer-speak without giving myself a headache.

According to the story told by "construction grammar" and other examplar-based theories, all languages are sort of like this. But whether new sentences are derived logically from abstract first principles, or reconstructed analogically from a collection of examples, their interpretation will still be subject to structural ambiguity. And you'd think that learning how to think and talk about this would be important to people whose job consists largely of creating and interpreting sentences.

**This is (sort of) a joke. The standard E.P. argument applies to individuals, not to groups, and requires that individuals with some particular behavioral trait should be more successful in engendering offspring. Lawyers whose legal documents require lots of follow-on lawyering are certainly helping to increase the share of the GDP that accrues to their profession, but it's not clear that they thereby create more lawyers who also write ambiguously or vaguely.

Posted by Mark Liberman at 11:52 AM

Language Log outage

Language Log was unavailable for a few hours this morning -- at about 7:40 this morning, there was a power outage that affected much of the Penn campus, and this created a condition that required the server to be rebooted, and I'm in Florida, so I had to call a friend and ask him to go give it a poke. All seems to be well now.

Posted by Mark Liberman at 11:51 AM

August 10, 2006

Moral panic in asterisking

Oddities in the U.S. iTunes asterisking scheme continue to turn up.  Now we see a moral panic incorporated in it, through the bleeping of "molest", "molester", "molestation", "pedophile", and "pedophilia".  Yes, the WORDS THEMSELVES are offensive.

[Other words too dirty to view: "scrotum" and "rapist" (but "rape" is ok, as I pointed out earlier).  "Masturbation" holds the title for greatest number of asterisks (ten), just barely beating out "molestation".  Thanks to Nassira Nicola for the pointers.]

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 10:31 AM

The dying adjective laureate

Would you like to see an utterly clear case of an error in The Cambridge Grammar of the English Language? I have one for you. There are very few errors in the grammar's 1860 pages, but I think it's important to point out that grammars do contain errors. It reinforces the fundamental point that a grammar is a set of claims about a language, and those claims can be wrong, and linguists expect evidence to be relevant to determining whether a grammar is wrong or not. Underlining that point is much more important to me than protecting the sanctity of one page of one grammar that I co-authored. The error I have in mind is certainly tiny — a single two-word example phrase that I now see was the wrong one to illustrate the point at hand. The point itself is correct. But what's interesting about this case is the sheer stunning weight of evidence saying our example was no good. It's really extreme. The topic at hand is those anomalous adjectives (there are just a few) that are required to follow the head noun when they play a modifying role in a noun phrase. And the adjective at issue is laureate, as in poet laureate.

We cited a number of adjectives that show the odd behavior of always being after the noun rather than before it. Galore, for example, is always after the noun (there are still mysteries galore), and it's an adjective as far as we can see. Likewise aplenty, and the political adjectives designate and elect, and one use of proper. And we added laureate to the list. It means "awarded a great honor such as the one that the ancient Greeks used to symbolize with a crown made out of a wreath of laurel leaves." Our mistake? We gave (at the bottom of page 560, in [21]) Nobel laureate as our illustrative example.

That was wildly, thumpingly wrong. The evidence indicates that Nobel laureate and poet laureate have totally different structures. (Marilyn Martin has very perceptively pointed out to me why: because "Nobel" is the name of the award, while "poet" is the name of the position. So a laureate can be a Nobel laureate if that is the prize awarded; while a poet can become a poet laureate if thus nominated; but while the poet laureate is a poet, the Nobel laureate is not a Nobel.)

In poet laureate (the title of whichever poet is currently designated as a kind of honorary official poet of the country), the head noun seems to be poet. So the plural is poets laureate, and that's what most people write.

But in Nobel laureate, for some reason, things have shifted. Laureate is the head. It has become a noun. (As several people have pointed out to me, we get phrases like "the laureates this year".) Nobel is an attributive modifier of that noun, as it is in Nobel prize. Hence the plural of Nobel laureate is Nobel laureates, a phrase which gets over four million Google hits. And Nobels laureate, as a plural NP, gets none. There are a very few occurrences of the phrase itself out there (about half a dozen plus a few duplicates, this page being empty when I checked it), but every one of these seems to be an instance of the rare practice of calling someone a "Nobels laureate" rather than a Nobel laureate, as if the modifier were Nobels (it isn't).

On poet laureate, by the way, usage is split: poets laureate is the commonest plural, with 66,800 hits, but poet laureates gets a healthy 44,400. That means about 40% of speakers have reanalyzed laureate as a noun in that phrase too. The adjective use of laureate may well be dying out.

But on Nobel laureate there is no debate; the word is not an adjective at all. For one structure, with laureate as the head noun, we have 4,000,000 hits to illustrate it. For the other, zero. That is what I call overwhelming weight of evidence from usage. Our example was a bad pick. On behalf of Rodney Huddleston and myself (we wrote the chapter together), I offer my apologies for it. But remember what I told you: For rational people at least, grammars can be wrong.

Posted by Geoffrey K. Pullum at 12:23 AM

August 09, 2006

Overnegation as obfuscation

We've observed many times (most recently here, here, and here — see also the list of links here) that multiple-negation constructions often seem to overload the parsing circuits of our poor brains. A recent comic strip from Penny Arcade takes a look at how negational confusion can be used a tactic to befuddle consumers.

(Hat tip to Matthew Ringel.)

Posted by Benjamin Zimmer at 04:24 PM

A Medical Eggcorn

Doctors have had eggcorn stories for a long time, though not by that name. Patients often don't understand the terminology they hear and come up with comic reinterpretations. My favorite is the story about the woman who named her daughter "Sue Phyllis" and explained that she overheard it at the hospital and thought it was a pretty name.

For those who didn't get it, what she overheard was syphillis, the venereal disease.

Posted by Bill Poser at 11:47 AM


Reading Mark Peters on eggcorns, it donged on me that we haven't featured a new eggcorn here at Language Log for a while. [Well, for four minutes, to be precise, because Ben Zimmer was already on the case as I composed this post... and meanwhile Language Loggers Arnold Zwicky and Ben Zimmer have been regularly contributing new entries to Chris Waigl's Eggcorn Database.]

A few examples from the internets;

Then it donged on me, Broadcom is providing the source code, the vendors just add their tweaks and change some logo's.
After looking at trackback.php delete function it donged on me that its used from inside article pages.
At first it was just a fun thing to listen to your recordings, but then it finally donged on me that Victor is actually talking.
Dude! It just donged on me, that the comment you left me on March 5th of this year @ 10:33 a.m...I....I could be wrong...however, I think it might be gay.
Hey girl! you look like youre going mod in that picture! you know what just donged on me! WE ARE SO GOING TO BE JUNIORS NEXT YEAR!
OOOOOOOOH it just donged on me, I remember now! How I wish I could forget that Nov. nightmare!
Then it donged on me…she didn’t want to be associated with me in public, around people she works with, because I have gained some weight.
it kept falling apart so one day it finally donged on me that I could actually go BUY a new trashcan...
I'm embarrased to say that I've seen the original trilogy countless times, and this never donged on me before.
It just donged on me that the TIMER uses electricity and no timer, no boiler!
It just donged on me its 2006 . Time flies . . .
I finally got out then realized what had happened, he had been attacked by a shark, later we found out it was a bull shark, thats when it donged on me, omg, i was on the water, not just floating on a board, he could of took me much more easily...
Today I was cleaning the basement for this party a mentioned above.. and it donged on me.... I really have a lot to be thankful for!

An "oops" page notes this as one of those cases where someone has "taken a common phrase or word and mutilated it". The discussion highlights the poetic force of such mistakes:

The funny thing about this was that it was a former neighbor of mine whose name was Dawn. Putting aside the fact that she wasn't the brightest beam shining through the window, I think she hit on something here. I mean, when something dawns on us, isn't it often like a "DONG!" on the head?!?

Exactly. "Dawned on me" is a stale, flavorless metaphor. Now, there's nothing wrong with using such expressions. Most English words and phrases are the fossilized residue of dead metaphors, from a certain point of view, and if you seriously tried to follow Orwell's advice ("Never use a metaphor, simile, or other figure of speech which you are used to seeing in print"), you'd be reduced to pointing and grunting. But "donged on me" is fresh and vivid. The only trouble is that if you write or say it, people will laugh at you. So there's your choice: you can be boring or you can seem ignorant.

[I first learned about this particular eggcorn from Cynthia McLemore, who explained that she and the late Professor Harold Kane discussed it in an English class at the University of Colorado, several decades ago, but didn't have a name to give to the phenomenon.]

Posted by Mark Liberman at 10:13 AM

A jewel of an eggcorn

In honor of Mark Peters' article on eggcorns in the Chronicle of Higher Education, William Salmon shared a "particularly juicy" (albeit gross) eggcorn on the American Dialect Society mailing list: pus jewel for pustule. It's now been enshrined in the Eggcorn Database, complete with examples from the Web.

Just think: if this had been noted on Language Log a few years ago, we might all be referring to such reshapings as "pus jewels" instead of "eggcorns"...

Posted by Benjamin Zimmer at 10:09 AM

E*****h, German, and a little bit of F****h

The iTunes automatic asterisking program in the U.S. mostly doesn't recognize languages other than English.  But someone seems to have added one French word to the list of banned nasties, with the result that a track from !!! Chk Chik Chick's album "Louden Up Now" is listed as "S**t, Scheisse, M***e".

"Merde" is asterisked everywhere, but as far as I can tell, no other French word is; "pisse" gets by, though English "piss" doesn't, and "cul" gets by too.  (Of course "con" escapes asterisking, since otherwise plenty of innocent song titles in Spanish would get axed.)  Quebec French religious-based curse words, like "sacrament" or "tabernacle", are also untouched, for obvious reasons.  (Thanks to Jean Sebastien Girard for the pointer to sacre.) 

Nobody seems to care about Latin or German (see "Scheisse" above), but Spanish gets some attention: "maricon", "mierda", and "pinga" get by, but "chinga", "puta", and "puto" are asterisked.  This yields a listing for Coal Chamber's tune "Maricon P**o" (iTunes is erratic in its use of accent marks, by the way).  No doubt there is more amusement to be found in the Spanish entries.

Further entertainment: iTunes marks some tracks as EXPLICIT, but inconsistently.  This is clearly not automated, and a fair number of racy lyrics are not flagged.  Wonderfully, the asterisking and the warning signs are not aligned.  In particular, "merde" is always asterisked, but only "S**t, Scheisse, M***e" gets a warning label -- because of "shit", not because of "merde".

Even better: when you search on a word, you get little boxes featuring items related to that word.  Searching on "merde" produces a box for the audiobook version of Stephen Clarke's A Year in the Merde.  The book is decorously identified in the box as A Year in the M***e -- but with an accompanying image of the front cover, the most prominent feature of which is the word MERDE.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 09:24 AM

On beyond eggcorns

Mark Peters has a piece on eggcorns in the Chronicle of Higher Education today ("Like a Bowl in a China Shop", 8/9/2006). He points out that "[i]t's nice to have a way of explaining mistakes that doesn't make students feel stupid", and that "if students become eggcorn hunters, they would have to pay attention to not only what's being said but how it is articulated. They would have to question expressions that may seem perfectly acceptable and consult the dictionary to see whether 'throws of passion'" or 'throes of passion' is correct. They would have to make fine distinctions, like the difference between an eggcorn and other kinds of mistakes, or between an eggcorn and a writer deliberately being clever. Surely such activities would exercise the reading and thinking muscles."

I agree. But to exercise all of the reading and thinking muscles, you'll want to go beyond eggcorns in teaching the skills of linguistic analysis. And maybe the exercise metaphor is the right way to get through to the modern mind: "Tone up those flabby appositives: just three sessions a week with your personal linguistics trainer will give you tight, attractive prose in less than a semester!"

Posted by Mark Liberman at 08:42 AM

August 08, 2006

They called Hillary a whaaa?

Asterisking objectionable words (as in the iTunes song titles discussed by Arnold Zwicky here, here, and here) requires a careful balance. The asterisker has to conceal enough of the word from impressionable eyes but not so much that it's no longer recognizable to the less faint of heart. So, for instance, motherfucker might be too opaque if it's represented as m***********, but most adult readers would be able to decipher it (given the proper context) if rendered as motherf***er, motherf*****, or even m*****f*****. Beyond the usual suspects of f***, s***, and so forth, it's a subjective task deciding which words get asterisked and how much asterisking of each word is needed.

The columnist Brett Arends faced such a challenge writing in the Boston Herald about nasty epithets slung at Sen. Hillary Clinton by likely voters in the New Hampshire Democratic primary (based on polling conducted by Dick Bennett's American Research Group). The selection of verbal abuse from likely voters begins:

"Lying b**** . . . shrew . . . Machiavellian . . . evil, power-mad witch . . . the ultimate self-serving politician."

(Yes, these are Democrats, not Republicans, using this invective.) In the context of Hillary-bashing, b**** is pretty transparent. We know that the disgruntled respondent isn't calling Clinton a beast or a bimbo, neither a beaut nor a brute. The only word taboo enough to fit the bill is, of course, bitch.

Moving through the litany:

"Criminal . . . megalomaniac . . . fraud . . . dangerous . . . devil incarnate . . . satanic . . . power freak."
And: "Political wh***."

This last one is not quite as obvious as b****, and the writer's use of two unasterisked letters out of five indicates that this is not one of the typical candidates for bowdlerization. But we can safely assume the respondent isn't commenting on Hillary's acumen as a "political whizz," her inexperience as a "political whelp," or her imposing presence as a "political whale." The word is indubitably whore, as bloggers such as Mickey Kaus and James Boyce wasted no time in spelling out.

I don't believe I've ever seen an asterisked version of whore before. Given the generally vicious nature of the characterizations collected by the pollster, this stands out as an unusually dainty concealment. A parenthetical comment by Arends may help explain the cautious asterisking:

(Note: I don't usually like reporting such personal remarks, but in this case you can hardly understand the situation without them. I have no strong personal feelings about the senator.)

If the political situation in New Hampshire in advance of the 2008 primary campaign can't be understood without reference to the abusive (and often misogynistic) anti-Hillary rhetoric among members of her own party, then there's no point in mincing words. We're all adults here — give it to us unexpurgated.

[Update: It's not an asterisking, but Levana Taylor notes that whore is bowdlerized in Rudyard Kipling's poem "The Sergeant's Weddin'," one of the Barrack Room Ballads:

Cheer for the Sergeant's weddin' —
Give 'em one cheer more!
Grey gun-'orses in the lando,
An' a rogue is married to, etc.


Posted by Benjamin Zimmer at 11:14 AM

Out-X-ing X

A couple of days ago, I got email from Owain Evans about patterns of the form "out-X X":

Google gave examples of: Herod, Hamlet, Joyce, Woolf, Pound, Eliot, Shakespeare, Homer, Virgil, Dylan, Beethoven, Hendrix, Dante, Bush, Clinton.

I would be interested to see more creative examples, but I don't know a better way to Google it than just trying possibilities.

"Perhaps the most important reason for me is just pure aesthetic enjoyment. To me, the libertarian theory is beautiful. It out-mozarts Mozart, it's just a gorgeous thing, and I enjoy every day as a libertarian, it's just a big turn-on." Walter Block (Introduction to Libertarianism I)

It's easy to find more:

Dana Milbank, Washington Post White House correspondent whose work has shifted from "tough but fair" to "trying to out-Krugman Krugman" over the past two years or so, checks in with his first post-Election Day glowing profile of John McCain.

So, in order to out- Khomeini Khomeini, the General introduced his own brand of Islamisation.

And he can out-Einstein Einstein. I especially love his ability to translate a dog's growls and barks.

Fox is apparently attempting to out-Disney Disney in its marketing campaign for the animated feature Anastasia (1997)...

Jackson has proven that he can out-Spielberg Spielberg when it comes to over-the-top action, but anytime the fur isn't flying, he really needs an editor.

Frankly, it looked like they were trying to out-Hannity Hannity and out-O'Reilly O'Reilly all at the same time.

That would be like saying George Lucas is going to Out-Star-Wars Star Wars or Alan Moore is going to Out-Watchmen Watchmen.

LeAnn tries to out-Mariah Mariah here or out-Celine Celine and out-Britney Britney there.

There are also (fewer) examples of "out-X-ing X" and "out-X-ed X"::

Then the Democrats were out bushing Bush in their fidelity to the "stay the course until we win" mantra.

Those including several former students, who viewed themselves partly as out-Chomskying Chomsky.

In Tennessee, Bob Clement out-Bushed Bush on some issues.

Rudalevige's book demonstrates how George W. Bush out-Nixoned Nixon.

For a word or a short phrase to be used in these patterns is a pretty good diagnostic for a certain kind of fame. How many different instantiations of the pattern do you suppose there are on the web? 1,000? 10,000? 100,000? This is an excellent example of the kind of search for which you really need a snapshot of the web and the right kind of software to scan it efficiently.

[Update -- Ben Zimmer passed along this relevant section from the OED's current database:

  23. In phrases where the compound verb in out- is cognate with its object: to outdo a person or thing in the sphere of action in which they have particular expertise or aptitude, or for which they are renowned; to reach a level of accomplishment in a particular quality or property superior to that normally associated with it.
  The earliest examples, formed from nouns and verbs, are from Shakespeare. The construction is rare in the 17th and 18th cents, but becomes common from the 19th cent., when phrases formed on adjectives also appear.    

    a. Formed on verbs, as to out-equivocate equivocation, to outfish fish.


    b. Formed on proper names: to outdo a person, nation, or sect in respect of the attribute for which they are renowned, as to out-Nero Nero , to out-Auden Auden. Cf. OUT-HEROD v. See also OUT-BABBLE v.
  N.E.D. (1903) remarks: 'The vast development of this, as of so many other Shakesperian usages, belongs to the 19th c., in which such expressions have been used almost without limit.'

1604 SHAKESPEARE Haml. III. ii. 14, I would haue such a Fellow whipt for o're-dooing Termagant, it out Herod's Herod [1603 It out, Herodes Herod], pray you auoyde it. 1655 T. FULLER Church-hist. Brit. VIII. ii. §24 Herein, Morgan Out-Bonnered even Bonner himself. 1737 Common Sense I. 309 Even to out-bentley Bentley. 1800 J. WOLCOT P.S. in Wks. (1812) IV. 338 In his accoutrements out-Alexandering Alexander. 1870 J. R. LOWELL Among my Bks. 1st Ser. (1873) 3 He..out-Miltons Milton in artifice of style. 1886 Referee 21 Feb. 7/4 If the Provost-Marshall has..out-Neroed Nero. 1941 P. LARKIN Let. 31 Dec. in Sel. Lett. (1992) 29 None of it will be of any value anyway, so it's no use short circuiting myself in an effort to out-Auden Auden or out-Lawrence Lawrence. 1995 Daily Tel. 12 Oct. 6/1 She managed to out-Thatch one of the greatest Thatcherites of them all, the Social Security Secretary, Peter Lilley.

    c. Formed on common nouns, as to out-villain villainy, to out-infidel the infidel.


d. Formed on adjectives, as to out-old the old, to out-modern the moderns, to out-royal royalty.



Posted by Mark Liberman at 04:26 AM

The emerging science of snowclones

For some reason, a phrase has been stuck in my head recently: "the emerging science of ___" It all started when I blogged about the book by Leonard Sax, "Why Gender Matters: What Parents and Teachers Need to Know about the Emerging Science of Sex Differences".

A Google search for {"emerging science of"} claims 180,000 hits, and the first 10 pages turn up emerging sciences of nanotechnology, spontaneous order, space weather, canopy ecology, the web, dietary components for health, very early detection of disease outbreaks, synchrony, artificial life, aspirin, conservation medicine, learning, body weight regulation, the internet, geometric integration, electromagnetic radiation, marine reserves, leadership, homeopathy, epigenomics, fructology, learnable intelligence, insecticide resistance, endocrine disruption, positive emotion, psychoacoustics, metabolomics, Asian American psychology, dam removal, wholeness, functional assessment, EMD (emergency dispatch), and (my favorite) forensic podiatry.

Exercise for the reader: what proportion of "emerging sciences" are in fact sciences? Certainly more than zero, but quite a bit less than one. I was hoping to be able to tell you that "the emerging science of linguistics" was absent from the net, but in fact there are a few hits. However, most of them deal with rather antique dates of emergence, such as 1820 or so:

Herder held that all thought (and consequently also a thinker’s mental life more generally) was essentially dependent on and bounded by the thinker’s capacity for linguistic expression; and that meanings consisted in word usages. Consequently, for Herder the route to discovering the nature of other peoples’ distinctive ways of thinking and meaning was through a careful examination of their distinctive languages and word usages. W. von Humboldt subsequently took over this position, making it his fundamental rationale for the emerging science of linguistics. He also developed it further by emphasizing, as Herder had not, that languages differ even at the very fundamental level of their grammatical structures.

Posted by Mark Liberman at 04:21 AM

Daw Aung San Suu Kyi

I've been asked several times about the name of Daw Aung San Suu Kyi ဒော္ အောင္ ဆန္း စု က္ရည္, the democratically elected leader of Burma, 1991 recipient of the Nobel Peace Prize, and prisoner of conscience. One question is, why so many names? Part of the answer is that she doesn't have quite as many as it seems. Daw ဒော္ is not a name; it's a respectful title for women. Its male counterpart is U ဦး, as in U Thant, the third Secretary General of the United Nations.

Even so, her name is indeed longer than most Burmese names. The first part, Aung San အောင္ ဆန္း "strange victory", is the name of her father, General Aung San, the hero of Burmese independence. The second part, Suu စု, is the name of her paternal grandmother, the final part, Kyi က္ရည္, one of the names of her mother, Khin Kyi. The name by which she is known to those privileged to be on a first name basis with her is Suu စု. Those on less intimate terms would address her as Daw Suu ဒော္ စု.

The other question is why the last syllable of her name is spelled <kyi> but pronounced [ʧi]. The reason is that <kyi> is not an English rendering of the pronounciation of her name but a transliteration. That is, it reflects the way her name is written in Burmese. Burmese spelling, like that of English, is archaicizing. At one time, Burmese had velar stops before /j/. Burmese subsequently underwent a sound change in which /kj/ merged with /ʧ/, but the spelling was not changed. Although there is no longer any phonetic distinction, Burmese retains an orthographic distinction between /kji/ and /ʧi/ which is reflected in the transliteration.

The plot is actually a bit thicker than this. If you use the transliteration favored by scholars of Burmese, က္ရည္ is transliterated <kri>, not <kyi>. That's because it was once pronounced /kri/. By another sound change, /r/ merged with /j/.

There's still a further complication. If you work out what the individual letters are, you'll discover that there are actually five: က   ္   ရ   ည   ္. က is /k/, ရ is /r/, and ည is /ɲ/. What about   ္  ? Well, Burmese is one of the many writing systems that has a default vowel. If a consonant letter is not followed by a vowel letter or an   ္  , it is assumed that it is followed by an /a/. The role of the   ္   is to indicate that no default vowel is to be inserted. For example, က by itself represents /ka/, while က္ represents /k/. So, the last of Daw Suu's names is actually, in a historicizing transliteration, kraɲ. Yet another sound change resulted in the rhyme /aɲ/ becoming /i/.

Posted by Bill Poser at 02:37 AM

August 07, 2006

C*m sancto spiritu

Yes, another triumph of the iTunes automatic asterisking program: the innocent Latin preposition "cum" 'with' loses its "u" because of its dirty homograph, as in Blowfly's song "Cum of a Lifetime" and Super 8 Cum Shot's self-titled album.  This wonderful fact from Barbara Partee, who downloaded "Carmina Burana" from iTunes and was confronted with "Si puer c*m puellula", which she would never have understood if it hadn't been for the work of the Taboo Avoidance Crew (also known as the Too Asterisked Crew) here at Language Log Plaza.

And there's fresh news from the TAC!  But first, a little puzzle for you to solve:

Here are six items that iTunes asterisks out that weren't in previous Language Log postings.  Identify the offending words.

1 c*********s [NOT "cocksuckers" or "cuntlickers"], 2 f******o, 3 f*****g [NOT "fucking"], 4 p******t, 5 s***m, 6 t**t

If you said "teat" for the last one, you're wrong; "teat" escapes asterisk-free, as do "nipple(s)", "boob(s)", and "breast(s)" (only "tit(s)" gets caught).  The right answer is "twat".  ("Muff" is ok though, even in "muff dive", "muff diver", and "muff diving".)  Number 5 is "sperm"; this is odd since, as you will recall, "semen" is ok on iTunes.  Number 4 is "pederast", 3 is "fisting", 2 is "fellatio" (an easy one), and 1 is "cunnilingus".  What on earth are the medico-legal "cunnilingus", "fellatio", and "pederast" doing on this list?

You might remember that "whore" is out; I can now report that "pimp", "hustler", "prostitute", "hooker", and "brothel" are ok.  The American iTunes seems not to appreciate the force of "arse" in British English, since the word gets no asterisks, while "ass" is a dirty word.  And as a result of the iTunes whole-word approach to asterisking -- it looks like someone has to enter a word into a list of asteriskables, and then the program searches for that whole word, and not for that letter sequence within a longer word -- "cockring" escapes punishment (while "cock" is, of course, verboten).

As Wendell Kimper pointed out to me in e-mail, there's a lot that iTunes misses because of the whole-word rule: "whore" is out, but "whorecast", "sacredwhore", "whoresulta", and even plain ol' "whoring" get by.  Kimper noted a podcast titled "A Cop and Two Wh***es", which startled me, because it's a violation of the first-and-last-letters-only rule; how did that "e" get preserved?  I then discovered that "cocks" becomes "c**ks", with the k preserved (though "fucks" becomes "f***s", not "f**ks").  Another Apple mystery.

Then Q. Pheevr wrote to say he'd blogged about iTunes's automatic asterisker a couple of times. And added still another complication: the Canadian iTunes store works differently from the U.S. store, at least to the extent of bleeping (of all things) "pole", which the U.S. store sensibly leaves untouched (since most of its uses in song titles are non-sexual).  Finally, another bit of marginal irregularity:

Also, the Canadian iTunes Store, which is the one I use, censors "bordel" in French, except in one song title ("Le p'tit bordel," by Marie France), but not "bordello" or "brothel" in English. The cross-linguistic inconsistency is amusing, if not entirely surprising, but the fact that one song seems to have escaped the automatic asterisker is a bit mysterious.

Pheevr's blogs note that while innocent eyes are protected from SEEING the nasty words, most of the time when the songs are sampled innocent ears can HEAR them.  There is, as yet, no automatic bleeping of the samples.

And to see the awful consequences of abandoning the whole-word rule, consider what happened when a British labor union opted for string searching.  As reported by Jon Henley in his "Diary" column in the Guardian of 1 August (passed on to me by Mark McConville), in the wake of Mel Gibson's drunken reviling of the "fucking Jews":

Our spirits are much lifted, through, by news that manufacturing, technical and skilled persons' union Amicus, which has already distinguished itself by monitoring employees' emails and hiring private detectives to keep an eye on its more unruly members, has started automatically filtering its internet discussion forums for unsuitable words, phrases and letter combinations. But while Scunthorpe understandably becomes "S****horpe", and Blackpool (a tad unnecessarily, to our mind) as "Black***l", it seems "entitled", "parse" and even the unspeakable "Saturday" survive intact. Are we alone in wondering how this can be?

You can see the system in action here.  But: POO??  I rushed to my iTunes to check on "poo", and while I was at it, "pee", and was relieved to see that these two nursery words are fit for American eyes.

McConville also points out another case where the visual offense was judged worse than the auditory one (see above on iTunes titles and samples):

... on REM's 1992 album "Automatic For The People", a song which is quite clearly meant to be called "Fuck Me Kitten" is listed on the sleeve as "Star Me Kitten". I remember at the time one of the band saying that Warner refused to allow the  (written) word "fuck" to appear on the title list (this was the era of Tipper Gore and Prince's "You Sexy Motherfucker"), although they had no problem with the line appearing in the song itself.

Along the same line, the Nirvana single "Rape Me" was altered to "Waif Me" for Wal-Mart and Kmart releases of the album "In Utero" -- "an intentionally comical name chosen by Cobain himself", according to Wikipedia.  (Thanks to Martin Marks for the pointer.)  Comical, maybe, but also phonologically VERY close to "rape".

Various artists have used their very own avoidance characters, drawing from the set used in comic-strip generic curses (like "*$&!"), in song titles like these:

Can't F#%k Wit Me
F#@k the Creationists
I Wanna F##k U
F@#k Da Police
Don't F$%k With Us

As OSTENTATIOUS (and instantly interpretable) avoidance, these are hard to beat.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 07:10 PM

Dan Brown news update

The Dan Brown Language Issues department at Language Log Plaza is currently working on these stories, among others:

  • It was reported on NPR this morning that around two percent of the Britons polled about the Domesday Book think that it is a novel by Dan Brown. Two percent would translate to way over a million people in the British population at large. (The book is in fact an 11th-century catalog of taxable properties in Britain.) Let's face it, Dan Brown is now an unstoppable legend of such proportions that if he were claimed to have written Beowulf, no one would turn a hair. [Footnote: Dan Brown did not write Beowulf. It is an Anglo-Saxon epic poem more than a thousand years old.]

  • As foretold here on Language Log, Dan Brown's Angels and Demons (originally published before The Da Vinci Code and something of a rough sketch for it) is to be made into a film. It will at last become possible to see for oneself what it looked like when the constantly angry Commander Olivetti of the Vatican Guard "entered the room like a rocket", and when "His eyes went white, like a shark about to attack", and when a bit later he said something with "his insect eyes flashing with rage." Language is ill-suited to conveying such images; we need a visual medium, and Columbia Pictures is going to provide it.

  • Research by the team of interns employed at Language Log's Dan Brown Textual Analysis Desk has confirmed that Digital Fortress is not an exception to the stylistic principle that new characters will be introduced with NPs that begin with anarthrous occupational designations. On page 48, chapter 9 begins by introducing new character Phil Chartrukian thus: "Systems security technician Phil Chartrukian had only intended to be inside Crypto for a minute—just long enough to grab some paperwork he'd forgotten the day before." (I don't need to tell you old hands that Phil soon dies a hideous death, his body found burned and broken lying across an electrical generator.) On page 102, we get another new character: "Cryptographer Greg Hale stood in the opening." And one might also count page 260, where chapter 74 begins: "Director Leland Fontaine was a mountain of a man..." — but note that Director, unlike systems security technician, does get used as a title (like Captain or Chancellor or President), so this is not a clear case. The general rule stands, therefore: every Dan Brown novel introduces new characters with noun phrases in which an occupational term before a proper name occurs with no preceding article. And they then usually die horrible deaths.

We will keep you up to date with all breaking news about Dan and his works; just bookmark Language Log's main page and check every day at least twice.

Posted by Geoffrey K. Pullum at 04:25 PM

Boston's irreconcilable council(l)ors

Boston's City Council is hopelessly deadlocked over a grave matter: Should council(l)or be spelled with one L or two? According to today's Boston Globe, the council is about evenly split between one-L-ers and two-L-ers. The two-L-ers say tradition is on their side, as that's how city documents have long spelled the word. But the one-L spelling is preferred by "newer, younger councilors" (the Globe goes with one L, obviously) who see it "as a symbol of breaking from an old, hide-bound kind of politics." It's a delicious example of how orthography can be invested with weighty sociopolitical significance, even to the point of fetishization.

Here are a few quotes from the brash young one-L-ers:

"It just exemplifies that we are on the vanguard of change," said Jack Kowalski, spokesman for Sam Yoon, the council's first Asian-American.

"Why use two Ls when you can use only one?" said Councilor Michael P. Ross. "I believe in conservation — and brevity."

Salvatore LaMattina, the newest council member, said his staff had a discussion over the issue when he first took office in June.
"I just liked the one L," he said. "It's easier, and I wanted to be a little different for my district than the previous councilor." Paul J. Scapicchio, the previous councilor from District 1, used two Ls.

And on the other side are the two-L-ers, saving orthographic tradition from the impertinent whippersnappers:

"Those new young guys, they've just got no respect," said [Councilor John] Tobin, whose staff for several years mocked him by giving him the nickname "Double L."
"I will not be part of the dumbing down of the English language," he said.

"That's the proper way," said Councilor Charles C. Yancey, who spells it with two Ls. "I am aware that some of my colleagues are spelling it a different way. I should accept personal responsibility for not properly educating them. Either that or they've refused to listen."

The mayor, Thomas M. Menino, doesn't seem to favor one L or two, instead seeing the debate as symbolic of the council's inability to compromise. "If they can't agree on the spelling of councilor, how are they going to agree on anything else?" Menino told the Globe.

The Globe article identifies councillor as a British spelling and notes that the Oxford English Dictionary has it that way, while other dictionaries including Webster's New World prefer the one-L version. The current OED entry for councillor doesn't even mention the one-L variant, though the entry doesn't appear to have been revised at all over the past century (true for many of the OED entries near the beginning of the alphabet). Most contemporary American dictionaries list councilor and councillor as acceptable alternates, though councillor is sometimes marked as British.

The one-L spelling of councilor was apparently one of Noah Webster's many attempts at distinguishing American orthography from the British model. Webster's pioneering dictionary of 1828 only lists councilor without deigning to mention the traditional British spelling. But after Webster's death, successor dictionaries were not as adamant in imposing the one-L spelling. The revised unabridged dictionary of 1913 notes parenthetically that councilor is "written also as councillor." Webster's Third New International of 1961, which traditionalists derided as being overly permissive, actually lists councillor first, though both spellings are deemed equally acceptable.

Despite the lingering confusion over council(l)or, changing two Ls to one (particularly in inflected and derived forms of L-final root words ending in unstressed syllables) was actually one of Noah Webster's more successful areas of reform. Would the Boston council members who think that the one-L spelling of councilor represents "the dumbing down of the English language" feel the same of jeweler, panelist, or even councilor's soundalike, counselor? As a patriotic one-L-er might point out, our forefathers quarrel(l)ed with having our orthography model(l)ed after the British. Long live America's marvel(l)ous and unrival(l)ed spelling reforms!

Posted by Benjamin Zimmer at 12:45 PM

Lawyers in need of linguistic training

Estelle Hawa sent in a link to an article by Grant Robertson in yesterday's Globe and Mail, "A basic rule of punctuation":

It could be the most costly piece of punctuation in Canada.

A grammatical blunder may force Rogers Communications Inc. to pay an extra $2.13-million to use utility poles in the Maritimes after the placement of a comma in a contract permitted the deal's cancellation.

The controversial comma sent lawyers and telecommunications regulators scrambling for their English textbooks in a bitter 18 month dispute that serves as an expensive reminder of the importance of punctuation.

The problem? The contract said that the 2002 agreement "shall continue in force for a period of five years from the date it is made, and thereafter for successive five year terms, unless and until terminated by one year prior notice in writing by either party." Rogers meant the "unless and until" clause to modify the clause about renewal, but Aliant (managing rights to the utility poles) felt that it should apply to the whole agreement, and proceeded to give the required one-year notice early in 2005. The Canadian Radio-television and Telecommunications Commission (CRTC) agreed with Aliant.

An Aliant spokesperson commented that "This is a classic case of where the placement of a comma has great importance", but it seems to me that punctuation is secondary. Lawyers with a bit of common sense, combined with elementary skill in analyzing ambiguities of structure and interpretation, should have seen the problem coming, and re-worded that part of the contract so as to make it entirely clear who was entitled to cancel it when. Given the importance of such ambiguities of interpretation, in construing laws and judicial orders as well as contracts, I've always been puzzled that lawyers aren't routinely educated in basic practical syntax and semantics. In olden times, lawyers would have acquired (an approximation to) these skills in the course of learning dead languages. These days, I suppose that few of them get any educational help at all in such matters, and have to fall back on their native wit, such as it may be.

Ironically, we've learned a lot over the past century about the analysis of structure and meaning. It's too bad that lawyers and their clients so rarely get the benefit of that knowledge.

Alas, I suppose that $2.13 million (Canadian) is a drop in the financial bucket these days -- easily lost in the round-off error on the lawyers' bills for companies like Rogers, though still embarrassing for the careless contract drafters.

[Update -- Margaret Marks found the decision. The relevant portion reads:

The Commission is of the view that the wording in section 8.1 of the SSA is clear and unambiguous. The Commission notes that based on the rules of punctuation, the comma placed before the phrase "unless and until terminated by one year prior notice in writing by either party" means that that phrase qualifies both the phrases "[the SSA] shall be effective from the date it is made and shall continue in force for a period of five (5) years from the date it is made" and the phrase "and thereafter for successive five (5) year terms".


[Update -- Mary Blockley pointed me to a relevant passage in her book "Aspects of Old English Poetic Syntax: Where Clauses Begin" (2001):

The Canadian regulators were apparently unimpressed by the "superveniency of intention". ]

[Another update -- Heidi Harley at Heideas has a extended discussion of the linguistic issues. ]

Posted by Mark Liberman at 12:37 PM

Sex and speaking rate

After spending a couple of hours trying to track down a plausible source for Louann Brizendine's characterization of male and female speaking rates, I've come up empty. Worse than empty: the paper that she cites as the source for the claim contains no relevant information at all, and the rest of the literature on speaking rate not only fails to support her assertion, but also includes results that contradict it. If you can help me to do better, please let me know. Meanwhile, here's where I've gotten to.

On p. 36 of her new book The Female Brain, Prof. Brizendine writes: "Girls speak faster on average -- 250 words per minute versus 125 for typical males." In support of this assertion, the end-note in her book cites "Ryan 2000", which her biblography lists as Bruce P. Ryan, "Speaking rate, conversational speech acts, interruption, and linguistic complexity of 20 pre-school stuttering and non-stuttering children and their mothers", Clinical Linguistics & Phonetics, 14(1), pp. 25-51 (2000). Its abstract:

This is the second in a series of reports concerning stuttering pre-school children enrolled in a longitudinal study; the first was Ryan (1992). Conversational samples of 20 stuttering and 20 non-stuttering pre-school children and their mothers were analysed for speaking rate, conversational speech acts, interruption, and linguistic complexity. Between-group analyses revealed few differences between either the two children or two mother groups. Within-group analyses indicated differences that involved conversational speech acts and linguistic complexity. Most stuttering occurred on statements (M = 32.3% stuttered) and questions (M = 20.9% stuttered). Stuttered and disfluent sentences had higher Developmental Sentence Scoring (DSS) (Lee, 1974) scores (M = 10.9, 12.9, respectively) than fluent sentences (M = 7.6). Multiple correlation analyses indicated that speaking rate of mothers (0.561) and normal disfluency of children (0.396) were major predictor variables.

The only (new) quantitative information about speaking rates to be found in this paper is in Table 1:

A few numbers from other studies are quoted in the literature review, but none of them breaks speech rates down by sex, and none of the cited numbers are either 250 or 125. I'm at a loss to see how Prof. B. can interpret anything in this paper as support for the view that "girls speak ... 250 words per minute versus 125 for typical males". Perhaps her research assistant pulled the wrong index card for that talking point? In fact, the whole idea is so odd that I wonder if it's all a misunderstanding of some sort from the start. Perhaps someone read from a column of average pitch measurements -- where female values might be roughly twice (post-puberty) male values -- instead of a column of average speech rate measurements. On the other hand, there are no pitch measurements in Ryan 2000, so we'd have to combine this with a mistake in the reference. A less charitable interpretation would be that the 250-vs.-125 numbers arose through the same sort of intellectual "telephone game" that operated in the infamous Eskimo snow-vocabulary case, and a harried assistant assigned to track down a reference for the numbers just fixed on Ryan 2000 because "speaking rate" occurs in its title.

As far as I can tell, none of the literature that actually addresses the question of sex differences in speaking rates finds anything remotely resembling the disparity that Brizendine claims. For example, Michael P. Robb, Margaret A. Maclagan, and Yang Chen, "Speaking rates of American and New Zealand varieties of English", Clinical Linguistics & Phonetics, 18(1) pp. 1-15 (2004), says:

Various acoustic measures of speaking rate were calculated for 40 adult speakers of New Zealand English (NZE). These measures were then compared to a group of 40 adult speakers of American English (AE). Results of the analysis identified significantly faster overall speaking rate and articulation rate for the NZE group compared to the AE group. No gender differences in speaking rate or articulation rate were found for either variety of English.

There's a brief mention of some sex differences in speaking rate in a paper that Jiahong Yuan, Chris Cieri and I will be giving at ICSLP 2006 in September: Jiahong Yuan, Mark Liberman and Chris Cieri, "Towards an Integrated Understanding of Speaking Rate in Conversation". The link is to a four-page "extended abstract" that will go in the conference proceedings; given the four page limit, we cut our remarks on sex to

Males tend to speak faster than females, as shown in Figure 6. The difference between them is, however, very small, only about 4 to 5 words or characters per minute (2%), though it is statistically significant. It might be due to things that we would not normally think of as speech-rate parameters, such as differences in word-frequency distributions. The opposite patterns of segment-length difference between males and females in Chinese and English are interesting, and need more study.

If you're curious, this is something that you can easily look into yourself -- just snag some of the speech floating around on the internet, transcribe it (or find a transcript made by someone else), count the words, and divide by the elapsed time. This can be a great topic for a research paper, because the data is easy to get, and the measure can be relevant to all sorts of things: rhetoric (how do speech-rate changes track the arc of an argument?), mood (how do actors use speech rate to convey emotion?), gender studies (how do male and female speech rates vary across circumstances?), or comparisons across age, social class, language or interactional context. There are a couple of tricks, though, which mean that you have to be careful about comparing rates across sources and contexts.

One is the question of whether and how to count silences. This came up a couple of years ago on Language Log, in a discussion of one of the 2004 presidential debates ("The Rhetoric of Silence", 10/3/2004):

I've pointed out that in Thursday's debate, John Kerry's sentences were 17.7% longer than George Bush's. Since the two men had the same amount of time to speak, you expect this to mean that Kerry used fewer sentences. And he did, 468 to 476. However, that's only a 1.7% difference. Kerry accounted for most of his greater sentence length not by using fewer sentences, but by packing more words into the same amount of time. 15.8% more, 7,136 to 6,165.

How'd he do that? New LexicoTardis® technology from the Rockridge Institute? Well, there are four obvious possibilities. First, Kerry might have talked faster. Second, he might have used shorter pauses. Third, he might have paused less often. Fourth, he might have used intrinsically shorter words.

A few quick and simple measurements suggest that the second of these four was the key factor. In the section of the debate that I examined, Bush used about the same number and frequency of pauses as Kerry did, but Bush's pauses were much longer. In between the pauses, Bush actually talked faster, but the pauses were so much longer that his overall speech rate was slower.

[...] Bush's overall speech rate was slower (155 words per minute vs. 167 words per minute), but while the two men were actually talking, Bush talked considerably faster (220 words per minute vs. 202 words per minute).

If you're going to leave out the pauses, you have to decide what a pause is. Are you going to leave out all the little silences, no matter how small? The littlest ones aren't really pauses, they're aspects of speech like "stop gaps", like the tiny silence while your mouth is closed during the [p] of apple -- but these are variably prolonged at phrase boundaries, and you'd need to decide when a prolonged stop gap turns into a silent pause. What about "filled pauses", hesitations that the speaker fills with sounds like "uh"?

Luckily, these decisions don't make a big difference in speech rate numbers. The big effect is leaving out pauses or not leaving them out. But if you're going to take them out, then different practices about what to remove will still create a small effect, and so you'd better do it the same way across all the conditions you're interested in.

Another issue comes up in dialogue and multi-party conversations. What are you going to do about overlapping speech? This can create a significant effect as well. Here's a quote from the previously-cited Yuan, Liberman and Cieri paper:

One indication of the nature of this problem can be seen by exploring various definitions of speaking rate applied to a modest-sized corpus that has been carefully aligned at the word level. This is the version of English Switchboard corrected and aligned at ICSI, comprising 2,438 conversations. If we calculate the overall speaking rate for this corpus, by simply adding up the number of words spoken by both participants, and dividing by the total elapsed time of the conversations, we find an overall average rate of 196 words per minute (WPM). For individual conversations, the rate measured in this way ranges from 111 WPM to 291 WPM. If we use the word alignments to exclude silences, non-speech noises and so on, we find an average “net” speaking rate of 236 WPM, with a minimum of 158 WPM and a maximum of 312 WPM. However, if we calculate the rate by adding up all of the turns for each speaker in order to get the total time, we find that there is so much overlap in the turn boundaries that the average turn-wise rate (total word count divided by the sum of segment times) is only 164 WPM, or 14% less than the rate calculated using the total conversational time.

For the "Switchboard" conversations, we thus found average rates of 196 WPM by one method (total words divided by total conversation time), 236 WPM by a second (total words divided by speech time excluding silent pauses), and 164 WPM by a third (total words for each speaker divided by the time allotted to that speaker's "turns", which included some short silences and some speech from the other parties). In my earlier post, I reported on average rates for males and females in a different corpus of conversational English, using the third method, but relying on turn boundaries established by a different crew of transcriptionists. This method produced average rates of 174.3 WPM for men and 172.6 WPM for the women. I'm fairly confident that this reflects different turn-segmentation practices by the transcriptionists, not a 6% faster speech rate in the second corpus.

There's no "standard" way to do all this, as far as I know. Just decide what to do, based on the goals of your study and the data you're working with, try to keep the practices consistent across the variables of interest to you, and document your decisions when you write it up. The one thing you shouldn't do? Don't just make up a couple of numbers claiming a 2-to-1 difference for the groups of people that you're interested in characterizing. Somebody might check.

[Update -- Lameen Souag points us to Ann Cutler and Donia Scott, "Speaker Sex and Perceived Apportionment of Talk", Applied Psycholinguistics 11 (1990): 253-272. It's not available on line, alas, but the description at ERIC reads:

Investigates whether listener bias contributes to the mistaken notion that women talk more than men. Perceptual effects (misjudgments of rates of speech) and attitudes to social roles and perception of power relations are suggested to be among the factors contributing to the misjudgment.

Lameen's description, based on reading about the paper in Jay Ingram's Talk Talk Talk:

Apparently, they took various dialogs and, for each dialog, recorded it being read by two women, by two men, by a man and a woman, or by a woman and a man. When a man and a man, or a woman and a woman, are reading the dialog, listeners judged each of them to be taking about equal time; when a man and a woman are reading the same dialog, listeners (male or female) consistently judged the woman to be talking more.


[ Update #2 -- speech rate information for females and males in Dutch conversation can be found in Diana Binnenpoorte, Christophe Van Bael, Els den Os and Lou Boves, "Gender in Everyday Speech and Language: A Corpus-based Study", Interspeech 2005. Their data is from 50 male and 58 females who participated in face-to-face conversations, and 40 males and 61 females who participated in telephone dialogues. Speech rate measurements can be found in their Table 1:

Converting from words per second to words per minute, that's 223 WPM for the males vs. 220 WPM for the females with pauses included, and 274 WPM for males vs. 266 WPM for females if pauses are excluded. Again, a (meaninglessly small) advantage for the males, in the opposite direction from Brizendine's claim. (Hat tip to Theo Vosse.)]

[Update #3: Paul Foulkes reminds me that in "Relations of sex and dialect to reduction". Speech Communication 15: 39-54 (1994), Dani Byrd checked the 630 speakers who read the two "calibration sentences" for the TIMIT database, and found that the male speakers were 6.2% faster on average. Though small, the effect was found to be statistically significant -- and again in the opposite direction from Brizendine's claim. ]

Posted by Mark Liberman at 10:05 AM

Eggcorns going mainstream?

Barbara Wallraff's Word Court column in the September 2006 Atlantic features a basket of eggcorns: "to step foot in", "baited breath", "free reign", "hone in", "ripe with mistakes", "LAN line" (for "land line"), "magnate school", "shipping magnet". She unapologetically refers to these as eggcorns -- well, she does explain that

Language geeks have given the name eggcorns to usages of this kind—“spontaneous reshapings of known expressions” which seem to make sense.

and some might take offense, but these days, X geek seems to have become an informal way of saying "expert in X", though perhaps with just a tinge of unhealthy obsession. And she cites Chris Waigl's eggcorn database, and gives the URL.

Web counts also suggest that eggcorn is making its way in the world (though there's more than the usual amount of noise in the numbers -- what's up with mondegreen, Yahoo, a splog nest in need of cleaning out?):

  Google Yahoo MSN
malapropism 121,000 101,000 15,485
mondegreen 106,000 1,110,000 10,308
eggcorn 56,500 26,200 4,436
Posted by Mark Liberman at 05:53 AM

MS Demo Backwash

Based on the 8/5/2006 and 8/7/2006 User Friendly strips, I'd say that the mud kicked up by the Big Redmond Demo Disaster is sticking to speech technology in general, more than to Microsoft's speech technology in particular:

That's too bad, given that it was really a bug in some general automatic gain control code for Vista audio "capture streams" (at least that's how I read the discussion here).

Posted by Mark Liberman at 05:39 AM

August 06, 2006

Sex-linked lexical budgets

[Update 7/3/2007: a list of links to other relevant posts can be found here. ]

This morning, I spent a fruitless hour trying to track down the source of Louann Brizendine's assertion that "A woman uses about 20,000 words per day while a man uses about 7,000". I found many similar assertions, with estimates of the male lexical allowance varying from 2,000 to 25,000, while assertions about the female daily word budget ranged from 7,000 to 50,000. But nowhere could I find any evidence that anyone has ever supported these assertions by actually counting words or measuring talking times. My current best guess is that a marriage counselor invented this particular meme about 15 years ago, as a sort of parable for couples with certain communication problems, and others have picked it up and spread it, while modulating the numbers to suit their tastes. This is what happened in the case that Geoff Pullum called The Great Eskimo Vocabulary Hoax, discussed here. If I'm wrong, and you know a source for Brizendine's numbers that isn't just passing along someone else's story, please tell me.

Here's what I've found so far --

We can start with this joke at

A husband looking through the paper came upon a study that said women use more words than men.
Excited to prove to his wife that he had been right all along when he accused her of talking too much, he showed her the study results. It read "Men use about 15,000 words per day, but women use 30,000".
The wife thought for a while, then finally she said to her husband "It's because we have to repeat everything we say."
The husband said "What?"

Some of my favorite jokes are well supplied with footnotes, but this one isn't.

A compilation of averages on a trivia page at says that "On average women say 7,000 words per day. Men manage just over 2000." There are no footnotes here either, needless to say.

The same numbers appear in a 2004 article by Hara Estroff Marano from Psychology Today, "Secrets of Married Men", which quotes "Rhode Island psychiatrist Scott Haltzman, M.D." as follows:

The average woman uses 7,000 words a day and five tones of speech, he points out. The average man uses 2,000 words and three tones. "Men are talk-impaired, relatively speaking," he says.

Haltzman has also written a book, also (according to the blurb at "emphasizing the biological differences between men and women", but if he's published in the scientific literature on this topic, Google Scholar doesn't know about it.

The 7,000/2,000 numbers also feature in Kevin Burke's one-man show "Defending the Caveman", according to a review by Adrienne Broaddus:

Did you know that men generally speak about 2,000 words a day and women 7,000? Men, Burke explains, bond and communicate by sharing long periods of silence and occasional name-calling, whereas women bond by gossiping, processing things and sharing emotional insights. That explains why men are never able to tell women details about their night out with the guys - they don't talk.

A 2004 CBS News segment quotes Kate White, the Editor-in-Chief of Cosmopolitan, citing numbers in the same range:

"We make the mistake as women sometimes of thinking that, because (men are) different, there is something wrong. The average guy speaks 2,000 to 4,000 words a day and the average woman 6,000 to 8,000. So we're different, but it doesn't mean there is something the matter with the relationship."

No footnotes, alas.

An undated article by Debbie Waitkus on the website of the magazine Ladies Golf Journey, "To Speak or Not to Speak, That is the Question", tells us that

Women, for the most part, love to talk. In fact, women speak an average of 30,000 words a day. Compare this with an average of 12,000 a day for men. Women can talk about anything, anywhere, anytime – golf course included.

A 1999 article in The Nationalist tells us:

Research shows that women can speak 20,000 to 25,000 words a day compared to men’s paltry 7,000 to 10,000.

Again no source is cited, though the article discusses a book by Allan and Barbara Pease where the meme does occur, though with different numbers. The cited book (Allan and Barbara Pease in "Why Men Don't Listen & Women Can't Read Maps" p. 80-81) not only gives different totals, but also breaks the totals down in an unusual way:

A woman can effortlessly speak an average of 6,000-8,000 words a day. She uses an additional 2,000-3,000 vocal sounds to communicate, as well as 8,000-10,000 facial expressions, head movements, and other body language signals. This gives her a daily average of more than 20,000 communications. That explains why the British Medical Association recently reported that women are four times more likely to suffer from jaw problems.

"Once I didn't talk to my wife
for six months," said the comedian.
"I didn't want to interrupt."

Contract a woman's daily "chatter" to that of a man. He utters just 2,000-4,000 words and 1,000-2,000 vocal sounds, and makes a mere 2,000-3,000 body language signals. His daily average adds up to around 7,000 communication "words" -- just over a third the output of a woman.

The referents of these multiple counts ("vocal sounds", "body language signals") are kind of vague, to say the least; and all of the counts seem roughly as scientific as the count of months in the quoted joke.

Another variant of the Pease-derived version of this meme can be found in a compilation of sex-stereotype wisdom by Dan Bova from Redbook:

Why doesn't he want to talk about his day when he gets home?
"I just want to leave all the annoying crap of the day behind me and think about nothing for a while," says Jim, 31, a father of two from Beacon, NY. At the end of the day, men are tired of thinking, and, more important, we're tired of talking. "Studies show that women use 8,000 to 9,000 words a day. Men use 2,000 to 4,000 words a day on average," explains communication expert Allan Pease. "By the time they come home from work, they've used up their words. And women have 5,000 left to go."

And here's another Allan and Barbara Pease bit, in the transcript of a 2004 CNN interview plugging their book "Why men don't have a clue and Women Always Need More Shoes" (yes, they've got a million of 'em, it seems -- they're serial stereotypers):

MCINTYRE: All right, so help us out. Help us clueless men out. How do we crack the code?
A. PEASE: Well, first of all, to understand that the brain scans that we talk about in our book show very clearly that men and women are different. Now, it's politically correct, I know, to go around pretending that men and women are exactly the same now. But the reality is we're not. And everybody knows this. We're scientifically different. And so, first of all, is to accept that.
Secondly, to understand that women can speak 20,000 to 24,000 words a day versus a man's top end of 7,000 to 10,000. And where this becomes apparent is at the early evening when you're having dinner, because most men have done their 10,000, right? She might still have 15,000 to go, and someone's got to hear them.

We can also see here the introduction of "brain scans" into this world. The Peases have clearly recycled this meme many times over the years, and it's possible that they're actually its source, though so far I don't have any references that antedate the citations coming up next.

James Dobson wrote in his Focus on the Family column, June 2004:

Research makes it clear that little girls are blessed with greater linguistic ability than little boys, and it remains a lifelong talent. Simply stated, she talks more than he. As an adult, she typically expresses her feelings and thoughts far better than her husband and is often irritated by his reticence. God may have given her 50,000 words per day and her husband only 25,000. He comes home from work with 24,975 used up and merely grunts his way through the evening. He may descend into Monday Night Football, while his wife is dying to expend her remaining 25,000 words.

Apparently this was recycled material, since Christianity Today quotes it from a book first published in 1993:

Dr. James Dobson hits on the wiring problem in his book, Love for a Lifetime. He writes: "Research makes it clear that little girls are blessed with greater linguistic ability than little boys, and it remains a lifelong talent. Simply stated, she talks more than he." Dobson suggests that God may have given Mrs. Cell Phone 50,000 words per day while Mr. Computer may average 25,000.

I don't have access to the 1993 edition of this book (though ads for the current version feature the same quote), so I'm going to be tentative in naming Dobson as one of the two earliest publications of the sex-linked vocabulary allowance idea. And Dobson says that "God may have given" particular lexical allowances to a hypothetical (if prototypical) woman and man, which is fair enough, so we can't blame him for the fact that the quantities are unsourced.

Dobson also brings out one of the interesting secondary features of (some versions of) this meme: the notion that we are dealing not with a behavioral average (like, say, the number of steps someone takes in the course of a day), but rather with a sort of behavioral budget. You get X number of words, and once you've said that many, you've got to stop. Sorry, too bad if you've got more to say, your word tank is empty. Try again tomorrow. (This is in contrast to a version in which women talk more just because they're more verbally adept or more verbally inclined than men are.)

In the obviously serious volume Counseling Criminal Justice Offenders, Ruth E. Masters wote "Females use an estimated 25,000 words per day and males use an estimated 12,000 words per day". And wonder of wonders, she gives a reference: (Smalley, 1993)!

Alas, (Smalley, 1993) turns out to be a little pamphlet made to be handed out by marriage counselors -- Gary Smalley, "Connecting with your husband", p. 18-19:

Studies show that the average male uses about 12,000 words a day, the entire day, and most of those are spent relating to people while on the job. Remember, most men are aggressive and driven. They will talk at length in the workplace in order to successfully complete an assignment, project, or task.

A woman, on the other hand, averages 25,000 words per day. Now these aren't just any words but words that must connect with people or emotions. In other words, when a woman spends her day iin the workplace, generally there are few opportuntities for her to realy dig in and use her allotment of words.

Here's the problem. At the end of the day -- whether the woman works in an office of in the home -- there is huge difference between the man's word count and the woman's. A man has spent nearly all his words. He comes home tired and drained, looking for a place to recharge for the next day's battle at the office.

A woman, however, is just warming up. She has thousands of words left to speak, and since her husband's word count is depleted, the conversations often wind up sounding like nothing more than question-and-answer sessions.

If this passage is in the 1993 edition (the one that I was able to search online is more recent), then Smalley is more or less tied with Dobson for priority on this idea. Smalley also features the "lexical budget" or "word tank" version of this meme. This is such a bizarre idea that it probably has a common source. Either Dobson got it from Smalley, or Smalley got it from Dobson, or both of them got it from some third party. The question is, did Louann Brizendine really just pick the idea up from this demi-monde of pseudo-scientific urban legends in order to deploy it in her pop neuroscience, or is there actually some sort of half-way respectable research in its history somewhere?

Looking through Brizendine's book, I couldn't find this factoid other than in the jacket copy. That's too bad, since in general she gives references for the assertions in the book. [update 9/2/2006: Brizendine makes the assertion on p. 14 of the book, and offers a list of references that includes Pease. A. and A. Garner (1997) Talk Language: How to use conversation for profit and pleasure, which is presumably where she found it.] The word-budget business comes up in this passage from one of her online lectures:

And some of you may know this interesting fact. By female- adulthood, females speak on average about twenty thousand words a day, and ... My sixteen-year-old son will sometimes say to me, "Mom, I've already said my seven thousand for the day. Call up your girlfriend!" [laughs]

[I should say, if it's not already obvious, that I'm skeptical about the truth of this claim. For either sex, the variance of daily word-production counts will be enormous, and even for a particular individual, the count will depend massively on variable aspects of life circumstances. At best you'd be able to say that in a certain range of comparable circumstances, there were different mean values for men and women; and even in that sort of controlled comparison, I'd be very surprised if the within-group variation wasn't much larger than the across-group difference. So far, I haven't found any evidence that there is any empirical warrant for saying even that much.]

[Update -- Clay Beckner reminded me of the review by Deborah James and Janice Drakich, "Understanding Gender Differences in Amount of Talk: A Critical Review of Research", in Deborah Tannen, ed., Gender and Conversational Interaction, Oxford U Press, 281-312 (1993); and sent the abstract:

It is shown that the widely held belief that women talk more than men is unsupported in the literature. Of the studies reviewed that examined mixed-sex interaction, the majority found either that men talked more than women, or that there was no difference between men & women in amount of talk. Approaches to understanding these findings are explored, with one theory - status characteristics theory - highlighted as most helpful in understanding gender differences in amount of talk. The effect of the research activity on the amount of talk in each study is explored, with studies divided into those that used formal task activities, informal task & nontask activities, & formal nontask activities. Most studies reported either that men talked more than women, either overall or in some circumstances, or that there was no difference between the genders in amount of talk. In each of these contexts, the findings are explored in light of the status characteristics theory. It is concluded that rather than viewing the overwhelming tendency of males to talk more than females as further evidence of domination & exploitation of power over women, the different goals for interaction, to which both men & women are socialized, should be considered in the context of social structure.

The cited research is not quite a refutation of the Dobson/Smalley/Pease/Brizendine/etc. claim, since it monitors what happens during a period devoted in some sense to interaction. It's still possible that when not forced into socializing, men uniformly clam up (e.g. in front of football games or behind newspapers) while women uniformly seek out someone to talk with, so that average daily word counts would still favor women by the nearly three-to-one margin that Brizendine claims. But I doubt it.

And in fact there is a body of material that tests the sexual word budget directly. Clay also reminded me of Rayson, P., Leech, G., and Hodges, M., "Social differentiation in the use of English vocabulary: some analyses of the conversational component of the British National Corpus". International Journal of Corpus Linguistics. 2(1) pp 133-152 (1997).

This data comes from 153 demographically balanced speakers, 75 female and 73 male, who "were equipped with a high-quality Walkman sound recorder, and recorded any linguistic transactions in which they engaged during a period of two days", obtaining permissions from their interlocutors as well. The total number of speakers recorded by this method included 561 females and 536 males.

TABLE 1 Distribution of the Conversational Corpus between Female and Male Speakers
Female Speakers Male Speakers
Number of Speakers 561 536
Number of turns 250,955 179,844
Number of words spoken 2,593,452 1,714,443
Number of turns per speaker 447.33 335.53
Number of words per turn 10.33 9.53

Unfortunately, this particular table covers all the contributions by all those involved in any of the conversations, not just the people who were recording all of their output during two days. However, we can look at the results in a couple of ways that bear on the sexual word budget question.

1. Average words per day per speaker (both produced and heard):
    1/2 days * 4,552,555 words / 153 speakers = 14,878
2. Average words spoken per day, assuming equal conversational shares:
    14,878/2 = 7,439
3. Ratio of words per female speaker to words per male speaker (recorder-wearers or not):
    (2593452/561)/(1714443/536) = 1.45    (i.e. 45% more per speaker for females)
4. Inferred female and male daily word totals:
    male = 14,878/2.45 = 6073 words per day.
    female = 14,878-6073 = 8805 words per day.
[Note that this is a doubly indirect estimate -- but the data is out there, so we should be able to get exact counts, including an estimate of the overlap in the female vs. male distributions!]

The table also indicates that the female participants took an average of 33% more turns (per speaker) than the male participants, and used 8.4% more words per turn, as well as using 45% more words. I'm not sure why this result is different from those of the studies surveyed by James and Drakich (where males generally took more turns and used more words), though perhaps a clue can be found in the fact that the eight words "most characteristic of male speech" (by χ2 value) in the BNC conversational corpus were fucking, er, the, yeah, aye, right, hundred, fuck, while the corresponding list for the women was she, her, said, n't, I, and, to, cos. In other words, I suspect a difference in formality, and probably also in class, between the BNC collection and the (much smaller) studies surveyed by James and Drakich. (I should say also that these counts seem a bit low to me. At overall conversational speech rates (from all parties in a given conversation) of about 200 WPM, 15,000 words would be only about an hour and fifteen minutes of talking time. Unless the participants led rather lonely lives, perhaps they didn't have their Walkmen turned on all the time, or didn't get permission from all their interlocutors?

In any case, 6,073 vs. 8,805 is a far cry from 7,000 vs. 20,000 -- and if we looked into the details further, I suspect we'd find a very large overlap between male and female daily word counts in the BNC conversational corpus. (This will be easy enough to check, as mentioned above.) But in any case, the differences between this result and the studies surveyed by James and Drakich reminds us that these numbers represent not the automatic consequence of chromosomes and hormones, but rather contingent actions of complex individuals in varied contexts of interaction. Who are influenced by their genes and hormones, sure enough. But still.]

[Update: more on this here ; and a list of links to other relevant posts can be found here.]

[Update 2/24/2007 -- Jeff Allen writes:

There is quote by James Dobson, from a June 2004 column, and then back to his 1993 book. I recall hearing James Dobson use the same example as the cited June 2004 text in a radio broadcast sometime during the period of 1984-1988.


Posted by Mark Liberman at 09:22 AM

Neuroscience in the service of sexual stereotypes

It's recently fashionable for books and articles to enlist neuroscience in support of the view that men and women are essentially and unavoidably different, not just in size and shape, but also in just about every aspect of the way they see, hear, feel, talk, listen and think. These works tend to confirm our culture's current stereotypes and prejudices, and the science they cite is often overinterpreted, and sometimes seems simply to have been made up. I recently discussed an example from Leonard Sax's book Why Gender Matters ("Are men emotional children?", 6/24/2006), which David Brooks has used to support an argument for single-sex education. The latest example of this genre, released August 1, is Louann Brizendine's book "The Female Brain".

Here's what its jacket blurb says:

Every brain begins as a female brain. It only becomes male eight weeks after conception, when excess testosterone shrinks the communications center, reduces the hearing cortex, and makes the part of the brain that processes sex twice as large.
Louann Brizendine, M.D. is a pioneering neuropsychiatrist who brings together the latest findings to show how the unique structure of the female brain determines how women think, what they value, how they communicate, and who they’ll love. Brizendine reveals the neurological explanations behind why
• A woman uses about 20,000 words per day while a man uses about 7,000
• A woman remembers fights that a man insists never happened
• A teen girl is so obsessed with her looks and talking on the phone
• Thoughts about sex enter a woman’s brain once every couple of days but enter a man’s brain about once every minute
• A woman knows what people are feeling, while a man can’t spot an emotion unless somebody cries or threatens bodily harm
• A woman over 50 is more likely to initiate divorce than a man
Women will come away from this book knowing that they have a lean, mean communicating machine. Men will develop a serious case of brain envy.

I looked through the book to try to find the research behind the 20,000-vs.-7,000-words-per-day claim, and I looked on the web as well, but I haven't been able to find it yet. Brizendine also claims that women speak twice as fast as men (250 words per minute vs. 125 words per minute). These are striking assertions from an eminent scientist, with big quantitative differences confirming the standard stereotype about those gabby women and us laconic guys. The only trouble is, I'm pretty sure that both claims are false.

With respect to the speech rate claim, I've just run a script on a corpus of 5,202 transcribed and time-aligned telephone conversations, involving native speakers of American English with a wide variety of ages, regions and backgrounds. The average speech rate for the males was 174.3 wpm, and the average speech rate for the females 172.6 wpm. I assume that Brizendine didn't just concoct her figures about male vs. female speech rates out of thin air -- she must have gotten them from a study that someone did somewhere, sometime, or at least from some other author plugging another work in the flourishing genre of pop gender studies -- but let's say, at least, that it ain't necessarily so. I'll post something more about Brizendine's striking speaking-rate and words-per-day claims as soon as I can figure out what evidence she based them on. [More on female and male speaking rates is here, and more on the number of words men and women typically speak per day is here.]

There certainly are psychological and neurological differences between men and women, sometimes big ones. But even when they aren't promoting their ideas on the basis of "facts" that are apparently false, authors like Sax and Brizendine use a set of rhetorical tricks that tend to make sex differences seem bigger and more consequential than they really are. You can do it too, if you want -- just choose phenomena that emphasize differences, leaving out the ones where the sexes are more similar; pick studies that find stereotypic differences, leaving out the ones whose results disagree; and in all cases, talk and write as if (even relatively small) differences in group averages were essential characteristics of every member of each group.

There's a great example of that last trick in the following passage from a lecture by Dr. Brizentine called the "The Teen Girl Brain", available in video form on her UCSF web site, :

We have a lot more sensitivity and emotional awareness than the typical male,
and- and- and hugely better ability to read facial expressions,
um and also tone of voice.
At birth, baby girls are able to hear a lot more um
um in a lot larger range
uh in the human voice range of frequency
in the human voice
than boys are able to. So ((you)) remember that story that I told you about "Don't touch"?
Sometimes a- a mother will be yelling at their boy
or at least- sh- how many mothers in the room have had to like scream at your boys to stop doing something?
He may not have heard you! I mean, real- he may not just be being bad, or whatever -- he may just not literally have heard you when your voice was a little bit quieter.
So he may just *need* a louder voice.
Um and there are more talking and listening circuits in the female brain.

Let's zero in on the business about differences in hearing sensitivity. In her book, Brizendine puts it like this (p. 17): "Just as bats can hear sounds that even cats and dogs cannot, girls can hear a broader range of sound frequency and tones in the human voice than can boys." If we take this literally, it's nonsense. In the first place, it's simply false that girls' frequency range compares to boys' like bats to dogs, and as far as I can tell, none of the sources that she cites even suggest anything of the kind. In the second place, all the communicatively-relevant information in speech is well within the frequency range even of normal adults, who have started to lose high-frequency hearing compared to children of both sexes. But let's give Brizendine the benefit of the doubt, and interpret her as talking about a sex difference in auditory sensitivity across the shared frequency range of normal hearing.

This sex difference really exists. It's been known for half a century that girls and women have more sensitive hearing, on average, than boys and men. But those two little words "on average" are crucial. If you pick a man and a women (or a boy and a girl) at random, the chances are about 6 in 10 that the girl's hearing will be more sensitive -- but about 4 in 10 that the boy's hearing will be more sensitive. Not only that, but the expected value of the sensitivity difference is extremely small: at 1,000 Hz, our randomly-selected girl's threshold will be about 1.1 DB lower than our randomly-selected boy's threshold; at 1,500 Hz, the difference will be about 2 DB. By comparison the JND ("just noticeable difference") for soft sounds is about 1 DB.

So if boys are really less attentive to their mothers than girls are, the difference is not very likely to be due to differences in hearing sensitivity. (And given how cavalier Dr. Brizendine is about the audiology, I'm not prepared to trust her generalizations about gendered behavior without checking and evaluating the references.) This is exactly the same type of statistical sleight-of-hand that Leonard Sax used to argue that men fail to mature emotionally in the way that women do ("Are men emotional children", 6/24/2006). Sax started from a small (and statistically marginal) experiment that claimed a small difference in average rates of change with age, where individual differences were very large compared to the claimed group difference, and blew it up into a claim about all individual men and all individual women. Brizendine has taken a well-established but small difference in average auditory sensitivity, where the between-group difference in means is about half a standard deviation, and turned it into a claim about the essential nature of all individual human males and females.

Unless you're interested in the details, you can tune out now -- but for the record, here are the facts.

The standard early reference on sex differences in auditory sensitivity is John F. Corso, " Age and Sex Differences in Pure-Tone Thresholds", The Journal of the Acoustical Society of America, 31(4), pp. 498-507 (1959). The more recent studies are consistent with his results, and he gives a lot of detailed numbers, so I'll rely on the data in his paper. Corso measured how loud tones of different frequencies had to be for subjects to hear them, across a range of frequencies from 250 Hz to 8,000 Hz. He tested a large number of males and females of different ages from the students, faculty and staff at Penn State.

Corso didn't examine infants and children -- his lower age bracket is 18-24. It's hard to test the hearing of babies and toddlers -- it's generally done using techniques like "auditory brainstem response" (ABR), "auditory evoked potentials" (AEPs), "spontaneous otoacoustic emissions" (SOAEs) and "click-evoked otoacoustic emissions" (CEOAEs) For a survey, see Dennis McFadden, "Masculinization Effects in the Auditory System", Archives of Sexual Behavior, Vol. 31, No. 1, February 2002.. None of these techniques makes it easy to quantify differences in hearing sensitivity in a way that would let us draw conclusions about how likely a one- or two-year-old would be to respond to a sound of a given intensity. I haven't been able to find any studies suggesting that the sex difference in auditory sensitivity is greater for toddlers than it is for college students, so I'll take Corso's 18-24-year-old group as representative of the likely differences for children. After eliminating people who had a history of excessive noise exposure, or other indications of possible environmental hearing loss, Corso had 62 men and 146 women in that age bracket.

Here's his summary table:

Here's a plot showing the distribution of thresholds expected on the basis of Corso's parameters for sensitivity to tones at 1,000 Hz in the 18-24-year-old age range, assuming normal distributions:

In fact there is evidence in his paper that the distributions are somewhat skewed towards the higher-threshold end; but this will not change the degree of overlap much.

[More here.]

Posted by Mark Liberman at 07:33 AM

August 05, 2006

Free verbs

In a comment at Language Hat's site, Eimear Ní Mhéalóid observes that "In Irish, the equivalent of the passive voice is referred to as an briathar saor, "the free verb", a far more appealing term I think." As Eimear goes on to suggest, the Irish "autonomous" or "free" verb is not quite equivalent to the English passive. Greene (quoted here) writes:

All tenses of the verb, however, have an impersonal form usually called in Irish grammars the autonomous, which often corresponds to an Irish passive, though its more exact equivalent can be found in French and German constructions with on and man respectively; briseadh an fhuinneog means ‘the window was broken (by somebody or something)’.

and I gather that both in terms of word order and in terms of case marking, the associated noun (here "the window") remains an object, with the subject being unexpressed. But independent of the analogy to Irish, the term "free verb" has a lot of potential. Almost all the common "free" terms have positive lexical associations, even when their referents are controversial: "free enterprise", "free jazz", "free love", "free market", "free software", "free speech", "freestyle", "free thinker", "free verse", "free will", "free world". The only (partial) exceptions that come to mind are "free loader", "free lunch", "free radical" and "free ride". And the trail to "free" renaming has been blazed by "liberty cabbage" and "freedom fries". The direct and vigorous free verb. Liberated from the accusative tyranny of the object. I like it.

Posted by Mark Liberman at 08:48 AM

The direct and vigorous hyptic voice

Big close-up of a man's face, "his short hair parted neatly in the middle and combed down over his forehead, his eyes blinking incessantly behind steel-rimmed spectacles as though he had just emerged into strong light, his lips nibbling each other like nervous horses, his smile shuttling to and fro under a carefully edged mustache."

No, it's not Jack Nicholson in the opening shot of The Shining II. This is Will Strunk, professor of English at Cornell University in 1919, as his student Elwyn Brooks White remembered him in 1979.

The dream sequence continues:

"Omit needless words!" cries the author on page 23, and into that imperative Will Strunk really put his heart and soul. In the days when I was sitting in his class, he omitted so many needless words, and omitted them so forcibly and with such eagerness and obvious relish, that he often seemed in the position of having shortchanged himself — a man left with nothing more to say yet with time to fill, a radio prophet who had out-distanced the clock. Will Strunk got out of this predicament by a simple trick: he uttered every sentence three times. When he delivered his oration on brevity to the class, he leaned forward over his desk, grasped his coat lapels in his hands, and, in a husky, conspiratorial voice, said, "Rule Seventeen. Omit needless words! Omit needless words! Omit needless words!"

This is a curiously self-refuting oration. Since Professor Strunk had already distributed his list of rules in printed form to the class, he could simply have written "Rule 17" on the blackboard, pointed to his booklet, and impressed the students with that display of blinking, lip-nibbling and smile-shuttling. If speech were really needed after all the facial calisthenics, one repetition of "omit needless words" might have sufficed. Performing the injunction three times suggests that words can be useful even when they are strictly speaking redundant.

But Strunk specialized in self-refuting advice. When he told us that "Many a tame sentence of description or exposition can be made lively and emphatic by substituting a transitive in the active voice for some such perfunctory expression as there is, or could be heard", he framed the advice in a sentence whose verbal core "can be made" is not only in the passive voice, but is also exactly analogous to the deprecated wording "could be heard".

Does Strunk follow his own Rule 11, "Use the active voice", elsewhere in the 1918 edition of Elements of Style? Well, let's take a look at what he says about Rule 9, "Make the paragraph the unit of composition":

If the subject on which you are writing is of slight extent, or if you intend to treat it very briefly, there may be no need of subdividing it into topics. Thus a brief description, a brief summary of a literary work, a brief account of a single incident, a narrative merely outlining an action, the setting forth of a single idea, any one of these is best written in a single paragraph. After the paragraph has been written, it should be examined to see whether subdivision will not improve it.

Ordinarily, however, a subject requires subdivision into topics, each of which should be made the subject of a paragraph. The object of treating each topic in a paragraph by itself is, of course, to aid the reader. The beginning of each paragraph is a signal to him that a new step in the development of the subject has been reached.

There are 11 tensed verbs in these two paragraphs, and only two of them are "transitive in the active voice". One is an active intransitive, five are passives, and the remaining two are forms of be, one in the deprecated there-construction. In fact, nowhere in the 1918 Elements of Style does Strunk try very hard to implement his assertion that "The active voice is usually more direct and vigorous than the passive". The end of the section on paragraphing, for example, features four passives, two copulas, and one (weakly intransitive) active-voice verb:

As a rule, single sentences should not be written or printed as paragraphs. An exception may be made of sentences of transition, indicating the relation between the parts of an exposition or argument.

In dialogue, each speech, even if only a single word, is a paragraph by itself; that is, a new paragraph begins with each change of speaker. The application of this rule, when dialogue and narrative are combined, is best learned from examples in well-printed works of fiction.

It seems that Will Strunk, who was certainly a direct and vigorous writer, was at least as fond of the direct and vigorous hyptic voice as his direct and vigorous contemporary Winston Churchill was. Vigor is as vigor does.

Posted by Mark Liberman at 07:13 AM

August 04, 2006

Who died and made you the king of snowclones?

Mary Karr, in the NYT Book Review of 7/30/06, p. 4, rebuking Ben Kunkel for his "rant against memoir":

Who died and put Ben Kunkel in charge of what memoirists are supposed to do?

Ah, a Who Died And snowclone, with an especially interesting semantics.

The form of the snowclone is

Who died and VPAST X Y?

where VPAST is

put, left, made, appointed, named, elected,...

and X denotes a person, group of people, or institution, Y a position of position of power and/or responsibility, often described hyperbolically ("king", "God").  (If the verb is "put", then Y is almost always "in charge (of Z)".)  The figure conveys roughly 'X is not, should not be, does not deserve to be Y', in a context where X is seen to be claiming to be Y or others are claiming this on X's behalf.  In the Karr example: 'Ben Kunkel is not in charge of what memoirists are supposed to do (and shouldn't act as if he is)'.

Some more examples:

David Reinhard of the Portland Oregonian addressed a relevant question to Bill Keller this past week: "Who died and left you president of the United States?" (link)

Who Died and Made Google King? (link)

Who died and made us the world's policeman? (link)

Who died and appointed him eternal King and Lord Emporer over the rest of you lesser folk? (Note: that was a rhetorical question; please do not flood my ... (link)

Pardon my indignation, but who died and named you the Chief Statistician? (link)

Who died and elected Junior McGee as dictator of Milwaukee? (link)

The Who Died And snowclone is of the form of a wh question with "who" as subject and a coordinate VP, with conjuncts "died" and VPAST X Y.  So its literal interpretation would be

who is the x such that x died and x VPAST X Y?
e.g., who is the x such that x died and x made Google king?

But this can't be right, since it's not x but x's death that would cause X to be Y; in its literal interpretation, the question has to be understood metonymically, as

who is the x such that x's death VPAST X Y?
e.g., who is the x such that x's death made Google king?

Of course, the question isn't understood literally; as one of the writers above points out, it's a rhetorical question, conveying a denial:

there is no x such that x's death VPAST X Y
e.g., there is no x such that x's death made Google king

This, in turn, is meant as an understatement; not only did SOMEONE's DEATH not V X Y, NOTHING VPAST X Y:

there is no event that VPAST X Y
e.g., there is no event that made Google king

which then conveys the still stronger:

X is not Y, and there is no cause/reason for X to be Y
e.g., Google is not king, and there is no cause/reason for Google to be king

In contexts where X claims to be Y (or is claimed to be Y by other people), this conveys:

X should not act as if X is Y
e.g., Google should not act as if it is king

or even:

X should stop acting as if X is Y; that is, X, stop acting as if you're Y!
e.g., Google should stop acting as if it is king; that is, Google, stop acting as if you're king!

When Y is hyperbolic, the hyperbole has to unpacked as well.  In the King Google example, for instance, we will understand "be king" as conveying something like 'be in charge of everything', so that "Who died and made Google king?" conveys something along the lines of 'Google, stop acting as if you're in charge of everything!"

We are finally home, after several steps of calculating conveyed meaning.  This is a marvel of indirect conveying of meaning.

As with any snowclone, there's a history to be uncovered, one going back at least to a MODEL formulaic expression (cliché, striking quotation, proverb or saying, catchphrase, slogan, or memorable name or title) that served as the basis for the generalization to other Vs, Xs, and Ys -- perhaps "Who died and made you king?", uttered or written in some memorable context.   At the moment, I haven't a clue about what the model was and what events caused it to spread and so become widely available for generalization.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 03:29 PM

Great news from Google research

See here: "All our N-gram are belong to you".

Posted by Mark Liberman at 01:00 PM

When men were men, and verbs were passive

Over the past few weeks, we've been discussing America's growing anxiety about passivity. That's the verbal voice, not the attitude towards life, though the composition mavens sometimes get the two mixed up. Arnold Zwicky found that the Avoid Passive rule originated in U.S. composition handbooks early in the 20th century (perhaps first in Strunk's 1918 Elements of Style), along with a metaphorical association between passive verbs and weakness. Today, after three generations of anti-passive propaganda, most American students are taught to "strengthen your verbs" to stimulate "active thinking and writing", and to avoid the "excessively wordy, weak" prose (and hairy palms?) caused by "the first deadly sin: passive voice".

Because George Orwell recommended Passive Avoidance in his essay "Politics and the English Language", Geoff Pullum quoted an ironic observation from The Merriam-Webster Dictionary of English Usage: "Bryant 1962 reports three statistical studies of passive versus active sentences in various periodicals; the highest incidence of passive constructions was 13 percent. Orwell runs to a little over 20 percent in 'Politics and the English Language.'" Since Strunk & White provide another of the streams feeding the massive river of contemporary anti-passivity, I checked a couple of pages of E.B. White's prose, and found 21% passives. Yesterday, as I read Winston Churchill's The River War in search of collective nouns, I was struck by the frequency of passive verbs. And as you'll see below, the numbers back me up -- in the passages I checked, Churchill uses passive verbs about as often as active ones. But Churchill, even more than Orwell, Strunk and White, is a model of forceful eloquence. Should 21st-century composition teachers reverse course, and advise their students to bulk up on passives so as to develop powerful, muscular prose?

The opening paragraph of The River War has about nine tensed verbs, depending on how you count (does "is drained and watered" count for one or two?). Of these, four (or perhaps five) are passive (in red below), three are active (in blue below), and two involve forms of "to be" that perhaps should not count one way or the other:

The north-eastern quarter of the continent of Africa is drained and watered by the Nile. Among and about the headstreams and tributaries of this mighty river lie the wide and fertile provinces of the Egyptian Soudan. Situated in the very centre of the land, these remote regions are on every side divided from the seas by five hundred miles of mountain, swamp, or desert. The great river is their only means of growth, their only channel of progress. It is by the Nile alone that their commerce can reach the outer markets, or European civilisation can penetrate the inner darkness. The Soudan is joined to Egypt by the Nile, as a diver is connected with the surface by his air-pipe. Without it there is only suffocation. Aut Nilus, aut nihil!

One of the three active verbs (lie) is intransitive and has no passive counterpart. So depending on how we count things, this is something between 4/9 (44%) and 5/7 (71%) passive verbs. Let's be conservative and say that 44% of the tensed clauses are headed by a passive verb, while 3/9 (33%) have an active verb as head.

But wait, you say -- Churchill is just laying out the geography. Once he starts describing the actions of men at war, those active verbs will surely spring up on every side. But not so. Consider this vigorous and forceful passage:

The known strength of the Khalifa made it evident that a powerful force would be required for the destruction of his army and the capture of his capital. The use of railway transport to some point on the Nile whence there was a clear waterway was therefore imperative. [...] The route via Abu Hamed was selected by the exclusion of the alternatives. [...] The plan was perfect, and the argument in its favour conclusive. It turned, however, on one point: Was the Desert Railway a possibility? With this question the General was now confronted. He appealed to expert opinion. Eminent railway engineers in England were consulted. They replied with unanimity that, having due regard to the circumstances, and remembering the conditions of war under which the work must be executed, it was impossible to construct such a line. Distinguished soldiers were approached on the subject. They replied that the scheme was not only impossible, but absurd. Many other persons who were not consulted volunteered the opinion that the whole idea was that of a lunatic, and predicted ruin and disaster to the expedition. Having received this advice, and reflected on it duly, the Sirdar ordered the railway to be constructed without more delay.


Lieutenant Girouard, to whom everything was entrusted, was told to make the necessary estimates. Sitting in his hut at Wady Halfa, he drew up a comprehensive list. Nothing was forgotten. Every want was provided for; every difficulty was foreseen; every requisite was noted. The questions to be decided were numerous and involved. How much carrying capacity was required? How much rolling stock? How many engines? What spare parts? How much oil? How many lathes? How many cutters? How many punching and shearing machines? What arrangements of signals would be necessary? How many lamps? How many points? How many trolleys? What amount of coal should be ordered? How much water would be wanted? How should it be carried? To what extent would its carriage affect the hauling power and influence all previous calculations? How much railway plant was needed? How many miles of rail? How many thousand sleepers? Where could they be procured at such short notice? How many fishplates were necessary? What tools would be required? What appliances? What machinery? How much skilled labour was wanted? How much of the class of labour available? How were the workmen to be fed and watered? How much food would they want? How many trains a day must be run to feed them and their escort? How many must be run to carry plant? How did these requirements affect the estimate for rolling stock? The answers to all these questions, and to many others with which I will not inflict the reader, were set forth by Lieutenant Girouard in a ponderous volume several inches thick; and such was the comprehensive accuracy of the estimate that the working parties were never delayed by the want even of a piece of brass wire.

25 passives, 12 actives, 11 copulas: we have 25/48 = 52% passives, 12/48 = 25% actives.

OK, you'll say, people are making decisions and plans in that passage; but what about the fighting? Won't we get more active verbs then? Maybe. Here's the start of the battle of the Atbara:

During the halt the moon had risen, and when at one o'clock the advance was resumed, the white beams revealed a wider prospect and, glinting on the fixed bayonets, crowned the squares with a sinister glitter. For three hours the army toiled onwards at the same slow and interrupted crawl. Strict silence was now enforced, and all smoking was forbidden. The cavalry, the Camel Corps, and the five batteries had overtaken the infantry, so that the whole attacking force was concentrated. Meanwhile the Dervishes slept.

At three o'clock the glare of fires became visible to the south, and, thus arrived before the Dervish position, the squares, with the exception of the reserve brigade, were unlocked, and the whole force, assuming formation of attack, now advanced in one long line through the scattered bush and scrub, presently to emerge upon a large plateau which overlooked Mahmud's zeriba from a distance of about 900 yards.

It was still dark, and the haze that shrouded the Dervish camp was broken only by the glare of the watch-fires. The silence was profound. It seemed impossible to believe that more than 25,000 men were ready to join battle at scarcely the distance of half a mile. Yet the advance had not been unperceived, and the Arabs knew that their terrible antagonists crouched on the ridge waiting for the morning; For a while the suspense was prolonged. At last, after what seemed to many an interminable period, the uniform blackness of the horizon was broken by the first glimmer of the dawn. Gradually the light grew stronger until, as a theatre curtain is pulled up, the darkness rolled away, the vague outlines in the haze became definite, and the whole scene was revealed.

10 passives, 15 actives, and 3 copulas: we're down to 10/28 = 36% passives versus 15/28 = 54% actives.

For another sample, here's Churchill's description of the action at Om Debreikat that finishes the Khalifa:

After about an hour the sky to the eastward began to grow paler with the promise of the morning and in the indistinct light the picquets could be seen creeping gradually in; while behind them along the line of the trees faint white figures, barely distinguishable, began to accumulate. Sir Reginald Wingate, fearing lest a sudden rush should be made upon him, now ordered the whole force to stand up and open fire; and forthwith, in sudden contrast to the silence and obscurity, a loud crackling fusillade began. It was immediately answered. The enemy's fire flickered along a wide half-circle and developed continually with greater vigour opposite the Egyptian left, which was consequently reinforced. As the light improved, large bodies of shouting Dervishes were seen advancing; but the fire was too hot, and their Emirs were unable to lead them far beyond the edge of the wood. So soon as this was perceived Wingate ordered a general advance; and the whole force, moving at a rapid pace down the gentle slope, drove the enemy through the trees into the camp about a mile and a half away. Here, huddled together under their straw shelters, 6,000 women and children were collected, all of whom, with many unwounded combatants, made signals of surrender and appeals for mercy. The 'cease fire' was sounded at half-past six. Then, and not till then, was it discovered how severe the loss of the Dervishes had been. It seemed to the officers that, short as was the range, the effect of rifle fire under such unsatisfactory conditions of light could not have been very great. But the bodies thickly scattered in the scrub were convincing evidences. In one space not much more than a score of yards square lay all the most famous Emirs of the once far-reaching Dervish domination. The Khalifa Abdullah, pierced by several balls, was stretched dead on his sheepskin; on his right lay Ali-Wad-Helu, on his left Ahmed Fedil.

10 passives, 11 actives, 5 copulas and similar things. 10/26 =38% of the tensed clauses are headed by passives; 11/26 = 42% by actives. Grand total for the samples of Churchill in this post: 45 passives (42%), 41 actives (38%), 21 copulas (20%).

I'm not seriously advising composition students to increase their use of passive verbs. They should write clearly, and let the verbs fall where they may. But the passive voice definitely needs some better PR, if only among writing teachers.

Perhaps we should start with a lexical make-over. We could try replacing the word passive with a competely new borrowing from a classical language, like the "hyptic voice". (Greek ὕπτιος meant "laid on one's back; turned upside down; backwards", and was also sometimes used to refer to the passive voice of verbs.) This might work -- hyptic is a little weird, but there are useful resonances with hip and hypnotic. Or we could try a positive-sounding name based on the value of the passive in focusing different thematic roles --"thematic verbs" or "the focusing voice". We could say, "use thematic verbs to maintain the velocity of your narrative". Or, "seize and hold your readers' attention with the focusing voice".

I'm not very good at this naming business, so let's have a Rename the Passive contest. If you've got a great idea, let me know. The winner gets a year's subscription to Language Log, a lifetime supply of by-phrases, and other exciting prizes.

Posted by Mark Liberman at 07:33 AM

August 03, 2006

Been anything so long it looks like not to me

Joe Boyd reports that the CBC radio program The Current, a guest expert recently said

"If Cuba's been nothing, it's been stable these last ten years."

As Joe observes, this is probably a blend of "Cuba's been nothing if not stable these last ten years", and "If Cuba's been anything, it's been stable these last ten years" (though it might also be a shortened form of "If Cuba's been nothing else, it's been stable these last ten years".) It's also at least technically an overnegation, since you can fix it by changing "nothing" to "anything".

I couldn't find any other similar examples on line. But the blend in the opposite direction seems to be fairly common -- in this case, it would have yielded "Cuba has been anything if not stable these last ten years". Some examples caught in the wilds:

But President Bush has been anything if not a political contrarian on the state of the US economy.
... stalwart jazz pianist Brad Mehldau has been anything if not prolific, with 2005's Day Is Done making it an even dozen albums...
Clinton has been anything if not inconsistent on a whole host of issues - most importantly her wavering 'me too' stance on the war...
For the past month, rather all year, my two Daniels have been anything if not predictable.

OK, everybody, crunch time. Listen up:

Do you not believe that, through the act of immigration, our nation has not been anything if not improved?

Which side is this guy on? I wouldn't like not to think it wasn't mine.

Posted by Mark Liberman at 02:54 PM

Each had his own strange tale to tell

I agree with Sally Thomason that Winston Churchill didn't write the passage attributed to him in a recent Dear Abby column. However, I wasn't sure about whether Churchill reliably conformed with the prescription against use of they with each, which Sally takes as determinative evidence. He might, after all, have written like Jane Austen. So in order to look into the question further, I devoted my breakfast reading this morning to Churchill's "The River War: An Account of the Reconquest of the Sudan" described by the Wikipedia entry as an "1899 book written by Winston Churchill while he was still an officer in the British army, a first-hand account of the conquest of the Sudan by the English-Egyptian force under Lord Kitchener".

In this work, everyone usually takes he, and so does "each N":

Everyone held his breath.

... thereafter each man saw the world along his lance, under his guard, or through the back-sight of his pistol; and each had his own strange tale to tell.

At the mosque two fanatics charged the Soudanese escort, and each killed or badly wounded a soldier before he was shot.

I did find one example of everyone with they:

[The steamer] had already arrived, and the sight of the funnel in the distance and the anticipation of a good meal cheered everyone, for they had scarcely had anything to eat since the night before the battle.

This example is interesting, in that substitution of he for they would be incoherent:

??[The steamer] had already arrived, and the sight of the funnel in the distance and the anticipation of a good meal cheered everyone, for he had scarcely had anything to eat since the night before the battle.

However, there weren't enough examples of either sort in this one book to compile a clear enough picture, and my breakfast coffee ran out before I had time to consult any other works. But this is part of the larger question of what pronouns to use with nominal expressions that are grammatically singular but (from some angles) semantically plural, discussed in earlier Language Log posts such as "Collective nouns with singular verbs" (2/5/2005). So to look at Churchill's habits in this area (at least as filtered through the editing process), I looked at what he does with the noun army, which is very common in The River War.

Churchill uses "the army was" and "the army were" equally often in this work -- four times each. For example:

The army was by then occupying Dongola, and was in actual expectation of a Dervish counter-attack, and it was evident that the military operations could not be suspended or arrested.

For two hours the army were the only living things visible on the smooth sand, but at seven o'clock a large body of Dervish horse appeared on the right flank.

Similarly, he sometimes uses the singular it in connection with "the army", but slightly more often, he uses plural they. Examples with it:

Within three months of its formation the army had its first review.

All the time that the army was operating on the Atbara it drew its supplies from the fort at the confluence...

With the cool of the evening the army left its bed of torment on the ridge and returned to Umdabia.

The army which the Khedives maintained in the Delta was, judged by European standards, only a rabble. It was badly trained, rarely paid, and very cowardly.

By these movements the army, instead of facing south in echelon, with its left on the river and its right in the desert, was made to face west in line, with its left in the desert and its right reaching back to the river.

And with "they":

The army of the Government approached slowly. Their leaders anticipated an easy victory.

In December the army returned to Gallabat, which they commenced to fortify, and their victorious general followed his grisly but convincing despatch to Omdurman, where he received the usual welcome accorded by warlike peoples to military heroes.

The army had now passed beyond the scope of a camel, or other pack-animal, system of supply, except for very short distances, and it was obvious that they could only advance in future along either the railway or a navigable reach of the river, and preferably along both.

The army were now dependent for their existence on the partly finished railway, from the head of which supplies were conveyed by an elaborate system of camel transport.

The distance, ten miles, was accomplished in five hours, and the army reached Hudi in time to construct a strong zeriba before the night. Here they were joined from Atbara fort by Lewis's brigade of Egyptians...

Communications with the Atbara encampment and with Cairo were dropped, and the army carried with them in their boats sufficient supplies to last until after the capture of Omdurman, when the British division would be immediately sent back.

It seems to me (post hoc) that Churchill is choosing it or they according to the feeling of the passage, though I'm not sure that I can make a convincing argument for this view.

Whatever the verdict on they with grammatically singular antecedents, I agree with Sally that Dear Abby's mock-Churchill passage is too clumsy to have been written by the same man who wrote The River War. As stylistic evidence, I chose a sample of passages that mix current relevance with shockingly casual racism. Let's start with Churchill's look at the two sides of nation building:

What enterprise that an enlightened community may attempt is more noble and more profitable than the reclamation from barbarism of fertile regions and large populations? To give peace to warring tribes, to administer justice where all was violence, to strike the chains off the slave, to draw the richness from the soil, to plant the earliest seeds of commerce and learning, to increase in whole peoples their capacities for pleasure and diminish their chances of pain--what more beautiful ideal or more valuable reward can inspire human effort? The act is virtuous, the exercise invigorating, and the result often extremely profitable. Yet as the mind turns from the wonderful cloudland of aspiration to the ugly scaffolding of attempt and achievement, a succession of opposite ideas arises. Industrious races are displayed stinted and starved for the sake of an expensive Imperialism which they can only enjoy if they are well fed. Wild peoples, ignorant of their barbarism, callous of suffering, careless of life but tenacious of liberty, are seen to resist with fury the philanthropic invaders, and to perish in thousands before they are convinced of their mistake. The inevitable gap between conquest and dominion becomes filled with the figures of the greedy trader, the inopportune missionary, the ambitious soldier, and the lying speculator, who disquiet the minds of the conquered and excite the sordid appetites of the conquerors. And as the eye of thought rests on these sinister features, it hardly seems possible for us to believe that any fair prospect is approached by so foul a path.

And here Churchill muses on the degeneration of the Sudanese movement founded by the self-proclaimed Mahdi, Muhammad Ahmad, whom he admired:

All great movements, every vigorous impulse that a community may feel, become perverted and distorted as time passes, and the atmosphere of the earth seems fatal to the noble aspirations of its peoples. A wide humanitarian sympathy in a nation easily degenerates into hysteria. A military spirit tends towards brutality. Liberty leads to licence, restraint to tyranny. The pride of race is distended to blustering arrogance. The fear of God produces bigotry and superstition. There appears no exception to the mournful rule, and the best efforts of men, however glorious their early results, have dismal endings, like plants which shoot and bud and put forth beautiful flowers, and then grow rank and coarse and are withered by the winter. It is only when we reflect that the decay gives birth to fresh life, and that new enthusiasms spring up to take the places of those that die, as the acorn is nourished by the dead leaves of the oak, the hope strengthens that the rise and fall of men and their movements are only the changing foliage of the ever-growing tree of life, while underneath a greater evolution goes on continually.

And this is his description of "the situation in the Soudan for several centuries":

The qualities of mongrels are rarely admirable, and the mixture of the Arab and negro types has produced a debased and cruel breed, more shocking because they are more intelligent than the primitive savages. The stronger race soon began to prey upon the simple aboriginals; some of the Arab tribes were camel-breeders; some were goat-herds; some were Baggaras or cow-herds. But all, without exception, were hunters of men. To the great slave-market at Jedda a continual stream of negro captives has flowed for hundreds of years. The invention of gunpowder and the adoption by the Arabs of firearms facilitated the traffic by placing the ignorant negroes at a further disadvantage. Thus the situation in the Soudan for several centuries may be summed up as follows: The dominant race of Arab invaders was unceasingly spreading its blood, religion, customs, and language among the black aboriginal population, and at the same time it harried and enslaved them.

Coby Lubliner points out that in the last sentence of the last quote, race takes it while population takes them. Churchill was clearly not one of of Emerson's "little minds", at least as far as pronominal reference to collective nouns is concerned.

Posted by Mark Liberman at 08:47 AM

August 02, 2006

How language sausages are made

Like so  many other older scholars, until fairly recently I had mistakenly thought that blogs were things written by teenagers to communicate with other teenagers. Then one day when my friend, John Lawler, was visiting us here in Montana, he introduced me to Language Log. I loved it, of course, and was thrilled when  a few months ago Geoffrey Pullum invited me to post with this group.

Although my enlightenment is increasing, I was surprised at what I learned when I read "Law-Related Blogging Starting to See a Coming of Age" in the August 1, 2006 Chicago Lawyer, written by Douglas A. Berman, a law professor at Ohio State University's Moritz College of Law (see here). What surprised me was that Berman has published over 50 law review articles and commentaries and he estimates that these have been cited only about a half-dozen times in judicial opinions. In contrast, his Sentencing Law & Policy Blog (here) has been cited in more than a dozen cases. He reports:

My blog is my most-cited work, by far. Certainly, it is more widely read than any of my scholarship ... It's all part of the power of the blog ... Blogs help make the legal world move a lot faster. Within a matter of minutes, I can take a new legal development, make it available to the world, and comment on it quickly.

Law-related blogs began  only about six years ago. Since then they have caught fire, with over 1,300 of them now up and running. One of Berman's law students even got academic credit for maintaining his own blog. And a law professor claims that he obtained his position at Boston University's school of law because of his blog.

Two of the many things lawyers like about their blogs is that they make their field more collaborative and they level the playing field by increasing interaction between the well-known legal academics and judges and the less well-known law practitioners.  One of the latter, whose blog is named Ernie the Attorney, calls blogging a "coffee table discussion" on a range of topics:

The big word here is transparency. It's better to let people see how sausages are made, what's going on.

As Language Log readers know by now, we try very hard to help people see how language sausages are made.

Posted by Roger Shuy at 06:10 PM

Did Winston Churchill Use Singular "They"?

I was catching up on pop reading this morning -- specifically the July 31 edition of The Missoulian -- and in Dear Abby's column I found this quotation attributed to Winston Churchill, which Dear Abby offered as part of her answer to an advice seeker:

To each there comes in their lifetime a special moment when they are figuratively tapped on the shoulder and offered the chance to do a very special thing, unique to them and fitted to their talents. What a tragedy if that moment finds them unprepared or unqualified for that which could have been their finest hour.

The sentiment expressed here sounds awfully sappy for Churchill, though for all I know he lapsed into sappiness occasionally (I'm not a serious student of his writings). But it would surprise me if this master of English prose really used all those plural pronouns (their, they, them) with a singular referent (each [person]): Churchill died in 1965, before the feminist movement that helped make the generic use of he unpopular; and even people in my generation wouldn't use "singular they" in formal prose or speech. Well, O.K., pedantic people like me don't use it; normal people in my generation probably do. (If you wanted to use "generic he" in the quotation, you'd replace the plural pronouns with his, he, him and change the plural verb are to singular is.)

So my question was, did Churchill really write this? As usual, I turned to Google for help. The quotation got 27 hits. Most were different newspapers with the same Dear Abby column, but a few were religious and other sites that featured the quotation. Most of them attributed the quotation to Churchill, but without giving a specific source; one site did mention "elaborating on Churchill's words", and that "elaboration" may be the ultimate source of the whole quotation. The end of the quotation had already made me suspect that someone who is not an eminent prose stylist had lifted it from one of Churchill's greatest speeches and used it as the punch line of a passage that never came from Churchill's pen. The closing sentence of Churchill's famous speech of 18 June 1940, at a dark period of World War II, goes like this:

Let us therefore brace ourselves to our duties and so bear ourselves that if the British Empire and its Commonwealth last for a thousand years, men will still say, `This was their finest hour'.
So it is likely that Dear Abby is unfairly maligning Churchill in attributing to him the prose and the sentiment in her column.

The vigorous spread of "singular they" usage is interesting in its own right. It's not new, even in Standard English -- see Geoff Pullum's post on Shakespeare (link below) -- but its use in formal prose has expanded greatly in the past twenty or thirty years. Even though I will (I think) never adopt it myself in formal contexts, it's an excellent solution to the problem of generic usage, now that many or most of us find "generic he" unacceptable. Often you can rewrite a passage with plural referents so that the non-gender-specific plural pronouns are grammatical even in the stuffiest formal Standard English (as in "To all people there comes in their lifetime..." or, possibly better, "To all of us there comes in our lifetime..."); but sometimes the plural makes for awkward prose. When I have to use singular pronouns, for instance in giving word meanings in the Montana Salish dictionary I'm compiling, I sometimes use s/he, but that's not pretty, especially if the gloss has a possessive in it (as in "s/he hit people with his/her fists"). So I look forward to the not very distant future in which "singular they" will be universally accepted as Standard English in both speech and writing. In the meantime, I will probably continue to impose what I perceive as the still-current standard on undergraduates, though with a twinge of conscience. I console myself with the fact that I stopped correcting, or rather "correcting", split infinitives about twenty years ago.

P.S. Maybe I should add a hedge to one assertion up there: do pedants out there object to "their lifetime"? If so, I'm prepared to duke it out in favor of "lifetime" over "lifetimes" in that sentence.

P.P.S. And I can't resist adding two examples from Lucy Thomason's file of weird pronoun usage. The first is a quotation she got from Ives Goddard, who heard it on a news program: The President and the National Security Advisor should be able to express his or her views frankly. And the second is from Jasper Fforde's hilarious book The Eyre affair: a novel: One of the group had their hand up and was determined to have his say. Examples like these seem to me to provide conclusive evidence that the pitfalls of proper pronoun usage are causing many self-conscious users of Standard English to lose their grip entirely. Not Language Loggers, of course; we're above pronominal insecurity. We just say what we want and declare them to be correct.

P.P.P.S. Pronoun usage is hardly a new topic on Language Log. We love to argue about it in Language Log Plaza. See especially the following earlier posts on the general topic:

Shakespeare Used They with Singular Antecedents So There

Singular They with Known Sex

Singular They and Plural He/She/It

Collective Nouns with Singular Verbs and Plural Pronouns

They Are a Prophet

All Lockers Must Be Emptied of Its Contents

Posted by Sally Thomason at 02:25 PM

Effing or fucking?

Yup, to boost LL readership, here's more on Mel Gibson and the "Fucking Jews" tirade. Or should that be "f---ing" or "f***ing" or just a bleep or what? A new one for me in this CNN clip: the reporter quotes Gibson (37 secs into the clip) by uttering "effing," while on the screen we see "F*****g Jews", and in the accompanying article we get "F---ing Jews." The reporter describes himself as giving a direct quote: "allegedly saying, quote, effing jews", but if other news sources are to be believed, Gibson certainly didn't say "effing," but "fucking." Presumably if CNN'd bleeped over the reporter, we'd have the impression that the reporter had himself used a bad word, rather than "effing", which apparently is acceptable. I also like the fact that on the clip "F*****g Jews" is visually superimposed on a torn paper strip shape, delicately suggesting the veracity of a quote pulled in direct from an on-the-spot reporter's notepad.

I wonder how the reporter would have quoted someone saying "motherfucking"? Or for that matter how he would quote Mel's use in the same eloquent improv passage of "sugar tits" to refer to a female officer, a remark which for some reason hasn't been taken up by the press as mysogynistic, and for which Gibson has not directly apologized. A schwa for the "i"? "Sugar mammary glands"? Presumably with accompanying "t*ts" on-screen.

And BTW, was Jesus H. Fucking Christ the first Fucking Jew? I bet the Egyptians has some pretty damned fruity locutions for Moses by the time they got to plague 4 ("Arov", "flies" or some such), but history, alas, does not record them. Perhaps in Mel's next carefully researched movie we'll find out.

Full disclosure: I'm a fucking jew, but I don't have sugar tits.

Posted by David Beaver at 11:06 AM

The very cypher of an excuse

Email from Richard Kingston points out something interesting in Mel Gibson's recent apology for his drunk driving and antisemitic tirades. Gibson has blamed loss of control due to alcohol -- normally he keeps his opinions better disguised -- but he spoke about his loss of control in a doubly indirect way. He might have said "I was completely out of control", or he might have moved back a step from acknowledging his responsibility for his loss of responsibility and said "I acted like I was completely out of control". But in fact he moved back two steps, and said "I acted like a person completely out of control."

Well, he's an actor, after all. I suppose he hopes, like all of us, to be judged by the depersonalized principles that Angelo explains in Measure for Measure:

Condemn the fault, and not the actor of it,
Why euery fault's condemnd ere it be done:
Mine were the verie Cipher of a Function
To fine the faults, whose fine stands in record,
And let goe by the Actor.

Richard (citing Shakespeare) wrote:

The psychological motivation for inserting the extra words is easy to see. But does it have a name (alteregofication?). And are there other examples?

It seems like a familiar rhetorical move, but none of the immediately-obvious search strings turn up many excuses besides Mel's.

Posted by Mark Liberman at 08:06 AM

August 01, 2006

Speech tech up and down

A couple of weeks ago, David Pogue wrote in the NYT about speech recognition in a way that might lead you to believe that all the problems are solved ("Like Having a Secretary in your PC", 720/2006). On the other hand, this more recent video of a demo gone awry might lead to believe that the technology doesn't work at all:

The truth, as far as I can tell, is somewhere in between. The technology is getting better all the time, and most people can use dictation software effectively if they want or need to. But the technology is also rather fragile, and unexpected acoustic conditions can wreak havoc with speech-to-text systems in cases where normal human ears don't detect any particular problem. I would be surprised to learn that there is any significant difference in basic recognition performance between the Dragon system that Pogue raved about and the Microsoft system that experienced the embarrassing demo failure.

[Hat tip: Spreeblick.]

[Update -- there are a couple of weblog entries from people involved in speech development at Microsoft, explaining what happened: Rob Chambers' post "Vista SR Demo failure -- and now you know the rest of the story" (7/29/2006), and Larry Osterman's post "Wait, that was my bug? Ouch!" (7/31/2006). Here's Larry Osterman's explanation:

About a month ago (more-or-less), we got some reports from an IHV that sometimes when they set the volume on a capture stream the actual volume would go crazy (crazy, for those that don't know, is a technical term). [...] The annoying thing about it was that the bug wasn't reproducible - every time he stepped through the code in the debugger, it worked perfectly, but it kept failing when run without any traces.

If you've worked with analog audio, it's pretty clear what's happening here - there's a timing issue that is causing a positive feedback loop that resulted from a signal being fed back into an amplifier.

It turns out that one of the common causes of feedback loops in software is a concurrency issue with notifications - a notification is received with new data, which updates a value, updating the value causes a new notification to be generated, which updates a value, updating the value causes a new notification, and so-on...

The code actually handled most of the feedback cases involving notifications, but there were two lower level bugs that complicated things. The first bug was that there was an incorrect calculation that occurred when handling one of the values in the notification, and the second was that there was a concurrency issue - a member variable that should have been protected wasn't (I'm simplifying what actually happened, but this suffices).

As a consequence of these two very subtle low level bugs, the speech recognition engine wasn't able to correctly control the gain on the microphone, when it did, it hit the notification feedback loop, which caused the microphone to clip, which meant that the samples being received by the speech recognition engine weren't accurate.

There were other contributing factors to the problem (the bug was fixed on more recent Vista builds than the one they were using for the demo, there were some issues with way the speech recognition engine had been "trained", etc), but it doesn't matter - the problem wouldn't have been nearly as significant.

This confirms my view that the problem had nothing to do with the basic quality of the speech recognition software, where Microsoft's systems are no doubt roughly comparable with the Dragon software that Pogue liked so much (and for all I know, might even be better). All contemporary commercial speech-to-text systems use the same basic design, though the implementations are of course different. As a result, they have roughly the same strengths and weaknesses.

It would interesting to listen to the audio stream as received by the ASR engine during that demo. Though the result of the gain-control bug might have been totally unintelligible noise, it also might well have been seriously distorted-sounding, but still intelligible to human listeners. But massively clipped or otherwise distorted audio causes big problems for all current speech analysis methods. You could think of this as being as sort of unplanned audio CAPTCHA puzzle, and in the general case, computer algorithms are (so far) no better at this for understanding sounds than they are for understanding images. ]

Posted by Mark Liberman at 04:43 PM

Between good and evil

I sat down to read today's Science Times, thinking that this article on sibilants (D5 in print) might be interesting (it wasn't). Instead, what I found interesting were the letters to the editor on the facing page (D4 in print), in particular the large number of letters concerning this article in last week's Science Times (which I hadn't read until after reading the letters). The article is "Faith, Reason, God and Other Imponderables", by Cornelia Dean, and is one of those collective, superficial reviews of several recent books clustering around a certain topic, the topic in this case being the (supposed) conflict between religion and science. I comment here on four of those letters, two of which involve interesting examples of linguification.

One of the letters I found interesting was a very short one from Doug Fox from Trumbull, CT. Mr. Fox refers to the following passage from the article:

Of course, just as the professors of faith cannot prove (except to themselves) that God exists, the advocates for atheism acknowledge that they cannot prove (not yet, anyway) that God does not exist.

Mr. Fox writes:

Since the existence of God can be neither proved nor disproved, the only possible position for a scientist is agnosticism.

This seems like a perfectly logical argument: until something is proven or disproven to exist, a rational scientist should believe neither in its existence nor in its nonexistence. (Let's put aside the fact that beliefs and hunches are often, if not invariably, what lead scientists to go about coming up with ways to prove or disprove things -- in other words, science arguably progresses because of the beliefs of scientists, not despite them.)

However, there's a presuppositional catch to Mr. Fox's argument: that the existence or nonexistence of God, as opposed to anything else, is particularly deserving of attention. As Derek Lessing of Erdenheim, PA points out in his letter, there are plenty of imaginable things the existence of which can be neither proven nor disproven, yet nobody bothers coming up with fence-riding positions (or terms) like agnosticism to characterize their views on the matter.

(In general, Mr. Lessing's letter challenges the perceived substance of the claim that the existence of God can be neither proved nor disproved, citing an interesting thought experiment proposed by Carl Sagan: we cannot prove or disprove the existence of "a special dragon in [Sagan's] garage -- invisible, incorporeal, breathing fire with no heat and floating in the air", but still we feel that there's a significant difference between belief and disbelief in such a dragon: disbelief is rational, belief is irrational. What's different about God?)

In another letter, from William Horwitz of Irvington, NY, we find the first interesting example of linguification (emphasis added):

It is striking that in "Faith, Reason, God" the term "agnostic" was never used. To many who embrace the as yet incomplete state of human knowledge, the role of humanity in forging a mature morality of its own and an acceptance of the mystery of existence, religion and atheism are two sides of the same coin, both rooted in an epistemology that posits unwarranted certainty and both worthy of a pox on both their houses. It's a shame the article made no attempt to distinguish between atheism and agnosticism.

It's true that "agnostic" appears nowhere in the article (the linguified claim in Mr. Horwitz's first sentence), and it's also true that no distinction between atheism and agnositicism (or between faith and agnositicism, for that matter) is explicitly made in the article (the other claim in Mr. Horwitz's last sentence); assuming that the claim in the last sentence underlies the linguified claim in the first one, then this is presumably a case of linguification that not even Geoff Pullum would take issue with.

[ Yikes! As I was finishing that last sentence, Geoff passed right by my cubicle in the staff writer's room here at Language Log Plaza. He does that every so often to "keep us on our toes". Lucky for me he didn't linger at my cubicle and look over my shoulder, because I didn't have time to hit the key combination that brings up a text editor with one of his old posts in it (randomly selected) -- a trick we learned that invariably fools Geoff into thinking we're hard at work, after which he shuffles excitedly out of the room, mumbling something that sounds like "insanely great". ]

However, if Mr. Horwitz's linguified claim is really that agnosticism is given no characterization in the article at all, that claim is false. It's particularly clear from Mr. Fox's letter quoted above that agnosticism, or the belief "that nothing is known or can be known of the existence or nature of God" (according to the definition in the New Oxford American Dictionary conveniently installed on my Mac), is a major theme of the article.

Finally, consider the linguified semi-rhetorical question asked in this letter from William Payne of Overland Park, KS (emphasis added):

In "Faith, Reason, God," Richard Dawkins is quoted comparing faith to a disease yet pointing to Steven Weinberg's statement that for "good" people to do "evil" it "takes religion." If God does not exist, what do terms like good and evil really mean? Do they mean anything an individual wants them to mean? And if they can mean anything, don't they ultimately mean nothing?

Apparently for Mr. Payne, "good" and "evil" can only have meanings in the context of a God that gives them those meanings; what is "good" is what God says is good, and what is "evil" is what God says is evil -- and presumably, the only way we mere mortals can know whether to do good or to do evil is to consider what God says will happen (for example, if we do good, we go to heaven; if we do evil, we go to hell). This naive view of word meaning reminds me of an exchange between Jim McCloskey and a student in a class that I was an undergraduate reader for at UCSC:

McCloskey: Where do words come from?
Student: The dictionary.
McCloskey: Ah, but where does the dictionary come from?
Student: [hesitates a little] God?

I'm especially amused by Mr. Payne's slippery-slope conclusion that without God, we have word-meaning chaos that "ultimately" leads to nothingness. It reminds me of these lyrics from "Zero" by Smashing Pumpkins:

Emptiness is loneliness, and loneliness is cleanliness
And cleanliness is godliness, and god is empty just like me

-- Eric Baković, smiling politely

More discussion of linguification on Language Log (all within the last month):

Linguifying (7/3/06)
Four more examles of linguifying (7/5/06)
Classical linguifying (7/5/06)
So ignorant, as that they know not the name of a rope (7/7/06)
A linguification from an unusual source (7/7/06)
The dictionary of fools (7/9/06)
Snowclones of linguification (7/9/06)
Underlying claim false, linguified claim true (7/10/06)
Throughout the ranks of left-wing bloggers (7/11/06)
Not a Slip of the Tongue (7/17/06)
It's hard not to read this and not do a double-take (8/1/06)

[ Comments? ]

Posted by Eric Bakovic at 03:33 PM

Loan words as "evasive language"

Russell Lee-Goldman at Noncompositional has some interesting things to say about the Japanese approach to regulating foreign loans:

Back when I was in Japan in 2003-4, the National Institute for Japanese Langauge (国立国語研究所, or 国研 Kokken) released its second list of suggested rewordings of gairai-go (外来語言い換え提案), or loan words mostly from western languages. However, unlike the efforts of some other national bodies, Kokken (or rather, the Gairaigo committee) does not wish to purge the Japanese language of evil foreign influences (yet! mwa ha ha), but instead encourage understanding and discourage evasive language. They point out that often the use of gairaigo is more about increasing ease for the writer or speaker (who can just import a foreign concept without explaining it), as opposed to increasing understanding for the reader or listener. Their suggestions are also, well, suggestions, rather than written-in-stone law.

In the case of English, I think of foreign borrowings -- say, karate -- as increasing precision of expression rather than fostering communicative evasion. Consider the case of anime, which the Japanese borrowed from English "animation", and we then borrowed back from them. (Or the stranger case of hentai, where a Japanese borrowing is used in the west for a genre of anime which the Japanese don't call "hentai", instead using foreign terms such as "H anime" or "eroanime".) In these cases, it looks to me like each stage of the process increases precision rather than evading it. One of the commenters here is explicit about this:

I like the word "hentai" because I can use that keyword in google to find English sites. If I want Japanese hentai from Japan I use the word "eroi", a word which doesn't exist in English. So it's very convenient.

But maybe other English loans in Japanese are used evasively?

Posted by Mark Liberman at 06:29 AM

It's hard not to read this and not do a double-take

Here's the latest dispatch on the overnegation front... Over on Slate, Christopher Hitchens takes on Mel Gibson's "Jew-hatred," observing that Gibson has never disowned his father's anti-Semitic comments. Hitchens contrasts this with former White House press secretary Scott McClellan, who had to distance himself from his father's book claiming that Lyndon Johnson was responsible for the JFK assassination. If you click over to the Amazon link for Barr McClellan's Blood, Money & Power: How L.B.J. Killed J.F.K., you can see an approving blurb from Walt Brown, editor of JFK / Deep Politics Magazine:

It's hard not to read this work and not shout 'Guilty as hell'.

There's one too many not's there. Presumably Brown intended to write either:

It's hard not to read this work and shout 'Guilty as hell'.

or even better:

It's hard to read this work and not shout 'Guilty as hell'.

The wildly overstated claim that it's difficult to read this book without shouting 'Guilty as hell" resembles Geoff Pullum's snowclones of linguification. It's probably pretty easy to read the book without shouting that phrase, but Brown uses a snowclone for rhetorical effect. Unfortunately, it's a snowclone that's particularly prone to overnegation. Here are some more examples of the form "It's hard not to do X and not do Y" where one not or the other can be safely omitted:

It's hard not to walk into a press conference these days and not hear, at some point, "With scholarships where they are today..." (Univ. of Michigan Daily)

But it's hard not to read Olney's book and not appreciate the key members of the team that dominated baseball for half a decade. (Deseret News)

It's hard not to walk into a place like that and not be overwhelmed by the sheer craftmanship and care with which it was put together. (Australian Broadcasting Corporation)

Indeed, it's hard not to view this build and not believe that Microsoft is absolutely back on track. (Paul Thurrott's SuperSite for Windows)

[In researching the period] it's hard not to look at 1910 and not see what's coming down the road. (Provincetown Banner)

But at the same time, it's hard not to look at this image of greatness and not feel a sense of awe. (Eurosport)

Another overnegated variant is "It's hard not to do X without doing Y":

It's hard not to think of the art of New Mexico without thinking of Georgia O'Keeffe. (Tucson Weekly)

It's hard not to tell somebody without having them freak out and get upset. (NY Daily News)

It's hard not to watch that without feeling. (CNN)

He's a man on a tightrope and it's hard not to watch him without worrying about him. (Austin Chronicle)

Indeed, it's hard not to think about Lincoln without writing your own, rococo history in your head. (Weekly Wire)

With so much success this season, it's hard not to discuss Notre Dame baseball without mentioning the College World Series. (Notre Dame Observer)

In both of these variants, there seems to be confusion over the scope of negation for "It's hard not to..." when this opening formula is followed by two conjoined VPs. When in doubt, people frequently overnegate. Sometimes it's hard not to construct a multiple-negation sentence without falling back on vernacular patterns of negative concord.

Posted by Benjamin Zimmer at 01:05 AM