Language Log: October 2005 Archives

October 31, 2005

More Dowdese

Fascinating. The "orange croqueted halter dress" that originally appeared in Maureen Dowd's piece in the New York Times Sunday Magazine has magically changed to "orange crocheted halter dress" in the online edition. (The Times can't hide from the Nexis, Proquest, and Factiva newspaper databases, however, all of which have archived the original version with croqueted.) The change was made without even a perfunctory correction note, typically found at the end of an online article. Perhaps this is how the Times usually deals with correcting typos that are not strictly errors of fact.

We'll need to see whether croqueted or crocheted appears in Dowd's soon-to-be-released book Are Men Necessary?, from which the Sunday Magazine essay was adapted. If croqueted is there, that means the goof slipped by the combined editorial forces of G.P. Putnam's Sons and the Times Magazine.

It's very possible, though, that Dowd's editors didn't initially correct the croqueted error because they simply assumed it was an example of the writer's usual playful take on the English language. Dowd's pop-culture-laden wordplay, which pleases some and annoys others, was evident throughout Sunday's essay. Here are a few annotated examples of the latest Dowdese.

After Googling and Bikramming to get ready for a first dinner date, a modern girl will end the evening with the Offering, an insincere bid to help pay the check.

Bikramming refers to Bikram Yoga, which involves vigorous exercises performed in a heated room. (The style of yoga was conceived by the entrepreneur Bikram Choudhury, who has trademarked the "Bikram" name.) Dowd first tried out Bikramming in an Aug. 29, 2001 column about "The Offering," where it appeared transitively: "A thoroughly modern young lady might be found Paxiling herself, Googling her date, Bikramming her body and pondering The Offering." Paxiling would refer to self-medication with the antidepressant Paxil. Clearly Dowd is a fan of creating verbal nouns from brand names, taking her cue from such recent neologisms as Googling and Botoxing.

Dowd continues with her reflections on women letting men pay for dates, and what men might expect in return (again cribbing from her Aug. 2001 column):

Jurassic feminists shudder at the retro implication of a quid profiterole.

"Quid profiterole" (which Chris Waigl describes via email as "eyebrow-raising") is a ham-handed pun, blending quid pro quo and profiterole. Not one of Dowd's best efforts — one can imagine her searching through the pro- entries in the dictionary hoping to find an appropriate food term.

The essay wraps up with this line, describing an imagined world twenty-five years from now:

With no power or money or independence, they'll be mere domestic robots, lasering their legs and waxing their floors — or vice versa — and desperately seeking a new Betty Friedan.

Here Dowd returns to the language of body care and its peculiar verbal nouns, such as lasering to refer to the process of laser hair removal. This is one of the better examples of Dowdian wordplay, as the throwaway "or vice versa" cleverly suggests an absurd chiasmus. And "desperately seeking" manages to evoke both Desperate Housewives and its cinematic predecessor in the bored-housewife genre, 1985's Desperately Seeking Susan. When Dowd isn't trying too hard, her mots can be quite bon.

Posted by Benjamin Zimmer at 04:40 PM

The merits of true minds

If you checked www.whitehouse.gov this morning as I did, shortly after President Bush's nomination of Judge Samuel Alito for a seat on the U.S. Supreme Court, you may have noticed an interesting slip of the ear memorialized in the transcript of his remarks:

Judge Alito's reputation has only grown over the span of his service. He has participated in thousands of appeals and authored hundreds of opinions. This record reveals a thoughtful judge who considers the legal matter -- marriage carefully and applies the law in a principled fashion. He has a deep understanding of the proper role of judges in our society. He understands that judges are to interpret the laws, not to impose their preferences or priorities on the people. [emphasis added]

The president's text must have read "a thoughtful judge who considers the legal merits carefully". He said "the legal matter" by mistake, stopped, and substituted "merits". The person recording the presentation instead transcribed "marriage", which is much closer to "merits" phonetically than orthographically: ['me.ɹɪdʒ] vs. ['me.ɹɪts] in IPA -- at least for people like me (and President Bush?) who pronounce merry, Mary and marry the same way.

By the time I checked again at 10:30, the transcript had been corrected to read "merits", as it should. But I wondered why the error had been made in the first place. This is not one that we can blame on an errant spellchecker. I briefly considered the possibility that some West Wing stenographer might be spending too much time thinking about gay marriage. Then I realized that this was almost certainly the result of one of the real-time transcription technologies -- either stenotyping (which uses a special chording keyboard) or voicewriting (which uses automatic recognition of shadowed speech). Everyone knows stories of speech recognition gone wrong, and there can often be mistakes in the results of stenotyping as well. Talking in real time is hard enough -- transcribing in real time is even harder.

Posted by Mark Liberman at 12:41 PM

Buckingham Browne and Nichols

Daniel Barkalow mails me with the story of another piece of awkward coordinative nomenclature resulting from a merger of schools in my current city of residence. There's a fairly well-known private school in Cambridge called "Buckingham Browne & Nichols", abbreviated as BB&N. The interesting oddity, Daniel points out, is that the name doesn't have any commas in it. (It contrasts, coincidentally, with another Cambridge institution, the research company Bolt, Beranek and Newman, abbreviated BBN, named by a ternary coordination.) BB&N was formed by a binary merger of the Buckingham school (for girls) and the Browne & Nichols school (for boys). When they agreed to merge, it is said, they wanted to avoid undervaluing either of the component schools, and they chose to signal this by just writing the names together, without any conjunction or punctuation. (This is common in the corporate world, of course, and there is an epidemic of it in publishing.) Daniel suggests that the effect is the opposite of what they wanted, making "Buckingham" look like a mere attributive modifier of "Browne & Nichols". I agree that the effect is not quite right, but I am not sure I agree about why. For me, the ampersand is salient enough to suggest that it marks the major break — that the first part is "Buckingham Browne" and the second part is "Nichols". In fact the effect is so strong that I can see Buckingham Browne in my mind's eye, his bow tie prominent against his bright pink shirt, his expensive suit impeccably unrumpled as he steps out of his Rolls Royce: "Hello; Buckingham Browne at your service. Mr Nichols, I presume?"

Posted by Geoffrey K. Pullum at 11:38 AM

Brigham and Women's: it could have been worse

Bill Poser stopped to chat at the water cooler in 1 Language Log Plaza the other day, and remarked that the ugly coordinative name "Brigham & Women's Hospital", which appears to coordinate items in different grammatical functions (attributive modifier and determiner, respectively), could have been worse, much worse. What is now the Brigham and Women's Hospital results from a series of three separate binary mergers of what were originally four hospitals:

— the Peter Bent Brigham Hospital
— the Robert Breck Brigham Hospital
— the Boston Lying In Hospital
— the Free Hospital for Women

If you merged those last two you have had "the Free and Boston Lying In Hospital for Women", or "the Boston Lying In and Free Hospital for Women", or any of a number of other awkward names. A merger of the first and third might have produced "the Peter Bent and Boston Lying In Hospital" or "the Boston Lying In and Peter Bent Hospital"... So perhaps we were lucky.

Posted by Geoffrey K. Pullum at 07:55 AM

Needling the Times

On the subject of spellchecker artifacts in the New York Times, the Boston Globe's Jan Freeman emails the following oddity from Maureen Dowd's piece in the Sunday Magazine, "What's a Modern Girl to Do?":

Cosmo is still the best-selling magazine on college campuses, as it was when I was in college, and the best-selling monthly magazine on the newsstand. The June 2005 issue, with Jessica Simpson on the cover, her cleavage spilling out of an orange croqueted halter dress, could have been June 1970.

Freeman writes: "I can't figure out what typo a spellchecker might turn into 'croqueted,' but I'm having trouble believing a human (two, three humans!) could miss this, too."

I'm not so sure a spellchecker can be faulted for this one. Dowd clearly meant crocheted, but using that word probably would not have bothered the spellchecker since it's common enough to be in the Microsoft Word custom dictionary (unlike, say, truthiness or DeMeco). And there's no likely candidate for a misrecognized typo here (unlike aquainted getting changed to aquatinted or amature to armature) — unless the typo was crocqueted, which is already pretty close to croqueted. Of course, if croqueted was in Dowd's copy to begin with, you can't blame the spellchecker for not recognizing that it was inappropriate for the context (though we can perhaps imagine a day when a superintelligent spellchecker would raise a red flag at collocating croqueted with halter dress).

Rather than spellchecker interference, this appears to be a simple mixup between two similar words of French origin. It's more of a malapropism than an eggcorn, since it's hard to imagine a semantic link between needlework and lawn games. There is in fact a rather obscure etymological connection: both crochet and croquet ultimately derive from Old French croche meaning "hook" (in one case referring to the hooked crochet needle and in the other to the crooked stick used in early forms of croquet). That at least explains the surface resemblance of the two words, differing only by digraphs (-ch- and -qu-) representing single consonants.

Still, as Freeman suggests, it is a bit surprising that this slipped past the copy editors at the Times. It's not an uncommon error in unedited text, as Google and Yahoo turn up scores of "croqueted sweaters" and the like — even after filtering out word lists, random text spam, and legitimate examples of croqueted (e.g., "I should have croqueted the Queen's hedgehog just now" from Alice's Adventures in Wonderland). A search on eBay auctions currently finds 14 crocheted items described as croqueted, with another 21 sold by eBay Stores. But we do hold Maureen Dowd and her editors to a higher standard than eBay vendors. What's more, this is in a piece for the Sunday Magazine, where the editing process is presumably more deliberate than it is for the daily paper.

Bonus dormitat Homerus. Far more outrageous errors have been published and corrected by the Times, as lovingly recounted in the book Kill Duck Before Serving. Here are two of my favorites (both also noted by Slate's Jack Shafer):

May 30, 1993
Because of a transmission error, an interview in the Egos & Ids column on May 16 with Mary Matalin, the former deputy manager of the Bush campaign who is a co-host of a new talk show on CNBC, quoted her incorrectly on the talk show host Rush Limbaugh. She said he was "sui generis," not "sweet, generous."

April 7, 1995
Because of a transcription error, an article about Senator Alfonse M. D'Amato's remarks about Judge Lance A. Ito misquoted the Senator at one point. In his conversation with the radio host Don Imus, he said: "I mean, this is a disgrace. Judge Ito will be well-known." He did not say, "Judge Ito with the wet nose."

[Update, 4:40 PM: Perhaps someone on the Times editorial staff reads Language Log, since croqueted has been changed to crocheted in the online version of Dowd's essay. Details here.]

Posted by Benjamin Zimmer at 12:35 AM

October 30, 2005

The first "Fitzmas"

Tracking neologisms in American English has a long and distinguished tradition. An early master was Dwight Bolinger (1907-1992), who began keeping tabs on the latest words and phrases in 1937 with his regular column "The Living Language" in the journal Words. In 1941 he moved the feature to American Speech (the journal of the American Dialect Society) under the title "Among the New Words," where it continues to this day, currently entrusted to Wayne Glowka. One able inheritor of Bolinger's mantle is Grant Barrett, project editor of the Historical Dictionary of American Slang and keeper of the entertaining and illuminating website Double-Tongued Word Wrester.

As a repository of "undocumented or under-documented words from the fringes of English," DTWW offers both completed entries and a queue of new citations that may eventually warrant full treatment in the entry section. One recent item in the queue is an unusually successful neologism, exploding out of nowhere into seeming omnipresence in a matter of a few weeks: Fitzmas.

Pending a full DTWW entry, let's turn to Wikipedia for a definition:

Fitzmas is the name given by some liberal American bloggers to the atmosphere of excitement and anticipation primarily among Democrats and some others preceding the announcement of results of Patrick Fitzgerald's investigation of the Plame affair. On October 28, 2005 Fitzgerald announced that the grand jury had indicted Lewis Libby, who was then the Chief of Staff and assistant for National Security Affairs to Dick Cheney, Vice President of the United States (Libby also served as Assistant to the President). The word "Fitzmas" is a portmanteau of Fitzgerald's name and "Christmas".

Staggeringly enough, Fitzmas currently yields in the neighborhood of half a million hits from both Google and Yahoo. So who coined Fitzmas? The DTWW queue lists an Oct. 6 blog entry from 2Millionth Web Log, while the most recent version of the Wikipedia entry cites a different source from Oct. 6, a forum post on Democratic Underground. Clearly what we need here is a tick-tock, as they say in the news business. And thanks to blog trackers like Technorati and Google Blog Search, we can provide just that. (Keep in mind, however, that only the first two entries specify the time zone!)

Oct 6, 2005, 1:04 AM PDT: Markos "Kos" Moulitsas Zúniga, on his immensely popular left-leaning blog Daily Kos, starts the ball rolling:

I get up at around 8:00 a.m. pacific time, or 11 a.m. Eastern. I hope I wake up to good news. This makes me feel like the night before Christmas:

The federal prosecutor investigating who leaked the identity of a CIA operative is expected to signal within days whether he intends to bring indictments in the case, legal sources close to the investigation said on Wednesday. [etc.]

Oct 6, 2005, 1:07 AM PDT: A few minutes after the Kos post, a contributor in the comments section named "Bob" takes a stab at what would be the first of many parodies of Clement C. Moore's "A Visit from St. Nicholas," with other commenters following suit:

'Twas the night before Christmas
And all through the house
Not a creature was stirring
Not even the louse...

Oct 6, 2005, 4:20 AM: A blogger named "Attaturk" writes a longer parody of "The Night Before Christmas" in a post on the Rising Hegemon weblog:

KOS mentioned that this story made him feel like the night before Christmas.

So...

'Twas the night before Fitz talked,
when all through the house
Not a creature was stirring, not even a louse;
[etc., etc., with words and pictures]
But I heard Fitz exclaim, ere they drove out of sight,
"You're crooks one and all, and each I'll Indict."

Oct 6, 2005, 8:24 AM: "Michael" on 2Millionth Web Log links to the Rising Hegemon post and provides the first known example of Fitzmas:

The Night Before Fitzmas, or a Visit From The Special Prosecutor
Rising Hegemon interprets an old classic.

Oct 6, 2005, 11:46 AM: In an apparently unrelated development, "seemslikeadream" gets into the Christmas spirit (inspired by Kos and his commenters?) in a forum post on Democratic Underground:

The 12 Days of Christmas Indictments
I wonder if I could get a little help on this, been thinking of it since last night
On the first day of indictments my true love Fitz gave to me
One Rovian Frog March

Oct 6, 2005, 12:56 PM: In a follow-up post in the Democratic Underground forum thread, "SpiralHawk" offers the second known attestation of Fitzmas:

These are the 12 Days of Fitzmas
Get your Fitzmas On !

So it looks like this is a case of independent invention. Once the "Christmas" meme was planted by Kos, it spread along two different branches, both of which happened upon the Fitzmas portmanteau in short order. The Democratic Underground branch probably had more to do with the continued dissemination of the term after Oct. 6, as other forum posters continued coming up with Fitzmas parodies (e.g., on Oct. 9, "Botany" offered "Happy Fitzmas," a play on John Lennon's "Happy Xmas (War Is Over)," as well as yet another version of "The Night Before Fitzmas").

But the Fitzmas explosion didn't really hit until Oct. 18 when the meme returned to Daily Kos, this time in a widely linked post by "georgia10" with the title "Dealing With Fitzmas." Anticipation about impending indictments was high at that point, and it would only increase as bloggers and the news media continued to speculate about the timing and nature of Fitzgerald's announcement. On her own blog, "georgia10" commented on Oct. 22 about the spread of Fitzmas (a term she credited to Democratic Underground), noting that a Google search just four days after her Daily Kos post already yielded a whopping 51,500 hits.

We've come a long way from the days of Bolinger, when neologism-hunting was a laborious enterprise requiring eagle-eyed readers scouring newspapers and magazines for the latest lingo. Of course, not all coinages will deliver up their provenance as easily as a blog-driven term like Fitzmas. And it goes without saying that this kind of blogospherese may have an exceedingly short shelf life. Now that the announcement of the Libby indictment has passed, I would expect that Fitzmas will die out quickly — unless, of course, an indictment of Karl Rove or another high-level official is in the offing, in which case be prepared for Fitzmas II: A New Beginning.

[Update, 10/31/05: Goodness, those Wikipedists move fast. The Wikipedia entry has already been revised to include the 2Millionth Web Log citation.]

Posted by Benjamin Zimmer at 06:56 AM

October 29, 2005

Scalia on the meaning of meaning

In the November issue of First Things, Justice Antonin Scalia has a review of Law’s Quandary by Steven D. Smith. One of the review's central issues is the meaning of meaning. Scalia writes that

The portion of Smith’s book I least understand—or most disagree with—is the assertion, upon which a regrettably large portion of the analysis depends, that it is a “basic ontological proposition that persons, not objects, have the property of being able to mean.” “Textual meaning,” Smith says, “must be identified with the semantic intentions of an author—and . . . without an at least tacit reference to an author we would not have a meaningful text at all, but rather a set of meaningless marks or sounds.” “Legal meaning depends on the (semantic) intentions of an author.”

Scalia disagrees: for him, meaning has to do with understanding texts or utterances, not with intending to use them to communicate:

Smith confuses, it seems to me, the question whether words convey a concept from one intelligent mind to another (communication) with the question whether words produce a concept in the person who reads or hears them (meaning).

Even a phonetician like me knows that this issue had an important role in 20th-century philosophy of language. I present it to students in Linguistics 001 as the distinction between speaker meaning and sentence meaning, framed by a quote from Peter Strawson's essay "Logic and Truth" (reprinted in his 1971 collection Logico-Linguistic Papers):

What is it for anything to have a meaning at all, in the way, or in the sense , in which words or sentences or signals have meaning? What is it for a particular sentence to have the meaning or meanings it does have? What is it for a particular phrase, or a particular word, to have the meaning or meanings it does have?
[ . . .]
I am not going to undertake to try to answer these so obviously connected questions. . . I want rather to discuss a certain conflict, or apparent conflict, more or less dimly discernible in current approaches to these questions. For the sake of a label, we might call it the conflict between the theorists of communication-intention and the theorists of formal semantics. According to the former, it is impossible to give an adequate account of the concept of meaning without reference to the possession by speakers of audience-directed intentions of a certain complex kind. . . The opposed view. . . is that this doctrine simply gets things the wrong way round. . . the system of semantic and syntactical rules, in the mastery of which knowledge of a language consists -- the rules which determine the meanings of sentences -- is not a system of rules for communicating at all. The rules can be exploited for this purpose; but this is incidental to their essential character. It would be perfectly possible for someone to understand a language completely -- to have a perfect linguistic competence -- without having even the implicit thought of the function of communication
[. . .]
A struggle on what seems to be such a central issue in philosophy should have something of a Homeric quality; and a Homeric struggle calls for gods and heroes. I can at least, though tentatively, name some living captains and benevolent shades: on the one side, say, Grice, Austin, and the later Wittgenstein; on the other, Chomsky, Frege, and the earlier Wittgenstein.

I'm not sure that Chomsky is accurately classified here, but it's certainly fun to think of him as being on the same virtual debating team as Scalia.

On a more serious note, I'm curious about a different cultural divide -- the apparent separation over the past century or so between philosophy of law and philosophy of language, despite the evident overlap of issues. I haven't read Smith's book, but there is no 20th-century philosophy of language in the "scholarship, from ancient to modern, bearing upon the philosophy of law" that Scalia cites Smith as reviewing -- the list skips from Socrates and Plato to "Holmes to Pound, Llewellyn to Dworkin, Posner to Bork (and Scalia, honored as I am to be condemned in such eminent philosophical company)".

Scalia's specific arguments that meaning is something that (people perceive that) texts have, not something that people do (for the purpose of affecting other people), seem to me to raise some interesting non-linguistic issues. For example, he proposes this parable:

Two persons who speak only English see sculpted in the desert sand the words “LEAVE HERE OR DIE.” It may well be that the words were the fortuitous effect of wind, but the message they convey is clear, and I think our subjects would not gamble on the fortuity.
[...]
As my desert example demonstrates, symbols (such as words) can convey meaning even if there is no intelligent author at all.

This is a clear rebuke to William Dembski's information-theoretic approach to Intelligent Design Theory. If Scalia believes that the letters "LEAVE HERE OR DIE" sculpted in the desert sands are a plausible example of symbols with "no intelligent author at all", Dembski's notion of "specified information" (as evidence for an intelligent designer) has surely not impressed him.

When it comes to the meaning of meaning, Scalia not only focuses on the interpreter rather than the creator of a signal, he gives absolute power to semiotic convention:

If the ringing of an alarm bell has been established, in a particular building, as the conventional signal that the building must be evacuated, it will convey that meaning even if it is activated by a monkey.

I question the implication that people who hear a fire alarm interpret its meaning without paying any attention to theories of how it was activated: on Scalia's theory, how can we make sense of the everyday concept "false alarm"? When I'm told that a particular alarm is due to a circuit fault (or a mischievous monkey, though monkeys are thin on the ground in Philadelphia), I don't conclude that the conventional meaning of the fire alarm signal has changed, but I do join everyone else in aborting the evacuation. Back in January, Geoff Pullum told a real-world story that bears on this issue.

(And of course, the genuine human response to Scalia's LEAVE HERE OR DIE example would be to reason, at least briefly, about the author of the message and the intent behind it -- a threat? a prank? an improbable accident? the slogan of a defunct weight-loss camp, missing its last letter? If I'm one of the "persons" seeing the message, and the other one explains to me "Oh, that's just one of my cousin's tasteless jokes", I haven't learned anything new about the conventions of the English language, but the news alters my interpretation of the message profoundly.)

Scalia also argues (or rather asserts) that group exegesis is less ambiguous than group authorship, explaining that multiple authors "may intend to attach various meanings to their composite handiwork", while we can "ordinarily tell without the slightest difficulty" what the meaning of that handiwork was to its multiple contemporary readers:

What is needed for a symbol to convey meaning is not an intelligent author, but a conventional understanding on the part of the readers or hearers that certain signs or certain sounds represent certain concepts. In the case of legal texts, we do not always know the authors, and when we do the authors are often numerous and may intend to attach various meanings to their composite handiwork. But we know when and where the words were promulgated, and thus we can ordinarily tell without the slightest difficulty what they meant to those who read or heard them.

By citing three points where Scalia's arguments didn't convince me, I don't mean to invalidate the review as a whole, which struck me as an intelligent and interesting attempt to address important questions, and which persuaded me to order Smith's (rather expensive) book. However, I wonder whether Scalia has considered and rejected the ideas of the past century of philosophy of language, or whether he's simply never encountered them. It's obvious that the concerns of legislators, lawyers and judges overlap significantly with the subject matter of linguistics and language-related philosophy, and I've always been puzzled about why the real-world interactions between the disciplines and their practitioners seems to be so limited. Reading Scalia's review left me more puzzled than before.

If someone like Scalia were to want a reading list for philosophy of language since Plato, what should be on it? Works by Strawson's heroes and shades would not be at the top of my list of suggestions, but I'm not the one to compose such a list in any case. Send me your nominations, or (better) blog about it and send me the link, and I'll summarize the results in a week or so.

I gather from Scalia's review that Smith's perspective is at least as strongly represented among contemporary legal scholars as Scalia's is. I won't presume to characterize the philosophical state of play on these questions, but let me say that as a practical matter, linguists generally find it necessary to think about both kinds of meaning, in something like the relationship suggested by Wilson and Sperber's "Relevance Theory". This gives ontological houseroom to "linguistic meaning", but considers it one of the factors on the basis of which (a normally more consequential) "speaker's meaning" is inferred:

According to the code model, a communicator encodes her intended message into a signal, which is decoded by the audience using an identical copy of the code. According to the inferential model, a communicator provides evidence of her intention to convey a certain meaning, which is inferred by the audience on the basis of the evidence provided. An utterance is, of course, a linguistically coded piece of evidence, so that verbal comprehension involves an element of decoding. However, the linguistic meaning recovered by decoding is just one of the inputs to a non-demonstrative inference process which yields an interpretation of the speaker's meaning.

And let me add that it's often necessary to consider other aspects of the causal chain as well, such as transmission-channel noise and possible slips of the tongue, pen or ear. Along with the intrinsic ambiguities of the signals involved, this means that "decoding" itself is a non-trivial process, usually seen as a form of Bayesian reasoning that crucially depends on assumptions about the a priori probability of alternative (linguistic) messages as well as on the available evidence about the signal being decoded. I suppose that legal texts are generally carefully composed and proofread, but errors must occasionally creep in -- and do obvious typos or malaprops then have the force of law?

[Update: more here and here. And some blawg discussion by Eh Nonymous at Unused and Probably Unusable. And a very relevant Georgetown Law Journal article by Larry Solan, "Private Language, Public Laws: The Central Role of Legislative Intent in Statutory Interpretation", for which a preprint is available here.]

Posted by Mark Liberman at 11:58 PM

Coordinative naming botches

I have noticed while walking about Cambridge, Massachusetts, that it has at least two institutions with names that result from mergers that took place in their pasts and ended up being close to ungrammatical. One is the Cambridge Rindge and Latin School. Another, in the news this weekend, is the Brigham and Women's Hospital. Both these coordinate names sound strikingly weird to me. It is worth trying to diagnose the syntactic reasons.

Cambridge Rindge and Latin is the only public high school in Cambridge, and I think its name sounds wrong because it coordinates a proper-name modifier associated with the name of the founder (Frederick Hastings Rindge) with a modifier that appears to designate a subject matter taught (the Latin language). So it's odd in the same way that it would be odd if a place called Bagley Farm merged with a place called The Dairy Farm and the result was called "the Bagley and Dairy Farm"; or if the two units at Harvard called the Kennedy School of Government and the Harvard Divinity School were to merge into something called "the Kennedy and Divinity School". (That one looks bad enough that mentally I place an asterisk in front of it.)

Cambridge Rindge and Latin resulted from a merger of the Rindge School of Technical Arts with the Cambridge High and Latin School. But of course the latter is itself a coordinative naming botch; the Cambridge High and Latin School was formed as a result of an earlier merger of Cambridge English High School and Cambridge Latin School. "High and Latin" is a coordination of an adjectival modifier with a proper-noun modifier, and sounds just as weird. (We're lucky we didn't get "Cambridge Rindge, High, and Latin", a coordinative amalgamation of all three.)

Brigham and Women's Hospital was in the news over the last few days because Luk Van Parijs, the MIT associate professor who has just been fired for faking data in several immunology papers, did some of his research there.

The linguistic problem with "Brigham and Women's" seems even worse than "Rindge and Latin" or "Kennedy and Divinity" to me. It's a coordination of a proper name modifier (as in the underlined part of London pride, Budapest Restaurant, or California girls) with a genitive noun phrase determiner (as in the underlined part of Ken's pride, Alice's Restaurant, or our girls). In general, it seems to me that you should expect any attempt to do this kind of coordination to make something crunchingly ungrammatical. See what you think (the square brackets indicate the coordinate constituents):

*[London and Mary's] pride is what you're dealing with.
*Let's go to [Budapest and Alice's] Restaurant.
*I like both [California and our] girls.

I think they should have called in a linguist in when they were discussing these mergers. You don't want your institution to get stuck with an ungrammatical name. It's the same as when you are coining a new word that you hope will catch on, or making an assertion about what phrases occur in current discourse. Language Log Plaza is happy to provide lingustic consultants on such matters. Our fees are reasonable, and our linguistic taste is guaranteed: if there are any problems with the new name we create, we are prepared to give you your old name back.

Posted by Geoffrey K. Pullum at 08:22 PM

Don't read it as something more than it's not

The punditocracy and the blogosphere, from right to left, have generally been impressed by Patrick J. Fitzgerald's news conference yesterday. I share this positive evaluation, but I want to use it as background for a different point. Speaking demands skill; explaining something complicated in public to a large audience is stressful; and when the large audience is poised to interpret every nuance to the nth degree, with enormous stakes riding on the results, it's amazing that anyone ever manages to bring it off without mistakes.

Well, the truth is that almost no one ever does, and yesterday's performance by Mr. Fitzgerald was no exception to this generalization.

First, let's take a quick look at the reaction. Andrew Sullivan's evaluation:

WOW: Just a comment on the press conference. Fitzgerald is more than impressive. His focus, grasp of the relevant facts, clear enunciation of what he is doing and dignified way in which he refused to speculate on anything else were, to my mind, deeply encouraging for anyone who cares about public life. He's an antidote to cynicism. The Jesuits who educated him should be very proud today. It will be very hard to slime him; and the administration would be very foolish to even think about it.

Pejman Yousefzadeh at Redstate.org was less effusive but similarly positive:

I thought that Fitzgerald's television appearance was very impressive. He was restrained but principled, he knew the case inside and out and he was clearly at the top of his game in answering the reporters' questions (in addition to showing a great deal of patience with stupid questions like the very last one asked).

I agree, but let's look at some details of his performance at the press conference. (Quotes are taken from the transcript on the NYT site; on a quick check, they seem to match the recording. Any boldface and/or italics was added by me.)

As evidence of how carefully Fitzgerald was monitoring what listeners might make of his words, consider this Q & A::

QUESTION: Mr. Fitzgerald, do you have evidence that the vice president of the United States, one of Mr. Libby's original sources for this information, encouraged him to leak it or encouraged him to lie about leaking?

FITZGERALD: I'm not making allegations about anyone not charged in the indictment.
Now, let me back up, because I know what that sounds like to people if they're sitting at home.
We don't talk about people that are not charged with a crime in the indictment.
I would say that about anyone in this room who has nothing to do with the offenses.
We make no allegation that the vice president committed any criminal act. We make no allegation that any other people who provided or discussed with Mr. Libby committed any criminal act.
But as to any person you asked me a question about other than Mr. Libby, I'm not going to comment on anything.
Please don't take that as any indication that someone has done something wrong. That's a standard practice. If you followed me in Chicago, I say that a thousand times a year. And we just don't comment on people because we could start telling, Well, this person did nothing wrong, this person did nothing wrong, and then if we stop commenting, then you'll start jumping to conclusions. So please take no more.

Fitzgerald's first speech error occurs back at the very start of his presentation, in his second sentence:

FITZGERALD: Good afternoon. I'm Pat Fitzgerald. I'm the United States attorney in Chicago, but I'm appearing before you today as the Department of Justice special counsel in the CIA leak investigation.
Joining me, to my left, is Jack Eckenrode, the special agent in charge of the FBI office in Chicago, who has led the team of investigators and prosecutors from day one in this investigation.

As the papers explained, Eckenrode is actually from Philadelphia:

Mr. Fitzgerald announced the charges with John C. Eckenrode, Special Agent-in-Charge of the Philadelphia Field Office of the FBI and the lead agent in the investigation.

and Fitzgerald of course knows that, as he made clear later in the session:

We, as prosecutors and FBI agents, have to deal with false statements, obstruction of justice and perjury all the time. The Department of Justice charges those statutes all the time.
When I was in New York working as a prosecutor, we brought those cases because we realized that the truth is the engine of our judicial system. And if you compromise the truth, the whole process is lost.
In Philadelphia, where Jack works, they prosecute false statements and obstruction of justice.
When I got to Chicago, I knew the people before me had prosecuted false statements, obstruction and perjury cases.

Why then did he call Eckenrode "the special agent in charge of the FBI office in Chicago"? Well, it was obviously just a slip of the tongue -- Chicago persisted from the the previous sentence, and intruded into a place where it didn't belong.

Fitzgerald committed another type of performance error in his answer to the first question:

OK, is the investigation finished? It's not over, but I'll tell you this: Very rarely do you bring a charge in a case that's going to be tried and would you ever end a grand jury investigation.

At least for me, the italized sentence is somewhere between terminally awkward and out-and-out ungrammatical. If he were writing the answer out, I'm sure he would have backed up and reworded it. But he's speaking, and so he has to keep going and work it out somehow.

[If you care about the details... What he's saying, it's clear, is that a prosecutor normally keeps a grand jury involved during the period between indictment and trial, so that new charges can be brought if appropriate. He starts out by putting this in a negative way: the contrary would happen "very rarely". What would happen very rarely is something like "you bring a charge and you end the grand jury investigation". Having started with "very rarely", he inverts the subject and auxiliary of the first clause: "very rarely do you bring a charge in a case that going to be tried". So far so good, but now he's stuck: not inverting the second conjunct would be very odd, while inverting it is hardly any better. Furthermore, he feels the need to stick in would so as to emphasize that the whole thing is hypothetical, since he apparently doesn't want to give any detailed facts about which grand juries are looking into what.]

Another type of error comes up in answer to a later question (pointed out to me by Eric Bakovic):

FITZGERALD: You couldn't walk in and responsibly charge someone for lying about a conversation when there were only two witnesses to it and you talked to one. That would be insane.
On the other hand, if you walked away from it with a belief that that conversation may have been falsely described under oath, you were walking away from your responsibility.
And that's why, when the subpoenas were challenged, we put forward what it is that we knew and we let judges pass on it.
So I think people shouldn't read this exceptional case as being something more than it's not.

As Arnold Zwicky observed via email, this apparently is a blend of

(as being something) that it's not

(as being something) more than it is

Let me emphasize again that I found Fitzgerald's presentation clear and impressive. My goal in pointing out some of his errors is not to cast a shadow on his considerable skill as a public speaker. On the contrary, these examples underline the fact that even impressive public speakers make lexical, grammatical and semantic mistakes in extemporaneous speaking, especially when the interpretive stakes are high.

This is a lesson that Jacob Weisberg and the other promoters of the Bushisms industry haven't learned, or more likely don't care to learn. As we keep explaining, people shouldn't read W's verbal blunders as being something more than they're not.

[Update: John Lawler emailed:

The first error I twigged to was one in the first selection you posted, but you didn't comment on it:
"We make no allegation that any other people who provided or discussed with Mr. Libby committed any criminal act."
Conjunction reduction has overapplied to the first disjunct VP; probably he meant to say 'provided information to' or something, but the rest of the VP with 'provide' got sluiced away, leaving 'Mr. Libby' as its erstwhile object. It's a good example of precisely what you're talking about in the post. Also a good example of how cooperative we listeners and readers are, even when we're on the lookout for mistakes.

Exactly. ]

Posted by Mark Liberman at 12:18 PM

October 28, 2005

Is splanchnic just another word for schmuck?

Over on the Logic and Language blog, my recent postings on "splanchnic" have elicited the quoting (by "logician") of a wonderful passage from Jonathan Safran Foer's Everything Is Illuminated that allows us to connect words in "-nik" (which "splanchnic" is not, though Roz Chast chose to treat it as if it were) with the World Famous Eskimo Snowclone. Here's the text, very lightly edited:

October 27, 2005

400 Words for Splanchnik

Arnold Zwicky over at languagelog has had a couple of posts about the word "splanchnik" recently... and given languagelog's fondness for jokes about the Eskimo Vocabulary Hoax,... I couldn't resist posting this excerpt from the novel I'm reading:

(Context: the Jewish-American hero-abroad is having a conversation with his Ukrainian guide. I've retained the non-standard layout of conversations from the original.)

"And I want to see what it's like now. I don't think there are any Jews left, but maybe there are. And the shtetls weren't only Jews, so there should be others to talk to." "The whats?" "Shtetls. A shtetl is like a village." "Why don't you merely dub it a village?" "It's a Jewish word." "A Jewish word?" "Yiddish, like schmuck." "What does it mean schmuck?" "Someone who does something that you don't agree with is a schmuck." "Teach me another." "Putz." "What does that mean?" "It's like schmuck." "Teach me another." ""Schmendrik." "What does that mean?" "It's also like schmuck." "Do you know any words that are not like schmuck?" He pondered for a moment. "Shalom," he said, "which is actually three words, but that's Hebrew, not Yiddish. Everything I can think of is basically schmuck. The Eskimos have 400 words for snow, and the Jews have 400 words for Schmuck." I wondered, what is an Eskimo? (p. 60 of Everything is Illuminated, Jonathan Safran Foer.)

Meanwhile, here in Palo Alto I had breakfast yesterday with Jane Robinson (the computational linguist) and Elizabeth Daingerfield Zwicky (the firewalls and systems administration guru) and the utterly adorable Opal Eleanor Armstrong Zwicky (the expert on language acquisition), during which Jane announced that she'd just learned a new word: yes, "splanchnic". Elizabeth and I gaped at her for a moment (Opal wasn't paying attention, since she was busy exercising her recently developed ability to compose sentences of more than two words), and then I defined the word, to Jane's amazement. How did I know that, she wanted to know. We explained.

Have I mentioned that "splanchnic" is on the tip of everybody's tongue these days? Hardly a day goes by...

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 07:05 PM

In or under

It's a working assumption of linguists that when there are alternative expressions, the choice between them is neither completely free nor completely determined. Extreme prescriptivists would like to maximize determination, and to make the basis for determination explicit: in any context, only one alternative should be acceptable; there should be a good reason for this choice; and it should be possible to articulate that reason. Pick either "in the circumstances" or "under the circumstances", and be ready to justify your choice as the right one.

Linguists, studying language in actual use, point out that there is an enormous amount of variation in these choices (variation between speakers, and within a given speaker, variation in different linguistic and non-linguistic contexts); that the choices are mostly matters of (rather subtle) preferences rather than crisp decisions; that these preferences interact with one another in complex ways; and that hardly any of these preferences are easily accessible via conscious reflection. You can ask people whether they would choose "in" or "under", and why, and with the options presented side by side this way, they're likely to express a preference and to produce some rationale for it, but there's absolutely no reason to think that their accounts are reliable. You have to look at what they actually do. (Judgments based on side-by-side comparisons are tricky and context-dependent even in situations where you might think they'd be straightforward, like the Pepsi Challenge discussed by Malcolm Gladwell in Blink, pp. 158-9.)

In my recent posting on "in the circumstances" vs. "under the circumstances", I started to move beyond side-by-side comparisons and into some (admittedly crude) statistics on actual use, uncovering a modest preference, in a Google web search, for "under" as opposed to "in", contrary to the advice of some prescriptivists. Now I've gone a bit further, and it looks like there's some interesting texture to the variation in this preposition choice.

A brief recap: the raw Google webhit figures were:

in the circumstances: 3,310,000
under the circumstances: 3,980,000 (in/under ratio: 0.83)

Now, I'm quite sure that there are between-speaker differences in the choice of "in" vs. "under" with "circumstances", but corpus searches like this aren't going to find them. And for individual speakers, I suspect that there is variation according to modality, style, and a number of other "external" factors, but again searches like this won't turn them up. What we can look at are factors that have to do with linguistic context, and here we strike paydirt very quickly.

First, MWDEU claims that when "circumstances" means 'financial situation', "in" is the preposition of choice, and "under" is rare. Googling supports this claim:

in reduced circumstances: 24,600
under reduced circumstances: 174 (ratio: 141.38)

This overwhelming preference for "in" extends to "circumstances" in the sense of 'personal situation' in general. With some possessive determiners:

in their circumstances: 83,800
under their circumstances: 751 (ratio: 111.58)

in your circumstances: 98,400
under your circumstances: 1,940 (ratio: 50.72)

in my circumstances: 36,000
under my circumstances: 1,470 (ratio: 24.49)

(There might be something in the difference between the pronouns, but I'm not exploring that today.)

On the other hand, the modest preference for "under" extends to PPs with the demonstrative "these" (my thanks -- I guess -- to Elizabeth Zwicky for suggesting that I look at determiners other than "the"):

in these circumstances: 2,850,000
under these circumstances: 3,020,000 (ratio: 0.94)

and then balloons for interrogative/relative "which" and interrogative "what":

in which circumstances: 61,400
under which circumstances: 134,000 (ratio: 0.46)

in what circumstances: 318,000
under what circumstances: 1,680,000 (ratio: 0.19)

Now a surprise. For demonstrative "those", "in" is preferred:

in those circumstances: 1,070,000
under those circumstances: 646,000 (ratio: 1.66)

As it turns out, the figures for "in" here are somewhat inflated by occurrences of "those circumstances" with a following relative clause in "where" or "in which", a context in which -- another surprise -- "in" is almost categorically preferred to "under":

in those circumstances where: 74,100
under those circumstances where: 538 (ratio: 137.73)

in those circumstances in which: 16,000
under those circumstances in which: 240 (ratio: 66.67)

Removing these occurrences from the overall "those" count leaves:

in those circumstances: 979,900
under those circumstances: 645,222 (ratio: 1.52)

There's still a fairly sizable preference for "in" with "those", in contrast to "these".

(There are only small numbers of other relative clause types with "those circumstances", for example:

in those circumstances under which: 37
under those circumstances under which: 12

in those circumstances which: 566
under those circumstances which: 497

in those circumstances for which: 66
under those circumstances for which: 34

I wouldn't try to draw any conclusions from these small differences.)

When we turn to quantity determiners, there seems to be a general preference for "in" over "under":

in all circumstances: 1,300,000
under all circumstances: 981,000 (ratio: 1.33)

in some circumstances: 2,850,000
under some circumstances: 1,760,000 (ratio: 1.62)

which becomes very strong for "many":

in many circumstances: 298,000
under many circumstances: 48,500 (ratio: 6.14)

and almost categorical for "a few":

in a few circumstances: 21,300
under a few circumstances: 462 (ratio: 46.10)

but is utterly reversed, in favor of "under" (almost categorically), for "no":

in no circumstances: 289,000
under no circumstances: 6,260,000 (ratio: 0.05)

In summary: the Google data suggest that "under" is preferred to "in"

(modestly)
    with determiners "the" and "these"
(more strongly)
    with determiner "which"
(very strongly)
    with determiner "what"
(almost categorically)
    with quantity determiner "no"

but that "in" is preferred to "under"

(almost categorically)
    when "circumstances" means 'personal situation'
(strongly)
    with determiner "those" in general
(almost categorically)
    with determiner "those" plus certain following relatives
(modestly)
    with quantity determiners "all" and "some"
(strongly)
    with quantity determiner "many"
(almost categorically)
    with quantity determiner "a few"

This just scratches the surface of the phenomenon, but it's enough to indicate that several effects are probably going on. As usual, the facts of usage are complex, subtle, sometimes surprising, and not easy to derive from first principles.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 05:33 PM

The longue duree is not our forte

An article about English usage by Candace Murphy in the Oct. 25 edition of "Inside Bay Area" (a publication of the Oakland Tribune) underscores the pitfalls of the "Recency Illusion" that Arnold Zwicky has eloquently blogged about in this space (here, here, and here). The article, entitled "Good Words Gone Bad," takes the typical hell-in-a-handbasket approach to "language abuse," despite objections from the very experts that Murphy quotes.

The article begins as follows:

The word is harmless enough. It has five letters. It doesn't have an inordinate number of consonants or vowels. It is commonly used in conversation.

But nearly always, it is pronounced incorrectly.

And that rankles true verbivores.

"Oh, yes. 'Forte,'" says Paul Brians with a perceptible sigh, pronouncing the word meaning a person's strength as it should be, monosyllabically without a flourishing finish on the word's final vowel. "I've given up on that one. It's a dead issue. If you went around saying 'FORT,' people wouldn't know what you're talking about. It's an error that has become a non-error."

The use, or some might say, abuse, of the English language, and the standardization of non-standard English is endless. A joke is hysterical when it's really hilarious. People buy a chaise lounge when it's really a chaise longue. Forte, probably confused with the musical dynamic of the same spelling that's borrowed from Italian, gains an extra syllable.

The skeptical reader's warning bells should be set off right away with the assertion that a particular word, in this case forte, is "nearly always ... pronounced incorrectly." It is the same alarmist approach that led a writer more than a century ago to proclaim that "every journalist and every author wherever the English language is written" is guilty of "sloppy usage" because they insist on pluralizing person as people. Everyone is wrong, and only by recognizing our linguistic condition of original sin can we hope to atone for our wayward usage and get on the straight and narrow.

A second warning bell should go off upon reading the supposition that "true verbivores" are "rankled" by such atrocities as the two-syllable pronunciation of forte in its sense of "strength." Paul Brians, the keeper of the comprehensive "Common Errors in English" website, is certainly a verbivore of excellent standing, but he shows little sign of outrage about this putative linguistic transgression (besides the "perceptible sigh" that the reporter would like us to think is telling). What does Brians actually say? "It's a dead issue." "It's an error that has become a non-error." In fact, Brians has specifically listed the two-syllable pronunciation of forte on his page of "non-errors," a position he has further emphasized in a thread on the alt.usage.english newsgroup.

It's difficult to know exactly how long English speakers have been conflating French-derived fort and Italian-derived forte (both from Latin fortis meaning "strong"), but it's safe to say it's not a new phenomenon. The Oxford English Dictionary shows that the spelling of the "strength" sense as forte rather than fort has been in use since the 18th century (probably simply an adoption of the feminine form of the French word at the expense of the masculine, akin to other Gallic borrowings like locale and morale). The two-syllable form evidently developed some time after that as a spelling pronunciation, but it has long been recognized as the primary pronunciation of the word in both American and British English.

From this spurious example, Murphy makes a great leap in logic:

It's enough to make a lexicologist wonder: Are we all just a bunch of idiots?

"I don't think so," says Brians, a professor at Washington State University who manages the Web site "Common Errors in English" and who recently wrote a companion book called "Common Errors in English Usage" (William, James and Co., $15). "People have always abused language. The specific abuses they commit just change with time, so we notice the new ones."

But today, it's quite likely that those abuses and misuses are worming their way into standard English usage at a quicker rate.

Here we see a fascinating tension between Brians — the authority who Murphy has brought in for the article (and author of a new book that provides her news hook) — and Murphy herself, who has clearly already made up her mind about the degenerative state of English usage. Brians plainly lays out the Recency Illusion as it applies to matters of disputed usage, but the reporter chooses to ignore her own source to put forward the unfounded claim that "abuses and misuses are worming their way into standard English usage at a quicker rate." (The claim is prefaced by the journalistic weasel words "it's quite likely," which we should translate as "I have nothing but anecdotal evidence for what I am saying.")

Murphy doesn't enlist any actual subscribers to the degenerationist point of view (say, a Robert Fiske or John Simon) but instead relies on Brians and the equally reasonable Erin McKean, editor of the Oxford American Dictionary. Both make the point that non-standard usage may appear to be on the rise because we are exposed to so much more of it in written form due to the explosion of online communication. (Coupled with the rise of Google and other search engines, this easy availability of an endless array of non-standardisms has spawned such pastimes as eggcorn-collecting and peeveblogging.) Brians further notes in the alt.usage.english thread that "such reading as people do is now often not professionally edited, which is a marked change from the past." This certainly heightens the Recency Illusion, though it doesn't necessarily imply what Murphy calls "the seeming increase in language abuse" (again note the weasel word "seeming").

But wait! Not only are all sorts of non-standard words and expressions spreading like wildfire across cyberspace, they're even entering our hallowed dictionaries!

Courtesy the Internet, those misuses, abuses, slang and shorthand are broadcast to the world, or at least, the World Wide Web, where they earn a standard usage of their own. Combine that with the fact that the new young crop of dictionary editors are more attuned with the Internet than their predecessors — McKean is one of the youngest, at 33, while the Oxford English Dictionary is helmed by Jesse Sheidlower, 36 — and it suddenly makes perfect sense why "podcast" is now in the dictionary.

"All the lexicographers I know who are in their 30s spend an unconscionable amount of time on the Internet," says McKean. "We're looking for language. Language is there."

That's where McKean has found words like farb (not authentic, badly done), nomenklatura (non-literally; by analogy), drabble (a short story of 100 words or fewer), haxie (a hack for the Macintosh operating system) and swancho (a combination poncho/sweater).

Though they're not in the dictionary yet, they may be coming soon to one near you. Each word is categorized by McKean as "on the brink." None of them may be right, correct, proper or even real. But McKean calls them innovations. And innovation, she says, is the essence of language.

Darn those young Turks of lexicography and their blasted Internet! Murphy should know, though, that the whippersnappers of the dictionary world are not a bunch of radicals hellbent on tossing out the old reference books in favor of Urban Dictionary. In fact, despite their penchant for online research, they tend to take the long view, patiently observing that in every generation there are those who decry new usage as barbaric, as somehow not "right, correct, proper or even real." And often that "new usage" is not so new after all, since the same bugaboos, like the pronunciation of forte as for-tay, keep getting hauled out time and time again as evidence of our linguistic degeneration.

The Recency Illusion is powerful, though, and it's very easy to ignore the longue durée in favor of a kind of naive presentism. For some reason, that presentism is particularly alluring when it comes to the shifting sands of English usage.

[Update, 10/31/2005: If you were baffled by the mention of "nomenklatura (non-literally; by analogy)" in the list of Internet-derived innovations, Languagehat has the explanation straight from Erin McKean: "she had been talking about a nonliteral use of the word nomenklatura itself, 'that is, one that referred to people that weren't Russians, but were metaphorically similar to the Russian nomenklatura.'"]

[Update, 11/18/2005: The curmudgeonly comic strip character Mallard Fillmore, who previously railed against apostrophe abuse, took up the (lost) cause for monosyllabic forte in the Nov. 13 strip.]

Posted by Benjamin Zimmer at 03:29 PM

Trademarking -ix

The Associated Press reports that a European Union court is about to rule on a trademark infringement suit filed by Les Editions Albert René, publisher of the Astérix comic books, against the mobile telephone firm Orange. What does Astérix have to do with mobile phones? Orange wants to register the trademark Mobilix for telephone services. Les Editions Albert René objects on the grounds that because Mobilix ends in ix people are likely to think that it is somehow associated with Astérix.

I don't know EU trademark law, but in the United States and Canada trademarks are only valid within a particular market sector, so in North America at least the publisher would have no case at all unless, contrary to fact, so far as I know, they had registered the trademark Astérix for telecommunications or can show that their mark is a so-called "famous name". Orange is not proposing to use the name Mobilix for a series of comic books that might compete with the Astérix books.

The linguistically interesting point here has to do with what association it is that the ending /iks/ conjures up. The publisher of Astérix claims that the association is with its series of books, in which, in addition to Astérix, a number of other characters have names ending in /iks/, such as his friend Obélix, the druid Getafix, and Astérix's dog Dogmatix. I suspect that there is actually something else going on.

The source of the suffix -ix in the Astérix books is the string rix in which the names of many actual Gaulish chieftains ended, such as Dumnorix, Vercingetorix, Orgetorix, Sinorix, Amborix, and Adiatorix. This rix consists of two morphemes: /rik/ "king" and /s/ "nominative singular case". It is the Gaulish cognate of Latin rex, whose stem is /reg/, as we see in forms such as the accusative singular regem and the nominative plural reges. The Astérix books have reanalyzed the ending rix as ix.

What I wonder is whether the association of words ending in ix is really specifically with the Astérix books or whether it is as much or more an association with things Gaulish. Names such as Vercingetorix are familiar to anyone with a classical education, in which one reads Caesar's memoir of the Gallic Wars, and should be familiar to most Western Europeans from their studies of European history. However, few people learn anything about Gaulish or for that matter the other Celtic languages, so people acquainted with such names may very well be unclear as to whether the suffix includes the /r/.

Posted by Bill Poser at 09:51 AM

Prosodically (in)correct

From Frank DeFord's commentary about the new NBA dress code, 10/26/2005 on NPR's Morning Edition, this semantico-phonetically interesting passage:

So too did the Minnesota Vikings institute a dress code not long ago [breath]
and yet the Vikes remain as [breath]
beHAViorally INcorrect as they may be [breath]
sarTORially COrrect.

This is a matched pair of playful allusions to the phrase "politically (in)correct". The first few pages of web search results for incorrect yield the other adverbially (in)correct echoes conservatively, patriotically, therapeutically, commercially, musically, spiritually and environmentally, and there must be dozens if not hundreds more of them out there.

In DeFord's case, the paired allusions create a nice example of double contrastive focus and the use of phrasing for emphasis. The feature that caught my attention, though, was the intonational highlighting of the prefix in- and the word fragment co-.

A commoner sort of contrastive focus is exemplified by DeFord's contrast between behaviorally and sartorially. As part of the performancy of this contrast, he highlights intonationally the syllable of each word that is its normal main stress. This is the usual effect of contrastive focus in English: the pitch-contour effect is strongest on the main-stressed syllable of a focused word or phrase. But in the case of the contrast between incorrect and correct, the syllables that mainly inherit the phonetic effect are not the normal main stresses.

This sort of thing was discussed in Ron Artstein, "Focus below the word level", Natural Language Semantics 12(1): 1-22, 2004. Artstein cite an analogous example from Bolinger (1963):

natural REgularity (“in a context that implied an opposition to IRregularity”)

Here's a display showing a pitch contour, a spectrogram and a waveform for DeFord's contrasts:

This display was created with WaveSurfer, an "Open Source tool for sound visualization and manipulation" that I recommend for casual users because it's relatively easy to learn to use in simple ways. Another excellent open-source program, Praat, offers a much wider range of functions, and is also better for producing publication-quality graphics rather than simple screenshots like that shown above.

Posted by Mark Liberman at 04:52 AM

Miered in doubt

Now that she has withdrawn her name from the Supreme Court nomination process, what will the linguistic legacy of Harriet Miers be? Will she be remembered as a supposed stickler in matters grammatical who ran afoul of subject-verb agreement in her first public statement as a nominee? Or will history record Miers' punctuation style, either her "trouble with commas" in written responses to Senate questions or her exuberant use of exclamation points in correspondence with President Bush when he was governor of Texas?

Perhaps Trent Lott is right to wonder, "In a month, who will remember the name Harriet Miers?" But an Associated Press article suggests that Miers' lasting legacy will only be her name, converted into a verb in the manner of Robert Bork, her predecessor in nomination termination (or "SCOTUS interruptus," as several wags have termed it). Like Bork, Miers has been eponymized primarily in the passive voice: Bork got Borked, Miers got Miered. Though Miered lacks the phonesthemic punch of Borked, it does of course have the benefit of being a pun on mired. Semantically there's a distinction too, according to the blogospheric sources quoted by the AP:

A contributor to The Reform Club, a right-leaning blog, wrote that to get "borked" was "to be unscrupulously torpedoed by an opponent," while to get "miered" was to be "unscrupulously torpedoed by an ally."

S.T. Karnick, co-editor of The Reform Club, elaborated.

"If you have a president who is willing to instigate a big controversy, the prospect of being 'borked' will be the major possibility," he said. "But if you have a president who is always trying to get consensus, then it's much more likely that nominees will get 'miered.'"

On The National Review Online, a conservative site, a contributor suggested that "to mier" means "to put your own allies in the most untenable position possible based upon exceptionally bad decision-making."

You don't have to fail in the Supreme Court nomination process to get your own verb. Justice David Souter was also eponymized, though it was well after his elevation to the Court. When the reticent John Roberts was announced as Bush's choice to fill Sandra Day O'Connor's anticipated vacancy, there was talk of him being Soutered. As CNN explained, "to 'Souter' has come to mean to pick a candidate without knowing much about him." It is presented as the opposite of getting Borked; where Bork had a voluminous record of legal opinions, Souter was an enigma at the time of his nomination by Bush Sr. and thus was difficult to attack during Senate questioning. But conservatives have also used Soutered to refer to the betrayal they felt when Souter joined with moderate to liberal opinions on the bench. "Won't get Soutered again," they vowed.

I note that one of the potential replacements for Miers, according to the New York Times, is Judge Diane Sykes of the Seventh Circuit. Is it too early to wonder if conservatives would be Syked about her nomination?

Posted by Benjamin Zimmer at 01:00 AM

October 27, 2005

Forbes Special Report

Forbes.com "Home Page for the World's Business Leaders" has taken a break from the accumulation of capital and oppression of the masses to publish a special report on Communicating with pieces by some well-known authors. There are a couple by Noam Chomsky On the Spontaneous Invention of Language and Why Kids Learn Languages Easily. You can hear his voice if you like here.

Legendary primatologist Jane Goodall has a piece on Why Words Hurt and another on The Dangers of Email. If you're interested in more monkey business there is an essay by Carl Zimmer entitled Can Chimps Talk?.

Stephen Pinker has a discussion of Why We Have Language and an audio clip on an What We Don't Know about Language.

The scope of the issue extends beyond linguistics, with pieces by writer Kurt Vonnegut, magician David Copperfield, futurist Ray Kurzweil, and even Walter Cronkite.

Posted by Bill Poser at 07:54 PM

The generous village of Yuwawer

For months now my local NPR radio station (WBUR at Boston University) has asserted in a pre-recorded announcement that they play at least once an hour: "Our support comes from Yuwawer listeners." I imagined this place, some small village or township maybe, out in the Massachusetts countryside, or maybe up in New Hampshire, where the people were extraordinarily generous. Perhaps Yuwawer just happened to have residents who were mainly retired bond traders or Internet billionaires who made their money in the 1990s and got out before the downturn. I had no idea where Yuwawer was, but it seemed that the listeners there gave so much that the rest of us hardly needed to.

Only in the last few days did it finally dawn on me that there was no such place. "Our support comes from you, our listeners" is what they were saying. I got the message. I called the station and pledged. Do likewise. You know you should. The NPR fall fund drive is on now. Go pledge. My Republican friend Paul thinks NPR is outrageously biased in favor of the most extreme liberal Democrat ideas; my anarchist friend Jim thinks NPR is disgustingly subservient to right-wing Republican corporate and political pressures. So it really can't be all that bad, can it? Go to the phone now, and pledge generously. They do need the money. NPR is like Language Log, there to entertain and inform you every day, except that they have serious, major, ongoing staff expenses to cover.

Language Log is in a very different position. It is supported by grants of imagination from linguists across the country, and the construction of our office tower at Language Log Plaza was paid for by a generous contribution from the Yuwawer Retired Billionaires Foundation.

Posted by Geoffrey K. Pullum at 12:28 PM

Artifacts of the spellchecker age

The New York Times has yet to issue a correction for the joke-ruining error in its Oct. 25 review of "The Colbert Report." This is somewhat surprising, since the Times prides itself on eventually rectifying even the most minuscule of typos that slip by the copy editors. According to Slate media critic Jack Shafer, there has been a "corrections culture" at the Times since the '70s, which "seems to revel in correcting every misspelling, transposed digit, historical inaccuracy, and boner." (The Times even authorized a book collecting its most amusing corrections, called Kill Duck Before Serving.)

But another recent error that did get corrected sheds some unexpected light on the situation.

In the Oct. 26 edition, we find this doozy of a correction for a sports article by Ray Glier:

Because of an editing error, a sports article in some copies on Sunday about the University of Alabama's 6-3 football victory over the University of Tennessee misstated the given name of a linebacker who is a leader of the Alabama defense. He is DeMeco Ryans, not Demerol.

Ouch! Warren St. John, one of Glier's fellow sports reporters at the Times, saw the correction and had this to say on the blog for his Rammer Jammer Yellow Hammer site:

All rightee then.

While we're at it, we've been meaning to run the following correction for a while: a previous post on the RJYH blog misstated the name of Alabama's head coach. It is Mike Shula, not God-I'm-dying-for-a-bourbon-and-water Shula.

How could Glier possibly have come up with Demerol instead of DeMeco? Could it have been some inside joke about the narcotic qualities of Alabama's defensive line? I doubt it. I think Glier innocuously wrote DeMeco in his story, but then a nefarious force changed the spelling for him: a runaway spellchecker. If you type DeMeco in a Microsoft Word document, you'll get the telltale squiggly red underline that indicates the word is not in the custom dictionary. And if you ask for a suggestion, the very first alternative provided is none other than Demerol.

So what about Alessandra Stanley's goof, replacing Stephen Colbert's sublimely silly truthiness with the pedestrian trustiness? I had assumed the error was a sort of anticipatory assimilation, since the word trust appears later in the same paragraph. But that may just have been a coincidence. If you type truthiness in MS Word, sure enough you're given trustiness as the first suggestion (followed by trashiness, frothiness, and trotlines, the last of which refers to a type of fishing line).

We don't know if Glier and Stanley applied the wayward spellcheckers themselves or if some intervening copy editor is to blame. Either way, this sort of spellchecking artifact is all too common these days, as anyone who has had to slog through college term papers can attest.

Some of the substitutions are rather startling. Back in 1996, this example was noted by a contributor to the Usenet newsgroup alt.usage.english:

This happened a few weeks ago to the menu of a well-to-do restaurant here in San Francisco. The menu was spell-checked, printed, and a copy displayed in the window of the restaurant (as is the custom here). Nobody noticed that the spell-checker turned "warmed spring salad greens with prosciuto" into "warmed spring salad greens with prostitutes."

I'm guessing that the restaurant menu actually had singular prostitute in place of the intended prosciutto (or prosciuto, if the writer missed the extra t) — amazingly enough, the custom dictionary in my copy of MS Word accompanying Office XP still doesn't recognize the spiced Italian ham and suggests prostitute instead. (There's a Sopranos joke in there somewhere.) Others have fallen prey to the same unfortunate replacement, as in this recipe appearing on a message board for Italian food:

Crumble bread sticks into a mixing bowl. Cover with warm water. Let soak for 2 to 3 minutes or until soft. Drain. Stir in prostitute, provolone, pine nuts, 1/4 cup oil, parsley, salt, and pepper. Set aside.

Some spellchecker artifacts only show up when a particular typo is made. In another case noted on alt.usage.english, the misspelling of acquainted as aquainted has caused some spellcheckers to suggest aquatinted instead. (That word, by the way, refers to etchings made using aquatint, a process that makes a print resemble a water color.) Thankfully, it appears that MS Word has fixed this one, as aquatinted now comes in second place to acquainted in the list of suggestions. But the damage has been done, as evidenced by thousands of Googlehits. (Another example of this kind showed up not too long ago on the Eggcorn Database: amature, a misspelling of amateur, is often transformed by spellcheckers into armature.)

One very odd type of substitution started popping up a few years ago among users of Yahoo Mail. If an email in HTML format was sent to a Yahoo address and contained the string eval, it would mysteriously get changed to review when the message was received. So medieval became medireview, retrieval became retrireview, primeval became primreview, and so forth. (In French messages, the word for horse, cheval, would become chreview.) It turned out that eval was one of a number of strings that Yahoo's security filter automatically replaced in order to prevent cross-site scripting attacks. (Bizarrely, it also replaced mocha with espresso and expression with statement.) This was far more insidious than spellchecker substitutions, as the replacements were made automatically, without the user's knowledge. Yahoo eventually changed its security filter, but once again the Googlehits live on. Future generations will no doubt chuckle at our technologically medireview times.

[Update, 11/1/05: The Times has issued a correction for the truthiness/trustiness confusion. The spellchecker remained blameless.]

Posted by Benjamin Zimmer at 01:00 AM

October 26, 2005

Preposition the circumstances

In my posting on "in terms of", I said in passing:

There is a more or less constant pressure to bulk up simple prepositions for the purposes of emphasis; just last week I caught someone saying "within the circumstances", presumably to improve on the simple "in". Brevity is not the only virtue.

I was then moved to look at the frequency of "within the circumstances" -- more than I thought, but way less than the frequency of "in/under the circumstances" -- and to recall that I was taught in grade school that only "in the circumstances", and not "under the circumstances", was correct (because "circum-" means 'around'), which led me to look at MWDEU's informative and entertaining entry on "circumstances".

Here are the raw Google webhit figures:

within: 16,200
in: 3,310,000
under: 3,980,000

The 16,200 figure is not to be sneezed at, but it's totally dwarfed by the others, which are 200-250 times as large. So far as I know, usage manuals do not yet complain about "within" in this context, but now that I've pointed out this minority option, maybe they soon will. Sigh.

As for "in" vs. "under": "under" is somewhat more frequent than "in", and the OED2's cites have "under" appearing before "in", but not enormously long before, so we'd conclude that the two prepositions are just stylistic options, with maybe a bit of an edge for "under". OED1 claims to see a meaning distinction between the two prepositions here -- 'mere situation' for "in" vs. 'action affected' for "under" -- and this claim was carried over into OED2, but few commentators now agree with it (or even understand it). It might be that the most important difference between the two is that "in" is normally unaccented, "under" accented. It certainly seems to be that some people tend to prefer one and some the other. But not much is known about the details of the choice between "in" and "under".

What makes the MWDEU entry so entertaining is the history of the proscription of "under the circumstances". First, it's an instance of a subtype of the Etymological Fallacy, which we here at Language Log Plaza comment on so frequently: the combinatory possibilities for "circumstances" are being dictated by the etymology of the word. Second, the proscription (with its EF underpinning) appears to have been a sheer invention, possibly by Walter Savage Landor in 1824. About a century after this, critics began to notice the issue, but for the most part allowed both prepositions. Nevertheless, the proscription of "under" would not die -- it's another zombie rule, this time one specifically not endorsed by Fowler (or, for that matter, Garner, who fancies himself Fowler's present-day heir) -- and it continues to surface every so often in people who passionately disapprove of "under the circumstances". I myself am a victim of this zombie rule: though I don't object to "under"-- I know too much about the facts to do that -- I use "in" almost exclusively, as far as I can tell, because that's what was ground into me in childhood. Another sigh.

[Update, 10/27/05: Oh, it gets even better. Reader Joe Heininge writes to suggest that the "original meaning" of "under" included a sense 'among, in the midst of', which would mean that "under the circumstances" would have an etymological pedigree as good as "in the circumstances". I don't know about the original meaning of "under" (or of any word), but OED2 supplies sense 6a -- "With reference to something which covers, clothes, envelops, or conceals; passing into the sense of 'within'" -- which would seem to fill the bill pretty nicely. Citations with a 'within, inside (of)' sense go back at least to the 15th century. Here's a modern quote, from the Habits of Good Society (1859): "If you do not wear silk stockings under your boots".]

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 09:15 PM

Splanchnic on view

The Roz Chast cartoon panel for the word "splanchnic" (mentioned not long ago here) is now, thanks to the kindness of Erin McKean, viewable via this link.

Posted by Arnold Zwicky at 07:52 PM

More playful alluding

In my posting on the Eye Guy figure, I observed some places where playful allusions to formulas, idioms, titles, quotations, etc. were especially dense -- among them titles of porn flicks and in headlines on feature stories, especially those about science. Now, in mail from Don Porges, a link to a transcript of a Saturday Night Live sketch about a team from "Velvet Productions" searching for titles for gay porn films. And, from today's NYT "Science Times" section and from the front matter of the 10/14/05 issue of Science, some playful heads (science with a light touch!).

The SNL sketch has a team faced with naming porn versions of recent popular movies. In order, they come up with:

The X-Men >> The Sex-Men
Lord of the Rings >> Lord of the Rims
Sweet Home Alabama >> Sweet Home Alan's Butthole
Bend It Like Beckham >> Bend Over Like Beckham

At this point they are baffled, just baffled, as to how to rework "The Pianist" or "Holes".

Meanwhile, in the NYT "Science Time" section there's a feature ("Observatory", by Henry Fountain) that presents brief research notes, almost all with playful heads. On 10/26/05, we get "Judging Craters" (Judge Crater, who of course has nothing to do with craters on Jupiter's moon Europa), "The Bird Next Door" (the boy next door), "Antifreeze: Fleas Do It" (Cole Porter's line "Birds do it, bees do it, even educated fleas do it" from "Let's Fall in Love" -- a line that, I see from Googling, has been widely used by writers on science, sex, and many other topics), and "Lambchop Genome" (which has me puzzled; it's like the porn flick titles, I often just don't get it). Just above Fountain's column is a piece on arachnid taxonomist Norman Platnick, titled "The Exciting Adventures of Spider Man: From the Vial to the Tree of Life". There's more, but let's turn to Science, most of which is forbiddingly technical (as befits the journal of the American Association for the Advancement of Science).

However, the front section of each issue has brief pieces, about research and about the political and social setting of science, and these often have punchy heads. On page 207, we get "Doing the Splits", about a database on cell division processes, and on page 227 "Brane Teaser", about a puzzle for string theorists concerning surfaces called "branes". Thigh-slappers both. (Page 227 also has "A Lying Matter", about the white matter in the prefrontal cortex of compulsive liars. Once again, I feel so stupid at not getting the reference.)

Even the Atlantic Monthly gets into the act when it reports on research. In the October 2005 issue, a brief article on a study of investing and gender (p. 46) is juiced up by the title "Stocks and Blondes".

I'm waiting for the cross-fertilization of the genres. What would you title a gay porn flick about the craters of Europa (no points for switching the setting from Jupiter to Uranus), or about cell division (meiosis? mitosis?), or about investing and gender?

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 07:24 PM

French syntax is (in)corruptible

One of the most striking ideologies of linguistic uniqueness is the belief that French exactly mirrors the inner language of logical thought. A few minutes of research led me to the conclusion that the source of this meme, or at least its earliest example, is an essay by Antoine de Rivarol, "L'Universalité de la langue française". In 1783, the Berlin Academy held a competition for essays on the subject of the widespread usage of French, and its prospects for continuing as the lingua franca of European intellectuals. Apparently nine submissions argued that French would continue; nine that it would be replaced by German; and one that Russian would win out. (English got no votes.) Antoine de Rivarol shared the prize with Johann Christoph Schwab.

De Rivarol's essay is the source of the often-quoted phrase Ce qui n'est pas clair n'est pas français ("What is not clear is not French"). My (doubtless faulty) translation of the relevant passage is below the jump.

(The paragraphing has been added for easier online reading.)

What distinguishes our language from other ancient and modern languages is the order and the structure of the phrase. This order must always be direct and necessarily clear. French names the subject of the discourse first, then the verb which is the action, and finally the object of that action: there is the logic that is natural to all men; there is what constitutes common sense.

Now this order, which is so favorable, so necessary to reasoning, is almost always contrary to sense impressions, which name first the object that is first to strike them. That is why all peoples, abandoning the straightforward order, have had recourse to more or less adventurous turns of phrase, as their sense impressions or the associations of the words lead them to do; and inversion has prevailed on earth, because man is more urgently governed by passion than by reason.

The French language, by unique privilege, is the only one to remain faithful to the straightforward order, as if it were all reason, and it doesn't matter if we disguise this order by the most varied movements and all the resources of style, it necessarily still exists; and in vain do the passions overwhelm us and urge us to to follow the order of our sense impressions: French syntax is incorruptible.

This is the source of that admirable clarity that is the eternal foundation of our language. What is not clear is not French; what is not clear is still English, Italian, Greek or Latin.

To learn the languages with inversions, it is enough to know the words and their inflections; to learn the French language, we must also retain the word order. We could say that a totally elementary geometry, a simple straight line, forms the French langaage; and that curves in their infinite varieties preside over the Greek and Latin languages. Our [language] rules and guides thought; their [languages] rush into the labyrinth of sense impressions and lose their way, following all the caprices of association; thus were [those languages] marvelous for oracles, while our language has completely disparaged them.

My comment: M. de Rivarol, puis-je vous présenter M. Derrida? But this is unfair. A century after de Rivarol's prize essay and a century before Derrida flourished, the idea that ce qui n'est pas clair n'est pas français had died a natural death among intellectuals, and Mallarmé performed its funeral oration. What Degas famously said, leaving one of Mallarmé's lectures, was "Je n'y comprends rien, rien!" not "Ce n'est pas français!"

Here's the original passage from de Rivarol's essay:

Ce qui distingue notre langue des langues anciennes et modernes, c'est l'ordre et la construction de la phrase. Cet ordre doit toujours être direct et nécessairement clair. Le français nomme d'abord le sujet du discours, ensuite le verbe qui est l'action, et enfin l'objet de cette action : voilà la logique naturelle à tous les hommes ; voilà ce qui constitue le sens commun.

Or cet ordre, si favorable, si nécessaire au raisonnement, est presque toujours contraire aux sensations, qui nomment le premier l'objet qui frappe le premier. C'est pourquoi tous les peuples, abandonnant l'ordre direct, ont eu recours aux tournures plus ou moins hardies, selon que leurs sensations ou l'harmonie des mots l'exigeoient ; et l'inversion a prévalu sur la terre, parce que l'homme est plus impérieusement gouverné par les passions que par la raison.

Le français, par un privilège unique, est seul resté fidèle à l'ordre direct, comme s'il était tout raison, et on a beau par les mouvemens les plus variés et toutes les ressources du style, déguiser cet ordre, il faut toujours qu'il existe ; et c'est en vain que les passions nous bouleversent et nous sollicitent de suivre l'ordre des sensations : la syntaxe française est incorruptible.

C'est de là que résulte cette admirable clarté, base éternelle de notre langue. Ce qui n'est pas clair n'est pas français ; ce qui n'est pas clair est encore anglais, italien, grec ou latin.

Pour apprendre les langues à inversions, il suffit de connoître les mots et leurs régimes ; pour apprendre la langue française, il faut encore retenir l' arrangement des mots. On diroit que c'est d'une géométrie toute élémentaire, de la simple ligne droite que s'est formée la langue française ; et que ce sont les courbes et leurs variétés infinies qui ont présidé aux langues grecque et latine. La nôtre regle et conduit la pensée ; celles-là se précipitent et s' égarent avec elle dans le labyrinthe des sensations, et suivent tous les caprices de l' harmonie : aussi furent-elles merveilleuses pour les oracles, et la nôtre les eût absolument décriés.

Posted by Mark Liberman at 07:02 AM

Truthiness or trustiness?

The New York Times has bigger headaches to deal with right now, but they blew the punchline to some linguistic humor that appeared in last week's premiere of "The Colbert Report," Stephen Colbert's satirical spinoff of "The Daily Show with Jon Stewart" on Comedy Central.

First, here is a transcript of what Colbert actually said on the Oct. 17 show, in his deadpan sendup of the "talking points" that Bill O'Reilly is so fond of spewing at viewers of his Fox News show. (Colbert captures O'Reilly's pseudo-demotic inanities perfectly, complete with sleek but pointless graphic accompaniments.)

And on this show, on this show your voice will be heard... in the form of my voice. 'Cause you're looking at a straight-shooter, America. I tell it like it is. I calls 'em like I sees 'em. I will speak to you in plain simple English.

And that brings us to tonight's word: truthiness.

Now I'm sure some of the Word Police, the wordanistas over at Webster's, are gonna say, "Hey, that's not a word." Well, anybody who knows me knows that I'm no fan of dictionaries or reference books. They're elitist. Constantly telling us what is or isn't true, or what did or didn't happen. Who's Britannica to tell me the Panama Canal was finished in 1914? If I wanna say it happened in 1941, that's my right. I don't trust books. They're all fact, no heart.

(A video clip of this segment is available here and here.)

Now here is how Colbert's monologue was paraphrased in the Times' Oct. 25 review of the show by Alessandra Stanley:

On his regular feature "The Word," Mr. Colbert routinely mocks the kind of anti-intellectual populism perfected by Fox News. "Trustiness" was his word of the day, he told viewers with a poker face, sneering at the "wordanistas over at Webster's" who might refute its existence. "I don't trust books," he explained. "They're all fact and no heart."

It's easy to see how Stanley accidentally substituted trustiness for truthiness, since it anticipates Colbert's blowhard assertion, "I don't trust books." But it completely ruins the joke, of course, since the "wordanistas over at Webster's" wouldn't have a problem with trustiness — nor would their colleagues at American Heritage, Encarta, et al. Even non-wordanistas would likely recognize trustiness as a rather unremarkable nominalization of the everyday word trusty (though it's certainly not as common as the similar nominalization trustworthiness).

But just in case Stanley didn't kill the humor entirely, let me finish the job by pointing out that truthiness wouldn't necessarily offend the Word Police either, since it actually appears in the Oxford English Dictionary. The OED has an entry for truthy, marked "rare or dialectal" and defined as "characterized by truth; truthful, true." The derived form truthiness (meaning "truthfulness, faithfulness") follows, supported by this citation:

1824 J. J. GURNEY in Braithwaite Mem. (1854) I. 242 Everyone who knows her is aware of her truthiness.

While I'm deflating lexicographical jokes, let me tackle the old grade-school gotcha game, "Did you know gullible isn't in the dictionary?" There's a kernel of truth to this, actually. As Donna Richoux noted in a discussion on the Usenet newsgroup alt.folklore.urban back in 2002, gullible doesn't appear in Noah Webster's 1828 American Dictionary of the English Language. (Here is one online transcription of the relevant page.) It turns out that gullible is a relatively recent word — it was still quite new when Webster published his dictionary, as the OED's earliest citation is from 1825. Also, the word was formed by a rather circuitous route, according to the OED's etymological information. It was evidently a back-formation of gullibility (dated to 1793), which in turn was an alteration of cullibility (1728), ultimately from cull (1698), meaning "a dupe". It's a puzzlingly roundabout derivation, considering that gull meaning "to dupe" dates back to c.1550.

That's all from this wordanista.

[Update, 10/27/05: Stanley may have been a victim of her spellchecker. Details here.]

[Update, 11/1/05: The Times has issued a correction.]

[Update, 1/6/06: The American Dialect Society has selected truthiness as Word of the Year.]

[Update, 1/10/06: Colbert gets personal.]

[Update, 1/13/06: More feuding over truthiness.]

[Update, 1/16/06: The OED may have erred in dating its citation to 1824 — it looks like it's actually from 1837.]

[Update, 1/19/06: Is the truthiness train finally grinding to a halt?]

Posted by Benjamin Zimmer at 01:00 AM

Software Libre in Cambodia

According to The South China Morning Post the Open Forum of Cambodia is translating FLOSS software into Khmer in order to make computers accessible to the average Cambodian. Most Cambodians do not know English and cannot begin to learn even basic tasks such as word processing, email, and web surfing without taking expensive English lessons. An additional motivation is national pride. Ourn Bora, chief of cabinet for Stung Treng province, is quoted as saying:

This is about national dignity to see Khmer script on a computer. It is historical.

The use of FLOSS software is attractive for this project for two reasons. Its low cost makes it accessible to everyone even in a country like Cambodia with a per capita GDP of US$2,000. Even more important, Cambodians can modify FLOSS software and distribute their modifications. Proprietary software leaves the user at the mercy of its owner. Microsoft has said that it has no intention of providing a Khmer translation of its software.

The software being localized includes the OpenOffice.org suite of a word processor, spread sheet, presenter, drawing program, and database, the Firefox browser and the Thunderbird email program. The project is also developing Khmer equivalents for technical computer terms so as to make them easier to learn for the average person for whom English terms are difficult.

Posted by Bill Poser at 12:34 AM

October 25, 2005

Better Not Use Q and W

A Turkish court has fined 20 Kurds 100 lira (US$74) for holding up placards at a New Year's celebration containing the letters Q and W according to a Reuters report. These letters are used in Kurdish but not in Turkish. Using them therefore violates the law of November 1, 1928 on Adoption and Application of Turkish Letters, whose purpose was to change the writing system of Turkish from the Arabic-based Ottoman system to the Roman-based system developed under the secular modernizing regime of Mustafa Kemal "Attatürk".

Although this represents a technically correct application of the statute, it is nonetheless selective as these and other letters not used in Turkish are commonly used in advertising without incurring prosecution. Here, for example, is the web site of Xerox Turkey which uses the letter X, and here is a page containing information about the program Quark Express, which uses both Q and X. Turkey is evidently having some difficulty in fully implementing the committment it made to the European Union to terminate its suppression of the Kurdish language. If I didn't know that there is no relationship between the structure of a language and the culture and character of its speakers, I would wonder how it could be that such a beautiful language as Turkish could have such turkeys among its speakers.

Posted by Bill Poser at 11:12 PM

Frankenstrunk, by Jan Freeman

God bless Jan Freeman. At least there is one newspaper writer on language who has, in addition to great style and humor, good research and a real sense of what is important and what is not. And in the case of Sunday's column, a better critique of Strunk and White ("this aging zombie of a book") in its stupid new full-color illustrated edition ("a colorful shroud on a corpse that's overdue for burial") than I could imagine writing myself. I will say nothing more about the October 23 "The Word" column in the Boston Sunday Globe, headed "Frankenstrunk", other than this: go and read it.

Posted by Geoffrey K. Pullum at 02:06 PM

Special linguistic providence

The idea that the English language is special, especially in its willingness to adopt or invent vocabulary, is favored in the popular imagination and therefore in popular writings on language. But assertions that "English is special" run immediately afoul of two prejudices of modern linguists: that all languages are roughly the same on all evaluative dimensions, and that asserted generalizations ought to be supported by evidence.

A few days ago, Bill Poser complained when Daphne Bramham observed in the Vancouver Sun that "English is the most idiosyncratic and wordiest of all languages". Back in June of 2004, Eric Bakovic took Richard Lederer to task for asserting that "the essential reasons for the ascendancy of English lie in the internationality of its words and the relative simplicity of its grammar and syntax". Eric's discussion, including an exchange of emails with Dr. Lederer, is good fun, well worth reviewing.

Other languages have popular notions of specialness as well: many Japanese are convinced that Japanese is uniquely difficult; some Chinese believe that Chinese is especially succinct and efficient; Umberto Eco mentions the seventh century Irish grammarians who "said that that the Gaelic language was created after the confusion of tongues by the 72 wise men of the school of Fenius... so that the best of every language was selected and retained in Irish, which was perfect because it preserved the original isomorphism between words and things".

Outsiders and insiders generally see the same language in terms of different stereotypes. Thus native speakers of English are unlikely to resonate with Jean-Claude Sergeant's notion that English is "rigidly structured", "characterized first by an extreme concern for coherence and for explicitness approaching redundancy", and required to "avoid all ambiguity as to the identity of agents intervening in a phrase". And Steve Thorne found that the "pleasantness" of 20 English accents was rated very differently by natives and non-natives. (For example, British listeners found Birmingham speech 'boring', 'wrong', 'irritating', 'grating', 'nasal', and 'whingey', while non-native listeners found it 'nice', 'melodic', 'lilting' and 'musical'.)

Languages are certainly different, and they do actually differ in most of the ways touched on by these stereotypes. For example, Chinese text in fact seems to be significantly more compact than corresponding English text, though it's not clear whether this is a fact about the languages or about their writing systems. (In some specific cases, such as the Olympic slogan discussed by Victor Mair, Chinese versions are longer phonetically, morphologically, lexically, orthographically, cybernetically and even conceptually.) And I've been told that some languages do indeed resist lexical borrowing, for language-internal reasons rather than for reasons of cultural preference.

The trouble with this whole area of discussion, in my opinion, is not that assertions of linguistic difference or linguistic specialness are always false. Rather, the problem is the low ratio of fact and insight to glib and evocative generality. There is little concern for facts as opposed to anecdotes, little interest in distinguishing differences in language from differences in writing systems or more general cultural differences, and hardly any effort to avoid confirmation bias and to control for the profound effects of ethnic stereotypes, exoticism and nationalist or anti-nationalist agendas. Generalizations about evaluative differences among languages and dialects are less pernicious than ethnic, racial and sexual generalizations are, but they're just as difficult to study objectively and fairly.

On a lighter note, linguistic and non-linguistic stereotypes combine cheerfully in John Cowan's Essentialist Explanations, where we'll leave them all for now.

Posted by Mark Liberman at 06:33 AM

Peeveblogging

Lately it seems as if everywhere you look there are practitioners of what Deborah Cameron has called "verbal hygiene" (all manner of activities "born of an urge to improve or 'clean up' language," as Cameron puts it in her book of that title). From best-selling authors to cartoonists, from op-ed columnists to Supreme Court nominees both approved and waiting in the wings, everyone's getting into the act. Now verbal hygienists are exploring a new ecological niche: the blogosphere.

Of course, the blogosphere is already much more than an ecological niche — it's a thriving ecosystem unto itself (a blogobiosphere?), able to support its own abundant array of niches, subniches, and subsubniches. Thus it's not surprising that we now find "peeveblogging" — weblogs slavishly devoted to particular points of grammar, punctuation, or usage. Here are two examples, though no doubt there are many more lurking out there in the underbrush.

Literally, A Web Log: "an English language grammar blog tracking abuse of the word literally." According to the creators, Patrick Fitzgerald and Amber Rhea, "it started as a nit-picking distraction, grew to a frustrating obsession, and finally resulted in the creation of this blog." Examples are found in the wild and categorized on the blog as "correct," "incorrect," or "unnecessary." "Incorrect" items are those where literally intensifies a figurative expression, while "unnecessary" items are those where it is used as a more general intensifier. (The American Heritage Dictionary, which includes both of these senses in its entry for literally, observes that critics have complained about the seemingly contradictory usage of this word for more than a century.)

Apostrophe Abuse: "links and visuals illustrating a grammar pet peeve." This is quite a common gripe, as we have noted before — the blog even links to a British organization calling itself The Apostrophe Protection Society. Blog entries lean heavily on the greengrocer's apostrophe, with numerous photographs of offending signs. A recent example:

(The blogger, who goes only by Chris, wryly speculates, "Maybe there is a 'Thai' Herb, and this is his restaurant.")

The granddaddy of peeveblogging actually predates the era of rampant blogification. Begun in the antediluvian year of 1996, The Gallery Of "Misused" Quotation Marks was faithfully curated by the visionary Evan "Funk" Davies. Sadly, it seems that Davies stopped updating the site in 2000 (a broken link to CDnow, long since overtaken by the Amazon behemoth, sits forlornly at the top of the page). But the Gallery's spiritual descendants live on in the new cyberecology. Let a thousand grousers grow.

Posted by Benjamin Zimmer at 01:00 AM

October 24, 2005

In terms of recency,...

Here's a reporter ranting about the phrase "in terms of" in an op-ed piece in his newspaper:

In terms of "in terms of," I just wanted to put this in terms that you might understand. In terms of "in terms of," everything is so much in terms of "in terms of" these days that the terms have stopped making sense...

"In terms of," I think, should be terminated...

In terms of historical perspective, "in terms of" is the new "like" -- or, for the Northern Californians among us, the new "hella."...

Who is this grammatical curmudgeon? Is this one of the seasoned journalistic complainers -- maybe Philip Howard, James Cochrane, or William Safire? And is "in terms of" some new affliction, a recently spreading noxious weed in the garden of English?

The journalist is Jake Wachman, a Stanford senior (majoring in Science, Technology and Society), writing in the 10/13/05 issue of the Stanford Daily (p. 4). He seems to have just noticed the idiomatic "in terms of" and so thinks it's a recent thing and is all over the place -- the Recency and Frequency Illusions I've described here on Language Log. But in fact the usage has been around with some frequency since roughly the time Wachman's grandparents were teenagers: MWDEU suggests that it became popular following World War II and notes that it has been widely deplored in the advice literature on English grammar and usage at least since 1954.

(The Recency Illusion has been getting some press recently. Jan Freeman's "The Word" column in the Boston Globe for 10/9/05 looked at examples from her correspondence, under the heading "Losing our illusions", concluding: "When you spot what looks like an upstart usage, it's probably later than you think." And several conference papers by members of the Stanford ALL Project -- Buchstaller & Traugott at the Studies in the History of the English Language 4 conference in Flagstaff a few weeks ago, Buchstaller & Deeringer and Rickford et al. at the New Ways of Analyzing Variation 34 conference in New York City over the weekend -- mention it prominently. Full disclosure: I am a member of the Stanford ALL Project -- since the Rickford et al. paper appeared on the program as being by "Stanford ALL Project", I've begun thinking that all five of the faculty members involved should start introducing ourselves as "Professor Stanford A. Project" -- but I didn't write the bits that mention my Language Log postings.)

The standard objections to "in terms of" are that it's wordy, three words where one preposition ought to do (OMIT NEEDLESS WORDS, as Strunk tells us, succinctly and sternly), and that it's imprecise. MWDEU notes (with relevant examples) that the imprecision can be a virtue and that replacing "in terms of" with a simple preposition can be a tricky task, sometimes requiring major reworking of other parts of the sentence. It could have also noted that the three words of "in terms of", one of them a noun that is normally accented, might have some value too: the phonological weight of the expression puts some emphasis on it, highlighting its (relationship) semantics. There is a more or less constant pressure to bulk up simple prepositions for the purposes of emphasis; just last week I caught someone saying "within the circumstances", presumably to improve on the simple "in". Brevity is not the only virtue.

Wachman's rant focuses on one use of "in terms of" that most of the manuals don't separate from the larger bulk of uses: sentence-initial topic-marking "in terms of":

"In terms of office hours, I recently heard a teaching assistant say, "they are from two until four." In terms of better ways to say that same thought, he could have said, "Office hours are from two to four." That would have half as many words, four fewer syllables and a much better sound to boot.

I'm not sure what Wachman's metric for goodness of sound is, but I suspect that he's just repeating his objection to "in terms of", expressed now as an objection to its very sound. (Linguistic pet peeves are like that. They start to sound ugly to the peeved.) Otherwise, we're back to brevity. But the teaching assistant wasn't just providing information about office hours. He was announcing that he was going to say something about office hours, and then he said it. He was making the topic explicit, instead of relying on the implicit association between subjecthood and topicality (as in "Office hours are from two to four"). This is often a good thing to do in discourse; it's helpful to the people who are listening to you or reading you. It would be nice if English had a grammaticalized topic marker, something like the famous wa of Japanese (and parallel items in vast numbers of other languages), but it doesn't, so we press various idioms into service:

About office hours: they're from two to four.
Concerning/Regarding office hours, they're from two to four.
As concerns/regards office hours, they're from two to four.
As for office hours, they're from two to four.
With respect/regard to office hours, they're from two to four.
As far as office hours are concerned, they're from two to four.

All of these are standard idioms. The last of these has a much-deprecated non-standard variant

As far as office hours, they're from two to four.

(studied in detail by Rickford, Wasow, Mendoza-Denton, & Espinoza in Language 71.1.102-31 (1995); yeah, I know, another Stanford plug), though I can't resist pointing out that all that's going on here is the omission of unnecessary words. And Left Dislocation (see the extended discussion in Birner & Ward, Information status and noncanonical word order in English (1998)) provides yet another, and even briefer, way of marking topics:

Office hours, they're from two to four.

This might seem a bit abrupt, and in any case LD is widely viewed as informal, conversational, and/or non-standard, hence not acceptable in formal standard written English. Still, it can be skillfully deployed, as in this excerpt from an interview with gay sex/relationship columnist Jason Steele (in The Advocate, 10/25/05, p. 24):

There's one article I submitted recently about how I personally can't stand gay dance music. I'm more into Bravery or the White Stripes. Any straight person who's reading these columns, I don't want them to think all gay people go out to gay dance clubs like those in Queer as Folk.

The textbooks won't let you use LD or plain "as far as" for topic marking in formal writing, but there are still plenty of choices. "In terms of" is just another one in this set. Yet it gets singled out for calumny, probably because it is perceived as being a recent (and therefore unnecessary) addition to the set and because it is perceived as being "overused". These themes are explicit in the crusty James Cochrane's discussion (Between You and I, p. 67):

In terms of is still occasionally used correctly..., but in recent times it has started to behave like an irresistible virus, destroying and replacing such old familiar words and phrases as concerning or as regards or in view of or in the light of or even simply about, to the extent that one seems to hear or read it a hundred times a day.

Contagion, invasion, and conquest! An occasion for a moral panic!

Putting the drastically overheated imagery aside, there are two factual claims here: that "in terms of" is recent, and that it's frequent, more frequent than its competitors. The first claim is just false. It's not even true for sentence-initial topic-marking "in terms of", since I recall heartfelt puzzled complaints by colleagues at the University of Illinois and Ohio State University, back in the 60s, about their students' affection for "in terms of" as a topic-marking device in compositions and papers. (My colleagues were not at all pleased when I told them that the whole thing probably came down to a difference in preferences for topic-marking devices, so that if they wanted their students to replace all those occurrences of "in terms of" by "as for" or "with respect to" or whatever, they were just going to have to confess that some people are unaccountably annoyed by "in terms of" so maybe you should avoid it, or at least use it sparingly. I didn't touch on the possibility that the different topic-marking devices might actually be doing slightly different things.)

As for frequency -- I'm a big "as for" user, by the way -- I have no idea what the facts are (though, having been sensitized anew to "in terms of" by Wachman's piece, I'm still not coming across many occurrences), but I'm sure that Cochrane is deeply ignorant on the subject and is merely behaving like someone in the grip of the Frequency Illusion (as well as the Recency Illusion). If someone is willing to slog though a lot of data to figure out who uses topic-marking "in terms of", how often, in what contexts, and for what purposes, I'd cheer them on and welcome the results. Warning: this is not a quick and easy project.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 09:25 PM

The wordiness of English

In reference to Daphne Bramham's observation that "English is the most idiosyncratic and wordiest of all languages", I can't resist quoting James D. Nicoll's 5/15/1990 post to rec.arts.sf-lovers:

The problem with defending the purity of the English language is that English is about as pure as a cribhouse whore. We don't just borrow words; on occasion, English has pursued other languages down alleyways to beat them unconscious and rifle their pockets for new vocabulary.

I've corrected the original (eggcornic) "riffle" to "rifle". Some further discussion is here, included a quoted email from James Nicoll: "If I had only known that was going to be my fifteen minutes of fame, I'd have run that sucker through a spell checker and taken more care while writing the surrounding material."

I'm not sure whether Nicoll's implicit claim is true: does English borrow words more avidly than your average language? In any case, we're certainly not shy about it. And sometimes a phrase is just too good to hold back for fact-checking.

Posted by Mark Liberman at 10:14 AM

It's a choriamb, folks

Over the past couple of days, an old Language Log post ("An Internet Pilgrim's Guide to Accentual-Syllabic Verse") has been seeing an unexpected amount of traffic: 281 of the past 4,000 visitors came to that page, making it the highest single entry page besides "main". What's this sudden upsurge of interest in metrics, I wondered? It's not big enough for a slashdotting, but perhaps there's a controversy over at Lambda the Ultimate about Eskimo words for iambs, or a flap at Metafilter about Trochees of Mass Destruction? What fun, I thought to myself, checking the top-ranked referring pages to find the discussion in question. Alas, there isn't one.

I'm still not certain what's going on, but a glance at the sorted list of recent strings from search engine referrals suggests the truth.

Here's the top of the list:

39 dan brown 1.0%
37 in classical poetry a metrical foot consisting of two short syllables between two long ones 0.9%
23 language log 0.6%
18 metrical foot two short syllables between two long ones 0.5%
18 in classical poetry, a metrical foot consisting of two short syllables between two long ones 0.5%
18 classical poetry metrical foot two short syllables between two long ones 0.5%
11 metrical foot of two short syllables between two long ones 0.3%
9 in classical poetry a metrical foot of two short syllables between two long ones 0.2%
8 upskirting 0.2%
8 metrical foot consisting of two short syllables between two long ones 0.2%
7 word sex 0.2%
7 classical poetry a metrical foot consisting of two short syllables between two long ones 0.2%
7 butt crack 0.2%
7 a metrical foot consisting of two short syllables between two long ones 0.2%
6 log 0.2%
5 pirate language 0.1%
5 incall 0.1%
4 wedding vowels 0.1%
4 sic meaning 0.1%
4 ku language 0.1%
4 in classical poetry,a metrical foot consisting of two short syllables between two long ones 0.1%
4 how now brown cow 0.1%
4 hobbesian choice 0.1%
4 happy bunny 0.1%
4 adjectives 0.1%
3 whorf < 0.1%
3 trepidatious < 0.1%
3 starbucks coffee sizes < 0.1%
3 sic < 0.1%
3 russian tennis players < 0.1%
3 private sex < 0.1%
3 poetry metrical foot two short syllables between two long ones < 0.1%
3 outcall < 0.1%
3 moreso < 0.1%
3 metrical foot 2 short syllables between 2 long ones < 0.1%
3 mens rea < 0.1%
3 language < 0.1%
3 hominin < 0.1%
3 harry potter porn < 0.1%
3 hangul < 0.1%
3 chinese menus < 0.1%
3 "if you will" < 0.1%
2 yoda < 0.1%
2 web crow < 0.1%
2 un official languages < 0.1%
2 typhoon long wang < 0.1%
2 teach english quebec < 0.1%
2 starbucks sizes < 0.1%
2 starbucks cup sizes < 0.1%
2 snow in inuit < 0.1%
2 simlish < 0.1%
2 shannon's equation < 0.1%
2 sex pro < 0.1%
2 sarcastic phrases < 0.1%
2 rasheed wallace quotes < 0.1%
2 prejudice accents < 0.1%
2 obstinant < 0.1%
2 o tempora o mores < 0.1%
2 neanderthal language < 0.1%
2 mocosoft < 0.1%
2 metrical foot two short syllables between two long ones classical poetry < 0.1%
2 marriage vowels < 0.1%
2 loggerheads < 0.1%
2 like < 0.1%
2 languagelog < 0.1%
2 language of brazil < 0.1%
2 language blog < 0.1%
2 kenosha kid < 0.1%
2 kennings < 0.1%
2 inclimate < 0.1%
2 in poetry a metrical foot two short syllables between two long ones < 0.1%
2 in classical poetry metrical foot two short syllables between two long ones < 0.1%
2 in classical poetry a metrical foot of 2 short syllables between 2 long ones < 0.1%
2 in classical poetry a metrical foot consisting two short syllables between two long ones < 0.1%
2 in classical poetry ,a metrical foot consisting of two short syllables between two long ones < 0.1%

I bet that some popular crossword puzzle recently featured as a clue "in classical poetry, a metrical foot consisting of two short syllables between two long ones". Most of the hits come from IP addresses in Great Britain, so it's probably a puzzle from a British Sunday paper.

Unfortunately, the page everyone is getting to ("An Internet Pilgrim's Guide to Accentual-Syllabic Verse") doesn't give the answer, which is choriamb.

Soon enough, this flurry will die down, and we'll be back to our usual set of web search referrals: the aficionados of Dan Brown, upskirting, word sex, butt crack and incall.

[Update: apparently it's a Mail on Sunday crossword, with a cash prize.]

Posted by Mark Liberman at 09:23 AM

Semantic entanglements

Back in July, Mark Liberman wrote that "the Valerie Plame story is all about referential opacity and felicity conditions for speech acts and other issues in philosophy of language." Since the release of New York Times reporter Judith Miller from jail and that paper's attempt to report reflexively on Miller's involvement in the Plame affair, semantic obscurity has remained the order of the day.

First we were left to puzzle over what the vice president's chief of staff, Lewis "Scooter" Libby, meant when he wrote to a still-imprisoned Miller the following bit of purple prose:

Out West, where you vacation, the aspens will already be turning. They turn in clusters, because their roots connect them. Come back to work—and life.

Since the leak of this letter, bloggers have had a field day speculating that Libby was sending a coded message to Miller, coaching her to give favorable testimony about him to the grand jury. (Satirists Bruce Kluger and David Slavin take the Dan Brown approach and break "the Da Libby Code" via anagrams.) Miller, for her part, when asked about the "turning aspens" letter by special counsel Patrick J. Fitzgerald, described her mysterious "last encounter" with Libby:

It came in August 2003, shortly after I attended a conference on national security issues held in Aspen, Colo. After the conference, I traveled to Jackson Hole, Wyo. At a rodeo one afternoon, a man in jeans, a cowboy hat and sunglasses approached me. He asked me how the Aspen conference had gone. I had no idea who he was.
"Judy," he said. "It's Scooter Libby."

So by mentioning "aspens" Libby was making an oblique reference to the city of Aspen and a conference Miller attended there? Miller's "explanation" only confuses matters further. Others have conjectured that Libby's esotericism points to the legacy of neconservative godfather Leo Strauss (Strauss was a teacher of Paul Wolfowitz at the University of Chicago, and Wolfowitz in turn taught Libby at Yale). As John Dickerson writes in Slate:

Part of Strauss' teaching is that ancient philosophers wrote on two levels: for the mumbling masses, but also, and often in contradiction of the literal message, on an "esoteric" level that only initiates could make out. Some Straussians have adopted this code themselves. So, where Homer Simpson would interpret Libby's note as ham-handed fawning over Judy, a Straussian close reader might discern something more devious: a literary file in the cake for both of them.

It might take an initiate in Straussian hermeneutics to make sense of some of the linguistic somersaults that have appeared recently in the New York Times as it gingerly deals with its Judy Miller problem. Take, for instance, the matter of Miller's "security clearance," as she herself called it in her Oct. 16 "personal account." In a follow-up story a few days later, Miller was forced to amend her wording when doubts were raised about whether she had anything more than the nondisclosure agreement signed by other "embedded" reporters:

In a telephone interview Wednesday, Ms. Miller said this so-called nondisclosure form was precisely what she had signed, with some modifications, adding that what she had meant to say in her published account was that she had had temporary access to classified information under rules set by her unit.

So when Miller wrote "security clearance," what she had meant to say was "nondisclosure form," naturally enough. As NYU's Jay Rosen pointedly remarked, "What the New York Times has not figured out yet is that Judith Miller is an extreme example of the unreliable narrator. She increases our doubt in the story as she tells it."

More unreliable narrativity came in the Oct. 22 edition of the Times, in an article reporting on a scathing memo about Miller distributed to the Times staff from executive editor Bill Keller. Keller accused Miller of misleading (or at least "seem[ing] to have misled") Washington bureau chief Philip Taubman, since Miller did not tell Taubman that she was a recipient of the Plame leak. Miller's response to Keller's memo is an intriguing study in speech acts: "I certainly never meant to mislead Phil, nor did I mislead him." In the parlance of speech act theory, Miller denies that she intended to commit the "illocutionary act" of misleading Taubman, and she further denies the "perlocutionary effect" that he was actually misled. But wouldn't we need to ask Taubman himself whether he was misled?

In the very next paragraph of the article, Miller manages to make an even more confusing assertion regarding the Keller memo:

She wrote that as she had said in an account in The Times last Sunday, she had discussed Mr. Wilson and his wife with government officials, but "I was unaware that there was a deliberate, concerted disinformation campaign to discredit Wilson and that if there had been, I did not think I was a target of it."

If Miller was unaware that there was a campaign to discredit Joe Wilson, then how would she be able to think one way or the other about whether she was a target of a campaign of which she was not aware? There are Zen koans that are easier to decipher.

Given the tangled web of semantics and pragmatics that the Plame/Wilson/Libby/Miller affair has turned into, it should perhaps be no surprise that the word "entanglement" is itself part of the story. In his memo to the Times staff, Keller wrote:

But if I had known the details of Judy’s entanglement with Libby, I’d have been more careful in how the paper articulated its defense, and perhaps more willing than I had been to support efforts aimed at exploring compromises.

Miller shot back: "As for your reference to my 'entanglement' with Mr. Libby, I had no personal, social, or other relationship with him except as a source." It seems that Miller is taking grave offense at Keller's use of the word "entanglement" for reasons perhaps known only to her (though the long-standing rumors about Miller's intimacy with powerful Washington men might have something to do with it). The waters were further muddied by Times public editor Byron Calame. In his Sunday column on "the Miller mess," Calame reports a similar statement by Keller about Miller and Libby, only with the word "engagement" instead of "entanglement." Calame provides the full context of Keller's remarks on his web journal:

But if I had known the details of Judy's engagement with Libby, I'd have been more careful in how the paper articulated its defense, and I'd have been better equipped for the third turning point, below.

According to Kalame, Keller sent him this statement in an e-mail on Wednesday, Oct. 19, and Keller then sent the Times staff "much the same message" in a memo on Friday, Oct. 21. But between Wednesday and Friday, Keller decided to revise "engagement" to "entanglement" (along with many other redactions). "Engagement" would have obviously been a less felicitous choice of words—we can imagine Times staffers sarcastically asking where the happy couple is registered. But connotatively speaking, "entanglement" is not much better, at least from Miller's perspective.

Perhaps some day soon the Office of the Special Counsel will help unsnarl this baffling morass. But I suspect we're in for many more weeks of entangled discourse. More grist for the Language Log mill, at the very least!

[Update, 11/9/05: See this followup on Judith Miller's departure from the Times.]

Posted by Benjamin Zimmer at 07:22 AM

October 23, 2005

Invariably followed by the phrase

Says Nancy Franklin in The New Yorker (October 24, 2005, p. 90), apropos of the new political TV drama "Commander in Chief", with Geena Davis playing the first woman president of the USA:

No one seems to be able to talk about "Commander in Chief" without also talking about Hillary Clinton, whose name is invariably followed these days by the phrase "who may or may not run for president in 2008."

But you don't have to just swallow this. Language Log is here to check such things for you. Journalists seem to love to make unchecked (but readily checkable) claims about the linguistic record in order to back up their unverified intuitions about what is on the public mind. Here are the facts:

Word string	Google hits
`"Hillary Clinton"`	6,150,000
`"Hillary Clinton, who may or may not run for president in 2008"`	0
`"Hillary Clinton, who may or may not run for president"`	0
`"Hillary Clinton, who may or may not run"`	0
`"Hillary Clinton, who may or may not"`	1

And that single hit for "Hillary Clinton, who may or may not" at the end there is not about running in 2008. It's from a 1998 article about Cherie Booth (wife of UK prime minister Tony Blair) wearing a necklace said to contain magic crystals barring harmful computer rays and stress. The sentence runs: "One of Booth's few fellow-travellers is Hillary Clinton, who may or may not have suggested the stress-busting necklace but who has reputedly sought the aid of spiritual guides, summoning up Eleanor Roosevelt from the White House ectoplasm." So it would appear that absolutely no one has ever followed the words "Hillary Clinton" with the words "who may or may not run for president in 2008" in any forum that Google can find. The statement is hugely, monstrously, grotesquely false.

It is not that this is some big factual issue. There are plenty of hits for phrases like "Hillary Clinton, who may be a presidential candidate in 2008" and "Hillary Clinton, who may look to become the first female president in American history" and "Hillary Clinton who may still enter the fray." What I am puzzled by is the practice of needlessly making true claims about nonlinguistic matters into false claims about linguistic material. Nancy Franklin could have said that no one seems to be able to talk about "Commander in Chief" without also talking about Hillary Clinton, who is constantly being referred to these days as a possible contender for president in 2008.

We have seen this before. I commented last year on a claim by Mark Bauerlein that "references to ‘right-wing think tanks’ are always accompanied by the qualifier ‘well-funded‘." The claim is not just false but absurdly far from the truth: hardly any references to right-wing think tanks are accompanied by that phrase. I said there that I wished journalists wouldn't "spoil their presentations ... by including ridiculous claims about public use of language that can be falsified in seconds." But the practice goes on. I have no idea why.

Posted by Geoffrey K. Pullum at 07:59 PM

Tropical Storm Alpha

Tropical storms have been in the news lately and now, finally, there's a linguistic aspect to the news: for the first time the main list of names for tropical storms in the Atlantic region has been exhausted and it has been necessary to fall back to the secondary list: the current tropical storm is prosaically named Alpha. Tropical storm Alpha is only the 22nd tropical storm of the year; the tropical storm name list for the Atlantic contains no entries for Q,U,X,Y or Z.

The National Hurricane Center of the US National Weather Service has information about the naming of tropical storms. There are different lists of names for different regions. The lists vary in length. The Eastern North Pacific lists, for example, contain 24 names. They have entries for X,Y, and Z but like the Atlantic list have none for Q and U. The only lists to have entries for all 26 letters of the alphabet are those for the Southwest Indian Ocean. If you live there you can enjoy storms with names such as Qiqita and Quincy, Ula and Usta, Willem and Wilby, Xaoka and Xanda, Yelda and Yuri, and Zuza and Zoelle. The shortest lists are those for the Papua New Guinea region, which contain only eight names, and those for the Central North Pacific, which contain only 12 names, presumably reflecting the fact that the names are chosen from the indigenous languages of the respective regions, whose phonological inventories tend to be small.

In some ways the most interesting list is the one for the Western North Pacific. It cycles through names contributed by the various countries of the region: Cambodia, China, North Korea, Hong Kong, Japan, Laos, Macau, Malaysia, Micronesia, Philippines, South Korea, Thailand, the United States, and Vietnam. (I've expanded abbreviations and changed the official names to those commonly used in American English. The original names are in alphabetical order.) There is some interesting politics here. Hong Kong, which is part of China, is treated as a country, as is Macau, while Taiwan is excluded, presumably at the insistence of China.

According to the US National Weather Service, in the West Indies hurricanes were originally named after the saint on whose day they occurred. The current system originated in WWII when US military meteorologists began naming storms after their wives and girlfriends. Currently the names are assigned by regional committees of the World Meteorological Organization. The names of particularly severe storms, like the numbers of athletes, are retired.

Posted by Bill Poser at 01:16 PM

Chebyshev and Murphy in Iraq

OK, one more try at interpreting Matthew D. LaPlante's sentence

It is rare, Hamblin knows, for these kinds of situations to end better than they normally do.

My first idea was that "better" was being taken to imply "good", whereas the situations in question almost always end badly, so that even if about half the outcomes are literally better than average, few of them would qualify as good enough to be called "better". The next idea, due to Kenny Easwaran and some other readers, was that the author intended some kind of de dicto interpretation. As another option, Fernando Pereira suggests that "better" here means something like "significantly better" or "perceptibly better":

The Hamblin sentence could be taken to be about the rarity of large deviations from the norm. Let X be the random variable "outcome of these situations". Then, what Hamblin knows is

P(X - EX >= ε) <= δ

where ε is an appropriate (maybe just discernible) difference in outcomes, and δ is an appropriately small probability, and P and the expectation E are over the distribution of "these situations"). The relationship between ε and δ is the subject of many well-known theorems. For example, in the one-tailed Chebyshev inequality, δ = 1/(1+ε²/Var(X)).

This makes a lot of sense.

Anyhow, LaPlante's reporting from Iraq for the Salt Lake Tribune is some of the most consistently interesting stuff I've seen. In addition to the piece on Sgt. Ozro Hamblin, there's another on non-military interpreters. His most recent report deals with that "hillbilly armor" we all heard about during the last election campaign -- guess what, some of it's still there. And you should definitely read his Oct. 10 essay on sex in the military.

Posted by Mark Liberman at 01:14 PM

Rarely better than de re

A week ago, I puzzled over the sentence "It is rare, Hamblin knows, for these kinds of situations to end better than they normally do." Kenny Easwaran wrote to suggest that this might be a sort of de dicto vs. de re ambiguity, citing as an analogous example

"What these people don’t understand is that He built the world to make us think the earth is older than it really is."

or the shorter, if less interesting, classroom favorites like "Kim thinks Leslie is taller than she is."

De dicto vs. de re is a classical distinction in the interaction of reference and modality. I'm not convinced that it works as an account of the Hamblin sentence, but back in February I went so far as to suggest that understanding the de re/de dicto distinction might help keep journalists out of trouble, so let's give it a shot.

I'll start with last Febuary's example. At the World Economic Forum in Davos, Eason Jordan, then CNN's chief news executive, said some things that got him into hot water. Rony Abovitz,who was at the session, blogged it in these terms on 1/28/2005:

During one of the discussions about the number of journalists killed in the Iraq War, Eason Jordan asserted that he knew of 12 journalists who had not only been killed by US troops in Iraq, but they had in fact been targeted. He repeated the assertion a few times, which seemed to win favor in parts of the audience (the anti-US crowd) and cause great strain on others.

Controversy ensued, and Jordan resigned (as he put it) to "prevent CNN from being unfairly tarnished by the controversy over conflicting accounts of my recent remarks regarding the alarming number of journalists killed in Iraq". The disagreements were not only over what Jordan said, and what the facts in Iraq were, but also about how to interpret whatever it was he said. And some of these interpretive differences hinged on the classic de re/de dicto ambiguity. Suppose that Jordan said something like

U.S. forces in Iraq have intentionally killed 12 journalists.

This isn't an exact quote -- some people believe that Jordan resigned to prevent the transcript from being released -- but whatever he said was apparently subject to the same ambiguity. Jay Rosen, from NYU's department of journalism, put it like this:

"The original account was too ambiguous for me. It had him saying United States soldiers targeted journalists, and then claiming that's not what he meant. He later explained it as: the soldiers were trying to kill these people, but did not know they were shooting at journalists."

That interpretation is the de re ("about the thing") reading. The soldiers were trying to kill certain people, without knowing that their targets would turn out to be journalists. And Oedipus wanted to marry Jocasta, without knowing that she was his mother. Here the belief or desire is all about the thing referred to -- the targeted person, the spouse, whatever -- and the description comes from outside.

The alternative is the de dicto ("about the saying" ) reading. Here the belief or desire is all about the description: the soldiers want to go out and kill some journalists; Oedipus is an adoptee who wants to find his mother and marry her.

According to the Stanford Encyclopedia of Philosophy,

The idea of the systematic distinction between the readings de dicto (in sensu composito) and de re (in sensu diviso) of modally qualified statements was introduced into medieval discussions in Abelard's investigations of modal statements (Super Periherm. 3-47, Dialectica 191.1-210.19), and was often mentioned, as in the Dialectica Monacensis, in discussions of the composition-division ambiguity of sentences.

Despite his understanding of the de re/de dicto distinction, Abelard came to a more troubled end than Eason Jordan did. All the same, his legacy includes the University of Paris, an enduring story of love in adversity -- and one of the few bits of Latin that remain in common philosophical use.

One example of the prevalence of the traditional use of modal notions can be found in the early medieval de dicto/de re analysis of examples such as ‘A standing man can sit’. It was commonly stated that the composite (de dicto) sense is ‘It is possible that a man sits and stands at the same time’ and that on this reading the sentence is false. The divided (de re) sense is ‘A man who is now standing can sit’ and on this reading the sentence is true.

In the middle of the 20th century, W.V. Quine re-analyzed this distinction as a matter of the scope of logical operators, and applied it to propositional attitude terms such as believe. As the Stanford Encyclopedia of Philosophy explains his analysis:

[1] Ortcutt believes that someone is a spy.
This could mean just that
[2] Ortcutt believes that there are spies
or that Ortcutt has more interesting information:
[3] Someone is an x such that Ortcutt believes that x is a spy.
The distinction here can be seen as a distinction of scope for the existential quantifier. In [2], the existential quantifier is interpreted as having small scope, within the propositional clause of the belief attribution.
[2*] Ortcutt believes: ∃x, x is a spy.
In [3], however, the existential quantifier has large scope, selecting an individual and then ascribing a belief that relates Ortcutt to that particular individual.
[3*] ∃x, Ortcutt believes that x is a spy.

(The backwards E is the "existential quantifier", so that the de re version ∃x, Ortcutt believes that x is a spy is read "there exists an x such that Ortcutt believes that x is a spy".)

OK, how could this help us with "It is rare, Hamblin knows, for these kinds of situations to end better than they normally do"?

Let's start with a scope ambiguity involving a comparative and a verb like want, which will make a good Quinian de re/de dicto example:

Kim wants to score higher than Leslie scored.

This could mean that there exists some score L (which we describe as the score Leslie got) such that Kim wants to score higher than L. That's the de re reading -- it's all about the numerical score. Kim doesn't care that it was Leslie's score, and maybe she doesn't even know who Leslie is.

Alternatively, the same sentence could mean that Kim wants her score to beat Leslie's, regardless of what it is. That's the de dicto reading -- it's all about Leslie's level of achievement.

We can get a similar scope ambiguity with a predicate like rare.

It was rare for Kim to score higher than Leslie did.

Translated into "heavy English", this might mean something like

"There was an L=Leslie's score, such that it was rare that there was a K=Kim's score and K was greater than L."

or it might mean something like

"It was rare that there was an L=Leslie's score and a K=Kim's score such that K was greater than L."

Here the de re/de dicto distinction doesn't arise from the interpretation of a traditional modal operator (like "necessarily") or a propositional attitude verb (like "wants"), but instead in reference to a statistical sampling process. We're not talking about whether necessity, belief or desire applies to a thing per se or to a thing under a given description. Instead, we've got a scope ambiguity having to do with a sampling process: do we fix Leslie's score and then look at the statistics of Kim's scores relative to it? or do we look at the statistics of the relationship between Kim's scores and Leslie's scores?

However, I'm not sure that the original example

"It is rare ... for these kinds of situations to end better than they normally do."

involves a coherent scope ambiguity of this type, because the reference point "how these situations normally [end]" comes out of the same sampling process referred to by the phrase "it is rare".

The idea seems to be that how these situations normally end is "badly", and if we substitute "badly" for "how they normally do" in

It's rare for these situations to end better than they normally do.

then we get

It's rare for these situations to end better than badly.

which is awkward but coherent. I agree that this is probably what the writer had in mind, and it does seem analogous to some of the medieval de re/de dicto examples, but I don't see a coherent reconstruction in terms of a Quinian scope difference. However, that's probably because I'm not a real semanticist, I just play one occasionally on Language Log.

Posted by Mark Liberman at 07:44 AM

October 22, 2005

English the most idiosyncratic and wordy?

Daphne Bramham's column [subscription required] in today's Vancouver Sun is entitled:

Keeping up with the English race:
The most idiosyncratic and wordiest of languages acquires and sheds words with stunning speed
For the most part, the column is an innocuous discussion of words that have recently entered the language. My favorite is ignoranus, for a person who is both ignorant and an asshole. I can think of many uses for that one. What struck me as peculiar are the claims that of all languages English is the "most idiosyncratic" and "wordiest". These are repeated in the body of the article:

English is the most idiosyncratic and wordiest of all languages. It has none of the rigidity of form that is the hallmark of German. And in all of its creative exuberance, unlike French, there's been no need for official word police.

The claim that English is the wordiest language has a fairly straightforward interpretation. "wordy" means "using or containing too many words", so the wordiest language would be the language that, on average, uses the most words to express the same content. We can't really evaluate this claim without making it more precise - we need to have consistent cross-linguistic notions of "word" and "same content" - but I strongly suspect that insofar as we can make the claim precise enough to test it will turn out not to be true. One reason is that, other things being equal, we should expect wordiness to be greater in isolating languages, in languages spoken by people with simple and unspecialized technology, and in languages whose relatively recent history has been such that they have no acquired multiple layers of vocabulary via language contact. Since English is not at the extreme isolating end of the morphological spectrum, is associated with complex and specialized technology, and has multiple lexical strata, we would not expect it to be particularly wordy.

Another reason is that both I and others who have some experience with translation have the distinct impression that documents generally lengthen noticeably when translated from English into French but not conversely. I don't know if this topic has been studied rigorously, but I did a quick check that seems to bear it out. I counted the words in the English and French versions of the decisions of the Supreme Court of Canada. The French versions contained 19,687,757 words, the English versions 18,682,563, for a ratio of 1.054. By this measure French is about 5% wordier than English.

I suspect, however, in light of her remarks on "creative exuberance", that Bramham means something different, namely that the English lexicon contains more words than that of any other language. That may be true, if one can get past the very sticky problems of defining and counting words, but in my experience wordy when applied to a language cannot mean "having a large lexicon".

More problematic still is the claim that English is the most idiosyncratic language. To begin with, what does this mean? It must mean that English deviates more from the linguistic norm than any other language. What norm, and how do we quantify deviation? Is idiosyncrasy some sort of statistical measure of deviation from central tendancy or is it to be based on notions like markedness of parameter settings and/or presence or absence of peripheral constructs? And insofar as we have a measure of idiosyncrasy, how does anyone know which language is most idiosyncratic, much less someone whose acquaintance with languages appears to be restricted to English, French, and German? How are we to know that one of the other 6,000+ languages isn't more idiosyncratic? Indeed, if French and German are taken to represent the norm, virtually all of the native languages of British Columbia are by any reasonable standard more different than English.

My purpose here is not to pick on Daphne Bramham. She's an experienced, award-winning journalist (profile) whose column I generally like. She's knowledgable and on the side of truth, justice, and the Canadian way. She may even read Language Log: she quotes from a post by Mark Liberman later in the same column. My point is that for some reason otherwise sensible people seem to feel free to toss off dubious statements about language without much thought, investigation, or careful phrasing. This column of Daphne Bramham's is a minor offender: the dubious statements don't affect the main point of the column and Bramham does not present herself as having any expertise on linguistic matters.

More disturbing are journalists who make serious mistakes on major points of their articles, especially those who write with some frequency about language and consider themselves to have expertise. Geoff Pullum has pointed out the deficiencies of a piece by BBC science reporter Alex Kirby and a report by CBS reporter Bob Simon. Mark Liberman has dealt with The Atlantic's Cullen Murphy and The Boston Globe's John Powers. An example that I've discussed is New York Times science reporter Nicholas Wade's lack of understanding of the rudiments of historical linguistics. As I've suggested here before, the poor performance of journalists writing about linguistics probably reflects the generally low level of knowledge about language. It may not be reasonable to expect journalists to learn much about linguistics, but they could do a lot better just by taking the subject more seriously and doing a little more research and thinking before they write about it.

Posted by Bill Poser at 11:49 PM

Expressing concern

I was heartened by this NYT article today, reporting that interim Cornell president Hunter R. Rawlings III finds the movement to have intelligent design taught in science classrooms "very dangerous". But I was also disheartened to read this at the end (bold emphasis added):

John G. West, a senior fellow at the Discovery Institute in Seattle, which is a leader in the intelligent design movement, said he was concerned that Cornell's president was "fanning the flames of intolerance."

"A college president is in a unique position to create an atmosphere of free speech," Mr. West said. "If he's implying that faculty don't have the right to discuss ideas, I'm very concerned."

The problem I have is that this bolded statement really doesn't entail anything significant at all, even though it appears to. All we can derive from this statement is that Mr. West believes that President Rawlings is "implying that faculty don't have the right to discuss ideas", and that Mr. West is "very concerned" should this belief turn out to be fact. (He might be "very concerned" either way, but that's a somewhat separate issue.)

This is not what Mr. West technically said; in fact, he could easily deny that this is what he meant by this statement. In the end, though, a critical connection is made between the comments made by President Rawlings and the fragility of academic freedom, which is (I'm sure) exactly what the ultimate purpose of Mr. West's statement was.

Speaking of (the fragility of) academic freedom ...

A couple of commenters on one of my posts from earlier this month took me to task for misrepresenting Intelligent Design (ID). I had mistakenly equated belief in ID with belief in a "young Earth", when in fact the NYT article that I was commenting on in that post says specifically that "[e]ven the intelligent design movement, which argues that evolution alone cannot explain life's complexity, does not challenge the long history of the earth." So yes, my bad.

However, after reading yet another relevant NYT article from a few days ago, I've decided to just go ahead and forgive myself for this misrepresentation. I'm convinced that the ID "movement" is being (over)run by creationists who are trying their best to get around, in whatever way they can, the 1987 Supreme Court decision on Edwards v. Aguillard.

The article is about Prof. Michael J. Behe, the "biochemist at Lehigh University [who] is the first expert witness for the school board of Dover, Pa." It's short and worth reading in its entirety, but here's what I found particularly noteworthy:

In two days on the stand, Professor Behe has insisted that intelligent design is not the same as creationism, which supports the biblical view that God created the earth and its creatures fully formed. [...] The cross-examination of Professor Behe on Tuesday made it clear that intelligent-design proponents do not necessarily share the same definition of their own theory. [...] [A]n excerpt from the ["Of Pandas and People"] textbook [says]: "Intelligent design means that various forms of life began abruptly through an intelligent agency with their distinctive features already intact, fish with fins and scales, birds with feathers, beaks and wings, etc." [...] [C]ouldn't the words "intelligent design" be replaced by "creationism" and still make sense? Professor Behe responded that that excerpt from the textbook was "somewhat problematic," and that it was not consistent with his definition of intelligent design.

So whose definition of ID are we to take to be the definition of the "movement", and more importantly, which definition is going to be taught in the science curriculum (or in schools more generally), should it come to that? This seems to be the heart of the matter, and yet we're getting conflicting points of view from what's supposed to be a unitary (and non-creationist?) group.

The article goes on to report that Prof. Behe was asked why he didn't object to this excerpt when he reviewed the "Of Pandas and People" for publication:

Professor Behe said that although he had reviewed the textbook, he had reviewed only the section he himself had written, on blood clotting. Pressed further, he agreed that it was "not typical" for critical reviewers of scientific textbooks to review their own work.

No kidding. But look who's impressed with Prof. Behe's testimony:

Listening from the front row of the courtroom, a school board members [sic] said he found Professor Behe's testimony reaffirming. "Doesn't it sound like he knows what he's talking about?" said the Rev. Ed Rowand, a board member and church pastor.

Mr. Rowand said the "core of the issue" is, "Do we have the academic freedom to tell our children there are other points of view besides Darwin's?"

I think I understand now why evolution is so often reduced to "Darwin's point of view", "just a theory", etc. A point of view is exactly all you have if you only review your own work. If your work is subjected to more penetrating external inquiry, however, and you (might) have something more. Perhaps folks like Mr. Rowand think that all scientific work is "reviewed" in the same way that "Of Pandas and People" was? Is this some folks' definition of academic freedom?

[ Comments? ]

Posted by Eric Bakovic at 03:35 PM

How Do They Come Up With These Things?

Much of the spam that I get seems to be randomly generated and makes no sense at all, but from time to time I get something interesting. Here is the latest:

adposition a spicy exotic fruit of youthful beauty is now at your fingertips

I eagerly await the author's observations on other parts of speech, perhaps:

adjective a pox-ridden old hag appears all too frequently

Very likely the source of such quasi-well-formed bits of spam is a random text generator. We notice bits like this more readily than word salad, and messages with such text are more likely to make it through content-based spam filters.

Posted by Bill Poser at 02:35 PM

Whomever controls language controls politics

From Michaele Shapiro's review of George Lakoff's Don't Think of an Elephant, sent in by John Lawler:

Lakroff [sic] believes in the tie between language and politics: whomever [sic] controls language controls politics. He contends that the specific words people use to communicate, and the framing they use, is [sic] crucial to the future of the nation: the language used in American politics is a precision tool which shapes our political future.

The language used in on-line book reviews, on the other hand...

John makes the connection to Geoff Pullum's 9/10/2004 post "The coming death of whom: photo evidence".

I can't resist quoting, yet again, from James Thurber's Ladies' and Gentlemen's Guide to to Modern English Usage:

The number of people who use "whom" and "who" wrongly is appalling. The problem is a difficult one and it is complicated by the importance of tone, or taste. Take the common expression, "Whom are you, anyways?" That is of course, strictly speaking, correct - and yet how formal, how stilted! The usage to be preferred in ordinary speech and writing is "Who are you, anyways?" "Whom" should be used in the nominative case only when a note of dignity or austerity is desired. For example, if a writer is dealing with a meeting of, say, the British Cabinet, it would be better to have the Premier greet a new arrival, such as an under-secretary, with a "Whom are you, anyways?" rather than a "Who are you, anyways?" - always granted that the Premier is sincerely unaware of the man's identity. To address a person one knows by a "Whom are you?" is a mark either of incredible lapse of memory or inexcusable arrogance. "How are you?" is a much kindlier salutation.

The Buried Whom, as it is called, forms a special problem. That is where the word occurs deep in a sentence. For a ready example, take the common expression: "He did not know whether he knew her or not because he had not heard whom the other had said she was until too late to see her." The simplest way out of this is to abandon the "whom" altogether and substitute "where" (a reading of the sentence that way will show how much better it is). Unfortunately, it is only in rare cases that "where" can be used in place of "whom." Nothing could be more flagrantly bad, for instance, than to say "Where are you?" in demanding a person's identity. The only conceivable answer is "Here I am," which would give no hint at all as to whom the person was. Thus the conversation, or piece of writing, would, from being built upon a false foundation, fall of its own weight.

A common rule for determining whether "who" or "whom" is right is to substitute "she" for "who," and "her" for "whom," and see which sounds the better. Take the sentence, "He met a woman who they said was an actress." Now if "who" is correct then "she" can be used in its place. Let us try it. "He met a woman she they said was an actress." That instantly rings false. It can't be right. Hence the proper usage is "whom."

In certain cases grammatical correctness must often be subordinated to a consideration of taste. For instance, suppose that the same person had met a man whom they said was a street cleaner. The word "whom" is too austere to use in connection with a lowly worker, like a street-cleaner, and its use in this form is known as False Administration or Pathetic Fallacy.

You might say: "There is, then, no hard and fast rule?" ("was then" would be better, since "then" refers to what is past). You might better say (or have said): "There was then (or is now) no hard and fast rule?" Only this, that it is better to use "whom" when in doubt, and even better to re-word the statement, and leave out all the relative pronouns, except ad, ante, con, in , inter, ob, post, prae, pro, sub, and super.

Posted by Mark Liberman at 10:05 AM

October 21, 2005

Miers dementia unlikely

A reader suggested that I check out the flap at the Volokh Conspiracy over Harriet Miers' commas. Jim Lindgren, the responsible conspirator, flagged the corpus delicti with boldface in this quote from p. 50 of Miers' response to the Senate questionnaire:

My experience on the City Council helps me understand the interplay between serving on a policy making board and serving as a judge. An example, of this distinction can be seen in a vote of the council to ban flag burning. The Council was free to state its policy position, we were against flag burning. The Supreme Court’s role was to determine whether our Constitution allows such a ban. The City Council was anxious to encourage minority and women-owned businesses, but our processes had to conform to equal protection requirements, as well.

But I'd be a hypocrite to join in abusing Miers for misdemeanors of proofreading. Geoff Pullum has put so much time into fixing the errors in my Language Log posts that UCSC gave him a sabbatical as compensation^*, and just yesterday, Chris Waigl caught me misspelling my own senator's name as "Spector".

On the other hand, Americans look for a higher level of qualification in SCOTUS nominees than "it's not so bad, really" and "everybody makes mistakes", and in the case of Harriet Miers, they're not getting much help. [The most coherent case for Miers comes from Michael Bérubé, of all people!] So I'm going to buck the storm surge that's been sloshing from right to left across the political spectrum, and say something positive about her. Not only that, but my encomium will make testable predictions based on on well-established science. Ready for it? OK, here goes: her prose style indicates that Harriet Miers is unlikely to get Alzheimer's. Surely this is an important factor in the case of a lifetime judicial appointment.

I base this prediction on the results of the famous "Nun Study". Briefly,

Two measures of linguistic ability in early life, idea density and grammatical complexity, were derived from autobiographies written at a mean age of 22 years. Approximately 58 years later, the women who wrote these autobiographies participated in an assessment of cognitive function, and those who subsequently died were evaluated neuropathologically. [...] Low idea density and low grammatical complexity in autobiographies written in early life were associated with low cognitive test scores in late life. Low idea density in early life had stronger and more consistent associations with poor cognitive function than did low grammatical complexity. Among the 14 sisters who died, neuropathologically confirmed Alzheimer's disease was present in all of those with low idea density in early life and in none of those with high idea density. CONCLUSIONS--Low linguistic ability in early life was a strong predictor of poor cognitive function and Alzheimer's disease in late life.
[D. A. Snowdon et al., Linguistic ability in early life and cognitive function and Alzheimer's disease in late life. JAMA 275 (7), 1996.]

Unfortunately, we don't have a sample of Miers' writing at the age of 22. But we can extrapolate from measurements taken now, because according to Kemper et al., "Language decline across the life span: findings from the Nun Study", Psychol Aging, 16(2):227-39 (2001),

Idea density averaged 5.35 propositions per 10 words initially for participants who did not meet criteria for dementia and declined an average of .03 units per year, whereas idea density averaged 4.34 propositions per 10 words initially for participants who met criteria for dementia and declined .02 units per year.

I measured "idea density" in a sample of sentences from p. 50 of Harriet Miers' response to the Senate Questionnaire, and the result was 4.77 "ideas" per 10 words. My (possibly faulty) understanding of the estimation method came from reading the cited sources, which are Kintsch, W. (1972) "Notes on the structure of semantic memory", in E. Tulving and W. Donaldson (eds) Organization of Memory, pp. 247–308. New York: Academic Press; and Kintsch, W. and Keenan, J. (1973) "Reading rate and retention as a function of the number of propositions in the base structure of sentences", Cognitive Psychology 5, 257–74. I also looked carefully at the examples given in papers by Snowdon and others.

The participants in the "Nun Study" were originally tested at a mean age of 22; Harriet Miers is 60; so if she were in the non-demented group, her score at age 22 should have been 4.77 + .03*38 = 5.91. Even adjusted by the .02 units per year found in the demented group, her age-22 score should have been 4.77 + .02*38 = 5.53. In either case, this is well above the (unadjusted) mean of the non-demented group.

But wait, there's more. Miers' "idea density" score also predicts that she'll be good at solving "unstructured problems", according to R. Davidson et al., "Using linguistic performance to measure problem-solving", Accounting Education, 9(1) 53-66 (2000) They define "a structured problem as one for which solution procedures are known, objectives are clearly defined, and there is usually an identifiable single correct answer", while "an unstructured problem ... is likely to require heuristic solution methods as well as intuition and experience, since there may be multiple objectives that are not clear-cut and the ‘goodness’ of the results is difficult to evaluate". The results?

We found that idea density does appear to be related to the ability to solve unstructured problems, but not to the ability to solve structured problems, even after controlling for the effects of other variables. We found that grammatical complexity was not significantly related to either problem-solving ability.

Some of this study's results were surprising, at least to me:

... we found a significant positive correlation ... between idea density and unstructured problem-solving, but no significant correlation with structured problem-solving ... We found no significant correlation between grammatical complexity and structured or unstructured problem-solving ... In addition, there was a significant negative correlation ... between measures of grammatical complexity and idea density ... The two measures of problem-solving performance were significantly correlated...

The mean "idea density" of the (prose samples written by the) 90 college-age participants in the Anderson study was 4.17 , with a s.d. of 0.39, so Miers' age-60 score of 4.77 was more than 1.5 standard deviations above the mean. Her extrapolated age-22 score of 5.91 would be almost 4.5 standard deviations above the mean.

I need to say at this point that you should not trust me on this at all. My sample of Miers' writing was small -- less than 200 words. I have a lot of questions about how to apply the "idea density" method, and I have no confidence that my practice was the same as that of Snowdon et al. or Anderson et al. When I was unsure, I gave Miers' credit for an extra "idea", a term that I'm putting in scare quotes because I feel that it's inexcusably tendentious: as far as I can tell, the factor most strongly influencing "idea density" is the frequency of modification and phrasal conjunction. The correlations in the Anderson study were not very high -- r = 0.3 for the relationship between "idea density" and "unstructured problem solving". And it's clear that different styles and genres of text from the same person will have systematically different "idea density" measures.

But still, if you're looking for the bright side of Miers' nomination, I'm here to help.

^*:-).

Posted by Mark Liberman at 04:47 PM

Oh, all right, micturition

A lot of mail is being received at Language Log Plaza concerning the issue of whether the word micturation, as found here and here and here, is some sort of error for the standard medical term micturition. On the one hand, there are far more Google hits for the latter than the former. And although Douglas Adams uses micturation, his use is (i) found in a piece of appalling poetry putatively written by a member of a highly unsavory alien race from another galaxy, which sort of lessens its value as evidence, and (ii) occurs in the poem as a count noun, in the plural form micturations, apparently meaning "urination events" (or possibly "products of urination events"). But on the other hand, the people at Language Log Plaza are not very inclined to get all pissy about technical words for pissing. Especially not when there is an almost totally productive process of deriving nouns ending in -ation from verbs ending in -ate.

On the third hand, though, Moray Allan has come up with a nice case of a (rare) word ending in -ition related to a (very rare) verb ending in -ate that is hardly ever paralleled by a noun ending in -ation: ebullition has 265,000 Ghits, and although ebullate exists (with 1,460 Ghits), ebullation only gets 90 hits, which means it is pretty close to being nonexistent. In cases of this kind, it seems that the -ition word came (from Latin) into English first, and the verb was formed later. Verb/noun pairs like ignite/ignition seem to be extremely rare, so people tend to give the verb the -ate suffix. Let me summarize this inconclusive post with this recommendation: by all means use micturition as the abstract noun for micturating if you want. If you are in medical school, then always use it ("Excuse me a moment; I need to stop by the restroom for purposes of micturition before we go ingest nutritive material"). But don't regard the parallel existence of micturation as indicating some kind of an error. These things happen. Lexical word formation is not one of the domains (if there are any) in which language is neat and orderly and logical.

P.S.: David Pesetsky writes from MIT with this observation:

In J.K.Rowling's Harry Potter novels, a wizard who teleports to another location is described as apparating, but the process is consistently called apparition -- with an "i". See, for example http://en.wikipedia.org/wiki/Apparate#Apparition. That always seemed odd to me as I read the books, but I just assumed that wizards know the right spells to transform theme vowels.

So that would be another case of an -ition word related to an -ate word where there is no -ation word, one would have thought. Except that apparation gets 68,000 ghits. Go figure. I have never heard of such a word, and I am not going to slog through 68,000 pages to see whether all of the occurrences are misspellings or commercial coinages or whether it's merely most of them; I'm delegating that to one of the research assistants at Language Log Plaza. Oh, by the way, there is an opening for a research assistant at Language Log Plaza at the moment. Ph.D. in linguistics or a related field preferred. No riff-raff. Send your resumé with your $150 non-refundable application fee to: Human Resources Department, Language Log Plaza, 3650 Spruce St., Philadelphia, PA 19104-6024.

Posted by Geoffrey K. Pullum at 02:28 PM

Micturation in Vogon Poetry

Semantic Compositions expresses surprise that in my discussion of Michael Tortorello's use of the verb to micturate I:

failed to mention what is without a doubt the single most common reference to a derived form of "micturate"

namely its occurrence in Vogon poetry [WARNING: this link leads to examples of Vogon poetry. Following it may be hazardous to your health.] in Douglas Adams' The Hitchhiker's Guide to the Galaxy. Since Vogon poetry is full of words that are either very rare or non-existent I suppose that Adams' use of micturation adds marginally to my point that it is a rare, high-falutin word, but it isn't worth an awful lot as evidence. Its rarity on the web is much better evidence. Since being forced to listen to Vogon poetry is an established form of torture (in The Hitchhiker's Guide to the Galaxy), I thought that the addition of this marginal bit of evidence was not worth the trauma it would inflict on readers with any literary taste.

Posted by Bill Poser at 01:00 AM

October 20, 2005

How Many Holocausts?

I've recently noticed a few comments on Language Log posts that I hadn't noticed or paid attention to before. One of them is a comment last year by Semantic Compositions on my post on the genocide in Darfur, on which in turn there is a long and interesting comment by Steve of Language Hat and a response to that by Semantic Compositions. Between them they raise a couple of issues. The first, raised by Steve, is the appropriateness of the topic for a language blog. Well, it's true that my post was mostly about the fact that genocide keeps happening with people doing very little about it, which isn't a core linguistic topic, but I thought it was nonetheless appropriate, for two reasons.

The first is that one approach to evading the problem in Darfur has been to quibble about whether what has happened there is, strictly speaking, genocide. That's a use of language to obscure reality and prevent right action, and I think it's just as appropriate to discuss such issues as more technical linguistic matters. To my mind, my post on game-playing over the meaning of the term anti-Semitism is of the same character.

The other reason is that some blogs are very narrowly focused and others are not. Language Log is of the latter sort, so I don't feel too constrained about bringing in topics that are somewhat marginal. You won't find me posting here about my views on, say, the lack of universal health care in the United States (barbaric), punk rock (horrible), Le Comte de Monte Cristo (lovely bedtime reading) manga (booooring), or the softwood lumber dispute (Canada is right), because these really don't have anything to do with language, but topics that have some connection, even if a bit marginal, are fair game.

The other point, the one that Semantic Compositions raised, is whether it is appropriate to refer to holocausts in the plural, or whether we should take the position that there has been only one Holocaust, namely the Nazi attempt to exterminate us Jews. SC's view is that the Holocaust was unique and that it diminishes its uniqueness to use the same term for it as for other instances of mass murder. SC and Language Hat between give a good summary of the issues. You can typologize mass murder according to the number of people killed, the percentage of the target population killed, whether the intention of the killers was extermination per se or merely being rid of the target population (which might be satisfied by driving them out rather than killing them) and various other factors. These aren't entirely quibbles since some of these distinctions imply different degrees of culpability and since making these distinctions can give insight into exactly what happened and why.

For example, when you look carefully at what the Spanish and other European colonists in the Carribean were up to, although it is true that their coming had the effect of virtually exterminating the indigenous population, this was not their intention. The Indians died largely because they had no resistance to European diseases and secondarily because they were overworked as forced laborers. The colonists had no particular bias against the Indians - their motivation was not like that of the Nazis - they just wanted to enrich themselves and didn't much care at whose expense they did it. They would actually have been delighted for the Indians to stay alive - it would have spared them the considerable expense of importing slaves from Africa. If you want to understand what happened, you need to differentiate between genocide and slaving. From a moral point of view I'm not sure that it matters very much - they're both beyond the pale. That's why I tend to side with those who think that for the most part fine typologizing of mass murder differentiates phenomena that are so similar that they should be regarded as falling into the same category.

One reason I'm not interested in too fine a parsing of what should be called a holocaust is that I have an ulterior motive: to save lives. For a variety of reasons the Holocaust of the Jews has name-recognition. Vast numbers of people know about that holocaust and consider it a symbol of a great evil that should have been prevented. Most other mass murders are not nearly as salient and they don't have names. Adolf Hitler famously asked: "Who remembers the Armenians?". His example was good up to a point, in that the memory of the Armenian genocide did little to prevent others, but in a way it was necessarily an imperfect example: he had to choose an example that his audience would recognize. If he had asked: "Who remembers the Dzungarians?", few people would have known what he was talking about.

Because the Jewish Holocaust has name-recognition, assimilating other mass murders to it by using the same term for them serves to make them more salient, more familiar, and more horrible, which, I hope, stimulates action against them. If extending the term holocaust to what is happening in Darfur brings people to equate the Janjaweed with the Nazis and the people of Darfur with the Jews and helps to overcome the attitude that what is happening is too remote and is happening to people who are too different from us for us to feel more than nominal sympathy for them, that's a good thing. Speaking from a Jewish point of view, faced with the decision whether to emphasize the uniqueness of our holocaust or to emphasize its universality, if the latter might save even a single life, there is no question as to what choice to make. This is the way in which the commandments are fulfilled.

Posted by Bill Poser at 04:22 PM

Never anything but less than precise

Something in the DC air has slips of tongue and pen flying thick and fast. The Volokh Conspiracy is in a tizzy over Harriet Miers' "trouble with commas" -- though being a stickler for spelling, grammar and punctuation was supposed to be one of her strengths. And this morning's Morning Edition quotes Senator Patrick Leahy overnegating his colleague Arlen Specter:

I've never know him to be anything but less than precise [862 msec.] on discussing cases.

The context is Specter's public difference of opinion with Miers about what was said in their private interview:

David Welna:	Specter is also at odds with Miers over their differing versions of a private conversation they had Monday, on abortion-related rulings. Specter said Miers recognized these rulings as valid precedents. She contradicted that the same day, an experience that Specter said he'd never had with other court nominees.
Arlen Specter:	And I've never walked out of a room and had a disagreement as to what was said. And as I have said publically, I accept her version.
David Welna:	But Leahy vouched for Specter's version.
Patrick Leahy:	I've never known him to be anything but less than precise [0.862] on discussing cases. [0.693] And I've never known him [0.560] to make a mistake [0.663] on what he heard.

Presumably what Leahy meant was either "I've never known him to be less than precise" or "I've never known him to be anything but precise". This could almost make you believe in Freud's analysis of what he called Fehlleistungen, and what everyone else calls Freudian slips.

Since I gave Harriet Miers credit for signaling her Oval Office agreement error with an extra-long pause, I've noted in the transcript that Leahy pauses for a fairly long time after precise. However, Leahy's post-error pause does not seem to be as outsized relative to his other pauses as Miers' was, and it might just as plausibly be analyzed as the "post sound bite pause" of a well-trained political speaker.

[Leahy's sound bit is here, if you don't want to listen to the whole NPR story. And the 10/19/2005 Leahy/Specter news conference is available from CSPAN here; Amy Ridenour has a transcript of (most of?) it here, but Leahy's comment is fixed up:

"I've dealt with Senator Specter on a lot of legal issues over the years, certainly on Supreme Court cases and nominees, I've never known him to be anything less than precise on discussing cases and I've never known him to make a mistake on what he heard."

]

Posted by Mark Liberman at 11:52 AM

What is so rare as a day in qiqsuqqaqtuq

Today's NYT article about changing Arctic weather conditions ("The Big Melt: Old Ways of Life Are Fading as the Arctic Thaws") has the obligatory reference to Eskimo snow vocabulary:

Across the Arctic, indigenous tribes with traditions shaped by centuries of living in extremes of cold and ice are noticing changes in weather and wildlife. They are trying to adapt, but it can be confounding.

Take the Inuit word for June, qiqsuqqaqtuq. It refers to snow conditions, a strong crust at night. Only those traits now appear in May. Shari Gearheard, a climate researcher from Harvard, recalled the appeal of an Inuit hunter, James Qillaq, for a new word at a recent meeting in Canada.

One sentence stayed in her mind: "June isn't really June any more."

In other news, lawyers for Woden asked for an injunction against unsanctioned use of their client's name in referring to a phase of the traditional Mesopotamian astrological cycle. "No license was ever obtained", said spokesman Louis Dewey. "Some bible translators just started using Woden's Day in rendering Greek originals that didn't mention our client at all." But according to a highly-placed lexicographic source, "these days no one even knows who Woden is unless they look up the etymology of Wednesday".

Posted by Mark Liberman at 04:57 AM

Censure/Censor Again

Reader Robert Lane Green has pointed to another instance of the erroneous use of censure in place of censor, this one in this New York Times article about the trial of Saddam Hussein.

The television feed from the courtroom, housed in the former Baath Party headquarters, was set up with a 20-minute delay so officials can censure any sensitive images or testimony.

The article is by John F. Burns and Edward Wong, so it doesn't seem likely that the error can be attributed to "bleed through" during translation. There probably wasn't any translation involved, and if there was, it would have been from Arabic, not a European language in which a verb that looks like censure has the meaning of English censor. I'm afraid that it looks like this is an error by the Times reporters or an editor. O tempora, o mores!

Posted by Bill Poser at 01:14 AM

October 19, 2005

Innocent Bongo

Since I am residing in Cambridge (Massachusetts) at the moment on sabbatical leave, I was the obvious choice to represent Language Log at the Ig Nobel Prizes awards ceremony earlier this month in Harvard's Sanders Theater. I did go. It is important that the organizers should know that their support comes not only from the ragtag band of tired old Nobel Prize-winning scientists who obligingly join the platform party each year, but also from really serious and important organizations like Language Log. We support them because if it were not for the Ig Nobel committee the business of bring science into disrepute by publishing amusing nonsense would be left almost entirely to BBC science reporters.

For me, the greatest pleasure in the ceremony was to find that the Ig Nobel Literature Prize had been awarded to such a worthy group of recipients: "the Internet entrepreneurs of Nigeria." (They were announced as "unable, or unwilling, to be with us here tonight.") They have indeed done astonishing work. It is not too much to say that they have tarnished the whole field of creative writing (not to mention the name of Nigeria; spam filters will ditch a message for just mentioning the N-word, so the entrepreneurs are now cycling through the names of various West African countries). Do you want to know the name of the latest literary genius who has written to me to try and get help in extracting a windfall from an account in Cote d'Ivoire? His name is Innocent Bongo. I am not kidding. I know I have often been guilty of posting hoax material and random humorous nonsense on Language Log, but I swear I am not making this up. A confidence trickster named Innocent Bongo wants me to help shift 18 million bucks of his father's lifetime savings into the USA via my Harvard credit union account. "I am soliciting for your assistance," he says, "to help me lift this money out from Abidjan to your safe account abroad so that we should invest it in any meaningful lucrative business in your country because this is my only hope in life." Don't underestimate your life's prospects, Mr Bongo. Whether you know it or not, you are the co-recipient of an Ig Nobel prize.

Posted by Geoffrey K. Pullum at 09:26 PM

Omit needless noises

When I said that I was looking forward to Strunk and White, The Movie, I was kidding. But in today's NYT, Jeremy Eichler's review of the new illustrated edition reports on Strunk and White, The Song Cycle. And I'm afraid that this one is serious:

She [Maira Kalman] explained that while she was painting her illustrations, she found herself singing the words and dreaming of a Strunk and White opera, or even a ballet. She turned to Mr. Muhly, whom she had known for more than a decade as a family friend and co-conspirator in various neo-Dadaist adventures. (Ms. Kalman once ran a Rubber Band Society - for people who love rubber bands, naturally - and invited Mr. Muhly to compose a work scored for rubber bands, which he did.) “I knew that Nico and I would have an immediate conversation in shorthand about humor and imagination, and that he’d completely get it,” Ms. Kalman said.

Mr. Muhly, 24, is a talented and audacious graduate of the Juilliard School who has worked with Philip Glass and Bjork. His Strunk and White songs are eloquently scored for soprano, tenor, viola, banjo and percussion. They also include parts for Ms. Kalman’s friends and family, who will make “little gentle noises” through amplified kitchen utensils (vintage eggbeaters and meat grinders) and a set of dice shaken in a bowl.

Apparently Ms. Kalman is the author of the New Yorkistan New Yorker cover, which gives her a lot of whuffie, at least in my personal accounting. I haven't heard the Strunk and White Song Cycle, but I suspect that her balance may plummet when I do.

Luckily the (apocryphal) ballet version is safely stashed in the vast virtual file of defunct witticisms:

White's granddaughter Martha White ... quoted from a letter White wrote in 1981: "You might be amused to know that Strunk and White was adapted for a ballet production recently. I didn't get to the show, but I'm sure Will Strunk, had he been alive, would have lost no time in reaching the scene, to watch dancers move gracefully to his rules of grammar."

Heaven forfend.

[via Chris Waigl at Serendipity, who points out that "Dadaism was a reaction to utter despair"]

[Update 10/30/2005: here's a link to a page describing a performance of the song cycle at the NYPL.]

Posted by Mark Liberman at 08:18 PM

Jeopardy Doing Okay

Since I've pointed out several linguistic goofs by the Jeopardy folks, I have to let you know that this evening not only were all five of their answers correct (they had an entire Languages and Dialects category), but I did not know the answer to one question, namely that Latgalian is the dialect of Latvian spoken in Latgale, which according to the web site referenced is:

an ancient and singular state in the eastern part of Latvia

and one of the four administrative areas into which Latvia is currently divided. The others are: Vidzeme, Kurzeme and Zemgale. (Indulge me: I evidently need a remedial course in Latvian geography.)

Posted by Bill Poser at 07:33 PM

My big fat Greek snowclone

Following on my fretting about the Eye Guy figure -- snowclone, playful allusion, or what? -- I've been offered a couple more variants of Queer Eye for the Straight Guy, plus a note from Jay Cummings pointing out a pun in the title: the straight guy is in fact the "straight man" for the comedy in the show. And then I returned to a message from Ken Callicott (back on 12 October), who started by wondering why some expressions lend themselves to so much variation while others seem not to, and then discovered that the stuff in the second group got messed with too. And noted that the Wikipedia list of catch phrases was a goldmine of snowclones, which made him wonder about "a (veritable) goldmine of X". Oi.

From Marc Ettlinger and Wendell Kimper, independently, a spin-off show Straight Plan for the Gay Man (like QESG, but with a reversed premise). I think I was briefly aware of this short-lived show, but averted my eyes from its obvious awfulness and then wiped it from my memory. And from "acw", a wonderful Eye Guy riff, "black eye for the white guy", suggesting that "some cool black dudes are going to give hopeless white geeks some fashion advice" -- but, ouch, wait, it sounds like the white guy's going to get a black eye!

On to Callicott, who distinguished formulas like

I'm not X, but I play one on TV.
I'm shocked, shocked to X.
Holy X, Batman!

(which he thought were easily generalized from their originals) from expressions like

The devil made me do it!
Show me the money!
Play it again, Sam!

(which he thought resisted substitution of novel words). Call these group A and group B.

But then he (and I) started googling. The things in group B turn out to have lots and lots of variants:

Made Me: X made me do it, where X = my antidepressant, my genes, the VC, Satan, God, the dark side of The Force, video games,...

Show Me: Show me the X, where X = monkey(s), science, numbers, gold, poetry, value, jobs,...

Play It Again: Play it again, X, where X = Mick, Bud, Pete, Pac-Man, RIAA, Gus, (MC) Shan, Maurice, Sledge,...

It Again Sam: X it again, Sam, where X = wear, repair, replay, pay, read, say, use, sell, pitch, tax, write, parse, knit,...

Callicott also googled up these marvels:

Harvard, we have a problem.
Frankly, my dear, I don't give a spam.
"Frasier" has left the building.
Gag me with a Grammy.
It's people! The internet is made of people!

Then, on my own, I looked for variants of My Big Fat Greek Wedding. Vast numbers. For "my" you can get pretty much any determiner: any possessive, "a". "the", "no", "this". And "big" can be, um, expanded to "great big". But the real action is in substitutions for "Greek" and "wedding":

big fat X wedding, where X = gay, lesbian, dyke, queer, Jewish, Italian, Irish, New York, California, Florida, Texas, family, church,...

big fat Greek Y, where Y = Buckeyes [Ohio State fraternity/sorority members], supper, omelet, sandwich, stuffed peppers, feast, diet, restaurant, cafe, shop, place, island, retreat, vacation, festival, party, adventure, architecture, quiz, Emmy, employee benefit, Olympic dream, life,...

big fat XY, where XY = independent movie, American summer, lousy screencast, queer life, physics project, class action, geek plate, Italian Thanksgiving,...

(Of course, I excluded examples where "big fat" could easily be taken literally.) It looks like this one has developed into a Big Fat snowclone.

Callicott concluded there was no real difference between groups A and B, but I'm not so sure. I think that what we should conclude is that almost any model can be playfully varied, but that some have turned into relatively fixed prefab figures, with open slots: Play One, and probably Big Fat. And I suspect that Shocked Shocked and Holy Batman are still in group B, with the other playful allusions, though maybe if they practice hard they can make it to the majors.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 06:55 PM

And they're just as ignorant as it used to do

Barbara Partee notes an odd sentence on the front page of today's NYT, in an article by Eric Lipton entitled "Number Overstated for Storm Evacuees in Hotels":

"FEMA still does not know any more about what it was doing last week than it was a month ago," Representative David R. Obey of Wisconsin, the ranking Democrat on the House Appropriations Committee, said. "It is still, as far as I am concerned, an incompetent agency."

Barbara writes:

I had to backtrack twice on this sentence -- first I thought there was just a temporal peculiarity involving a 'relative' use of what I thought was an 'absolute' deictic, "last week" -- but to get the reading I thought they were after (if "last week" could be used relatively) it would have to say "than they did a month ago", not "than it was a month ago". And after realizing that they hadn't said that, I had to backtrack again and finally concluded there was no possible parse for the sentence and it must be ungrammatical, although I don't think it 'felt' ungrammatical to me, just "huh?". I suppose they meant what I first thought they must have said. This seems similar to modifier-induced number agreement errors, making a matrix verb agree with the 'closest' NP instead of its subject, as in " [the cause of layoffs such as these] are not the taxes" (from Francis 1986, cited in this paper: http://uts.cc.utexas.edu/~tls/2001tls/Pfau.pdf ) Only this time it is a VP-deletion error, incorrectly targeting the nearest preceding VP rather than that of the intended, matrix, clause. Is that a well-known processing error too? I never noticed such a thing before.

I think that Barbara's analysis is exactly right. An example of "agreement with closer" was widely discussed a few days ago, when Harriet Miers said

The wisdom of those who drafted our Constitution and conceived our nation as functioning with three strong and independent branches have proven [1330 msec. pause] truly remarkable.

It's not clear whether she was misled by the string-wise closest plural noun phrase "three strong and independent branches", or by the semantically salient "those who drafted our Constitution", but one of these intervening plurals diverted her attention from the actual subject "the wisdom of those who ... branches". The extra-long pause after "have proven" shows, I argued, that she recognized the problem as soon as the words were spoken.

As Barbara suggests, if we change Representative Obey's "than it was" to "than it did" then his verb-phrase ellipsis makes sense:

FEMA still does not know any more about what it was doing last week than it did a month ago.

Alternatively, if his first clause had been constructed a bit differently, his second clause would have worked out:

FEMA still was not any more competent in its activities last week than it was a month ago.

One caveat: Miers' error was documented in the official recording of her remarks, while so far we only have Lipton's word for what Rep. Obey said. Judging by my previous investigations of NYT quote transcription, there's roughly a 70% chance that any particular word of this quote is accurate. I assume that Obey's quote came from an interview with Lipton, so we'll never know the truth.

And I have a different sort of complaint in this case. Even if this is really what Obey said and not just what Lipton transcribed or copied from his notes, why not give the guy a break and use a different quote, or give him a chance to fix it? Surely the story here is FEMA's administrative confusion, not Obey's speech error.

The NYT's code of ethics says that

If a subject’s grammar or taste is unsuitable, quotation marks should be removed and the awkward passage paraphrased.

Unless there's a subtext to undermine the reputation of the source quoted -- as there clearly is in many quotations of "Bushisms" -- why feature a politician's slip of the tongue in the third paragraph of a front-page story that's about another topic entirely?

Posted by Mark Liberman at 04:26 AM

October 18, 2005

Critical tone for a new snowclone

If you're reasonably up on your American pop culture, you'll have caught the echo of the Bravo television show Queer Eye for the Straight Guy. But if you missed the allusion and didn't take "a critical tone for a new snowclone" to exemplify any sort of formulaic language, you still understood the phrase pretty much as I intended. I can supply plenty of examples that are much closer to the original -- "Queer eye for the dead guy" (#2 below, way below) -- as well as some in an intermediate zone: "Queer Eye for the Scruffy Dog" (#12), "Straight Eye for the Consumer Guy" (#17), "Bubba eye for the Brahmin guy" (#8). The question is then: are these instances of a new Eye Guy snowclone? Or just allusions to an expression? What counts as a snowclone, anyway?

In clear examples of snowclones, like the Play One figure I recently discussed here (based on the model "I'm not a doctor, but I play one on TV"):

the figure contributes some meaning of its own; appreciating the full import of (my own words) "I am not a semanticist, though I play one at Language Log Plaza" depends on getting the acting metaphor, which you can do by active interpretation (crediting me with some cleverness) or by recognizing the half-frozen metaphor in a formula, but either way you treat the expression as figurative, and the figure as meaningful (implicating some less-than-full qualifications for filling some role);

the figure has form as well as content; "Though I'm not actually qualified to talk about semantics authoritatively, I make pronouncements on meaning on Language Log anyway" is not an instance of the Play One snowclone, though it communicates much the same thing as "I am not a semanticist, though I play one at Language Log Plaza";

this form is neither completely fixed (as in frozen idioms like "by and large") nor subject to many variations (as in take-offs on book or movie titles, a topic I'll get to shortly); like many idioms, it has a lot of fixed stuff and some variable slots (the idiom/snowclone borderline I'll take up on another occasion);

you can use the figure without much thought; you get it "off the shelf", and real creativity (even at the level of the pun) is not required;

you can use the figure without any appreciation of its origin; in fact, for many snowclones the original model is hard to determine.

The Eye Guy examples I've already given do have form, but as far as I can tell the figure supplies no content of its own; the figure is, as I've already suggested above, enormously variable; I read all the examples of it I've found as consciously crafted, rather than fleshings-out of prefab skeletons; and it's hard to imagine anyone using the figure without an appreciation of its origin. The Eye Guy figure is certainly formulaic language, but it's not a good snowclone at all. I'd put in the category of "playful allusions", a category that includes puns and variations on fixed expressions.

Playful allusions are heavy on the ground in certain contexts: they're much used in advertising (a lot of the data that Elizabeth Zwicky and I collected for our 1986 piece on imperfect puns -- "Imperfect puns, markedness, and phonological similarity: With fronds like these, who needs anemones?", Folia Linguistica 20.3-4.493-503 -- came from advertising rather than jokes), and headline writers for feature stories are fond of them (perhaps over-fond, as Geoff Pullum keeps pointing out about stupid headlines on stories about language). The people who name porn flicks are also in love with playful allusions to idioms, titles, and other fixed expressions. Here's a sampling of such allusions from the Adam Gay Video Directory 2005 (Knight Publishing Corp., Los Angeles):

mostly simple puns, alluding to idioms, clichés, common collocations, and occasionally proper names and titles, and introducing sexual content to them: Ace in the Hole, Ace of Spades (which I'm sorry to say features black men), Big Ben, Camp Out, Giant, Grand Slam (starring Max Grand), Hard at Work, Longshot, Moving Men, Perfect Fit, Quarterback Sack, Red Hot Pokers, Service Trade (punning sexually on both words), Standing Erect, Stiff as a Board, Stone Fox (starring Eddie Stone), Stud Farm, Take It Like a Man, Uncut Timber;

imperfect puns, mostly on titles: Bang of Brothers (Band of Brothers), Hot Throb ("heart throb"), Rear Factor (Fear Factor), A Rim with a View (A Room with a View), Semper Bi (semper fi), Zak Attack ("Mac attack", and starring Zak Spears);

variations by substitution, often with punning as well: The Agony of Ecstasy (probably alluding to, and blending, both "the agony of defeat" and The Agony and the Ecstasy), American Porn Star (explicitly modeled on American Idol), Dear Dick (the main character is an advice columnist), Greek Holiday (Roman Holiday), A Man's Tail (A Boy's Tale, plus doubly punning "tail": "tail" 'posterior, ass, butt', and a merman, complete with tail, as a major character).

The Eye Guy examples are a lot like this last set of playful allusions. But before I trot out more Eye Guys, a few words about the title Queer Eye for the Straight Guy.

"Queer eye for the straight guy" is a little poem, one line of highly structured tetrameter. It has the form

Adj1 N1 for the Adj2 N2

exhibiting syntactic parallelism, and within the paired nominals, having Adj1 and Adj2 semantically paired, by opposition (queer vs. straight), and N1 and N2 phonologically paired, by rhyme (/aj/). The expression is dense with linguistic organization.

(Why, you ask, "queer" instead of "gay"? At least three possible motivations. One, "queer" is trendier than "gay"; there are a lot of people who think that "queer" is the successor to "gay" in much the same way that "gay" was the successor to "homosexual". Two, some people think that "queer" is more inclusive and less sexualized than "gay". Third, "queer" and "straight" are similar in phonological structure -- both with initial consonant clusters (/kw/ and /str/) and a closed syllable (in /r/ and /t/) -- while "gay", with its single initial consonant and open syllable, is less similar to "straight". Ok, I grant that "gay" and "straight" have the same vowel, but I think that "queer" wins on points.)

Next, there's a double sense to the definite article in "the straight guy"; it's both generic and individual. First, the Fab Five propose to improve the lives of straight men in general. But, then, on each show they make over one particular man, the straight guy for that show.

Finally, an oddity of "queer eye for the straight guy": it's understood as having an indefinite article, 'a queer eye for the straight guy'. Singular count nouns like "eye" normally can't occur without a preceding determiner (not "I saw eye", but instead "I saw a/one/the/this/its eye"). However, one place where articles, indefinite or definite, can be omitted from NPs with singular count heads is at the very beginning of (but not inside) titles: Band of Brothers, Day of the Dead, Dead Ringer, End Zone, Pretty Woman, Roman Holiday, Small World. Omitting the indefinite article in Queer Eye for the Straight Guy signals that it's a title, not an ordinary expression of English. It also allows the title to begin with an accented syllable, which might be seen as making it "punchier" -- more masculine, in fact. (A major subtext of the show is how much the Fab Five share with the guys they make over, just by virtue of all being men.)

Ok, if you're going to vary Queer Eye for the Straight Guy, what's available? Some small stuff: you can make the indefinite article explicit, and one or both of the Ns can be plural rather than singular (if N2 is plural, then it no longer has to have the definite article). Both of these adjustments are made in

(1) A queer eye for Nazi guys Why is the London Lesbian and Gay film festival celebrating the works of a Nazi film-maker? B Ruby Rich Friday March 12, 2004 (link)

in which we also see a substution for Adj2, "Nazi" for "straight". Adj2 is also replaced in:

(2) "Queer eye for the dead guy" (caption for New Yorker cartoon, p. 650 in Complete Cartoons)

(3) "Queer Eye for the Santa Guy" (title of piece in The Advocate, 12/21/04, p. 58, by Carson Kressley, one of the Fab Five)

(4) "'Queer Eye' for Wine Guys (and Gals)" (Fox News article featuring Queer Eye's Ted Allen offering wine advice)

(5) Queer Eye for the Green Guy. Yes, clothes really do make the activist. By Lou Bendrick. 03 Mar 2005. (link)

(6) Queer Eyes for the Spanish Guys (2003 gay porn video from Big City Video; note that here "eye for" has an appreciation sense, as in "have an eye for something")

In a step further away from the model, both Adj1 and Adj2 can be replaced:

(7) "Homosapien eye for the Neanderthal guy" (caption for 1/24/04 "Speed Bump" cartoon)

(8) Playing the Daddy card was part of the Kerry makeover by the Clintonistas -- Bubba eye for the Brahmin guy. (Maureen Dowd op-ed column, "Getting Junior's Goat", New York Times 10/7/04, p. A31; the make-over sense is maintained, but advice-giving from someone alluded to by the subject NP is not)

Still sticking fairly close to the model, N2 can be replaced, in the examples below losing the rhyme to N1:

(9) "Queer Eye for the Straight Girl" (Bravo spinoff of the original)

(10) "Queer eye for the straight pimp" cartoon (link)

Or the whole second NP can be replaced, either by something that still rhymes with "eye" --

(11) Queer Eye for the GI (All Worlds Video gay porn film; appreciation "eye for" again)

or, more distantly, by something that does not:

(12) ... the 96-page glossy, cocktail-table magazine, New York Dog, debuted, featuring a dog psychology advice column, dog horoscopes and dog obituaries, along with such articles as the makeover-inspiring "Queer Eye for the Scruffy Dog." (Chuck Shepherd, "News of the Weird", Funny Times of January 2005, p. 15)

Then there's a whole series of variations in which "straight" and "queer" are (I hate to say it) inverted, sometimes preserving N1 and N2 --

(13) Straight Eye for the Queer Guy Watch thousands of short films; see exclusive interviews with big-name directors, producers and writers. (link)

sometimes replacing N2 by another noun --

(14) Straight eye for the queer gals Why do men love to see women kissing? It's about self-loathing -- and the lusciousness of the female body. (link -- appreciation sense again)

and sometimes switching the two NPs as wholes --

(15) Straight guys for gay eyes (Unzipped ad, October 2005, p. 69, for "straight porn for gay men" at sg4ge.com; more appreciation, now involving goal/purpose "for")

Finally, two occurrences of "straight eye for the Y guy", with "straight" as Adj1 and some Y other than "queer" as Adj2:

(16) Listed below are links to weblogs that reference Straight Eye for The Straight Guy: ... (link)

(17) Straight Eye for the Consumer Guy Dan Friedman. The second most obvious trend in television -- next to the sudden popularity of so-called 'Reality' TV -- has ... (link)

These are just fortuitous finds, plus some examples on only the first few pages of the results of googling on "queer eye for" and "straight eye for", which of course will yield only variants pretty close to the model. Eventually more distant variants will turn up, I have no doubt. I certainly don't see any template emerging in the data.

My impression is that the function of the Eye Guy figure is largely to allude to the title Queer Eye for the Straight Guy. A fair number of the variations alter the semantics of the model significantly, so that the main effect of using the figure is to call attention to the writer's cleverness; several are transparently jokes, and most of the rest seem to be more eye-catching than informative. Just the sort of thing that so annoyed Geoff Pullum in those headlines from science writing, and not particularly snowclonish.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 08:36 PM

Typhoon Long Wang

On the Tonight Show tonight Jay Leno made fun of the fact that the latest typhoon in China is typhoon "Long Wang", saying that the Chinese are evidently running out of names. (For our non-native English-speaking readers, the joke is that wang is a slang term for the male member.) What Jay probably doesn't know is that "Long Wang" is 龍王 "Dragon King". It's a very appropriate name for a typhoon since it's the epithet of the Rain God. And it's best not to make fun of dragons.

Posted by Bill Poser at 12:12 AM

October 17, 2005

The interpreter shortage

Mark's post on the shortage of interpreters in Iraq is yet another reminder of this persistent problem. There aren't enough interpreters in Iraq or in the various intelligence agencies and the FBI, which can't process all the material that they collect. You might think that the lack of attention to linguistic matters is just one facet of the general incompetence of the Bush administration, along the same lines as the failure to provide armour for vehicles and soldiers, but that isn't the whole of the problem. The stupid and bigoted policy of firing translators who disclose that they are gay doesn't help, but it's a small part of the problem. The fact is that the US government has shown a disturbing lack of interest in foreign languages for decades. A particularly salient example is the incident in 1980 after the Soviet invasion of Afghanistan when a Soviet soldier took refuge in the US embassy in Kabul and asked for asylum. Communication with him was difficult - none of the embassy staff could speak Russian.

The government does operate four language schools: the Defense Language Institute (military), the Foreign Service Institute (State Department), the CIA Language School, and the NSA Language School, which have a reputation for quality. Indeed, one of my high school friends, who had fallen in love with Arabic, after investigating the various ways in which he could study the language, joined the army in order to to attend the Defense Language Institute. The Defense Department and the intelligence agencies have long sponsored Natural Language Processing research. However, there clearly aren't enough resources devoted to language training, and what is most curious, there appears to be little interest in encouraging soldiers to learn languages on their own initiative. As the Salt Lake Tribune article about Sargeant Hamblin mentions, "he gets no special perks for his special skills". It's a mystery to me why there aren't incentives for soldiers to learn languages, and why there aren't more extensive materials made available to help them.

The military seems to have taken language skills much more seriously during the Second World War. My father went directly from being a buck private in basic training to Master Sergeant in an intelligence position because he could speak French, Flemish, and German. The Army recognized that the ability to speak these languages was useful for interviewing civilians and interrogating enemy soldiers.

The military also made a point of training soldiers in Chinese and Japanese. In addition to the intensive courses in these languages, they produced materials for self-instruction. I have in my collection a book entitled Japanese for Military and Civilian Use by Richard D. Abraham (Instructor of Foreign Languages, Philadelphia Public Schools) and Sannosuke Yamamoto (Instructor of Japanese, Philadelphia Navy League School), published in 1944. It begins with 14 grammatically-based lessons of the sort that one would find in books in the Teach Yourself series and the like. These are followed by 28 topically oriented lists of sentences and then 11 appendices, most of which are devoted to the vocabulary for particular topics. The first, for example, covers the parts of airplanes, which are illustrated. It concludes with short Japanese-English and English-Japanese dictionaries.

Much of the vocabulary introduced is related to combat or its immediate aftermath. It includes military ranks and organization, weapons, distances and directions, medical care, questioning the enemy and statements to captors. At the same time, the authors evidently had in mind the future occupation of Japan. The book covers restaurants, cabaret and geisha, getting a haircut, having laundry done, and sending a telegram.

The soldiers I have spoken to report that they are not provided with such materials, nor given any real encouragement to learn Arabic. That's a shame since it would likely be of immediate military advantage, make a good impression on Iraqis, and reduce the number of incidents that result from misunderstanding, all too often with fatal consequences.

Posted by Bill Poser at 01:17 PM

Things that are rarely better than they normally are

Matthew D. LaPlante is a reporter who has been "traveling in Iraq with Utah-based military units", and writing about it in the Salt Lake Tribune. Last week he contributed an interesting story about the shortage of translators, and more generally of Arabic-language skills, among Army units in Iraq: "Interpreters in high demand in Iraq."

The story is focused on the experiences of a "steely eyed sergeant" named Ozro Hamblin, who is "the only member of the Utah-based 222nd Field Artillery who speaks Arabic".

Hamblin is not, in fact, a translator by trade. A Middle Eastern studies major at the University of Utah, he has been learning Arabic for a few years. He speaks enough to get by, and, he hopes, to help Iraqi citizens confronted by his unit to feel a little more at ease.

He gets no special perks for his special skills. And when he leaves this unit, as he is hoping to do soon, there will be no one to take his place.

According to the story, "the Army didn't pay for [Hamblin's] training and, during pre-deployment training, would not offer him or anyone else Arabic classes."

"I honestly thought there would be a lot more people that would speak at least a little bit of the language, and I thought there would be interpreters," said Hamblin, who is on his second tour of duty in Iraq. "The first time I was over here we never had an actual interpreter the entire time. It blew my mind. You'd think the big Army could fix these problems, but they can't."

The Defense Language Institute has made a big investment in increasing its Arabic classes, though as far as I know they are still offering only Modern Standard Arabic, and the graduates are unlikely to wind up on the streets in units like Hamblin's. Eventually perhaps CASL will make a difference, but not quickly. I doubt that there are any quick solutions here, though I share with Hamblin a sense that the folks in charge ought to be working harder on the problem.

Meanwhile, I was taken aback linguistically by the third sentence of the article:

RAMADI, Iraq - Like greyhounds out of the gate, the soldiers spring from the armored doors of their Humvees, and pour down the road toward the approaching car.

Behind their ranks, a steely eyed sergeant named Ozro Hamblin jogs to catch up. It is rare, Hamblin knows, for these kinds of situations to end better than they normally do.

Now, the reporter might have had in mind a distribution of outcomes for such situations in which it's literally true that a very small fraction of the results are above the mean (or some other interpretation of the "normal" outcome). However, I don't think this is what he meant.

Instead, I think that this is a subtle instance of the average = bad confusion, involving equivocation between the statistical and moral interpretions of words like "better" and "normal". This was the basis of Garrison Keillor's little joke about Lake Wobegon, where "all the children are above average", and it was behind the complaint that "[t]he [No Child Left Behind] tests being used are formulated so that 50 percent of the test-takers will fall below the median score -- in effect setting school districts up for failure no matter how much preparation students receive."

In this case, I think the reporter meant that "these kinds of situations" normally end badly -- another instance of the military expectation of SNAFU -- and rarely end well enough to deserve being described as having ended "better than" this expected result. More simply, I suspect that he wanted us to know that the outcome of such situations is usually bad and is rarely good, and that Hamblin knows this.

Why didn't he just say so, instead of exposing himself to the Escherian complexities of the comparative construction? I don't know, but there may be a connection to the phenomenon of overnegation. This involves backwards interpretation of sentences with multiple negatives, some of which are usually negative words with phrasal complements, as in "No head injury is too trivial to ignore", or "Don't fail to miss this spectacular event".

Posted by Mark Liberman at 10:16 AM

October 16, 2005

Omit needless pictures

In Friday's WSJ OpinionJournal, David Gelernter was unhappy about the new illustrated edition of Strunk and White:

Maira Kalman's illustrations are the occasion for the 2005 edition and attendant hoopla. They are a well-meaning mistake.

According to Gerlernter,

The problem with these pictures is their strange relation to the text. A section on pronouns includes a sample sentence that mentions "Polly." On the facing page is a loud, large picture of Polly--who has nothing to do with the topic under discussion. Ms. Kalman's pictures are like a kibitzer's random observations during a conversation among friends.

I'm looking forward to the movie version, myself.

I admit that the book itself is low in dramatic tension, and completely bereft of car chases, but there are some compelling images in the background. I'd gladly pay my $8.50 to see Robin Williams twitching his way through Strunk as described by White:

From every line there peers out at me the puckish face of my professor, his short hair parted neatly in the middle and combed down over his forehead, his eyes blinking incessantly behind steel-rimmed spectacles as though he had just emerged into strong light, his lips nibbling each other like nervous horses, his smile shuttling to and fro under a carefully edged mustache.

Strunk has some good speaking lines as well:

In the days when I was sitting in his class, he omitted so many needless words, and omitted them so forcibly and with such eagerness and obvious relish, that he often seemed in the position of having shortchanged himself — a man left with nothing more to say yet with time to fill, a radio prophet who had out-distanced the clock. Will Strunk got out of this predicament by a simple trick: he uttered every sentence three times. When he delivered his oration on brevity to the class, he leaned forward over his desk, grasped his coat lapels in his hands, and, in a husky, conspiratorial voice, said, "Rule Seventeen. Omit needless words! Omit needless words! Omit needless words!"

And so does White:

To me no cause is lost, no level the right level, no smooth ride as valuable as a rough ride, no like interchangeable with as, and no ball game anything but chaotic if it lacks a mound, a box, bases, and foul lines.

More raw material for the screenplay can be found here, and perhaps Geoff Pullum would be available as a consultant.

[By the way -- is it WSJ house style never to use the title "Dr.", or did they just goof in calling the author "Mr. Gelernter"?]

Posted by Mark Liberman at 02:15 PM

Playing one 3

The story so far: in tracing back the Play One snowclone -- the model for which is "I'm not a doctor, but I play one on TV" -- I have separated the commercial that seems to have contributed most to its spread (for Vicks Formula 44 cough syrup, beginning in 1985, using actors who played doctors on soap operas) from a similar commercial (for Oral-B toothbrushes, beginning in 1982, using an actor who was framed as a dentist) that probably helped. In updates in my last posting, I suggested that people who recall the line as "I'm not a dentist, but I play one on TV" have probably blended elements from the two ads.

But at least three loose ends remain: several sources suggest the involvement of Robert Young, star of "Marcus Welby, M.D." (which played from 9/69 through 5/76) in the snowclone; several sources recall the commercial as being from the 60s or 70s, not the 80s; and several sources recall a pain reliever, aspirin in particular, as the product being hawked. I'm now prepared to say that, basically, everybody's right, except for a certain amount of blending of memories. There were THREE commercials contributing to the snowclone.

Ok, let's start with Robert Young. A few of my e-mail correspondents recall him as having uttered the line in a television commercial, and several websites agree, for instance this one:

Traditional marketing is all about Impostors - BzzAgent isn't. From the first radio ad that began with a cheery announcer calling you "friend" to the attempt to make every TV spot about people just like you, brands have always relied on two simple facts. You trust people you know and you buy from people you trust. They don't know you so they have made an art form of creating instant trust relationships.

If they use a celebrity you recognize as a spokesperson they reason you feel you already know them - so you will trust them. When Robert Young, then the star of TV's Marcus Welby M.D. said, "I'm not a doctor but I play one on TV" while wearing a lab coat (!!) people bought aspirin. They felt they knew him - he was like their doctor - they would certainly do what their doctor recommended.

and this one:

Remember the old advertisements that featured actor Robert Young from Marcus Welby, when he'd say, "I'm not a doctor, but I play one on TV?" Well, for too long we've had Presidents who weren't fiscal conservatives, but who played one on TV! Together this President and this Congress have shown the American people that old categories and labels don't make much sense anymore--it's actions that count.

But other sources don't claim that Young uttered the line, merely that he was featured, in his Dr. Welby persona, in a commercial, as here:

Sometimes advertisers create lines or scenes so memorable that they're cemented into the culture. Here's one: "I'm not a doctor, but I play one on TV." That was Robert Young, who played Marcus Welby, M.D., with such credibility that a drug company hired him to put on a white lab coat and hawk its product in a commercial.

and in e-mail from Lori Levin:

You pointed out that the Wikipedia doesn't provide a reference for the Marcus Welby commercial in the 1970's. I also can't provide a reference, but I remember it pretty clearly. My husband does too. (We are old enough to remember the 1970's.) Can't remember what the product was, just that it was strange to be asked to believe an actor who played a doctor.

We now have most of the ingredients of a full account, which is clarified here, on "Steve's Primer of Practical Persuasion and Influence" (by Steve Booth-Butterfield, in Communication Studies at West Virginia University):

I am old enough to remember the TV series, "Marcus Welby, M.D." The actor, Robert Young, portrayed a friendly, wise, and incredibly available physician who never lost a patient except when it would increase the show's Nielsen ratings.

Most interesting was the fact that Robert Young parlayed his fame as Dr. Marcus Welby into a very productive sideline. He sold aspirin on TV ads. And he sold aspirin, not as Robert Young, the actor, but as Dr. Marcus Welby.

There were enough lazy thinkers out there that they did not realize that the guy on the ad selling aspirin was merely an actor and not the real thing. It didn't matter. Robert Young looked and acted like an authority. And sales of his brand of aspirin increased.

Eventually the federal authorities got wise to this gimmick and cracked down on it. It is now illegal to use an actor in this way. So what have advertisers done? Their response and its impact is so amazing to me that it stands as the best example of how lazy we can be.

Here's the new trick. The advertisers will still use a popular actor to sell their aspirin and stay legal with their ads. Here's what happens. The famous TV doctor looks at the camera and says, "I'm no doctor, but I play one on TV and here's the aspirin I recommend." And sales of that aspirin increase.

So: apparently Robert Young did make a commercial as Welby, in the 70s (those who recall it as in the 60s are probably just illustrating the old observation that a lot of what we think of as "the 60s" actually happened in the early 70s), and for some brand of aspirin (which brand it was is not particularly important for the story, but Benita Bendon Campbell's hazy recollection that it was Bayer aspirin is probably right, to judge from the snippet of a 1974 article by Dennis Baron in the Journal of Popular Culture that I googled up). But it seems that he didn't actually say "I'm not a doctor, but I play one on TV"; instead, he just continued playing a doctor, but now in a commercial instead of a dramatic show. And some of us have conflated aspects of this commercial with aspects of the later Oral-B and Vicks commercials.

Several sources suggest, in fact, that for Young himself the boundary between Young and Welby was none too sharp, so that he would have found it unnatural to separate himself from his television character. He later returned to this persona in another commercial, as described by Ronald Pine (in Philosophy at Hawaii Community College) here, on his "Essential Logic" website:

Sometimes the actor being paid has acted previously as an authoritative personality relevant to the product being endorsed. Actor Robert Young was best known for his roles in the TV shows Father Knows Best and Marcus Welby, M.D. In both shows, he played the role of a very stable and wise person that people could turn to in times of confusion and agitation. Later, in a Sanka coffee commercial, he seemed to play the same role endorsing the caffeine-free benefits of this product in the commercial. Although there was no direct reference to him being a doctor, he wore the same clothes and acted the same as he did in Marcus Welby, M.D., endorsing Sanka as a cure for upset people who were about ready to strangle their dogs or kids.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 02:13 PM

October 15, 2005

Web search counts: half empty or half full of __?

Heidi Harley at Heideas has posted some thoughts on Scalar Adjectives with Arguments, illustrated by a cartoon:

Heidi observes that "half empty of X" seems bad to her, though "empty of X" is fine; and she points out that Google counts support her judgments, since the ratio of the frequencies of "half full of" and "half empty of" is much higher than the ratio of the frequencies of "half full" and "half empty" -- by a factor of more than 100.

This reminded me of the on-going concerns about the usefulness of web counts for linguistic analysis . One way to evaluate this is to look at the consistency of the numbers across different web search engines, as I did in the post just linked to. As discussed in that post (and especially in the posts by Jean Veronis cited there), there are several reasons for inconsistencies across search engine counts:

differences in what's indexed, especially in how duplicate documents and nests of fake "search engine optimization" documents are excluded;
differences in how much of each document is indexed;
differences in how counts are estimated (since the numbers are often an extrapolation from a small sample).

The advantage of the web search engines is that they index a lot of documents, so that you can get a reasonable sample size for fairly small corners of the language. The disadvantage is that (despite their best efforts) they index a lot of crap, and (at least in some cases) their counts may be estimated by methods that are not very accurate.

In order to offer a small numerical window on some of these issues, I thought I'd repeat Heidi's experiment with two other search engines (Yahoo and MSN in addition to Google), and with three other corpora of various types and sizes.

I added the GigaWord corpus of English-language newswire available from the Linguistic Data Consortium, which currently comprises about 2.5 billion words (2,458,744,437 words in 5,710,419 documents, to be exact); the LDC's collection of English conversational telephone speech (CTS), which currently comprises about 25 million words (26,151,602 words in 28,274 conversations, to be exact); and the World Edition of the British National Corpus, which includes about 100 million words (100,467,090 orthographic words in 4054 texts, to be exact).

	Google	Yahoo	MSN	GW (words)	GW (docs)	CTS	BNC
full	1,890,000,000	2,000,000,000	367,645,836	524,609	436,154	2,577	28,215
empty	137,000,000	192,000,000	37,086,190	71,923	64,181	280	5,379
full / empty ratio	13.8	10.4	9.9	7.3	6.8	9.2	5.2
half full	2,030,000	4,090,000	683,765	2,111	2,031	14	66
half empty	1,580,000	2,530,000	384,730	1,966	1,908	8	35
half full / half empty ratio	1.28	1.62	1.78	1.1	1.1	1.8	1.9
half full of	248,000	397,000	63,841	123	119	2	22
half empty of	1,910	1,420	1,682	3	2	0	1
half full of / half empty of ratio	130	280	38	41	60	NA	11
(half full of / half empty of) / (full of / empty of) meta-ratio	101	173	21	38	56	NA	12

I've got four points to make about this tiny little experiment.

The first point is that Heidi's results are validated: it's clear that "half empty of" is a lot less frequent that "half full of". This doesn't seem to be a matter of complete ungrammaticality, since some of the hits certainly seem perfectly fine to me:

(BNC) We swayed down the long baggage car, which was half empty of freight and very noisy, and George, having told me to remove and lay aside my waistcoat in case I got oil on it, unlocked the door at the far end.
(GW) When Mike Tyson and Buster Mathis Jr. finally entered the ring Saturday night, the 18,000-seat Spectrum was half empty of spectators and fully barren of suspense.

What's going on? See below, and look at Heidi's post and the comments from Q. Pheevr and Lance Nathan for some thoughts.

The second point is that size matters. A corpus of 26 million words (the LDC CTS corpus) is too small to address Heidi's question in a reliable way. There are only 2 instances of "half full of", and no instances of "half empty of", and from this we can determine little other than that neither of the patterns involved is terribly common. A corpus of 100 million words (the BNC) is not a great deal better. There are 22 instances of "half full of", and 1 instance of "half empty of". This is enough to support a judgment that the first pattern is genuinely more frequent than the second, but not enough to support much research into why this is or what it means. A corpus of 2.5 billion words (the LDC GW newswire corpus) is again not a great deal better: the string counts of 123 and 3 give us even better confidence that there's a difference in expected frequency, but not a great deal to go on in determining what factors are involved in discouraging or permitting the infrequent pattern.

The third point is that the welcome size of web indices has a price: web search results are sometimes heavily weighted with crap of various sorts. In this case, nearly half of the web search counts for "half empty of" are from the typo "half empty of half full" (for "half empty or half full"): 836 of 1910 from Google, 573 of 1,420 for Yahoo, and 725 of 1,682 for MSN. There are other problems as well:

Crisis management would be far too Glass-is-half-empty of a term to describe life since the relocation to LA ...
Only the most "glass half empty" of HR professionals will perceive that their careers are at risk
.. the large bottle, half empty, of diet cola, reminder of the previous evening's riotous revelry...
"Half empty, of course," said Pamela.

Still, we can find plenty of relevant examples to suggest ideas about why the pattern is relatively infrequent and nevertheless sometimes validly used. One clue is that many of the valid examples involve parallel contrast with "half full", e.g.

It’s half full of water and half empty of water.
...is it half empty of sadness or half full?
...remember this: when a glass is half empty of water, it's also half full of hot air!

Many of the other examples seem to be cases where "half empty" means "half emptied", referring to a point in a process of emptying, whether literally or as a metaphoric evocation of loss:

Due to the basic laws of physics, by the time your reel is half empty of line, the drag has effectively doubled.
The glass is half empty of a brown liquid, two ice cubes float in the mix, waiting to melt and become part of the whole.
The moon is half-empty of white. The houses and roads are serrated.
She looked around the familiar scene -- two glasses half empty of soft drink, overflowing ashtrays, the TV flickering with the sound off.
Sure enough, when I got back, the tub was half empty of water.
The Veneroso number suggests that central bank vaults are “one-third to one-half empty” of their reported gold.

This supports Q. Pheevr's suggestion that "The contexts in which I would use half empty are mostly ones in which the container being described is expected to be full, or was recently full and has been partially emptied". In any case, web search results need to be examined carefully, and counts need to be considered in the light of such examination.

The fourth point is that web search counts are (as we already knew) quantitatively unreliable. As I've observed before, we can tell this from the instability of ratios of counts across different search engines -- and of course ratios of ratios of counts are even less stable. We can look at the issue in a different way, given exact counts from corpora whose properties we know better (very few duplicates, little or no garbage text, decent proofreading). In particular, we can use the exact (word and document) frequency counts from the GW corpus to (crudely!) estimate the number of documents indexed by the web search engines for comparable searches. Note that the structure of the GW corpus (2,458,744,437 words in 5,710,419 documents) means that it averages about 431 words per document, which should be the same order of magnitude as the web documents that the search engines index.

The word full occurs in 436,154 GW documents out of 5,710,419, which corresponds to a rate of about 76.4 per thousand documents. If the rate for the search engines were the same, and their (document) counts were accurate, then MSN would be indexing about 4.8 billion documents; Yahoo about 26 billion documents; and Google about 25 billion documents. If we do the same extrapolations for empty, we get MSN at 3.3 billion, Yahoo at 17 billion, and Google at 12 billion. Doing it for "half full" gives us MSN at 1.9 billion documents, Yahoo at 11.5 billion, and Google at 5.7 billion. The estimates for "half empty" are MSN at 1.2 billion, Yahoo at 7.6 billion, and Google at 4.7 billion.

All of these are order-of-magnitude consistent with the general idea that text searches on the web are now indexing roughly 10 billion documents. But there's an interesting trend: as we look at words or phrases whose document frequency in the GW corpus gets smaller, the resulting estimates of the size of the search engine's collections also gets smaller. I don't think this gives us any special insight into the search engines' true index sizes, but it does suggest that their methods for estimating counts might have increasing positive bias with increasing frequency.

Posted by Mark Liberman at 05:51 PM

Suffices to say

I guess I'm getting interested in eggcorns after all.

Yesterday I had lunch with a couple of nonlinguist friends, and one of them asked me whether "suffice it to say" or "suffices to say" is grammatically correct. I had never heard (or read) "suffices to say" before, but I had some educated guesses about how to answer my friend's question, which were later generally confirmed with a few quick Google searches.

First, my educated guesses. I think "suffice it to say" is the right idiomatic phrase, but it's easy to see why it might be misanalyzed as "suffices to say". For one thing, "suffice" is not a commonly-used verb in English outside of this idiom. Thinking it over since the conversation over lunch yesterday, I can only think of one other common use: the phrase- or sentence-final use that is typically preceded by a modal; e.g., "a simple phone call would suffice". The available evidence thus does not make it clear that this verb can be used transitively, the "it" of "suffice it to say" not being particularly referential. Moreover, the subjectless subjunctive is also not very common in English anymore; there's no reason to think that the covert subject is third person singular, and thus that the verb is third person indicative, which in the present tense is "suffices", which sounds sufficiently like "suffice it" to complete the misanalysis.

[ Update, Oct. 17: Rich Alderson writes to offer the following better-educated alternative analysis:

In what way is "suffice it to say" to be taken transitively, or to have a covert subject? It is, of course, a frozen subjunctive, with nonpersonal "it" as subject, and the typical inverted subject-verb ordering of older hortatory subjunctives in English (as in the example you quoted, "be that as it may").

Add in the slightly more archaic "suffice to say", and the common conversion of frozen subjunctives to an indicatives ("it suffices to say"), and we have the situation with which your friend presented you.

Absolutely right -- don't know what I was thinking by calling that "it" an object. (Can you say "that suffices me" as an alternative to "that is sufficient to/for me"? I think not.) I need to work on my syntactic analysis skills, or maybe I just need to buy the Cambridge Grammar.

Incidentally, further perceptive comments on this post can be found here. ]

Now, the results of my Google searches. {"suffice it to say"} gets over 2 million hits, whereas {"suffices to say"} gets just under 100,000. More interestingly, the vast majority of the examples of "suffices to say" (roughly 90%) are subparts of {"it suffices to say"}, which indicates to me that, among most people who use "suffices to say", the misanalysis is fairly complete: roughly put, if they were just misanalyzing "suffice it" as "suffices" but otherwise just memorizing the phrase as an idiomatic, subjectless indicative clause, there would be no motivation for adding the subject "it". (Incidentally, I also got a literal handful of 5 hits for {"he suffices to say"}, and none for {"she suffices to say"}.)

Finally, the first hit in my search for "suffice it to say" was for a Random House "Word of the Day" page. (The link to the original page appears to be dead, so I relied on Google's cache thereof "as retrieved on Oct 8, 2005 18:17:28 GMT".) The word of the day on July 14, 1997 was apparently "suffice", but the text on the page (copied below) is a response to someone's question about the idiomatic phrase "suffice it to say". Details aside, the parallels between what's said in this response and what I thought about it over lunch yesterday are pretty remarkable, given that I'm not a grammarian of English -- though of course this may say more about this particular Random House Word of the Day columnist's abilities than about mine ...

Allison Payne writes:

I hear people say "suffice it to say..." To my way of thinking, this should be "let it suffice to say that..." Is there a rule about this or am I getting irritated for no reason?

You're getting irritated for no reason. There are a few things going on here, and the easiest thing to do would be to say it's just an established idiom, but we can look at it in some more detail.

Suffice has several meanings, of which the most important, and the only truly current one, is the intransitive 'to be enough or adequate': "Two hours should suffice"; "Why need I volumes, if one word suffice?" (Emerson).

In the expression suffice it to say, the word suffice is a subjunctive. In other words, it does mean "let it suffice to say...." In the past there were various ways suffice could be used in the subjunctive ("Suffise, that I haue done my dew in place"--Spenser; "My designs/Are not yet ripe; suffice it that ere long/I shall employ your loves"--Beamont and Fletcher), but now it is effectively found only in this set phrase. This example is known as the formulaic subjunctive: an invariant expression found chiefly in independent clauses. Some other examples of the formulaic subjunctive are the phrases "Be that as it may..." (i.e. "let that be..."); "Come what may...," "God save the Queen!" (i.e. "may God save the Queen"), and others, none of which excite any controversy.

The it in suffice it to say is an impersonal or indefinite pronoun, one that functions as a grammatical placeholder without supplying much real meaning. Relevant examples, which are assigned to various complex subcategories by grammarians, are "it's raining," "go it alone," or "it behooves you," where behoove itself is an impersonal verb we discussed last year.

[ Comments? ]

Posted by Eric Bakovic at 02:29 PM

Pot and Kettle

Geoff Pullum's comments on Michael Tortorello's complaint that Supreme Court nominee Harriet Miers' bland writings are grammatically and stylistically poor omit one point that I found salient: Tortorello's complaint is itself stylistically weird. His statement that:

Miers hasn't just micturated on the grave of Strunk and White; she has desecrated the whole cemetery.

uses the verb to micturate. I bet quite a few people either had to look that one up or inferred its meaning from the context but were previously unfamiliar with it. The word is rarely used. Here are the results I obtained from Google for micturated and various synonyms:

Term	Ghits
pissed	6,790,000
peed	1,740,000
urinated	427,000
went/made wee wee	971
micturated	509

Since pissed also occurs in the non-synonymous expression to piss off, the count reported above is the number of Ghits on pissed less the number of Ghits on pissed off.

Here are the counts I got for V on his/her/the grave:

Term	Ghits
pissed	606
peed	127
urinated	30
went/made wee wee	0
micturated	0

There were 3 gross hits (GrGhits?) on micturated, but one of them was to Tortorello's article, and two were to different posts of this article where it is in square brackets, intended as a substitute for the putatively offensive term actually used by the person quoted. Since nobody really used this collocation, I have not counted it.

As it happens, the word to micturate is part of my vocabulary, but that is the result of having had a fair amount of exposure to medical terminology. Even so, it is hard to imagine a context in which I would find it the natural word to use. Even in a formal medical context to urinate would be entirely appropriate and much more natural. I strongly suspect that there are few if any native English speakers for whom this is normal usage outside of a formal, medical context, and even there I suspect that it is rare. To me, using it in a less formal non-medical context is very odd; to use it in the expression to piss on X's grave is bizarre.

I'm not sure whether Mr. Tortorello was being extremely prissy or whether he decided to spice up his writing by going to a thesaurus, but either way I consider this poor usage. This seems to me to be a case of the pot calling the kettle black.

[Addendum: reader Steve Jones suggests:

We might piss, and lawyers and even High Court Judges have been known to urinate from time to time, but a potential Supreme Court Justice can't possibly do anytning less hi'-fallutin' than 'micturate'.

He's got a point, though given Tortorello's attitude toward Miers I am doubtful that this was his reason for using to micturate.]

Posted by Bill Poser at 12:20 PM

What Kind of Error?

A CBC News article about a Danish newspaper's solicitation and publication of pictures of the Muslim prophet Mohammed contains the following interesting error:

Flemming Rose, cultural editor, denied the newspaper was being provocative. Instead, the call for pictures was a reaction to the rising number of situations in which artists and writers censure themselves out of fear of radical Islamists, he said, according to Jyllands-Posten.

The correct word is of course censor; censure means "to criticize" with connotations of "strongly", "harshly", or "officially". I'm a bit surprised to see this sort of error in a news article as one expects journalists to have a large vocabulary, but what I find more interesting is the question of what sort of error it is. Does the author (or possibly editor) not know the difference between the two words, or is it a spelling error?

[Addendum: Reader Gaston Dorren suggests a third possibility, namely that the article, or the statement by Flemming Rose, was translated from a Danish original and that the translator mistakenly carried over into English the Danish verb censurere. That's quite possible, though it still leaves the question of why the presumably native-English-speaking editor didn't catch it.]

Posted by Bill Poser at 11:16 AM

October 14, 2005

Grammatical micturation??

I'm not getting this. Michael Tortorello read the David Brooks New York Times piece on Harriet Miers' bland writings in the Texas Bar Journal and commented:

Miers hasn't just micturated on the grave of Strunk and White; she has desecrated the whole cemetery.

Now first, I don't believe these oft-linked gentlemen ended up in the same grave. Strunk died in 1946 and White almost 40 years later in 1985. Perhaps Tortorello means the grave of the book. But unfortunately it is not dead; a new illustrated edition (can you believe that?) is coming out this month. But never mind that. What's this suggestion about Miers' writing constituting a metaphorical peeing on the putative grave of The Elements of Style? That's what I'm interested in. Let's take a look at this writing, shall we?

Here are the quotes from Miers that David Brooks provides:

"More and more, the intractable problems in our society have one answer: broad-based intolerance of unacceptable conditions and a commitment by many to fix problems."

"When consensus of diverse leadership can be achieved on issues of importance, the greatest impact can be achieved."

"An organization must also implement programs to fulfill strategies established through its goals and mission. Methods for evaluation of these strategies are a necessity. With the framework of mission, goals, strategies, programs, and methods for evaluation in place, a meaningful budgeting process can begin."

"We have to understand and appreciate that achieving justice for all is in jeopardy before a call to arms to assist in obtaining support for the justice system will be effective. Achieving the necessary understanding and appreciation of why the challenge is so important, we can then turn to the task of providing the much needed support."

Now, don't get me wrong. I'm as much opposed to intellectual mediocrity, and the kind of vacuous drivel that gets written in newsletter editorials, as the next man. This is a rather rare case of me agreeing with Ann Coulter, in fact: Miers should not even be allowed to play a Supreme Court judge on TV. The above excerpts are evidence of a profoundly dull and cliché-clogged mind. But I'm interested in the idea that Strunk and White's stupid and old-fashioned little collection of platitudes and warnings about writing English is in some way dishonored here. As far as I can see, after several almost unbearably dull readings, there is absolutely nothing in the above excerpts that contravenes prescriptive grammar at its most benighted, or Strunk/White style at its most clipped. The writing is pathetic; but there is simply nothing wrong with its grammar or the formal aspects of its style. Do not bring this woman up before the grammar court. Release her. She is dull-witted, yes. She will never win prizes, but grammatically and even stylistically, she is innocent.

Posted by Geoffrey K. Pullum at 09:26 PM

No Inuktitut words for no ice?

According to a BBC Radio 4 segment today on environmental crisis in the Arctic.

A report published last month made the shocking claim that the Arctic ice cap could disappear by the end of the century . How is the Inuit population is coping with the reality of a melting permafrost?

About five minutes into the segment, they get into the inevitable words-for-X issue. Up there in Nunavut, it's either feast or famine, lexicographically speaking. Last time around, it was the problem of robins: now it's extreme weather event and similar phrases.

BBC Announcer: This new information is posing something of a problem for the Inuit language, Inuktitut. There's simply no words to describe terms like greenhouse gas emissions or concepts like global warming. Members from across the territory have been meeting in Iqaluit to see if they can't agree on some new ones.

(Inuit elder) Nick Amautinuaq (sp?): We have resource people to try to explain the definition of the word, in English, and we try to create uh Inuktitut word. We try to create one word instead of explaining the meaning of the whole uh meaning of uh extreme weather event.

BBC Announcer: Did you come up with a word for- for extreme weather event?

Nick Amautinuaq: We select / silau tsamanuktularininga / so that's why uh we uh try to create some uh new Inuktitut words.

I've given an approximate transcription in roman orthography, and linked to a sound file -- I'll look forward to some Inuktitut scholar telling me how to spell it properly, and what its morphological analysis and interlinear gloss should be. This sort of thing is pretty hard to predict, but I'm feeling one of those scholarly hunches that we trained linguists get sometimes: do you think it might turn out to mean something like, um, "extreme weather event"?

[I understand, of course, that it's entirely reasonable to put some systematic effort into deciding on a standard translation of technical vocabulary into a language that lacks it. What's unreasonable in this case is the BBC's all-too-predictable assumption that this normal and commonplace process represents a "problem" for Inuktitut as a language, or for the Inuit as a culture

[BBC radio tip from Richard Cox]

Posted by Mark Liberman at 06:21 PM

We are everywhere

We here at Language Log Plaza are astonished and gratified to have reached a new level of recognition: citation in an advertisement, for the HIND Motion Sensor Sport Bra Top for moderate impact sports:

The Motion Sensor bra may [give] a whole new meaning to tighty whities unless the linguistics experts get involved. In any case, the Motion Sensor is sure to give support in all the right places. Built for B to D cups, the molded liner wicks, separates, and successfully battles the bounce. Also available in show-off colors.

Yes, that link is to my first Language Log posting on tighty whities. The sports bra sells for $46.95, by the way.

We Are Everywhere.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 02:12 PM

Stupid title blather for language articles

I have commented before (here) on decent articles about linguistic topics getting stupid titles stuck on them by editors who seem to have one-track minds limited to words and symbols and tongues and noises you make with your mouth. Here's another one, by Michael Erard. The topic? How the USA is gravely short of language teachers, and of course short of people who are expert in the use of foreign languages. Erard's article discusses it sensibly and seriously. The goofy editor's title? "Tongue Tied." What the hell does Erard's topic have to do with being tongue-tied? Nothing. It's as if editors reach into a small box labeled "Stupid punning titles for language articles", pick out a random slip of paper, and just paste it on without even reading it.

By the way, Jim Gordon has just reminded me that they don't do so well in dealing with articles that are not about language, either. He noticed one of today's stories in the Washington Post under the headine "Asian Bird Flu Found in Turkey". The virus was not, of course, found in a turkey. Jim says he was on the edge of referring to the headline writers as turkeys, but then he decided it would be an insult to a noble bird.

Posted by Geoffrey K. Pullum at 10:48 AM

October 13, 2005

Playing One 2

My e-mailbox overflows with offers of information about the origin of the Play One snowclone. What seems pretty clear now is that the model was "I'm not a doctor, but I play one on TV." But then we have to distinguish who first said this publicly and whose use of it caused the quote to take off as a popular formula. As historical linguists put it, we have to distinguish between actuation -- first uses -- and spread. And then we have to confront the fact that some of the information on the net is, surprise surprise, not entirely accurate.

I have heard so far from, in order, Ben Zimmer (three times), Don Porges, Rob Malouf, Benita Bendon Campbell, Sean Williford, Carrie Shanafelt, Ken Callicott (three times), Mike Albaugh, Cody Boisclair, and Jim Toth. I've been offered accounts from the everything2 site, the IMDB, and the Wikipedia. All these sources say that the line was first uttered, in a 1980s commercial for Vicks Formula 44 cough syrup, by the actor Peter Bergman, who played Dr. Cliff Warner on the soap opera "All My Children" (and then moved in 1989 to the non-medical role of Jack Abbott in "The Young and the Restless").

There is no question that Bergman did utter the line in that commercial, but as Zimmer discovered, he was not the first actor to utter it in that commercial. That, it turns out, was Chris Robinson (in 1985), who played Dr. Rick Webber on "General Hospital". When Robinson was convicted on income tax evasion, he was replaced by Bergman (in 1986).

In any case, the Formula 44 commercial -- a clever device to defuse the criticism that viewers might take its advice to be coming from a real M.D. -- seems to have been what triggered the spread of the catchphrase, which was then riffed on in various ways ("parodied in many pop culture references", as the Wikipedia article on notable events of 1986 puts it). But was this the first use?

The Wikipedia article says no. It tells us that the phrase was "first used in the early 1970s by Robert Young of 'Marcus Welby, M.D.' fame". This is entirely plausible, but the article provides no citation to back up the claim.

Two of my correspondents remembered (but without much assurance) the TV commercial as being for a pain reliever rather than a cough syrup, an idea echoed by philosopher Michael Connelly in his webpage on informal fallacies, where under "Appeal to Authority", he refers to

the commercial for a 'popular' pain reliever which is endorsed by an actor who is "not a doctor, but I play one on TV"

Another correspondent recalls the commercial as being from the early 60s rather than the 80s, but I've found no evidence for this. Memory is a tricky thing.

Finally, Shanafelt locates the commercial in the 80s, but recollects, in some detail, a "dentist" rather than "doctor" version of it:

Isn't this snowclone derived from that terrible Oral-B commercial in the 80's? The original script had a guy in his bathroom, facing the camera, saying, "I'm not a dentist, but I play one on TV." He then went on to explain his preference for Oral-B toothbrushes. The oddest thing was that I didn't remember the actor ever having played a dentist on television, or there even being a television character who was a dentist. I never even figured out what the guy's point was about playing a dentist and knowing anything about toothbrushes.

This specific a recollection is hard to dismiss out of hand. And Connelly seconds Shanafelt's report, in a passage that maddeningly assumes that his readers know what he's talking about:

the Oral-B toothbrush ad- the fellow in the towel may well be a dentist- but who knows if most dentists believe the claim?

I haven't been able to google up any other references to such an Oral-B commercial. It seems to lack the fame of the Vicks Formula 44 commercial. Maybe it was a second-generation take-off on the cough syrup ad, in which case it wouldn't have been necessary for the actor to be someone who played a dentist on TV.

There is, however, at least one significant television character who was a dentist: Jerry Robinson (played by Peter Bonerz) on "The Bob Newhart Show". But I can't find any evidence that Bonerz appeared in an Oral-B commercial, or indeed in a commercial for dental products of any sort.

[Updates: Don Porges points out Jerry Robinson's predecessor in situation-comedy dentistry, Rob Petrie's neighbor Jerry on the "Dick Van Dyke Show".

And... Rob Malouf notes a website that describes the "Rob the Dentist" Oral-B commercial from the 80s: "The ad featured a man in his towel with his back to the camera and a John Laws voice over with the memorable tagline 'The toothbrush most dentists use'." The campaign was launched in 1982 and lasted for nearly ten years.

So my current guess is that Shanafelt blended two 80s commercials, the Vicks ad (actor facing camera and disavowing being a doctor) and the Oral-B ad (actor with back to camera and presented as a dentist).]

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 02:35 PM

All verbal assassins speak in the passive voice

So wrote Peter Porcupine in March of 1797, attacking Noah Webster for "grammatical inaccuracy" in a froth of phrases like "illiterate booby", "inflated self-sufficient pedant", "very great hypocrite", and even "something of a traitor".

The Oct. 8-14 issue of the Economist includes a review of recent books from the rightward end of the political spectrum. Their titles (Surrounded by Idiots, Liberalism is a Mental Disorder, 100 People Who Are Screwing Up America, The Vast Left Wing Conspiracy) support the review's first sentence: "Political debate in America seems to grow less civil by the hour." However, I happen to have been reading the 18th-century political pamphlets of Peter Porcupine (the pen name of William Cobbett), and I wonder whether even Mike Gallagher, Michael Savage, Bernard Goldberg and Byron York have yet matched the standard of invective that he set.

Cobbett's attack on Webster is one of his milder sallies, but it has special linguistic interest, both because of its target and because of its content. Below the fold, I give some background, and quote from the exchange between the two men (courtesy of the fascimile .pdfs at ProQuest's American Periodicals Series, 1741-1800).

The background:

Noah Webster: born 1758 in Hartford, Connecticut; graduate of Yale; wrote "the blue-backed speller" (A Grammatical Institute of the English Language, 1783) that was still selling a million copies a year in 1861; founded New York's first daily newspaper, American Minerva, in 1793. A dedicated member of the Federalist Party.

William Cobbett: born 1763 in Surrey, England; non-commissioned officer in the British Army 1784-1792; educated himself in barracks by "reading the contents of the circulating library of [Chatham], and getting up by heart Lowth's English Grammar"; fled to France after making charges of corruption against his former officers; moved to America in 1793 after Louis XVI was executed and Britain declared war on France; in Philadelphia, he published Porcupine's Political Censor (1796-1797), and Porcupine's Gazette (1797-1800). A leading Federalist wingnut.

During the European wars that began in 1793, the Federalist party favored Britain, while the Republicans favored France. Cobbett was passionately anti-French and pro-British. Writing as "Peter Porcupine", he excoriated the partisans of the French, and also scorned those in his own party, like Webster, who were less than totally committed to the British side.

Noah Webster in the Minerva of New York, March 21, 1797:

In a late paper, we inserted sentiments of this kind, that the putting up in the Coffee House, a card, on which was painted the English flag, was a low pitiful business, equalled only the meanness of putting up a French flag, and that it is servile to be bandied about between the flags of different foreign nations. We ought to unite under our own flag and learn to be a nation.

Peter Porcupine has copied the paragraph with disapprobation, and says it contains more of vulgar prejudice, and mistake, than of justice or good policy. [...]

No comment will be made on the insinuation of "Vulgar Prejudice," against the Editor of the Minerva. When Peter becomes acquainted with the editor's real character, he will learn, that in a combat of that kind, he himself must certainly be the loser.

But we contest Peter's principles. It was strongly suspected many months ago, that his principles are not very friendly to the independence of America, and still less to the form of our government. This suspicion has been greatly increased by the manner in which his gazette has been conducted. [...] We observe also whole columns of some of the first numbers of Peter's gazette, filled with "apologies for the old government of France," that is, for the feudal system, though in a relaxed state, and for as corrupt a system of despotism as Europe ever witnessed.

William Cobbett responded:

To Mr. Noah Webster of New-York
Porcupine's Political Censor, March 1797.

Vain, fickle, blind, from these to those he flies,
And ev'ry side of wav'ring combat tries;
Large promise makes, and breaks the promise made;
Now gives the Grecians, now the Trojans aid.
Pope's Homer, Lib. V.

Some days ago I promised you an answer to your Address (or whatever else you may please to call it) of the 21st of March. It luckily matters little how this answer begins. Aware I suppose of the uncouth manners of the man you were about to assail, you kindly contrived that that the rudeness of your attack should furnish an ample apology for his want of politeness.

Your Address treats of your important self, of me, and of the proposed alliance between the United States and Great Britain. This alliance is a subject of two [sic] much consequence to be blended with an enquiry into your and my character, principles, and conduct, I shall therefore reserve it for a separate letter; not losing, however, the present opportunity of declaring, that your reasoning, instead of convincing me that I was mistaken, has strengthened, as far as any thing in itself contradictory can strengthen, the opinion which gave so much offence to your wisdom.

[...]

Had the same stupid admiration of the French, that prevailed, and that you participated in, for several years; had this admiration and its concomitant partiality still existed, you would never have dared (with all your heroism) to call the hoisting of their flag "a low pitiful business:" you would prudently have left that to a writer of less caution and more sincerity, reserving to yourself the agreeable task of endeavoring to disfigure his motives and blast his fame.

And, was it then such a heinous offence to quote a writer of your stamp "with disapprobation," or apply to him the charge of vulgar prejudice? It would be curious to hear, on what it is that you ground your right of exemption from all censure and criticism. Besides, to say that a man has adopted a vulgar prejudice, is calculated to give offence to no one but an illiterate booby, who does not know the meaning of the words, or a captious, inflated self-sufficient pedant. Yet it is this phrase, and this alone, that has provoked your to seek retaliation, and retaliation, too, of the most base and malicious species. —"We contest (say you, after declaring that I am unable to cope with you) "We contest Peter's principles. It was strongly suspected many months ago, that his principles are not very friendly to the independence of America, and still less to the form of our government."

The grammatical inaccuracy of this last sentence, though fallen from the pen of a language-maker, it would be foreign to my purpose to remark on: it is the slander it conveys, that it is my duty to expose. —"It was strongly suspected." This is the true gossiping, calumniating style. All verbal assassins speak in the passive voice, that, what they cannot prove, they may at last throw on public report. If you had said, "I suspected many months ago," though it would have led to a detection, you would have acted more like a man; and this might have been expected too, in a volunteer of your "determined zeal and firmness."

However, as you are very fond of the pompous plural number and passive voice, perhaps it is but fair to suppose, that you mean to intimate, that you suspected my principles many months ago; and and, if this was really the case, pray how came you to recommend my pamphlets to the perusal of your readers, as the best antedote to the anarchical principles of the enemies of the government. How many months ago was it that your penetration made the grand discovery? When I proposed publishing a paper, which was no more than about six weeks anterior to the date of your Address, you told the public in an exulting manner, that I should "prove a terrible scourge to the patriots," meaning Bache, Greenleaf, and all the antifederal crew. Six weeks, 'Squire Webster, is not many months. If you really suspected my enmity to the government, and to the independence of America, you were a very great hypocrite, if not something of a traitor, to applaud my undertaking; and, if on the other hand, you had no such suspicion, and have now feigned it merely for the purpose of revenging what your haughtiness has construed into an affront, I leave the public to determine what name you are worthy of.

These are merely brief samples: Webster's remarks occupy five printed pages, while Cobbett's "letter" takes up 26.

As a sample of Cobbett's less restrained invective, I reproduce some bits of his "last will and testament", written "in the name of fun", and published in the same 1797 issue of his periodical as his response to Webster. In addition to the sneers at Noah Webster (not a RINO, but a "centaur") and William Thornton, there is a casually racist item for Thomas Jefferson (referring to Haitian generals and to the mathematician Benjamin Banneker), and violent animosity towards Republicans Franklin Bache, James Monroe, and Tom Paine.

SINCE I took up the calling that I now follow, I have received about forty threatening letters; some talk of fisticuff, others of kicks, but far the greater part menace me with out-right murder. Several friends (whom by the bye I sincerely thank) have called to caution me against the lurking cut-throats; and it seems to be the persuasion of everyone, that my brains are to be knocked out the first time I venture home in the dark.

Under these terrific circumstances, it is impossible that Death should not stare me in the face: I have therefore got myself into as good a state of preparation as my sinful profession will, I am afraid, admit of; and as to my worldly affairs, I have settled them in the following Will, which I publish, in order that my dear friends, the Legatees, may, if they think themselves injured or neglected, have an opportunity of complaining before it be too late.

IN the name of Fun, Amen. I PETER PORCUPINE, Pamphleteer and News-Monger, being (as yet) sound both in body and in mind, do, this fifteenth day of April, in the Year of our Lord, one thousand seven hundred and ninety seven, make, declare, and publish, this my LAST WILL AND TESTAMENT, in manner, form, and substance following; to wit:

In primis, I LEAVE my body to Doctor Michael Lieb, a member of the Legislature of Pennsylvania, to be by him dissected (if he knows how to do it) in presence of the Rump of the Democratic Society. In it they will find a heart that held them in abhorrence, that never palpitated at their threats, and that, to its last beat, bade them defiance. But my chief motive for making this bequest is, that my spirit may look down with contempt on the their cannibal-like triumph over a breathless corps.

Item, To T_____ J_____son, Philosopher, I leave a curious Norway Spider, with a hundred legs and nine pair of eyes; likewise the first black cut-throat general he can catch hold of, to be flead alive, in order to determine with more certainty the real cause of the dark colour of his skin: and should the said T_____ J_____son survive Banneker the Almanack-Maker; I request he will get the brains of said Philomath carefully dissected, to satisfy the world in what respects they differ from those of a white man.

Item, To the Philosophical Society of Philadelphia, I will and bequeath a correct copy of Thornton's plan for abolishing the use of the English Language, and for introducing in its stead a republican one, the representative characters of which bear a strong resemblance to pot-hooks and hangers; and for the discovery of which plan, the said society did, in the year 1793, grant to the said language maker 500 dollars premium. —It is my earnest desire, that the copy of this valuable performance, which I hereby present, may be shown to all the travelling literati, as a proof of the ingenuity of the author and of the wisdom of the society.

Item, To my dear fellow labourer Noah Webster, "gentleman-citizen," Esq. and News-man, I will and bequeath a prognosticating barometer of curious construction and great utility, by which, at a single glance, the said Noah will be able to discern the exact state that the public mind will be in the ensuing year, and will thereby be enabled to trim by degrees and not expose himself to detection, as he now does by his sudden lee-shore tacks. I likewise bequeath to the said "gentleman citizen," six Spanish milled dollars, to be expended on a new plate of his portrait at the head of his spelling-book, that which graces it at present being so ugly that it scares the children from their lessons; but this legacy is to be paid him only upon condition that he leave out the title of 'Squire, at the bottom of said picture, which is extremely odious in an American school-book, and must inevitably tend to corrupt the political principles of the republican babies that behold it. And I do most earnestly desire, exhort and conjure the said 'Squire-news-man, to change the tittle of his paper, The Minerva, for that of The Political Centaur.

Item, To Tom the Tinker, I leave a liberty cap, a tri-colored cockade, a wheel-barrow full of oysters, and a hogshead of grog: I also leave him three blank checks on the Bank of Pennsylvania, leaving to him the task of filling them up; requesting him, however, to be rather more merciful than he has shown himself heretofore.

Item, To the editors of the Boston Chronicle, the New York Argus, and the Philadelphia Merchants' Advertiser, I will and bequeath one ounce of modesty and love of truth, to be equally divided between them. I should have been more liberal in this bequest, were I not well assured, that one ounce is more than they will ever make use of.

Item, To Franklin Bache, editor of the Aurora of Philadelphia, I will and bequeath a small bundle of French assignats, which I brought with me from the country of equality. If these should be too light in value for his pressing exigencies, I desire my executors, or any one of them, to bestow on him a second part to what he has lately received in Southwark; and as a further proof of my good will and affection, I request him to accept of a gag and a brand new pair of fetters, which if he should refuse, I will and bequeath him in lieu thereof—my malediction.

Item, To citizen M___oe, I will and bequeath my chamber looking-glass. It is a plain but exceeding true mirror: in it he will see the exact likeness of a traitor, who has bartered the honour and interest of his country to a perfidious and savage enemy.

Item, To the republican Britons, who have fled from the hands of justice in their own country, and who are a scandal, a nuisance and a disgrace to this, I bequeath hunger and nakedness, scorn and reproach; and I do hereby positively enjoin on my executors to contribute five hundred dollars towards the erection of gallowses and gibbets, for the accommodation of the said imported patriots, when the legislators of this unhappy state shall have the wisdom to countenance such useful establishments.

Item, To Tom Paine, the author of Common Sense, Rights of Man, Age of Reason, and a letter to General Washington, I bequeath a strong hempen collar, as the only legacy I can think of that is worthy of him, as well as best adapted to render his death in some measure as infamous as his life: and I do hereby direct and order my Executors to send it to him by first safe conveyance, with my compliments, and request that he woud make use of it without delay, that the national razor may not be disgraced by the head of such a monster.

An interesting picture of the period is available in Richard Rosenfeld's history/novel American Aurora, the first chapter of which is available on line here, and includes this passage about Cobbett:

William Cobbett, publisher of the Porcupine's Gazette, has, in one short and vitriolic year, made his radically conservative paper more popular and influential than any Federalist journal in the country except perhaps John Fenno's Gazette of the United States. William Cobbett started Porcupine's Gazette on the day John Adams took the presidential oath, and, from that day to this, "Peter Porcupine" (as Cobbett often signs his articles) has defended Mr. Adams with a knifelike quill his opponents are loath to match.

William Cobbett is a powerful man, six feet in height, of heavy build, fair-complected, gray-eyed, and possessing, as he says, "a plump and red and smiling face." Perhaps because he endured seven years as a British corporal in the backwoods of Canada, William Cobbett is dogged in his pursuits and intense in his anger. He is also British, very royalist, and possibly crazy.

Crazy! How else can one explain an Englishman--and he is an Englishman--who, eight months before starting an American newspaper, opened a bookstore opposite Philadelphia's Christ Church (the "English" church) on Second-street to sell loyalist, royalist, and Federalist propaganda, removed the building's shutters, painted its facade a bright blue, and decorated its front windows with portraits of royalty like Britain's King George III (from whom Americans won their independence) and France's King Louis XVI (whom the French Revolution had overthrown)?

In the first issue of Porcupine's Gazette, William Cobbett declared Benjamin Bache and the Philadelphia Aurora his enemies. Calling the Aurora a "vehicle of lies and sedition," the first Porcupine's Gazette opened with a letter to Benjamin Bache:

I assert that you are a liar and an infamous scoundrel ... Do you dread the effects of my paper? ... We are, to be sure, both of us news-mongers by profession, but then the articles you have for sale are very different from mine ... I tell you what, Mr. Bache, you will get nothing by me in a war of words, and so you may as well abandon the contest while you can do it with good grace ... I am getting up in the world, and you are going down. [F]or this reason it is that you hate me and that I despise you; and that you will preserve your hatred and I my contempt till fortune gives her wheel another turn or till death snatches one or the other of us from the scene. It is therefore useless, my dear Bache, to say any more about the matter...

Cobbett's NNDB page is here. Apparently he returned to England in 1800, fleeing from a libel judgment "for saying that Dr. Benjamin Rush, who was much addicted to blood-letting, killed nearly all the patients he attended". He agitated in England first as a Tory, then as a Radical. In 1802, he began publishing Parliamentary Debates, a series that he sold to Hansard in 1809. To escape his debts, in 1817 he fled back to the United States, where he wrote and published an English Grammar. He returned again to England, where he disinterred Tom Paine and tried to sell his bones as relics. He served in Parliament, was unsuccessfully prosecuted in 1830 for inciting to rebellion, and died in 1835.

[Update 11/6/2005: recent email from peterporcupine1776@yahoo.com informs me that/p>

I am pleased to announce that the reports of my death are greatly exaggerated.
I very much enjoyed your piece on my will, and my deserved excoriation of that scoundrel, Noah Webster. I have a blog now, sir, and find that blogging is indeed the modern equivalent of my pamphlets and newspaper writings. While centered in Massachusetts rather than Philadelphia, I am still able to speak upon a wider stage as ever I did.
I invite you to visit http://capecodporcupine.blogspot.com and I hope you will enjoy my offerings there. Many are pieces about my contemporaries which may be of interest to you.

]

Posted by Mark Liberman at 09:34 AM

October 12, 2005

Kurdish to be Co-official in Iraq

Article 4 of the draft of the new Constitution of Iraq makes Arabic and Kurdish co-official languages. Both languages are to be used in legislative bodies and courts, schools, and official publications and correspondence. This is great news for Kurdish and the Kurds, whose language has never before had official status. Indeed, until very recently, the public use of Kurdish was banned in Turkey, which denied the very existence of a distinct Kurdish people, calling them "Mountain Turks". The ban was lifted only as a result of pressure from the European Union, which Turkey badly wants to join. Even so, Turkey has a hard time with anything related to the Kurds. Only last March Turkey announced that henceforth it would consider the scientific name of the Red Fox to be Vulpes vulpes rather than Vulpes vulpes kurdistanica.

In Iran under the Shah Kurdish was suppressed, as described in Margaret Kahn's book Children of the Jinn. Kurdish is now in public use to some extent, but the use of the language is banned in schools and the Kurds are still oppressed. Among other things, since most are Sunni Muslims they are not permitted to vote.

In addition, the new constitution guarantees parents who speak other languages, such as Turkmen, Armenian, and "Syriac", the right to have their children educated, in government schools, in that language. Private schools are permitted to use any language as the language of instruction. By "Syriac" they don't mean the language that linguists call Syriac, which is an extinct branch of Eastern Aramaic in which some important early texts of the New Testament are written. They mean the several Northeastern Central dialects of Aramaic presently in use in Iraq, such as Chaldean Neo-Aramaic. Of course, it remains to be seen whether the Constitution will actually be implemented. Article 15 of the Constitution of the Islamic Republic of Iran guarantees the rights of linguistic minorities but is in practice ignored.

Posted by Bill Poser at 07:52 PM

To snowclone or not to snowclone

Post about snowclones, and people send you more. Today from Pat Dreher, To X Or Not To X, specifically an article by Constance Weaver, Carol McNally, and Sharon Moerman (in the NCTE's Voices from the Middle 8.3 (2001)) entitled "To Grammar or Not to Grammar". That quickly led me to Bob Kennedy's posting (of 9/21/05) on the piloklok blog on this very snowclone, under the title "to clone or not to clone".

Kennedy had come across the sports headline "QB or not QB" and recognized it for what it was:

The to X or not to X structure is pretty clearly a snowclone, of the type frequently tracked on Language Log. This one's also pretty easy to date; it's at least as old as Shakespeare's Hamlet, but I suppose it's possible that the Bard may have lifted it from one of his contemporaries' work.

A google search of to * or not to * gets the following phrases on page 1:

to be or not to be
to Lariam or not to Lariam
to spank or not to spank
to pee or not to pee
to MBA or not to MBA
to hack or not to hack
to blog or not to blog
to breed or not to breed

And none of these actually discusses the play.

Regardless, the structure is so utterly common that its absence from the Google Meme Observatory is, IMHO, forgiveable.

Among the over sixteen million hits that came up for me today were the following, from just the first two pages:

to bundle or not to bundle
to /movelog or not to /movelog
to vaccinate or not to vaccinate
to rebuild or not to rebuild
to _root or not to _root
to outsource or not to outsource
to stopword or not to stopword
to circumcise or not to circumcise
to zap or not to zap
to shave or not to shave
to meow or not to meow

An utterly ordinary structure, indeed. Some occurrences might just be simple contrastive disjunction, though the most natural expression of this disjunction would be with ellipsis in the second disjunct: "to X or not (to)". You have to think that there's a lot of echoing of The Bard here.

The formulaic nature of some of these examples is made clear by the occurrence of an X that is straightforwardly a noun, not a verb. Dreher pointed this out about "To Grammar or Not to Grammar", and Kennedy's list has "Lariam" (an anti-malaria medication) and "MBA" on it.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 07:10 PM

Playing one

She's not a linguist, but she sometimes plays one in e-mail. That would be my old friend Benita Bendon Campbell, who's taken up snowclone hunting as a hobby and (on 10/6/05) turned up this one in Hendrik Hertzberg's Talk of the Town comment "Rain and Fire" in the 10/3 New Yorker (p. 39):

The film, "Last Best Chance," was a bit unusual, too. You might even say it isn't really a movie at all--it just plays one on TV.

We seem not to have commented on the Play One snowclone here on Language Log, though two of us have used it. Back on 5/7/05, in "Not to or to not", I asked for mercy via Play One:

I am not a semanticist, though I play one at Language Log Plaza, so go easy on me here.

And a year ago (on 10/15/04), in "Ceci n'est pas un Bushism", Eric Bakovic qualified his claims to expertise in a similar way:

... keep in mind, I'm not a syntactician (but I have been known to play one in the past).

A web Google search on "play one on tv" gets ca. 146,000 pages, "plays one on tv" another 27,000, "played one on tv" a further 14,200, and "playing one on tv" a final 905. Certainly a popular turn of phrase! Most of the hits are like

I'm not X, but/though I play one on TV.
I'm not X, I just play one on TV.

with a variety of subjects and a variety of verb forms in the second clause:

You're not a porn star; you just play one on TV
It's not real news, but it plays one on TV.

(Note that one as the anaphor in the second clause seems to be fixed, since it's used even when it's not appropriate for its antecedent: "real news".... "one".) Interrogative variants -- "Are you X, or do you (just) play one on TV" -- are also possible.

X is mostly the indefinite article a(n) plus a nominal denoting an identity, usually one that is either valued or scorned:

Leftist, programmer, soldier, blogger, doctor, worker, retarded person, nerd, reporter, attorney, politician, woman, porn star, hypocrite, journal, badass, real lawyer,...

though there are also occurrences of definite NPs ("the President"), bare NPs ("President"), and adjectives ("Russian", "Hawaiian"). For the adjectives, again, the anaphor one is used even though it's not strictly appropriate: "I'm not Russian, but I play one on TV."

Finally, though "on TV" (often not to be taken literally) is the usual location specified -- I assume that this was the original version of the formula, though I haven't traced the history -- all sorts of other locations are possible (on the net, on some specific newsgroup, mailing list, or blog, at work, etc.), and, as in the quote from Eric, it isn't even necessary to specify a location.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 04:28 PM

Singular they and plural he/she/it

A recent National Journal article by Murray Waas exhibits, in one paragraph, a singular they and a plural he:

But the senior Justice official added that even in the absence of hard evidence of an obstruction, "a prosecutor is going to want to know why a subject of (the) investigation did not want a witness to co-operate, and why they would allow someone to linger in jail for more than eighty days, unless they had something to hide. That is going to lead many prosecutors to redouble his efforts."

Singular they, as we've repeated at tiresome length, has been sanctioned for centuries by the usage of esteemed writers, though it's deprecated by some. Plural uses of singular pronouns are rarer, and seem to be genuine mistakes, perhaps caused by hypercorrection in the face of confusion about which pronoun to use in reference to generic or universally-quantified antecedents whether singular or plural. I mentioned another example of this kind in one of the first Language Log posts:

All lockers must be emptied of its contents by August 22 at 5:00 p.m.

I don't see any easy way to search for examples of this phenomenon.

Note, by the way, that there is nothing necessarily illogical about using singular pronouns in such cases. The most obvious difference between singular they and plural he/she/it is that the first is commonly used while the second seems to be extremely rare. However, I don't know of any frequency estimates for either one.

[Steve at Language Hat points out by email that

I wonder if the sentence you quote started out as "That is going to lead many a prosecutor to redouble his efforts." I can easily imagine some copyeditor thinking "many a" was archaic or something and changing it to "many" + plural, without reflecting that this required a corresponding change in the subsequent pronoun. This kind of thing happens all the time. I find it difficult to believe the sentence was written as it stands; plural "he" is just too weird.

The same thought had occurred to me, but I thought I'd give the poor copy editors a break for a change.

Also, this is in a quote from an anonymous source, so the fault might be in Waas' original notetaking. However, one thing I've learned over the past couple of years, from correspondence with readers of this blog, is that a phrase that is "just too weird" for me is sometimes suprisingly unproblematic for others. ]

Posted by Mark Liberman at 08:56 AM

What is this Harvard?

On come the snowclones. Recently, in the pages of the New Yorker (October 10, 2005, p. 80), Malcolm "Blink" Gladwell reflected on his experience in applying for college in Canada, contrasting it with the American experience of applying to, and going to, elite colleges:

Once, I attended a wedding of a Harvard alum in his fifties, at which the best man spoke of his college days with the groom as if neither could have accomplished anything of greater importance in the intervening thirty years. By the end, I half expected him to take off his shirt and proudly display the large crimson "H" tattooed on his chest. What is this "Harvard" of which you Americans speak so reverently?

Two snowclone-attentive correspondents (Benita Bendon Campbell and Matthew Hutson) have pointed me to this Gladwell passage, harking back to my July Language Log postings on this formula, here and here. Nice to see this variant of it in the pages of the New Yorker.

I know, I know, this is just a tiny bulletin in the snowclone annals. But the fact is that I've been collecting comments on earlier snowclone postings, and nominations for new snowclones, for many, many months (don't ask how many), intending to assemble them all in one big summary posting. At this point, I'd have to carve out a whole day in my life to do it. So the above pathetically brief Gladwell reference is the best I have to offer right now. Now to work back through all the earlier stuff.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 01:36 AM

October 11, 2005

Naming names

A little while back, Victor Steinbok wrote me with a link to the NNDB page of nicknames granted by George W. Bush -- "Ali" for Barbara Boxer, "Camarones" for Paul Cameron, "Corndog" for John Cornyn, and so on. (You probably already know about "Turd Blossom" for Karl Rove.) The main effect of the page is to depict POTUS as an aging adolescent, and that has some entertainment value, though you shouldn't let that image cause you to misunderestimate the man.

The NNDB site is an odd assortment of biographical information about roughly 15,000 people, from all over the world, both dead and alive, focusing on celebrity (even rather modest celebrity) and notoriety, and promising links between entries. (Compare this with the roughly 100,000 entries in Who's Who in America: only in America, only living people, focusing on accomplishment, and linkless.) Stanford psychology professor Phil Zimbardo gets in, probably because of the famous "prison experiment". Joey Buttafuoco is in there, classified straightforwardly as a "criminal". "Girls Gone Wild mastermind" Joe Francis (aka Joseph Francis) also gets in, with his occupation down as "business" and a five-item rap sheet (child pornography, drug trafficking, obscenity, racketeering, and rape). A fair number of pornstars get in: at least one, Jeff Stryker, from the biggies of gay porn, and the Lindas include not only Blair, Ronstadt, Hamilton, Evangelista, Lavin, Evans, Lingle, Chavez, McCartney, Fiorentino, Gray, Tripp, Cardellini, Hunt, and Perry (notice the preponderance of actors and entertainers), but also Lovelace.

You will probably be pleased, then, to hear that not one of the Language Log bloggers is (yet) in the NDDB. Yes, I checked, starting with myself, of course. That's how I came across Phil Zimbardo, a companion in the Land of Zs.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 08:45 PM

Splanchnic!

After I posted about the appearance of the wonder word splanchnic in an article in Harper's, Kenny Easwaran wrote from Berkeley to say that he recalled seeing a poster in which splanchnic appeared among "about forty generally unfamiliar words, together with little one-frame cartoons 'defining' the words". His recollection was that the frame for splanchnic played on the Yiddish sound of the word.

Yesterday he wrote to say that he'd tracked it down, and provided a scanned-in copy for my delight. It's by the New Yorker's wonderful Roz Chast, and it came out in 2001 to advertise the first edition of the New Oxford Dictionary of American English (the second edition is now out, by the way). NOAD's editor, Erin McKean, will supply me with a high-quality image and approach Chast for permission to link to it here. While I'm waiting for this to happen, here's a brief description of the splanchnic frame:

Frustrated man, at wit's end, confronting a toaster that is not plugged in, cries out: "Why isn't this working? Oh, WOE IS ME!!!"

Woman in background thinks to herself: "What a splanchnic."

Note that splanchnic here is a noun, while the real splanchnic is an adjective. The Yiddish flavor comes in part from the -nic -- normally spelled -nik in Yinglish, of course.

For the record, there are 36 frames. Easwaran's memory is pretty damn good.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 07:53 PM

YAEVC

Kaitlin Johnson has submitted Yet Another Eskimo Vocabulary Cartoon:

She explains that "This one is my favorite (because zombies make everything funnier)".

Posted by Mark Liberman at 12:11 AM

October 10, 2005

Did which-hunting change the Laws of the Game?

Jim Gordon pointed out by email that the 2005 Laws of the Game (LOTG), as published by Fédération Internationale de Football Association (FIFA), include one provision that is incoherent in its English version (p. 40 of the .pdf):

Law 12 - Fouls and Misconduct
Decisions of the International F.A. Board
Decision 4
A tackle, which endangers the safety of an opponent, must be sanctioned as serious foul play.

As Jim observes, the commas after "tackle" and "opponent" turn an integrated or restrictive relative clause, referring to those tackles that are safety-endangering, into a supplementary or nonrestrictive relative clause, referring to all tackles. (For us Americans, this is about soccer, where tackles lack the sacramental status they have in American football, but are still a legal and even routine defensive technique.)

As evidence that what the English text says is not what FIFA means, Jim offers the versions in Spanish and in French:

Una entrada que ponga en peligro la integridad física de un adversario deberá ser sancionada como juego brusco grave.

Un tacle qui met en danger l’intégrité physique d’un adversaire doit être sanctionné comme faute grossière.

Could this be an example of what we might call "type 2 which-hunting"? In the commoner form of which-hunting, certain copy editors change which to that in restrictive relative clauses. An alternative method of enforcing this spurious "rule" would be to add commas around every which-clause, regardless of its meaning. There is some evidence that this principle has been applied unsystematically throughout the English version of the LOTG, since we have e.g.

(p. 15) In competition matches, only footballs which meet the minimum technical requirements stipulated in Law 2 are permitted for use.
(p. 33) Only procedures to determine the winner of a match, which are approved by the International F.A. Board and contained in this publication, are permitted in competition rules.

Another possibility, I suppose, is that the English versions of certain parts of the LOTG were written by a native speaker of German. FIFA renders the corresponding decision about tackling in German as:

Ein Tackling, welches die Gesundheit eines Gegners gefährdet, ist als großes Foul zu ahnden.

I wonder what the English wording of this decision has been across the various editions of this document.

[Update: Bertil van Zweeden has the textual facts:

According to a memo from the US Soccer Federation, the rule was created in 1998 in the following form:
"A tackle from behind which endangers the safety of an opponent must be sanctioned as serious foul play." Before, such tackles from behind were not explicitly mentioned in the rules (although in practice, players often got a punishment for them). Here is the memo:
http://www.kenaston.org/ussf/memo1998i.htm
Note that here is discussed if every tackle from behind should be punished.

In this '99 memo from the US Soccer Federation, the comma is introduced:
http://socref.net/docs/USSF%201999%20Memorandum.pdf
So, your hypothesis about an alternative form of which-hunting was probably correct.

In 2005, the "from behind" part was skipped:
http://socref.net/docs/USSF%202005%20Memorandum.pdf

However, I couldn't find any official FIFA memoranda on this rule.

My native Dutch is up there with English: all tackles should be punished in our football :) . Bad idea, if you ask me. The matches are already interrupted too often by the man in the black shirt.

]

[Update #2: Steve Jones suggests

You can blame it all on Microsoft Word (Geoff will no doubt be overjoyed to hear that!).
Put the offending clause, with 'which' and without the commas into Word 2000, and its grammar checker will underline it in green, and suggest you put in the commas or change 'which' to 'that'. Understandably, the grammatically illiterate choose putting in the commas as the lesser evil.

MSWord does act that way, but I doubt that this is a source of joy for Geoff Pullum, who would no doubt be happier if all grammatical advice, from whatever source, were valid. ]

Posted by Mark Liberman at 11:40 PM

October 09, 2005

Can humanists count?

A few days ago, I wrote about a strange conjunction from A(gence) F(rance) P(resse). Arnold Zwicky responded with a subtle psycho-grammatical deconstruction. Jean Véronis suggested perhaps it was careless translation of a French-language original (see the update for details). My reaction was to wonder whether some modern humanist scholars, like the Pirahã, are unwilling to count things.

I'm not talking about Arnold and Jean, who are empirically scrupulous and numerically expert. To follow my train of thought, we need to start with the excess of demonstratives in the French version of the AFP texts:

Les gagnants sont alors discrètement contactés avant la cérémonie pour leur laisser la possibilité de décliner cette offre. Mais en fait ils sont peu nombreux à résister à cette récompense et sont même beaucoup à venir recevoir leur prix en personne et à leurs propres frais.

The winners are discretely contacted beforehand to give them an opportunity to decline. It is a testament to the growing prestige of the event that very few turn down the offer and agree to attend at their own expense.

Now, I don't think that there's anything about French and English that forces this difference. At least, the English version could have used object-phrases like "this offer" and "this award", and French is sometimes happy enough with null complements, or with a form of le instead of a form of ce. However, all this reminded me of Jean-Claude Sergeant and his characterization of the English language, which seems to predict the opposite distribution of demonstratives:

L'anglais est une langue paradoxale. Assez rigidement structurée sur le plan syntaxique, on pourrait la qualifier de "no nonsense language", c'est-à-dire de langue où l'à-peu-près n'a pas sa place, au risque de voir cette appréciation immédiatement contredite par l'emploi surabondant de reprises anaphoriques – this, that et leurs formes plurielles – et par les multiples possibilités de raffinement lexical que lui permet le jeu des postpositions adverbiales. On approchera un peu mieux encore la vérité de la langue en rappelant que le terme "reasonable" renvoie davantage à ce qui relève du bon sens que de la raison.

English is a paradoxical language. Quite rigidly structured on the syntactic level, we could call it a "no nonsense language", that is to say a language where there is no place for more-or-less, at the risk of seeing this insight immediately contradicted by the overabundant use of anaphoric references -- this, that and their plural forms -- and by the multiple possibilities of lexical refinement permitted by the play of adverbial postpositions. We will come a bit closer to the truth of the language by recalling that the term "reasonable" refers more to common sense than to rationality.

In previous posts, I've spot-checked some of Sergeant's other claims about differences between French and English. In each case, a few minutes' research over morning coffee was enough to call his generalizations into question. It doesn't seem to be true that "we can hardly, in English, accumulate at the start of the phrase the elements of complementation as this is practiced in the French press", nor that "English will avoid all ambiguity as to the identity of agents intervening in a phrase", at least if we use web search to count instances similar to Prof. Sergeant's own examples of English/French comparison.

So the extra demonstratives in the French AFP text reminded me of a Sergeant claim that I haven't checked yet: that English is characterized by "overabundant use of anaphoric references -- this, that and their plural forms". And I thought I'd do another coffee-break experiment.

I picked three stories in each language from this morning's news about the earthquake in Pakistan. The French-language stories were from AFP, Reuters and Le Monde; the English-language stories were from AFP, Reuters and BBC Online.

The French stories turned up 16 demonstratives (8 ce, 7 cette, 1 cela) in 2251 total words, for a rate of 7.1 demonstratives per thousand words. The English stories turned up 9 demonstratives (3 that, 5 this and 1 these) in 2270 total words, for a rate of 3.9 demonstratives per thousand words. (Note that I had to check all the instances of that by hand, to distinguish the demonstratives from the complementizers. This prevented me from trying to evaluate the idea on a larger scale over breakfast.)

I certinly don't claim that this little experiment -- based on a too-small sample of a single genre -- is definitive. However, I'll assert that it calls into question the idea that "this, that and their plural forms" are "overabundant" in English prose, at least in comparison to appropriately matched French prose.

More broadly, I'm starting to wonder: is there a subculture of humanistic scholars, embedded within modern western civilization, that has lost or rejected the idea that there is a connection between a generalization and our expectation about counts of its instances? The attitude of the Pirahã towards counting seems exotic and even improbable; but the attitude of scholars like M. le Professeur Sergeant puzzles me more.

Posted by Mark Liberman at 03:15 PM

Hangul Day

Today, October 9th, is Hangul Day (한글날), the celebration of the promulgation of the Korean alphabet Hangul 한글 by King Sejong the Great in 1446.

King Sejong (세종대왕 in Hangul, 世宗大王 in Chinese characters), the fourth king of the Choson dynasty, was born May 6, 1397 and ascended the throne in 1418 at the age of 21. He died May 18, 1450 aged only 53. He is known for his work improving Korea's defenses against Japanese pirates and invaders from Manchuria, for his patronage of scholarship (among other things, he founded the 집현전 集賢殿 (Jiphyeonjeon) or "Hall of Worthies", a kind of royal academy) and for his own scholarly work. He is credited with the invention of a rain gauge, a water clock, and a sundial. His literary works include the highly regarded Yongbi Eocheon Ga "Songs of Flying Dragons", Worin Cheon-gang Jigok "Songs of the Moon Shining on a Thousand Rivers", and Seokbo Sangjeol "Episodes from the Life of the Buddha". He also compiled the Dongguk Jeong-un "Dictionary of Proper Sino-Korean Pronunciation". Most of all, he is known as the creator and promulgator of Hangul.

Prior to 1446, the Korean language was rarely written at all. The written language used in Korea was Classical Chinese. The combination of the use of a foreign language with the large amount of memorization required to learn thousands of Chinese characters meant that only a small elite were literate, overwhelmingly men from aristocratic families. The great majority of people were illiterate. On the relatively rare occasions when Korean was written, it was written using Chinese characters, in part for their sound, in part for their meaning. This too was a complex system poorly suited for mass literacy. Hangul was the first writing system to make it easy for any Korean to read and write his or her native language.

I won't give a detailed exposition of Hangul because it would take a good bit of time and space and there are a number of good ones available. The Wikipedia article is quite good and contains additional links.

Hangul is an alphabet. Some letters represent consonants, such as ᄀ /g/, ᄂ /n/, ᄃ /d/, and ᄆ /m/. Others represent vowels, such as: ᅡ /a/ ᅦ /e/ ᅵ /i/ ᅩ /o/ ᅮ /u/. They are normally written in squarish groups that vaguely resemble Chinese characters. For example: 기 /gi/, 김 /gim/, 미 /mi/ and 민 /min/. If there is no initial consonant, a special "null consonant" letter is used; in normal writing, the isolated vowels exemplified above are therefore actually written: 아 에 이 오 우.

It is for this reason that Hangul is sometimes wrongly considered a syllabary. It isn't, since it is completely analyzable at the segmental level. In fact, the groups into which the letters are formed do not correspond exactly to syllables. The rule since 1931 has been that consonants that belong to the same morpheme must remain within the same block. Thus, "chicken" is underlyingly /dalg/, written 닭. In isolation, the /l/ is not pronounced, but both post-nuclear consonants are written. When the nominative case suffix /i/ is added, the /g/ resyllabifies into its onset and the word is pronounced /dal.gi/, so we might expect to write *달기. In fact, the correct spelling is 닭이.

Hangul is considered a great achievement for several reasons. First and foremost, it is a perfect alphabet. It distinguishes all of the distinct sounds in Korean and makes no subphonemic distinctions. From the point of view of the reader, there are no ambiguities. From the point of view of the writer, there are a few ambiguities in that in certain environments syllable-final nasals may be written either as nasals or as the plain stops of the same point of articulation. This is not an error but reflects a decision to write at a higher level of abstraction than a classical phonemic representation. It makes things slightly harder for the writer but makes things easier for the reader, who is given more direct access to lexical representations.

Other reasons that Hangul is considered such a great achievement are found in details of its design. Hangul exhibits a significant degree of featural decomposition. That is, components of the letters correspond to phonological features. For example, the aspirated series of stops and affricates are written by adding a horizontal stroke to the letters for the plain series. Thus we have: 가 /ga/ vs. 카 /ka/, 다 /da/ vs. 타 /ta/, 바 /ba/ vs. 파 /pa/, and 자 /ja/ vs. 차 /ca/.

The shapes of the letters are based on the configuration of the articulators in making the corresponding sounds. This was truer in the original version of Hangul than it is now. The fairly modest changes over the past few centuries have in some cases slightly obscured this relationship.

Hangul is well designed for the reader, in that the letters are easily distinguished, and for the writer, in that the letters are simple. Even I can write legibly in Hangul.

Finally, Hangul is considered a great achievement because the perfection of its design and the care and insight with which it was designed are considered unique and to have had no precedent. Although Hangul certainly represents a very high point in the design of writing systems, it isn't true that its designers had no precedent to work with. They were familiar with Chinese phonological theory and to some degree at least with the writing systems of the Tibetans, the Mongols, the Japanese, and the Jürchen. We know that they were acquainted with alphabetic writing in the form of the Devanagari alphabet in which the Sanskrit of the Buddhist scriptures was written and in the form of the Phags-Pa alphabet. The idea that Hangul sprang from nothing is a bit of an exaggeration.

I submit that we should honor King Sejong as much for his reasons for creating a new writing system as for its technical beauty. Sejong was very explicit about why he considered it important to introduce Hangul. His goal was to make literacy accessible to everyone. This can be seen from the opening paragraph of the 訓民正音 (훈민정음), the document in which King Sejong promulgated the new writing system. Here is the original document, in Classical Chinese:

and here is a translation:

The sounds of our country's language are different from those of China and do not correspond to the sounds of Chinese characters. Therefore, among the stupid people, there have been many who, having something to put into writing, have in the end been unable to express their feelings. I have been distressed by this and have designed twenty-eight new letters, which I wish to have everyone practice at their ease and make convenient for their daily use.

Nowadays that isn't such a striking goal, but in his world it was remarkable. In 15th century Korea, as almost everywhere else in the world, literacy was restricted to a small elite - most people were illiterate. Furthermore, Korean society was extremely hierarchical. It consisted of three tiers, nobles, commoners, and slaves. It was almost impossible for a slave to become free, or for a commoner to become a noble. Until 1444, when King Sejong forbade the practice, a slave's owner had the right to kill him at whim.

The dominant ideology was Confucianism, a philosophy based on the relationships between ruler and subject, parent and child, older and younger, man and woman, and friend and friend, the first four of which are conceived as inherently unequal. Women could not inherit property. In short, 15th century Korea was a highly stratified society rigidly controlled by a small elite in which those who were not elite and not male had few rights.

Indeed, there was strong opposition to the introduction of Hangul on the part of King Sejong's court, so strong that they presented a memorial in opposition and debated with him verbally. The reasons they gave were in part that it was wrong to deviate from the Chinese way of doing things, and in part that such a simple writing system would lead to the loss of aristocratic privilege. Their motives may have been wrong, but they understood the effects of mass literacy all too well. After King Sejong's death, Hangul was very nearly suppressed. It took much longer to come into wide use than he had intended due to the opposition of the aristocracy.

For the king himself in such a society to create the means for mass literacy, knowing full well its liberating effect, is absolutely stunning. King Sejong was not merely a great scholar; he was a great humanitarian.

Hangul Day is celebrated on October 9th in South Korea and (under the name 조선글날 Choseongeul Day, using a different word for "Korea" in the compound "Korean writing") on January 15th in North Korea. From 1960 until 1991 it was a legal holiday in South Korea. Along with United Nations Day, it was reduced to a "commemorative day" due to pressure from large corporations, which wanted to increase the number of work days and saw Hangul Day as dispensable.

If you like to celebrate with a drink, I suggest some 막걸리 makkeolli, a kind of rice wine also known as 농주 nong-ju "farmer's wine". It is triply appropriate. It is particular to Korea, it is the drink of ordinary rural people, and its name is similar to that of the late Jim McCawley, one of the few non-Koreans to celebrate Hangul Day.

[Addendum 2005-10-10: A reader inquires about the difference in the dates between North and South Korea. In South Korea the date chosen is intended to approximate the day on which the 訓民正音 (훈민정음) was proclaimed. Internal evidence shows that it was proclaimed in the first ten-day period (上旬) of the ninth month of the old lunar calendar. October 9th corresponds to the tenth day of the ninth month. The date used in North Korea is believed to be the day on which the document was actually created. The fact that October 9th is the celebration of the anniversary of the Worker's Party may or may not be relevant. It is certainly plausible that a different date was found for Hangul Day so as not to conflict with Worker's Party Day, but I don't know which date was set first. ]

Posted by Bill Poser at 03:25 AM

Ape Brain Simulators Run Rampant

I couldn't resist mentioning this plug for the program Noble Ape Simulation that I just came across.

This remains the best landscape-orientated ape-brain simulator for Mac OS X.

The curious presupposition here is that there are a bunch of competing landscape-oriented ape-brain simulators for Mac OS X with which Noble Ape Simulation is holding its own. Without in any way meaning to suggest that Noble Ape Simulation is not a fine program, I am skeptical about the breadth of the field of competition here.

[Addendum 2005-10-12: I thought that everybody would realize that the review quoted on the Noble Ape site was probably tongue-in-cheek, but I am told that some people have not. Anyhow, it probably was. The NobleApe folks thought so. Incidentally, Noble Ape Simulation runs under Linux and MS Windows as well as Mac OS X, so non-Mac folks can give it a try.]

Posted by Bill Poser at 01:11 AM

Another theory

Other snow-words cartoons here, here and here. The truth is discussed here, here and here.

[via Michael at Translate This!]

Posted by Mark Liberman at 12:21 AM

October 08, 2005

Denotation switching

Mark Liberman puzzles over this passage from a story on the Ig Nobels:

The winners are discretely contacted beforehand to give them an opportunity to decline. It is a testament to the growing prestige of the event that very few turn down the offer and agree to attend at their own expense.

One way of characterizing what the problem is here is that turning down the offer is predicated of some small number of winners, denoted by "very few" -- call this small class of downturners T -- but the writer's intention is to convey that agreeing to attend at their own expense is predicated of the COMPLEMENT OF T, not T: it's the people who don't turn down the offer who agree to attend at their own expense. You just can't do this by conjoining the VPs "turn down the offer" and "agree to attend at their own expense" with the shared subject NP "very few", because "very few" must be understood as simultaneously denoting T and ~T -- a straightforward violation of what I called, in an earlier posting on failures of parallelism in reduced coordination, the Factor Constancy condition: in factorable coordination, the factor must have the same semantics in combination with each of the conjuncts.

That's not quite the end of things, however. In similar examples involving not coordination but anaphora (zero or overt), it's much easier to get away with this sort of denotation switching. Here's an instance I brought up on the American Dialect Society mailing list back in May, from the Palo Alto Daily News ("City OKs university land deal" by Jason Green) of 5/4/05, p. 74:

Although touted by university officials and city staff as a historic deal, not everyone was in favor of the agreement, including council member Yoriko Kishimoto. Of the six council members eligible to vote on the Mayfield agreement, Kishimoto cast the sole "no" vote.

In the first sentence from the PADN, the inclusion of council member Yoriko Kishimoto is predicated of some group not overtly mentioned in the phrase "including council member Yoriko Kishimoto". This is zero anaphora. So we search for an appropriate discourse referent for the anaphor, if possible one recently mentioned in the discourse. Well, the subject of the main clause of that first sentence is "not everyone", which predicates less-than-universality (within the universe of relevant people) of the class of those in favor of the agreement, and so indirectly introduces this class -- call it F -- as a discourse reference. Could F be the discourse referent that the zero anaphor picks up? No: the writer's intention is clearly to convey that Kishimoto is included in ~F (the class of people opposed to the agreement), not in F. Still, the sentence isn't so bad (though it was troublesome enough for me that I reflected on it). ADS-L posters ranged from those who found it plainly ungrammatical to those who had little problem with it.

At this point, as so often happens on ADS-L, Larry Horn stepped in to tell us that there's actually some literature about this phenomenon:

If anyone is really interested in this from a theoretical/empirical direction, there have been a number of publications by Linda Moxey and Tony Sanford, including their book Communicating Quantities (1993), on what they call "Comp-set" as opposed to "Ref-set" reference, i.e. cases in which the reference is to the complement of the set specified ("Few of the students passed, because they [= the ones who didn't] hadn't taken the test seriously" vs. "Many of the students passed, because they [=/= the ones who didn't]...).

Horn's example of Comp-set reference involves an overt anaphor, "they", but the Kishimoto sentence shows that the same thing is possible with zero anaphors, at least for some people.

Still another zero-anaphor example, again from PADN (Health column, "Fall of Atkins diet traced", 9/15/05, p. 45):

Although studies showed the diet helped followers lose weight -- and quickly -- the Atkins dropout rate was high. Few managed to stay on the diet for an entire year, complaining that, eventually, even an unlimited amount of steak and eggs can become boring. Others suffered unpleasant side effects...

Here, the complainers are (presumably) the many who didn't manage to stay on the diet, not the few who did. The same is true of a version with an overt anaphor:

Few managed to stay on the diet for an entire year; they complained that, eventually, even an unlimited amount of steak and eggs can become boring.

To sum up: Comp-set reference is possible, for at least some speakers in some contexts, for both overt and zero anaphors, but not (apparently) in (reduced) coordination. It's not customary to think of coordination as being in any way like anaphora, so this difference between the two domains is scarcely a surprise. But you COULD think of reduced coordination as being akin to zero anaphora, with the factor constituent overt with one conjunct but zero with the other, so that "saw Kim and Sandy" would have a structure like:

[saw Kim] [and ___ Sandy]

(where the underline marks the position of an omitted V, which is interpreted as picking up the reference of the preceding V "saw"). In case you're tempted by this proposal, the unavailability of Comp-set reference in reduced coordination should make you think twice.

But I'm not done yet. Ok, maybe reduced coordination isn't really much like zero anaphora, but it might be like another class of omitted-constituent constructions, namely those involving "functional control" (rather than anaphoric control), as in "Kim wants to leave", with a structure like:

[Kim] [wants [to ___ leave] ]

Just such a proposal was made to me, very tentatively, by Paul Postal in e-mail following my Language Log posting of 6/10/05, on "WTF coordinate questions" like:

Are you like most Americans, and don't always eat as you should?
Have you written a thesis, but have no idea what to do next?

There's a lot to work out about this idea, and I'm still playing with it, though I have to say that my thinking about these WTF coordinate questions has led me away from anything resembling functional control and towards still another class of omitted-constituent constructions, namely Initially Reduced Questions like:

Have no idea what to do next? 'Do you have no idea about what to do next?'
Anxious about your exams? 'Are you anxious about your exams?'

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 03:00 PM

Grammatical indoctrination at law reviews

One of the emails in response to my post on Coulter the Grammarian was from John W. Brewer, who suggested that we look for the roots of Coulter's feelings about that v. which in her experience on law review at the University of Michigan law school.

... a key part of law review culture is a hyperlegalistic concern with details of style and usage, and an almost pathological fear of exercising discretionary judgment among plausible alternatives. For any style/usage issue, the notion is that there must be a rule, and you can look the rule up in an authoritative source, and once you've done that you should follow the rule strictly, both in your own writing and especially in seizing opportunities to make petty corrections to the writing of others. The so-called "Blue Book" provides most of the obsessive-compulsive detail on matters of abbreviation and the like (should the U.S. Court of Appeals for the Second Circuit be abbreviated "(2d Cir.)" or "(2nd Cir.)"? For God's sake, don't guess! Look it up!). But it doesn't deal directly with many issues of prose style where people like this intuitively sense that There Must Be a Rule. For that, many law review types use as their authoritative source the Texas Law Review Manual of Usage and Style (MoUS), which I dimly recall from unhappy encounters with it circa 1990 as having a particularly obsessive and wrongheaded view of the that/which issue ...

In the absence of an anthropological study of law review culture, the legally-naive reader might take a look at Judge Richard Posner's "Against the Law Reviews" ("Welcome to a world where inexperienced editors make articles about the wrong topics worse").

With a few exceptions, law reviews are edited by law students rather than by professors or other professionals. The law reviews are numerous, are published bimonthly or at more frequent intervals, are edited without peer review, and are seemingly unconstrained in length. Their staffs are large, but the members, being students, are inexperienced both in law and in editing. With such abundant manpower and no reliance on peer review, law reviews do not forbid simultaneous submission or insist on brevity, and the interval between initial submission and final publication is much shorter than in other scholarly fields. The size of law review staffs enables them not only to check the author's citations but also to make many substantive comments and to engage in line-by-line copyediting.

Language Log readers will not be surprised to learn that the student copyeditors, in addition to imposing MoUS, sometimes favor idiosyncratic linguistic theories of their own invention. Judge Posner cites a striking example:

According to an article written by James Lindgren at Northwestern Law School in the Chicago Law Review, one law review editor "thought that many uses of the word 'the' in an article were errors. Following this bizarre rule of thumb, he took as many 'thes' out of manuscripts as he could, thus reducing many sentences to a kind of pidgin."

I suppose this might be a particularly obtuse application of the Omit Needless Words rule.

Commenting on Posner's article, Micah at Crooked Timber wrote that:

An average top-tier law review has a staff of about 80 students. Instead of engaging in pro bono work or their own research, those students spend—and this is a very conservative estimate—7000 hours per year editing the work of law professors. Now multiply that across the dozens, if not hundreds, of law journals out there.

If there are 143 law reviews operating at the cited scale -- a plausible particularization of "dozens, if not hundreds" -- that's a million hours a year of law-review editing. This is an extraordinary experiment in mass indoctrination. As John Brewer wrote,

In my experience, the law review experience may intensify preexisting tendencies toward bogus prescriptivism among well-educated young people, with longlasting negative effects. Liberal, conservative, and moderate students seem equally at risk. It's particularly sad if someone like Ann, who was otherwise able to be fiercely critical about the legitimacy of the overwhelming majority of what her conventionally left-of-center professors and fellow students in law school presented to her, buys into it. You (or another Language Log contributor?) have made the point before that the grammar of a natural language is helpfully thought of as a Hayekian spontaneous order. Since Ann is right-wing enough both to know what that means and to believe that such spontaneous orders are generally preferable to the bureaucratic diktats of graduates of elite law schools, the oft-lamented-on-LL failure of communication between the specialized linguistics culture and the general educated public is particularly regrettable in her case.

It was Glen Whitman who first suggested that grammar is a Hayekian spontaneous order , in discussing Geoff Pullum's "third position" between the "two extremes: on the left, that all honest efforts at uttering sentences are ipso facto correct; and on the right, that rules of grammar have an authority that derives from something independent of what any users of the language actually do." I just quoted and linked to Glen's post.

[While we're here, it occurred to me to check Friedrich Hayek's stance on which v. that. I looked in three of his on-line works, checking them in chronological order.

In "Economics and knowledge", (a presidential address to the London Economic Club, 10 November 1936), First published in Economica (February 1937), I found that the first integrated-relative which occurs in the second sentence:

Its main subject is of course the role which assumptions and propositions about the knowledge possessed by the different members of society play in economic analysis.

In "The Use of Knowledge in Society", American Economic Review, XXXV, No. 4; September, 1945, pp. 519-30, we have to wait until the third sentence:

If we possess all the relevant information, if we can start out from a given system of preferences, and if we command complete knowledge of available means, the problem which remains is purely one of logic.

And in the Foreword to Economics as a Coordination Problem: (1977), Hayek's use of integrated which is delayed all the way to the fourth sentence:

It is a curious fact that a student of complex phenomena may long himself remain unaware of how his views of different problems hang together and perhaps never fully succeed in clearly stating the guiding ideas which led him in the treatment of particulars.

If Ann Coulter were here, I'd ask her whether Hayek so often committed "the sort of error that results from trying to sound 'Ivy League' rather than being clear", and whether his essays and speeches "would be a lot more convincing without all the grammatical errors"?]

Posted by Mark Liberman at 11:08 AM

Many are discretely called, but few decline and agree?

I almost missed this one, since my normal news outlets don't include AFP anymore, but Richard Hershberger sent the link along. The context is a 10/5/2005 story entitled "Ig Nobels to honour strange science".

The winners are discretely contacted beforehand to give them an opportunity to decline. It is a testament to the growing prestige of the event that very few turn down the offer and agree to attend at their own expense.

Don't be distracted by the simple spelling problem in the first sentence, where the writer probably meant that the winners are contacted discreetly, i.e. in a circumspect and private way, rather than discretely, i.e. separately rather than as an undifferentiated mass. The real fun is in the second sentence, which literally means something much further from what the writer must have intended.

I wonder whether this might have been the result of editing in a hurry. After all, these two phrases are pretty much synonymous in isolation:

a. Nearly all accept the offer.
b. Very few turn down the offer.

but the context of conjoined verb phrases changes things:

a. Nearly all accept the offer and agree to attend at their own expense.
b. Very few turn down the offer and agree to attend at their own expense.

Luckily for the writer, (s)he is anonymous. I guess this is also lucky for the editor, come to think of it.

[Update: Jean Véronis suggests another explanation: perhaps the errors result from hasty translation. My understanding is that multi-lingual news agencies like A(gence) F(rance) P(resse) produce most of their stories independently in their different language bureaus, rather than translating stories from one language to another -- so that these feeds are not a good source of genuinely parallel text for statistical machine-translation training -- but some pieces are sometimes translated, more or less loosely. In this case, Jean found a French-language AFP story with similar structure:

Les gagnants sont alors discrètement contactés avant la cérémonie pour leur laisser la possibilité de décliner cette offre. Mais en fait ils sont peu nombreux à résister à cette récompense et sont même beaucoup à venir recevoir leur prix en personne et à leurs propres frais.

This easily explains the spelling mistake with "discretely". The conjunction mistake in the next sentence is a little harder, since the French original (if the story indeed was originally in French rather than the other way around) shifts the quantified agent from the "few in number" who refuse the award to the "many" who come at their own expense. However, the French sentence uses a construction that is at best awkward in English: "in fact they are few in number to refuse this award, and indeed many to come and receive their prize in person and at their own expense." So perhaps the mistake resulted from trying to get the quantifiers into the subject position where English wants them to be.]

Posted by Mark Liberman at 07:50 AM

October 07, 2005

Ann Coulter, Grammarian

Suddenly everyone is a linguist. The latest to join the parade is Ann Coulter. In her 10/5/2005 column, "This is what 'Advice and Consent' Means", she offers the most creative socio-syntactic theory that I've encountered since Derek Bickerton suggested that our hominid ancestors developed language in order to scavenge dead elephants.

But you can't understand her hypothesis without some background, which I provide below the fold.

To start with, Ms. Coulter is not at all happy about the nomination of Harriet Miers for associate justice of the U.S. Supreme Court. She mentions President Bush's "Scottish Terrier Barney" as an alternative nominee, and complains that

Harriet Miers went to Southern Methodist University Law School, which is not ranked at all by the serious law school reports and ranked No. 52 by US News and World Report. Her greatest legal accomplishment is being the first woman commissioner of the Texas Lottery.

She explains that academic elites are normally to be shunned, quoting "William F. Buckley's line about preferring to be governed by the first 200 names in the Boston telephone book than by the Harvard faculty", but then she argues that "the Supreme Court is not supposed to govern us ... Being a Supreme Court justice ... [is] a real job."

Coulter further explains that

...if we were looking for philosopher-kings, an SMU law grad would probably be preferable to a graduate from an elite law school. But if we're looking for lawyers with giant brains to memorize obscure legal cases and to compose clearly reasoned opinions about ERISA pre-emption, the doctrine of equivalents in patent law, limitation of liability in admiralty, and supplemental jurisdiction under Section 1367 — I think we want the nerd from an elite law school. Bush may as well appoint his chauffeur head of NASA as put Miers on the Supreme Court.

OK, I think I've got it: we need to distinguish between governing and philosophizing, which we should assign to the graduates of lower-ranking schools; and "real jobs", which we should reserve for the elite. But some conservatives have not grasped this distinction:

One Web site defending Bush's choice of a graduate from an undistinguished law school complains that Miers' critics "are playing the Democrats' game," claiming that the "GOP is not the party which idolizes Ivy League acceptability as the criterion of intellectual and mental fitness."

I don't really see this as a partisan issue -- it seems to me that by the time someone gets to be 60 years old, you'd want to stop evaluating them by the rank of their alma mater, and instead evaluate their alma mater in terms of the quality of their accomplishments. Of course, I say this without having fully digested Coulter's distinction between governing or philosophizing and "real jobs"...

Anyhow, there's a fatal linguistic flaw in the presentation of the anti-elitist argument, according to Coulter:

(In the sort of error that results from trying to sound "Ivy League" rather than being clear, that sentence uses the grammatically incorrect "which" instead of "that." Web sites defending the academically mediocre would be a lot more convincing without all the grammatical errors.)

She's talking about the phrase "the GOP is not the party which idolizes Ivy League acceptability...", and what she's saying about it, I think, is:

It's grammatically incorrect to use which in integrated ("restrictive") relative clauses.
Misuse of which in such cases is a hypercorrection, caused by trying to sound "Ivy League".
Committing such a hypercorrection in an anti-elitist discourse subverts the argument, by suggesting that the writer is a faux elitist-wannabe, neither securely elite nor secure in his non-elite identity.

This might be a subtle but devasting riposte -- if points (1) and (2) were true. Alas for Coulter, they're not.

Every so often, we have a little flurry of which vs. that posts here on Language Log. These are a representative sample:

Five more thoughts on the that rule (Arnold Zwicky, 5/22/2005)
What I currently know about which and that (Arnold Zwicky, 5/10/2005)
The people from the CCGW are here to see you (Arnold Zwicky, 5/7/2005)
Don't do this at home, kiddies! (Arnold Zwicky, 5/3/2005)
Which vs. that: integration gradation (Mark Liberman, 9/23/2004)
Which vs that: a test of faith (Mark Liberman, 9/20/2004)
Which vs. that: I have numbers (Geoff Pullum, 9/19/2004)
Sidney Goldberg on NYT grammar: zero for three (Geoff Pullum, 9/17/2004)

If you don't already know it, you'll learn from those posts that the prohibition against using which to introduce integrated relative clauses is a made-up "rule", unsanctioned by the usage of good writers in any era. Still, I think that there's a germ of sociolinguistic truth in Coulter's theory -- "which hunting" is a favorite sport of down-market American copy editors, so that the rate of which in integrated relatives is lower in American journalism than it is in British journalism. As a result, integrated which may indeed have an elitist flavor for those American readers who have noticed the difference.

Here's a table that Geoff Pullum quoted from work by Biber et al., making the point quantitatively:

	AmE news	BrE news
integrated relatives with which	800	2600
integrated relatives with that	3400	2200
supplementary relatives with which	1400	1400
supplementary relatives with that	0	0

But you shouldn't think that integrated-relative which in America is the exclusive preserve of Ivy League limousine liberals and their social-climbing acolytes. The great Conservative Communicator himself, Ronald Reagan, was fond of this construction.

His speech "To Restore America", delivered 3/31/1976, contains four examples:

Tonight, I’d like to talk to you about issues. Issues which I think are involved-or should be involved in this primary election season.

It very quietly passed legislation (which the president signed into law) which automatically now gives a pay increase to every Congressman every time the cost of living goes up.

The truth is, Washington has taken over functions that don’t truly belong to it. In almost every case it has been a failure. Now, understand, I’m speaking of those programs which logically should be administered at state and local levels.

But there is one problem which must be solved or everything else is meaningless.

A few other examples selected at random from various other Reagan speeches, most of which contain several examples:

A freeze at current levels of weapons would remove any incentive for the Soviets to negotiate seriously in Geneva and virtually end our chances to achieve the major arms reductions which we have proposed.

Voices have been raised trying to rekindle in our country all of the great ideas and principles which set this nation apart from all the others that preceded it, but louder and more strident voices utter easily sold cliches.

The decayed and degraded state of moral and patriotic feeling which thinks nothing is worth a war is worse. The man who has nothing which he cares about more than his personal safety is a miserable creature and has no chance of being free unless made and kept so by the exertions of better men than himself.

Reagan also frequently quoted others using which in the way that Coulter thinks is ungrammatical:

And it was George Washington who said that “of all the dispositions and habits which lead to political prosperity, religion and morality are indispensable supports.”

Alexander Hamilton said, "A nation which can prefer disgrace to danger is prepared for a master, and deserves one."

Lord Acton of England, who once said, “Power corrupts, and absolute power corrupts absolutely,” would say of that document, “They had solved with astonishing ease and unduplicated success two problems which had heretofore baffled the capacity of the most enlightened nations. They had contrived a system of federal government which prodigiously increased national power and yet respected local liberties and authorities, and they had founded it on a principle of equality without surrendering the securities of property or freedom.”

Does Coulter think that Reagan, Washington, Hamilton and Acton routinely committed "the sort of error that results from trying to sound 'Ivy League' rather than being clear", and that their essays and speeches "would be a lot more convincing without all the grammatical errors"?

I can't resist quoting Coulter herself, though I'm not suggesting that her usage is definitive:

O'Connor took sadistic glee in refusing to overturn Roe v. Wade in the face of the unending strife it has caused the nation. (And it hasn't been easy on 30 million aborted babies either.)
She co-authored the opinion in Planned Parenthood v. Casey which upheld Roe v. Wade, gloating: "(T)o overrule under fire in the absence of the most compelling reason ... would subvert the Court's legitimacy beyond any serious question."

[Jokes aside, Derek Bickerton's idea (about human language as a consequence of counteractive niche construction for scavenging megafauna) is a serious scientific hypothesis, carefully thought through and supported by an elaborate body of evidence. Ann Coulter's idea (about conservative essayists subverting their own argument by misusing which) is a parenthetical cheap shot thrown into a bizarre proposal for a division of labor between graduates of lower-ranked schools in the legislative and executive branches, and graduates of top-ranked schools in the judiciary. Or something.

Sorry, Derek. But you can see why I couldn't resist, can't you?]

Posted by Mark Liberman at 12:01 AM

October 06, 2005

A wink or a nod?

I've been casually following some of the reporting on -- for lack of a better way to put it -- the whole Intelligent Design v. Evolution debate. Casually, because I simply can't be impartial about most of the issues involved in this debate and I'd rather not sit around with my blood boiling. But something I've come to appreciate is the sheer plurality of views on the matter, and how it's not as simple as one might think to know in advance what someone thinks about it; for example, there are biologists who are firm believers in some form of intelligent design, and there are theologists who accept the scientific evidence for evolutionary theory. (See, I told you I can't be impartial.)

But really, this is relevant to language. Or at least, to Language Log. Read on.

Today there was this article in the NYT about the supposed geological evidence for Intelligent Design. The article begins with "Tom Vail, who has been leading rafting trips down the Colorado River here [at Grand Canyon National Park] for 23 years", who is quoted as saying:

"You see any cracks in that?" he asked. "Instead of bending like that, it should have cracked." The material "had to be soft" to bend, Mr. Vail said, imagining its formation in the flood. When somebody suggested that pressure over time could create plasticity in the rocks, Mr. Vail said, "That's just a theory."

But the immediately following quotation is what struck me:

"It's all theory, right?" asked Jack Aiken, 63, an Assemblies of God minister in Alaska who has a master's degree in geology. "Except what's in the Good Book."

The problem is, there's no further reference to Mr. Aiken anywhere in the article. None. And I'm completely stumped as to what his position is on the matter, and therefore what the relevance of the quote is. On the one hand, he's a minister. On the other, he has a degree in geology. And as I noted above, a minister may accept evidence for the relevant scientific theory (in this case, the "Old Earth" theory of loooooong-term erosion) and a geologist may firmly believe in the intelligent design alternative (the "Young Earth" theory of God-did-it-fairly-recently). Even if Mr. Aiken was quoted directly and completely, his intonation could very well have been dismissive (imagine a rolling of the eyes, or a wink and a smile, as he says what's quoted above), or it could have been knowing (imagine a reassuring nod). Which is it, and why is Mr. Aiken being brought in at random to say it?

[ Comments? ]

Posted by Eric Bakovic at 11:21 PM

Dear Mr. Pogue

Many of us can only wish we had David Pogue's job. Companies that make cool new gadgets send him said gadgets, he plays around with them, and then he discusses them in the New York Times and other venues. I'm sure that in many cases he gets to keep the gadgets -- and please don't correct me if I'm wrong about this; illusions are often all that I have. In any event, not a bad gig if you can swing it.

Mr. Pogue also maintains a blog (well, sort of -- it looks like any other page of the NYT online, but it's a blog): Pogue's Posts. Today, he posted this short entry, where he asks about the misuse of it's for its:

Are the people who can't figure this out unique to computers? Or is it that NOBODY knows when to use the apostrophe, and it only SEEMS like it's computer people because they're the ones whose stuff I see all the time?

I suddenly see my opportunity to speak directly to David Pogue, and to help him by answering his question! But what do I say? Do I recommend Lynne Truss, or do I refer him to Mark's two posts that comment on Menand's scathing New Yorker review thereof?

In the end, I decided to just refer Mr. Pogue directly to the experts here at Language Log Plaza, where we have not only cared enough to take Time magazine to task for a misplaced apostrophe, but also took the time to correct a random blogger's misperceptions about the popular use of d'oh and even discussed the apostrophe as typographical evidence that the infamous Texas Air National Guard memos were forgeries. Not a bad place to turn for your questions about language, eh, Mr. Pogue? (Oh, and about orthographic conventions, too.)

[ Comments? ]

Posted by Eric Bakovic at 10:35 PM

Plain speaking

As reported by Richard Norton-Taylor in the Guardian of 9/29/05, in a story on the appointment of Sir Richard Mottram as "the prime minister's top security and intelligence adviser":

Sir Richard put it more succinctly. He is said to have told a colleague [about a 2001-02 mess in Whitehall]: "We're all fucked. I'm fucked. You're fucked. The whole department's fucked. It's been the biggest cock-up ever and we're all completely fucked."

Passed on to me by Victor Steinbok, who characterized Sir Richard's forthrightness as "one of the best lines of recent British politics".

Note that the Guardian, unlike some other papers I could name, doesn't shrink from printing Sir Richard's remarks without any avoidance characters.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 08:17 PM

Linguistic mens rea

Did Harriet Miers know she done wrong? When I wrote yesterday about her mistake in subject-verb agreement, I suggested that she signaled her awareness of the error with an extra-long pause:

The wisdom of those who drafted our Constitution and conceived our nation as functioning with three strong and independent branches have proven [1330 msec] truly remarkable.

I don't like to leave that kind of statement hanging out there without some empirical support. So I transcribed her speech as a whole, and made appropriate measurements, duly reported below the fold.

The core of Miers' remarks -- omitting the thank-yous and such -- took 112.236 seconds ("From my early days" to "our constitutional system"), during which there were 17.84 seconds of silent pause, 16% of the total time. She paused 39 times. 33 of these pauses were internal to sentences and 6 were between sentences. The transcribed segment comprises 198 words, for a rate of 106 wpm overall, or 126 wpm during the non-silent regions.

For comparison, I also transcribed and measured a comparable segment of George W. Bush's preceding remarks -- 125.068 seconds (from "In our great democracy" to "the cause of justice"), during which there were 47.566 seconds of silent pause, or 38% of the total time. He paused 30 times, of which 17 were internal to sentences and 13 were between sentences. The transcribed segment comprises 284 words, for a rate of 136 wpm overall, or 220 wpm during the non-silent parts. In his prepared speeches, W uses long pauses, adding up to an unusually large fraction of the total time; but in between the pauses, he speaks pretty fast.

Here's a boxplot that shows the distribution of durations of silent pauses in Miers' speech, and also in a comparable segment of George W. Bush's preceding remarks -

The line in the middle of the box in the leftmost column shows the median value of Miers' within-sentence pauses: 465 milliseconds. The bottom and top of the box show the 25% and 75% percentiles, respectively: 343 and 620 msec. The "whiskers" show the extreme values (excluding the pause after "have proven") -- 169 and 744 msec. The other columns of the plot show Miers' between-sentence pauses, and the comparable distributions for Bush's pauses.

The big red asterisk in the first column of the plot is the duration of Miers' pause after "have proven" -- 1330 msec. This is 2.9 times larger than her median within-sentence pause length, and 1.8 times larger than the next-longest example. In fact, it's larger than her longest between-sentence pause.

In my earlier post, I theorized that Miers' might have made the error in a distracted moment, and then regretted it. On reflection, I think it's just as likely that someone else wrote the speech for her, and she didn't notice the error until she read it out loud. In either case, it seems likely to me that the extra-long pause after the error is a symptom of regret. It must have been painful for her to make this mistake in such a public forum, given the reports that she is a fiercely detail-oriented editor of others' texts.

I'll also note in passing that the striking difference in rhetorical rhythm between the two speakers, noted in the waveform display in my earlier post, is displayed in a different way in these measurements.

Back in January, Geoff Pullum wrote about a similar sort of agreement error that he found in the newspaper. Some commenters in another web forum complained about this:

Sorry kids, you can't be an apple and an orange, and if you're a descriptivist, and someone honestly makes a sentence, that's an honest sentence in the language that actually is.

In response, Geoff explained the facts of grammatical life in simple terms. Some readers may again be puzzled about how linguists, who sometimes seem to feel that "whatever is, is right", can at other times be so certain that something is an error.

Concerns about grammar and usage fall into several categories. There are violations of nonsensical "principles" hallucinated by some self-appointed arbiter; there are differences between linguistic innovations and conservative norms; there are differences between standard and non-standard dialects; and finally, there are genuine errors.

The genuine errors -- slips of the tongue or the mind -- usually reflect some local confusion of the speaker or writer, who chooses a form or structure by mistake, as if absent-mindedly discarding the apple slices and putting the peels in the fruit salad. In English, one common confusion is using a plural verb form with a singular subject, under the influence of a nearby plural noun that is not the subject.

Language cranks treat all these categories of phenomena in the same way, as crimes against the laws of correct language. Linguists distinguish them according to their several kinds, and try to offer advice suitable to the situation. It may be prudent to avoid running afoul of imaginary strictures, though some may choose to defy them on principle; the choice of innovative or non-standard forms should be made mindfully; but mistakes are just mistakes.

Harriet Miers made a mistake, and she knew it as soon as the words were out of her mouth.

Posted by Mark Liberman at 11:32 AM

October 05, 2005

Jeopardy Gets It Wrong Again

The folks at Jeopardy are not doing at all well this week on the linguistic front. They just got another one wrong. The answer was a display of the country symbol PY together with the statement:

After Spanish, this country's most common language is Guaraní.

The corresponding question was: Paraguay.

The problem is that Guaraní is not the most common language of Paraguay after Spanish; Guaraní is the main language of Paraguay. Over 90% of Paraguayans are estimated to speak Guaraní, with about 40% of the population monolingual in Guaraní. About 6% are monolingual in Spanish. Roughly half the population is bilingual. That means that about 56% speak Spanish, compared to over 90% who speak Guaraní.

Politically, Spanish is the more important language. The monolingual Spanish minority still constitute much of the elite, and Spanish is used more widely than Guaraní in urban areas. Since the fall of the Stroessner dictatorship in 1989 Guaraní has assumed a more prominent role.

[The figures in the Ethnologue, are somewhat different from what I have seen elsewhere. It gives 75% for Guaraní but only 4% for Spanish. I suspect that they have confused figures for ability to speak, monolingualism, and/or language preference..]

Posted by Bill Poser at 07:47 PM

Stickler Shock

Bruce Reed has an amusing piece in Slate entitled "stickler shock", musing on the fact that both Chief Justice John Roberts and Associate Justice nominee Harriet Miers are known to serve as self-appointed copy editors for texts that cross their desks. Reed adds, as an update:

Former Supreme Court clerk, future Supreme Court Justice, and fluent English speaker Robert Gordon reminds me that in her maiden speech yesterday, Miers got off to a rocky start in her bid to become Associate Grammarian:

"The wisdom of those who drafted our Constitution and conceived our nation as functioning with three strong and independent branches have proven truly remarkable."

Chief Justice Roberts might allow the controversial use of "proven" in place of the older and more established "proved," but any stickler who says "The wisdom ... have" listens too much to George W. Bush.

That's indeed what the NYT transcript says she says, and the White House transcript says the same thing, but I like to check these things, so I inspected a recording of the event, and this time, the transcripts are correct: that's what she said.

However, I noticed two other relevant things about the recording. First, the crucial verbal group "have proven" is followed by a long pauses:

... independent branches [600 msec] have proven [1330 msec] truly remarkable.

Second, Ms. Miers' speech was not otherwise characterized by pauses of similar length. In fact, there's a striking difference in this respect between her and President Bush, as can immediately be seen in this low-res picture of the waveform of the entire appearance -- it's easy to see where Bush stops and Miers starts, because his portion includes so much more open air between phrases.

I conclude from this that when Ms. Miers used "have" in place of "has", she had momentarily lost the thread of sense in the speech she was reading, and was misled in the usual way by the plural noun phrase ("three strong and independent branches") intervening between her subject ("the wisdom") and her verb phrase ("has/have proven"). This sort of thing could happen to any of us. In fact, I caught myself doing something similar in this morning's lecture.

Reed's remark about "listens too much to George W. Bush" seems to me like a cheap shot, since I don't believe that there's any good evidence that President Bush is significantly more prone to this sort of error than the rest of us are.

[Update: more here.

[Slate link via Ben Zimmer]

Posted by Mark Liberman at 04:31 PM

The cultural importance of jokes

Geoff Pullum has put it plainly , and even in bold face: YOU CANNOT draw conclusions about what a culture values, or what speakers perceive, or how a nation thinks, by selective comparison of the senses of a few lexical items. This is especially true if the lexical items are invented as a joke by someone who doesn't know the language under discussion. A case in point, via Benjamin Zimmer:

I wonder how many readers of Matt Bors' strip will recognize (as Ben Zimmer did) that the "Inuit" list in the top panel is taken from a satire due to Phil James, "The Eskimos' Hundred Words for Snow", which has been circulating on the internet since 1996 or so, and includes obviously fake (though funny) items like

tliyel          snow that has been marked by wolves
tliyelin        snow that has been marked by Eskimos
hiryla          snow in beards
tlayinq         snow mixed with mud
quinaya         snow mixed with Husky shit
quinyaya        snow mixed with the shit of a lead dog
puntla          a mouthful of snow because you fibbed
allatla         baked snow
fritla          fried snow
gristla         deep fried snow
MacTla          snow burgers
jatla           snow between your fingers or toes, or in groin-folds
ertla           snow used by Eskimo teenagers for exquisite erotic rituals
hahatla         small packages of snow given as gag gifts
warintla        snow used to make Eskimo daiquiris
mextla          snow used to make Eskimo Margaritas
mortla          snow mounded on dead bodies
ylaipi          tomorrow's snow
nylaipin        the snows of yesteryear ("neiges d'antan")
ever-tla        a spirit made from mashed fermented snow, popular among Eskimo men
huantla         special snow rolled into "snow reefers" and smoked by wild Eskimo youth
tla-na-na       snow mixed with the sound of old rock and roll from a portable radio
depptla         a small snowball, preserved in Lucite, that had been handled by Johnny Depp

I'm happy to agree that you can draw conclusions about what a culture values, or how a nation thinks, by looking at its myths. But what does the Eskimo-Snow-Words myth tell us about our own culture? I guess we can learn that a good story will always outrun the fact checkers -- but this is hardly unique to us. The snow-words myth underlines our love of exoticizing otherness, but there's a specificity to the ingrained idea of words as a window on the mind of the Other that remains puzzling.

Posted by Mark Liberman at 07:29 AM

Jeopardy Gets It Wrong

There was a rare gross error on Jeopardy yesterday. Jeopardy ruled correct a question that equated pow-wow and potlatch. That's wrong. They're entirely different events. A pow-wow is a gathering for fun. People go to them to meet other people and for the entertainment. They typically focus on dancing, much of which is competitive. There will also be people selling crafts and food stalls. Most people camp at the pow-wow site, so the socializing goes on day and night.

A potlatch on the other hand is serious business. Food is served, sometimes fairly modest, sometimes a real feast, and people may enjoy seeing their friends, but the purpose is not merely socializing and entertainment. The purpose of a potlatch is for business to be conducted in the presence of witnesses so that everyone can see what has been done and that it has been done properly. The business done may be clan-internal, such as the taking of a name. The holders of names are usually described in English as hereditary chiefs, but this is inaccurate since they aren't chiefs in the sense in which English speakers generally understand the word and their positions are not hereditary. Holding a name is more like holding a noble title.

Nowadays the most common kind of potlatch is a funeral potlatch, at which the clan of the deceased pays other people for the duties that they have performed related to the wake and funeral. Other potlatches have to do with the resolution of disputes. These are usually internal to the community, but sometimes take place between tribes. In 1861, for example, a long period of violence between the Tsetsaut and the Gitanyow in northern British Columbia came to end with the transfer of the area around Meziadin Lake from the Tsetsaut to the Gitanyow as blood money.

Many sources describe the potlatch as a ceremony for the purpose of wealth display. That's true to an extent, but more so in some areas than others, and it probably isn't the whole story anywhere. It certainly is not true in the interior of British Columbia, where I am personally familiar with it.

Posted by Bill Poser at 03:52 AM

October 04, 2005

Worth a repetition

Let's just take this remark of Mark Liberman's (here) and repeat it in a bolder format for greater visibility, OK?

Someday, someone will explain to me why private theories about the logic of language so often turn into public crusades of moral awakening.

Exactly. That's what we want to know. How private worry about hyphens or apostrophes will so often turn a person into a punctuational avenging marauder, a ruthless grammatical Genghis Khan. Why does this happen? I wish I knew.

Posted by Geoffrey K. Pullum at 06:22 PM

Prescriptivism and national security

Another home run from Rob Balder at Partially Clips:

I once had a good friend who waged a long, tireless but lonely and fruitless campaign to persuade a major American corporation to enforce certain rules of hyphenation in all of its documents. His idea was that any left branching in complex nominals must be signaled by a hyphen within the constituent. Taking a few examples from this morning's papers, this would mandate not just two-bedroom house and hurricane-related damage, but also Supreme-Court choice, real-estate firm, Open-Content Alliance and so on. He was passionately convinced that his proposal was much more logical than any of the standard stylistic rules on the subject, which he argued were so complex and exception-ridden that no one could follow them in practice.

I gather that my friend formed his general rule as a misunderstanding of what he was taught in school. He was indignant to find that allegedly well-edited sources got this wrong all the time, and then felt betrayed and outraged when he discovered that they felt no remorse at this systematic violation of elementary rationality. After all, without some orthographic code for signaling the structure of these complex nominals, what else could a reader rely on?

Someday, someone will explain to me why private theories about the logic of language so often turn into public crusades of moral awakening. Perhaps it reduces to the previously-unsolved problem of why some people commit themselves so emotionally to other projects of rationalizing ethical norms with respect to a revealed system.

Meanwhile total-space-nut-hyphen-basket might be a useful term of art, despite being a little long. I'll try it in a couple of other posts, and see how it works.

Posted by Mark Liberman at 12:20 PM

English-Teaching Robots?

According to this news report, South Korea is about to test the use of robots for the teaching of English. The robots are to be placed in apartment complexes, where they will be available to help children with English pronounciation. The robots are said to be able to read aloud English stories and to correct children's pronounciation.

No explanation is offered of why robots will be used. I can see why computer tutoring might be helpful. Using a computer as a playback device is of course old hat. Computerized evaluation of language learner's pronounciation has been used for some time in Japan. Neither of these tasks requires a robot. The organization sponsoring the project is the Korea Advanced Intelligent Robot Association, an outfit supported by the Ministry of Information and Communication, whose newsletter provides further insight. It turns out that the goal of the project is to push robots, with English-teaching just one of a variety of activities of which they are capable. The robots have sensors that allow them to function as security alarms. They also have built-in vacuum cleaners.

What I find rather odd is that the functions that call for robots as opposed to computers are extremely limited. Other than vacuuming, for which much cheaper and smaller devices are available, the only function mentioned that calls for motion is that the robot can be programmed to go into the bedroom and wake people up. The last I knew, alarm clocks did a perfectly fine job of this. Robots have some very important uses. Industrial robots already do a lot of work on assembly lines, and bomb-disposal robots make the job of bomb-disposal units a good deal safer. These robots, however, seem to have little practical use.

This impression is confirmed by plans to introduce postal robots. One model, described as male, is to function as a security guard. The other, described as female,

will take care of customers by showing fun video clips to waiting clients on a built-in monitor.

Leaving aside the gender stereotyping, isn't this more than a bit ridiculous? If long lines at the post office are a problem, maybe they should increase staff or make postal operations more efficient. If I were in a long line at the post office, I doubt that being shown funny video clips would greatly improve the experience. And, after the initial novelty wears off, why will it be more enjoyable to see the video clips on a robot than on a television screen?

The impression I get is that this is the sort of snazzy but not really useful project that critics consider to be characteristic of the MIT Media Lab.

Posted by Bill Poser at 11:49 AM

Only 7% of variance shared across data from different web search engines?

Well, it's not really that bad. Except sometimes. Here's the story.

This starts with the most recent issue of Language, which has has several excellent articles that should appeal to interested non-specialists. I wish (again!) that Language were an Open Access journal, so that those of you without institutional subscriptions could read it. And meanwhile, it would be nice to have the table-of-contents info on the journal's web site kept up to date, with an accessible link to the subscribers-only content at Project Muse -- the TOC for the September issue, sent to the printers several months ago and delivered to subscribers several weeks ago, is still not displayed as of this morning.

But enough complaining. I'd like to point readers to one of those excellent articles: Anette Rosenbach, "Animacy versus Weight as Determinants of Grammatical Variation in English", Language 81(3) 613-644. It was this article that reminded me of a disappointing fact about (some of) the counts returned from web searches.

Dr. Rosenbach starts from a traditional observation: the choice between the English expressions of possession "X's Y" and "the Y of X" is influenced by the animacy and the (linguistic) weight of X. If X is weightier -- e.g. longer -- then "the Y of X" is more likely and "X's Y" is less likely. And if X is animate or human, "X's Y" is more likely and "the Y of X" less likely. Here's a table of Google counts exemplifying the effect of animacy in a single context:

	the child	the ocean
the temperature of __	239	12,500
the __'s temperature	687	1,090
apostrophe-s proportion	74%	8%

And here's a related example showing the effect of length:

	the ocean	the Atlantic ocean	the North Atlantic ocean
the temperature of __	12,500	145	57
the __'s temperature	1,090	8	1
apostrophe-s proportion	8%	5%	2%

Dr. Rosenbach explores the interaction of these two factors (length and animacy) both in studies of reader judgments and in a corpus study, which uses the British component of the International Corpus of English (ICE-GB). Here's (a slightly revised form of ) one of her tables from the corpus study, showing the interaction of these two factors in determining apostrophe-s proportion, for possessors between 1 and 4 words long:

	L(possessor)=1	L(possessor)=2	L(possessor)=3	L(possessor)=4
Human possessor	387/482 (80%)	370/491 (75%)	56/94 (60%)	1/15 (7%)
Inanimate possessor	14/44 (32%)	44/711 (6%)	3/244 (1%)	1/65 (1%)

She argues, based on statistical modeling of such data, that the two effects can't be reduced to epiphenomenal projections of a single underlying effect, as some have suggested. The paper is full of interesting examples and discussions -- by all means go read it, if you can get access to a copy.

As I mentioned , Dr. Rosenbach's numbers don't come from web search counts, but rather from a parsed, balanced -- but small -- corpus, ICE-GB, comprising only a million words of text. There are lots of advantages to such a corpus, but its small size can be a big disadvantage. So I thought I'd explore what we can learn about the distribution of possessive structures in a corpus the size of the web at large. In particular, I wondered whether there might be effects of phonological weight (the number or type of syllables in a word), of word frequency, and so on. And I was also interested in the different distributions for different head words. So I started with a set of city names, which can be found in phrasal pairs like "the population of Miami" vs. "Miami's population", or "Chicago's architecture" vs. "the architecture of Chicago".

Here's a sample with counts from Google, ranked in order of increasing 's proportion:

	Los Angeles County	Vienna	New York	Tallahassee	San Antonio	London	Moscow	Miami	Cleveland	San Francisco	Hong Kong	Chicago	Denver	Austin	Baghdad	San Diego
population of __	1,320	479	28,800	172	392	32,000	703	722	449	782	9,500	628	326	428	527	654
__'s population	441	161	10,800	80	257	22,100	654	684	461	837	11,300	838	443	641	869	122,000
's proportion	25%	25%	27%	32%	40%	41%	48%	49%	51%	52%	54%	57%	58%	60%	62%	99.5%

There's a considerable range here, from 25% to 95.5%, which is what we want if there's going to be an effect to explain. However, the highest point (San Diego) looks like an outlier. Indeed there's something funny going on: nearly all of the 122,000 instances of "San Diego's population" are from a set of pages at sandiego.merchantamerica.com ("San Diego Directory of Merchants"). This seems to be a legitimate site, not some sort of fake search-engine-optimization link-infestation (though I wonder, can there really be 122,000 different businesses serving San Diego's 1.2M people?), but it's irrelevant to this investigation in any case.

A glance at the rest of the hits doesn't turn up any similar catastrophes, but neither is very much of the variation attributable to any obvious linguistic trend. That's not too discouraging, since the explanation might involve several factors and this is a small sample. But whatever the explanation, how much signal is there here?

One way to look into this question is to compare results from a different search engine. Here's the same table with counts from MSN search:

	Los Angeles County	Vienna	New York	Tallahassee	San Antonio	London	Moscow	Miami	Cleveland	San Francisco	Hong Kong	Chicago	Denver	Austin	Baghdad	San Diego
population of __	1,280	630	11,887	79	594	9,939	1,161	1,428	952	3,339	2,774	2,801	602	665	1,666	1,481
__'s population	302	205	3,810	134	463	8,024	431	1,105	836	1,729	4,004	2,330	824	1,062	627	2,283
proportion 's	19%	25%	24%	63%	44%	45%	27%	44%	47%	34%	59%	45%	58%	61%	27%	61%

The range of apostrophe-s proportions in the MSN counts is somewhat smaller, only 19% to 63%, though that's just because MSN's San Diego apostrophe-s count is more reasonable -- throwing out San Diego, the Google proportions were 25% to 62%. However, the remaining variation doesn't correlate terribly well, as this scatter plot of the proportions calculated from the the two counts suggests.

In fact, the r² measure of correlation between the two sequences of numbers is only 0.27, meaning that only about 27% of the variance of the MSN ratios can be predicted from the Google ratios (and vice versa). And in fact, if we remove the (bogus) San Diego point, which happens by accident to be pulling the correlation up, r² for the estimated proportions drops to 0.20, or 20 percent of variance accounted for.

In other words, about 4/5 of the variance in our estimates of possessive-construction proportions for different city names is apparently due to the selection procedures used by different web search systems. This leaves only about a fifth of the variation to be accounted for by any factors relating to the actual city identities, whether these factors are linguistic or extra-linguistic. The situation is is by no means hopeless -- there are a lot of city names out there, and so we might eventually be able to separate some gold from the dross. However, it's not going to be an easy task.

Not everything is quite so bleak: r² for the of-phrase counts is 0.97:

although for the apostrophe-s counts, the bogus San Diego point reduces it to 0.07 (this is where I got the 7% figure for the title):

If we leave out the San Diego estimates, these numbers become 0.97 for the of-phrase counts and 0.93 for the apostrophe-s counts. However, most of this (excellent) degree of correlation is due to variation in population and overall web presence, rather than to any linguistic factors.

Moral: interpret web counts with caution! They're often informative, but they're by no means a direct route to linguistic truth. At a minimum, you should verify that effects are shared across two or three search engines.

This is not some special flaw in web counts as data. The raw material of science often requires a significant amount of refinement. As an object lesson from another field, I recommend a fascinating post from last August at RealClimate about the interpretation of radiosonde data with respect to upper-air temperature trends and global warming.

Google counts: no worse than tropospheric temperature data.

[ Update: for a scholarly discussion of the question "what do search engines really index, anyhow?", see a lovely series of posts by Jean Véronis at Technologies du Langage, dealing with the controversy that arose back in August over the relative size of Google and Yahoo's indices:

http://aixtal.blogspot.com/2005/08/yahoo-missing-pages-1.html
http://aixtal.blogspot.com/2005/08/yahoo-missing-pages-2.html
http://aixtal.blogspot.com/2005/08/yahoo-missing-pages-3.html
http://aixtal.blogspot.com/2005/08/yahoo-missing-pages-4.html

Jean also muses here on how it is that Google managed to index 584,000 pages on his personal website!

For what it's worth, Yahoo finds 41,000 hits for "San Diego's population", and indeed all but 312 of them seem to be from merchantamerica.com. Was this a case where MSN was smart enough to avoid a doubtful site that both Google and Yahoo fell for? Or did MSN just index a smaller sample, and so missed this cache by luck rather than by design? I have no idea; but linguists need to be careful, out there in the back streets of the web. ]

Posted by Mark Liberman at 07:35 AM

October 03, 2005

Warmth and the French language

It is probably time for me to point out, to the one or two people who still have not noticed, that my post on the inadequacies of French as a Romance language for talking about romance was a joke. That's J O K E: witticism, pleasantry, jape, jest, spoof, rag, kidding... you know what I'm talking about? But at least it did lead to a correspondence about what one should really say about such things as that French doesn't appear to have a word for warmth. I'm grateful to Benoit Essiambre (yes, don't be surprised, I still have many French-speaking friends) for designing this cute graphic to illustrate something a little closer to what might be the true situation regarding a few of the words designating regions in the range from "cold" to "hot" in English and French:

You see what the chart suggests. Some of the senses sort of roughly line up, but nothing lines up perfectly between the two languages with respect to their series of vague adjectives for temperature. They have simply cut the spectrum of heat levels into a different set of regions. This does not mean anything about the French (or English) mind or the impossibility of translation or the inexpressibility of the notion of a warm friendship. I was in fact poking fun, in my ill-advised deadpan way, at the extraordinarily huge number of educated people who nonetheless believe that such conclusions follow. Listen to me: YOU CANNOT draw conclusions about what a culture values, or what speakers perceive, or how a nation thinks, by selective comparison of the senses of a few lexical items.

There. When I die it will still be the case that most educated people still do not appreciate the truth of what I just said, but I thought I would just say it once very seriously with appropriate emphasis. I will die without having convinced people that you can't conclude things about the savage mind by alluding to a few opaque Inuit words for snow, but at least I will not die never having written down the truth in boldface here at Language Log.

Posted by Geoffrey K. Pullum at 03:48 PM

Doing the undo

From this story in yesterday's NYT (emphasis added):

Regardless of how the criminal case unfolds, it is clear that Mr. Delay's persona has produced a cottage industry of forces that trace his every step and draw negative public attention to it.

"I think it's entirely his own undoing, but the good-government groups definitely decided to focus on him," said Tom Matzzie, the Washington director of the liberal organization MoveOn, which spent hundreds of thousands of dollars running advertisements against Mr. DeLay and for his current Democratic opponent.

I've heard of something being (entirely) someone's (own) doing -- that person's responsibility or fault -- and I've heard of something being someone's undoing -- the thing or event that brings that person down -- but something that's entirely someone's own undoing? I'm guessing this is a mixture of the two idioms; assuming Matzzie was quoted correctly, he's trying to say that Mr. DeLay has brought his likely political demise on himself. But I think that's just an odd way to say it.

[ Comments? ]

Posted by Eric Bakovic at 02:14 AM

October 02, 2005

Avoiding the obvious

Though I have a pretty extensive vocabulary, I occasionally come across an unfamiliar word, usually in technical contexts, but sometimes, mystifyingly, in more ordinary prose. Here's a passage -- from Frederick Kaufman's "Debbie does salad" (on television food shows as porn), Harper's Magazine, October 2005, p. 59 -- that sent me to the dictionaries a few days ago:

The primeval brain of the involuntary, the abdominal brain, the brain that controls sympathy and revulsion but not ratiocination, that is the brain of the wow.
When it comes to television, the theory becomes practice: Whether on the Hot Network, E! Entertainment Television, or CBS, the splanchnic response, not the lucubrations of the intellect but the primal gut reaction--that's what hauls in the ratings.

Wow, indeed: splanchnic. I'm down with ratiocination and lucubrations (though I find these word choices annoyingly fancy), but splanchnic would have been a total zero out of context. In this context it must mean 'of the gut, visceral', and the dictionaries confirm that it's a (Greek-derived) medical term with this meaning. But why did Kaufman use it? He can't really have expected many of his readers to be familiar with it.

Here's a guess. First, he decided to refer to the gut twice, for emphasis. (I would have counseled sticking to a single contrast to "the lucubrations of the intellect", or however this idea gets formulated, and then he never would have gotten into mining the far reaches of lexicography.) One of these references can just be with gut: "the primal gut reaction" above. To avoid mere repetition, the other one's going to have to be something fancier. The obvious candidate is visceral, but (a) it is, well, obvious, almost clichéd, and (b) it could be read as (somewhat) metaphorical, rather than as a literal reference to the viscera (though gut has the very same problem). So Kaufman hauls himself off to a thesaurus, or consults one of the experts on anatomy he interviewed, and unearths the shiny hundred-dollar word splanchnic. (If he had the word to hand already, then he's been doing way too many Expand Your Word Power exercises.) Of course, for most readers it doesn't actually contribute anything to the sentence and just causes them to get hung up in the middle of it. But it certainly does avoid the obvious.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 09:02 PM

October 01, 2005

Another overearnest comedy of fact checking

Thursday morning, while waiting in the San Jose airport, I wrote about Jean-Claude Sergeant's unflattering view of the English language:

Dans sa configuration actuelle, l'anglais courant se caractérise d'abord par un extrême souci de cohérence et d'explicitation proche de la redondance.

In its present configuration, current English is characterized first by an extreme concern for coherence and for explicitness approaching redundancy.

I dissected Prof. Sergeant's example of the English obsession with coherence -- the alleged inability of English journalists to match the flexible Gallic deployment of pre-subject "elements de complementation" -- and concluded that the factual content of his claim, at least in the neighborhood of his example, appears to be false. The boarding call for my flight came before I had time to get to his example of the redundancies resulting from the English concern for explicitness. Having just poured this morning's first cup of coffee, back at home in Philadelphia, I've got a few minutes to journey further into this interesting exercise in linguistic stereotyping.

Prof. Sergeant writes:

L'anglais évitera toute ambiguïté quant à l'identité des agents intervenant dans une phrase. Lorsque, par exemple, Flaubert évoque l'attention affectueuse, quoique un peu naïve, de M. Bovary pour Emma – "C'était une surprise qu'il réservait à sa femme : son portrait en habit noir" –, l'anglais ne se contentera pas de : "his portrait in a black dress coat", mais renforcera la relation à l'agent animé : "a portrait of himself in his black dress coat" (traduction G. Hopkins).

English will avoid all ambiguity as to the identity of agents intervening in a phrase. When, for example, Flaubert evokes the affectionate (if a bit naive) attention of M. Bovary for Emma -- "C'était une surprise qu'il réservait à sa femme : son portrait en habit noir" –, English will not be satisfied with "his portrait in a black dress coat", but will reinforce the relationship to the animate agent: "a portrait of himself in his black dress coat" (translation by G. Hopkins).

The choice between "his portrait in <article of clothing>" and "a portrait of himself in his <article of clothing>" was freely made by the translator picked by Sergeant, but the English language would have been well enough satisfied with the other alternative. And as a matter of fact, in the translation by Eleanor Marx-Aveling available from Project Gutenberg, the same sentence from chapter 6 is translated as

It was a sentimental surprise he intended for his wife, a delicate attention—his portrait in a frock-coat.

Now, what does the choice between "his portrait in a black dress coat" and "a portrait of himself in his black dress coat" have to do with "the identity of agents intervening in a phrase"? Neither the identity nor the agency of Charles Bovary is really in question here. Sergeant is apparently alluding to the fact that in general "his portrait" is ambiguous between "a portrait of X" and "a portrait by X", and the black dress coat might have been rented, borrowed or stolen. In this context, it's clear that Charles is the subject, not the painter, and that the coat is his own. However, Sergeant thinks that English, with its "extreme concern for ... explicitness approaching redundancy", will avoid all ambiguity about such things by introducing several superfluous words, which were omitted by the subtler and more rational French.

As Marx-Aveling's alternative translation makes clear, this is not necessarily true. And as we can learn from a few simple web searches, the sort of phrasing that Sergeant claims English "will avoid" is actually an order of magnitude more common.

Looking at web counts for the first part:

	his portrait	portrait of himself	her portrait	portrait of herself	my portrait	portrait of myself
Google	551,000	57,100	227,000	20,600	143,000	20,100
MSN	110,012	11,446	46,990	4,354	37,756	4,583

And the second part:

	in a black coat	in his black coat	in a black dress	in her black dress	in a cowboy hat	in his\|her cowboy hat
Google	20,000	596	107,000	10,700	63,500	1,346
MSN	3,297	405	22,187	1,704	12,816	1,201

So let's sum up.

Flaubert wrote "son portrait en habit noir", which word-for-word is "his portrait in tailcoat black";
One English translator rendered this as "a portrait of himself in his black dress coat";
Sergeant takes this as evidence that English must "reinforce the relationship to the animate agent" so as to "avoid all ambiguity";
Another English translator rendered the same phrase as "his portrait in a frock-coat", which reproduces Flaubert's relationships of animate agents exactly (and is also vague as to the coat's color);
Web counts of similar phrases suggest that the second translator's choices are about 10 times more common in English than the first one's are.

As long as we're trading in stereotypes, I can't resist quoting Adam Gopnik's explanation of a French intellectual's puzzled response to the concept of a "fact checker":

People don't speak in straight facts; the facts they employ to enforce their truths change, flexibly and with varying emphasis, as the conversation changes, and the notion of limiting conversation to a rigid rule of pure factual consistency is an absurd denial of what conversation ought to be. Not, of course, that the French intellectual doesn't use and respect facts, up to a useful point, any more than even the last remaining American positivist doesn't use and respect theory, up to a point. It's simply the fetishizing of one term in the game of conversation that strikes the French funny. Conversation is an organic, improvised web of fact and theory, and to pick out one bit of it for microscopic overexamination is typically American overearnest comedy.

So perhaps this little exercise in microscopic overexamination will give Prof. Sergeant a chuckle.

Although I enjoy Gopnik's morality play of national sterotypes, in fact I'm reluctant to believe that it's any more true in the end than Sergeant's linguistic stereotypes are. Empirical scrupulousness is as common among my French acquaintances and friends as among the Americans I know. Among intellectuals of all nationalities, there are some who are interested in rational inquiry, and others who prefer to strike poses.

[Update: several people have written to suggest that perhaps Sergeant was not concerned with whether Charles was the subject or the painter of the portrait, but rather with whether the subject was Charles or Emma, since the French "son portrait" is ambiguous in that respect. On the other hand, "his portrait" vs. "her portrait" already makes that distinction; and I was hoping to find some way in which agency entered into the matter, since Sergeant focuses on "the relationship to the animate agent". On the other hand, he may be using "agent" loosely, to mean simply "human referent". In general, this aspect of the article does not give the impression that Sergeant thought about it as carefully as my correspondents are doing. ]

Posted by Mark Liberman at 07:10 AM