Accelerando molto con micino

Courtesy of Michael Kats(evman), the lolsingularity now has its own domain,

Linda Seebach sent evidence that even Immanuel "lol" Kant has been drawn into the basin of this vast memetic attractor.

The linguist is the Sufi

From Reza Aslan's No god but God:

(The passage begins on the previous page, with "Consider the following parable originally composed by the great-")

Yes, it's the other sense of linguist -- someone who "speaks several languages fluently" (as the AHD4 puts it).  But, still, it's nice to have the quotation, "the linguist is the Sufi, who enlightens humanity".

(Thanks to Mae Sander, who told me about the passage and then scanned it in for me.)

Groundbreaking research with credulous primates

I think I would have said, had anyone asked, that reporting on animal communication stories couldn't get much dumber. There just wasn't any more room down there: we're at 87 on the FM dial of intelligence in reporting, and there's just about nowhere to go but up. I was wrong. Says John Berman of ABC News in a piece headed Groundbreaking Research Has Scientists Talking With Apes:

The Great Ape Trust in Des Moines, Iowa, is home to seven bonobos -- a close relative of the chimpanzee -- and three orangutans. But if you think Iowa might be a strange place for them to live, don't say it out loud — these apes understand English.

Really. No kidding.

I think it would be unfair to make you read through the drivel itself to reach the point where Berman interviews Kanzi, a bonobo, and we learn the full extent of Kanzi's success in EFL class, so here is the crucial bit:

I read Kanzi a series of words, and then without fail, he hit the corresponding lexigram symbol on a touch screen.
I said "Egg."
He pressed "Egg."
I said, "M and M."
He pressed "M and M."
Then Kanzi took control of the conversation and pressed the symbol for "Surprise!"

By "Suprise!", Kanzi actually means the closed box of candy beside Berman (it's Kanzi's keepers who have supplied the meaning SURPRISE for that lexigram, of course; they could have called it the CUT THE CRAP AND GIMME SOME OF THAT CANDY I KNOW YOU'VE GOT IN THAT BOX lexigram). Kanzi wants Berman to open it and give him some piece of candy. He's prepared to learn which symbols on an electronic board to press in order to induce humans to give him sweet treats. This is the evidence that Kanzi has learned English. Are you convinced?

Because there is more. Only I don't mean more from Kanzi, who is a clever ape, who has doubtless learned a remarkable number of new symbol-to-concept associations and lexigram-pressing behaviors but unfortunately has nothing to say to us. I mean there is more evidence concerning the incredible stupidity of the white, hairless primates doing the reporting. This is more dim-wittedly credulous than anything I have seen before.

"Moments like this", Berman says, referring to the above fascinating interspecies communing, "are proof that these conversations help scientists learn about apes, from the apes themselves." He quotes Rob Shumaker, one of the investigators at the Great Ape Trust:

"If we have some common means of communicating with each other," said Shumaker, " we suddenly have exponentially large number of topics that we can explore."

(There's nothing like a floating use of the term "exponential" to reveal a speaker's innumeracy, I always think. An exponential function of what, in this particular case? Never mind; it was only a rhetorical question.)

We learn more about Shumaker from this passage, just before the above quote:

Rob Shumaker has known Azy, a majestic, huge male orangutan, for more than 20 years. He talks to Azy, just like he would speak to one of his children, or a longtime friend.

"When I'm around them we just kind of talk normally," he explained. "I use my normal vocabulary, my normal voice my normal gestures."

Berman asks: "Sound beyond belief?" Well, since he asks, no, it doesn't. The point was meant to be about apes understanding English, not about humans like Rob Shumaker being able to speak it. What does Azy say back to Rob? Nothing, apparently. Berman tells us nothing more about Azy. He just babbles on breathlessly: "During a visit to the Great Ape Trust, I sat down with Kanzi the Bonobo -- the first Ape I have ever interviewed." And then he reports the interview above, where, as he accurately describes it himself, "I read Kanzi a series of words, and then without fail, he hit the corresponding lexigram symbol on a touch screen."

Another member of the staff at the Great Ape Trust is Bill Fields, who has been working with Kanzi for years. "To communicate," says Berman, "Fields speaks to Kanzi, who then points to the lexigrams to respond and demonstrate a level of understanding." And then he quotes Fields as saying this:

"Qualitatively, there is no difference between Kanzi's language and my language," Fields said. "It's a matter of degree."

Let's look at that first sentence Fields uttered. It has a preposed adverb in adjunct function (qualitatively) at the beginning of an existential clause with a singular postcopular noun phrase with a negative determiner (the determinative no) and a complement preposition phrase headed by between. The preposition between has as its complement a noun-phrase coordination in which the two noun phrases have contrasting genitive noun phrase determiners (Kanzi's and my, respectively). Bill Fields knows how to construct English clauses with typical syntactic sophistication.

Kanzi, on the other hand, sometimes presses the EGG lexigram on hearing a human say "Egg". (Read the testimony of Steve Jones about another day when Kanzi was not quite so cooperative.)

And Fields wants to tell us that there is simply no qualitative difference between his linguistic ability and Kanzi's. He must think he is talking to complete morons who will believe absolutely anything about interspecies communication.

He may be right, too. John Berman appears to have believed him. And doubtless millions of other people will.

I don't know about you, but I am just about... speechless. If I had a lexigram board in front of me, I wouldn't know which lexigram to press. The BULLSHIT lexigram, maybe? Is there a lexigram for that? Or an IDIOT lexigram? I think I need a stiff drink, actually; I may press the LARGE SINGLE MALT lexigram.

Oh, one other thing: at one point Berman reports this remark from Bill Fields:

"Language is culturally acquired. Its not learned," said Fields. "It's acquired in the immediate postnatal antogyny of the organisms life. [sic]

On the whole of the web, antogyny only got one hit as of yesterday, and it's Berman's piece. (Soon there will be this post as well.) The word does not, of course, exist. Berman heard Fields talking about ontogeny, but didn't know the term (it means development in the individual rather than evolution in the species), so he apparently guessed which lexigram button to hit. He guessed he was hearing anto- as in antonymy and -gyny as in misogyny. A sort of learned eggcorn. (What would "antogyny" mean? The property of being the opposite of a woman?) It's further clear evidence that Berman didn't know what he was seeing or hearing, and had no ability to understand the supposed story that someone had sent him to cover. A wandering human simpleton loose in a community composed of (i) apes who spend all day trying to persuade people to give them candy and (ii) human keepers trying to hype the non-existent linguistic abilities of the apes. It looks like the fire season and the silly season are both going to be bad ones this year.

Audio photoshopping at NPR

Last Friday, NPR's On the Media re-broadcast a segment about how NPR news and public affairs programming is edited ("Pulling Back the Curtain", 5/25/2007).

Ever wonder what goes on behind the scenes here at OTM? (Hint: Not everybody speaks as cleanly as it might seem.) A few years ago, we invited reporter John Solomon backstage to see how the sausage is made.

(The transcript is not on their site yet, but a bit of internet search turned up the version of the piece that was broadcast on 11/14/2003.) From my perspective, the most interesting part was what Solomon left out.

Nearly all of the 13 minutes is spent worrying about the relatively inconsequential topic of what you might call "audio airbrushing" -- editing out verbal blemishes like filled pauses, false starts and so on. Solomon also mentions that Car Talk edits in fake laughter from the co-hosts. (He didn't mention that editors -- so I've been told -- don't just remove false starts and dead air, they also sometimes add add fake pauses and breath noises, as a form of verbal punctuation.)

All of this audio touch-up stuff is interesting, but hardly a matter for listener indignation (though I can imagine the flap if a politician's office were caught similarly touching up interviews or news-conference recordings!).

A few slightly more serious sorts of fakery are mentioned . One is studio dubbing of voice-overs on top of background sounds from the field:

Nothing is so identified with NPR News as its signature field pieces, and nothing is more of a cut and paste production collage. My first such story was a media workout day held for reporters by the National Football League's New York Jets at their practice facility. [JETS FOOTBALL PRACTICE AMBIENCE UP & UNDER] Before I left, the producer suggested I record ambient sound during the drills that could be mixed later into the piece underneath my narration. That confused me. As a listener, I had thought that the narration in field pieces was recorded by the reporter at the location. I guess I should have realized reporters wouldn't always be able to deliver their clear, concise, well-organized prose while also reporting the story.

Another is the creation of apparent real-time conversations that never took place. We only get a hint of this:

JOHN SOLOMON: It also may be worth examining whether there are normal production elements that may unnecessarily deceive listeners, for example this is how All Things Considered's Melissa Block introduced a report on President Bush's visit to Singapore last month.
MELISSA BLOCK: NPR's Vicki O'Hara is traveling with the president and joins us now from Singapore. Vicki, the center piece of this trip to Asia for the president was the annual...
JOHN SOLOMON: Kern and Dvorkin agreed that the use of "joins us" in an introduction strongly implies that the interview is live when in fact it may have been taped hours before.
JEFFREY DVORKIN: I think it's a morally slippery slope. I think it makes more sense to say "we spoke to Vicki O'Hara a few hours ago," - boom - and you take that. The idea of being live is a construct and almost a vanity of the electronic media now.

I can say from personal experience that such concocted conversations are not limited to pre-recorded field reports spliced in as if they were arriving in real time. At least sometimes -- and maybe often, I don't know -- pre-recorded "conversations" are cut-and-paste jobs, just as fake as a photoshopped picture of two people who never met.

I described (my memories of) one such experience in an earlier post ( "Imaginary debates and stereotypical roles", 5/3/2006):

I wound up participating, on the air, in a vivid debate with my friend and colleague Cecil Coker in which we appeared to disagree fairly sharply on a topic that in fact we mostly agreed in being uncertain about. And curiously, though the [Radio Personality] spent half a day interviewing a half a dozen of us at length, the interviews had all been individual.

Thinking back over the experience, we realized that the RP had approached us from opposite sides of the question, and then stitched together bits of our answers. The interview technique was roughly like this:\

RP: So, in short, we can say that it's now apparent that EITHER?
Me: Well, the answer isn't clear. To be fair, there's quite a bit of evidence pointing toward OR, such as X, Y and Z. At least some people think that way, though I don't find the arguments very convincing myself; and P and Q do seem to point towards EITHER.

[ . . . ]

RP: As I understand it, a lot of people have concluded that OR.
Cecil: Well, some people think so, but I'm not convinced. EITHER seems much more likely to me, because of P and Q.

and the broadcast "conversation" then consisted of a series of "exchanges" like this:

Me: There's quite a bit of evidence pointing towards OR, such as X, Y and Z.
Cecil: Some people think so, but I'm not convinced. EITHER seems much more likely to me, because of P and Q.

In that case, no real harm was done -- it's true that my views were distorted, but I doubt that anyone else cared or even noticed. Still, as NPR's Ombudsman Jeffrey Dvorkin said to John Solomon in 2003, in reference to the practice of splicing in pre-recorded segments from reporters in the field, this is "a morally slippery slope".

For centuries, print journalists have been asking leading questions and then presenting the answers in a completely different context; omitting crucial frames and qualifiers; juxtaposing quotes from different segments of an interview as if they had a rhetorical connection, thereby creating meanings that the speaker never intended; highlighting aypical asides from a long interview as if they were the subject's main reaction; and so on. And I'm not just talking about Fox News, but about the Washington Post and the New York Times.

It shouldn't come as a surprise that these techniques are also commonplace in the broadcast media. And the audio or video version can be even more misleading, because it seems to be a real person who is participating in a real conversation -- which never took place, or at least never took place in the form in which it's presented to listeners.

In news photography, analogous staging, cutting and pasting would be (I believe) a firing offense. But radio seems to be governed by the culture of the writer rather than the culture of the photographer -- for the obvious reason that considerable staging, cutting and pasting is usually required by the nature of the medium, in order to create a coherent program out of bits and pieces from various places and times.

In the case of print journalism, the better publications have well-defined standards, even if they're routinely violated in practice. There are similar words in (for example) the NPR News Code of Ethics:

8. NPR journalists make sure actualities, quotes or paraphrases of those we interview are accurate and are used in the proper context. An actuality from an interviewee or speaker should reflect accurately what that person was asked or was responding to. If we use tape or material from an earlier story, we clearly identify it as such. We tell listeners about the circumstances of an interview if that information is pertinent (such as the time the interview took place, the fact that an interviewee was speaking to us while on the fly, etc.). Whenever it's not clear how an interview was obtained, we should make it clear. The audience deserves more information, not less. The burden is on the NPR journalist to ensure that our use of such material is true to the meaning the interviewee or speaker intended.

(This applies specifically to NPR News -- I don't think it covers programs like On the Media, Marketplace, Day to Day, or All Things Considered. I haven't checked to see if these individual shows have similar statements of principle.)

I can't cite many examples of radio photoshopping that violate these standards, because I rarely have access to the raw materials from which the finished products were created. But standard audio production techniques -- the same ones involved in editing out stammers and flubs, or splicing pre-recorded reports into an apparently live feed -- make it as easy to do this sort of thing in radio as it is in print. And if you think that that radio journalists haven't given in to the temptation from time to time, then you have a higher opinion of human nature than I do.

[Update -- David Warner sends in a link to this fictional example from Frontline, where the resulting radio version would have been perfect, but a small problem of wardrobe consistency spoils the video.]

Swimsuit semiotics

Posted by Mark Liberman at 07:50 PM

Block that organic metaphor!

From the 1923 Compton's Pictured Encyclopedia:

Language grows much as a tree grows--the big simple things first, like the roots of the tree; then more complicated things that reach up like the trunk and the branches; and next the thousands on thousands of little separate words, each like the others and yet different, like the leaves.

Unfortunately, I no longer have the volume of the encyclopedia from which this passage comes, so I can't say whether this is about the evolution of language, the history of an individual language, or the course of language acquisition, or several of these at once, nor do I know what are supposed to be the big simple, root-like, things and the more complicated, trunk- and branch-like, things in language.  But the writer is entirely clear that words come last.  This is a bizarre image.

Thanks to Steven Levine, who gifted me with this encyclopedia some time ago, so that I could plunder it for images and quotations to incorporate in the collages I make.  It's meant for children, and manages to be patronizing, sentimental, and breathlessly enthusiastic all at once.  Its coverage of other lands and cultures is heavy with Exotic Otherness, and it's especially taken with science and technology in the March of Progress.

Steven picks up odd things like this at garage and estate sales and gives most of them away to his friends.  Delightful.

Let me try to salvage a little something of linguistic interest from the encyclopedia passage: thousands on thousands of.  I would have used and rather than on for "extravagant doubling" of words referring to numerical units, though on doesn't strike me as incorrect.  It turns out that and is definitely the favored connector these days, followed by upon, then (usually) of, with on (usually) in last place, though of and on are pretty close.  In raw Google webhits for


The one surprise in here is the large number for billions of billions of -- most from discussions of astronomy, influenced no doubt by Carl Sagan's use of this version on several occasions.  Sagan really MEANT, and said, of, understood multiplicatively, rather than any of the other connectors, which are understood additively.  (He's often quoted, and parodied, as having said "billions and billions of stars", but he was really referring to a very much larger number than the few billions you'd get by adding some billions to some more billions.)

Dediu and Ladd again

I've been waiting to read the paper by Dediu and Ladd ("Linguistic tone is related to the population frequency of the adaptive haplogroups of two brain size genes, ASPM and Microcephalin") that I mentioned a couple of days ago, but press coverage is increasing, and the paper doesn't seem to be up on the PNAS site yet. So I'll go ahead and make a some intitial observations based on their excellent "further information" site at Edinburgh, and my memories of a talk that Bob Ladd gave here at Penn a few months ago.

This research began (as I understand it) when Bob Ladd noticed that maps of the geographical distribution of two particular genetic traits (from a couple of articles published in Science in 2005) seemed similar to what he knew of the geographical distribution of lexical tone. So he worked with Dan Dediu to check this impression statistically. They looked at the geographical distribution of 26 linguistic traits (from Haspelmath et al.'s World Atlas of Language Structures) correlated with the geographical distribution of 1,000 genetic variants, yielding 26,000 correlations. The statistical distribution of these correlations looks like this:

The image “” cannot be displayed, because it contains errors.

And sure enough, as the arrow on the graph indicates, the correlation between the geographical distribution of lexical tone and the two genes of interest is way out in the tail of this bell-shaped curve of empirical gene/language correlations. This clearly tends to confirm Bob's original intuition.

But even in a distribution created by completely random effects, without any meaningful connections among the variables studied, something has to be way out in the tail.

There's a tricky point of statistical reasoning here, which is clear in general, although even experts often seem to wind up disagreeing about particular cases. If you started with 26,000 correlations, and no reason to expect any causal relations among the variables, and a situation (like geographical diffusion of genetic and cultural traits) where it's not clear what distribution of correlations to expect in the absence of causal connections, then you probably wouldn't draw any conclusions at all from the fact that some of the correlations were fairly high. On the other hand, if you thought that there might be some meaningful connections in the mix, then the tails of the distribution are a good source of examples to study further -- and this is the main conclusion that D & L draw from their results, quite properly.

If there is some initial reason to suspect a connection between two traits -- for example, if D & L had started from the discovery that the alleles in question had a significant effect on the relative psychophysical salience of pitch vs. timbre -- then the discovery of a strong geographical correlation would be very strong result indeed. But in this case, what they started with was Bob's informal impression of a geographical correlation, and they confirmed this impression via a statistical demonstration that the correlation in question is indeed among the largest 1.4% of 26,000 correlations tested, i.e. among the top 360 or so. This proves that Bob Ladd has a good eye -- but have they in effect evaluated one trial of one hypothesis, or some much larger number of hypotheses and/or trials, as high as 26,000? Presumably it's somewhere in between.

An analogy may help to clarify this point. If I claim to have a trick (or lucky) penny, and we try it out together, and discover that it comes up heads 9 times out of 10, we'll conclude that my claim is probably valid. Theory aside, we could do this empirically by reference to a database of coin-toss experiments, using a wide variety of randomly chosen pennies, in which we find that 9 out of 10 heads or better only occurs about once in 100 trials.

But suppose, instead, someone points out to us that on a given Saturday during football season, in a collegiate league whose teams play in 10 games, 9 out of 10 of the referees' coin tosses came up heads, and asserts on this basis that the refs are crooked. Should we accept the argument? Probably not, since we're only checking this case because it struck someone as unusual. [More on this issue, in a more frivolous context, here.]

Just to underline one aspect of this discussion, I did a trivial little simulation, in which individual mutations were placed at random on a 20x20 grid, and then died, reproduced and migrated at random for 35 generations. (The grid was configured as a manifold, so that if you migrate off of one side, you come back in on the opposite one.) I ran 100 traits independently, and then looked at the geographical correlations among their population frequencies:

(The number of correlations was actually less than the expected 4,900, because when a trait vanished from the population before the end of the simulation, I didn't include it in the final calculations. 26 out of 100 traits went extinct, so the number of correlation in the plot above is actually .5*74*74-74 = 2,664. This is also the reason for the preponderance of higher positive correlations over negative ones, I believe, since unusually successful traits tend to have a high frequency everywhere, whereas unusually unlucky ones simply vanished. The larger number of small negative correlations is (I think) the natural consequence of limited diffusion from geographically-random point mutations.)

I suspect (though I haven't shown) that by jiggering the parameters of the simulation, you could get a frequency-distribution of geographical correlations rather like the one the Dediu and Ladd found, without assuming any meaningful connection at all between genes and linguistic traits. And of course there really are are some connections, if only because of linguistic endogamy.

The point is not in any way to debunk Dediu and Ladd -- I don't think that they would disagree with what I'm saying here, and I certainly don't disagree with their conclusions, as I understand them from what I've read so far. Their results are suggestive, and well worth following up. Indeed, I think that the other few hundred top-ranked gene/language correlations should be investigated as well!

But unsurprisingly, the press response is full of headlines like "Speaking in tones? Blame it on your genes" and "Genes may help people learn Chinese". It's worth noting that if there's a causal connection here, it need not have anything to do with the relative ease of learning tonal distinctions. In the talk that he gave here, Bob Ladd speculated that perhaps there is a difference in the relative psychophysical salience of the "micromelodies" that are a universal and inevitable consequences of consonant and vowel articulations in every language, leading to differences in the propensity to create tonal contrasts by re-interpretation of segmental contrasts, as happened repeatedly in the reconstructed history of East Asian and Meso-American tone languages.

[Update -- a response by Ladd and Dediu is here.]

Loop until kthxbye

Now there's lolcode:

		BTW this is true
		BTW this is false

In addition to the lolcode site itself, which of course includes a wiki, you should check out the scholarly appreciation at Notes from a Linguistic Mystic ("im in ur programmz, codin in ur dialect: LOLCode and Feline Dialectology", 5/29/2007):

Periodically, one goes through periods of deep metaphysical malaise. You look around at the world, wondering how such evil could flourish and such suffering could endure. You descend deeper into darkness, your faith in humanity waning, wondering why we were ever born into this cruel world. Then, suddenly, you realize that somebody has written a programming language based off of the dialect of Lolcats/Cat Macros, and your faith in humanity’s inherent good is completely restored.

Indeed. Breathes there a soul so dead, whose spiritual batteries are not recharged by reading

      YA RLY
          HALP "Var is too little!"!

So far, no one seems to have taken up the challenge to create an object-oriented lolcode ("lolcode++"?) or a functional lolcode ("lolcaml"?), but I'm not certain of my ability to track memetic evolution as we approach the lolsingularity.

[THX to approximately ℵ0 readers.]

[Update -- Chris Casinghino writes:

I saw your post about lolcode today. While I don't know of a complete functional language in this vein, you might enjoy this recent post to one of the Haskell mailing lists, which defines what might be called a lolmonad:

Several posts in the thread expand on the idea

And in the same vein, Zohar Kelrich writes:

In your recent Language Log posting on FOP (Feline-Oriented Programming) you mentioned the lack of functional-lolcode Actually, research into functional cats has already started, as can be seen here.

This is clearly an active area of research, rapidly advancing towards the lolsingularity.]

Omit needless

My adventures in the Practical Survival Skills area of the advice literature on English -- material that gives you some small number of crucial mistakes to avoid in business or other professional writing, or just in writing in general -- have taken me to some curious places.  Occasionally I discover "mistakes" I hadn't noticed before.  For instance, in a website on "10 Grammar Mistakes That Make You Look Stupid" (version of 5/23/05), there are the ten mistakes -- almost all of them common spelling errors that a spellchecker won't find (like loose for lose, they're/their/there, and then for than) -- plus two peevish extras: hit and miss for hit or miss, and this one, appended to the then/than entry:

Note: Here's a sub-peeve. When a sentence construction begins with If, you don't need a then. Then is implicit, so it's superfluous and wordy:

No: If you can't get Windows to boot, then you'll need to call Ted.
Yes: If you can't get Windows to boot, you'll need to call Ted.

Omit needless then!

(A very big hat tip to Doug Kenter, who pointed me to the "Make You Look Stupid" site.)

First, a few words on hit and miss.  "Make You Look Stupid" attacks it on the basis of logic:

At some point, who knows when, it became common practice to say that something is "hit and miss." Nuh-UH. It can't be both, right? It either hits or it misses... "Hit OR miss."

Granted, it's a small thing, a Boolean-obsessive sort of thing. But it's nonetheless vexing because it's so illogical.

(By the way, appeals to "logic" in usage matters are almost always flags indicating gaps in reasoning, suppressed premises, and the like.)

A user of hit and miss can counter that this version of the idiom makes more sense than the version with or: if something is hit and miss, sometimes it hits AND sometimes it misses.   In fact, two recent dictionaries distinguish two different idioms:


hit-and-miss ... Sometimes succeeding and sometimes not.

hit-or-miss ... Marked by lack of care, accuracy, or organization; random.


hit-and-miss done or occurring at random ...

hit-or-miss ... as likely to be unsuccessful as successful ...

Whoops!  Essentially the same meaning distinction, but with the meanings assigned in opposite ways.  My guess is that we're looking at a lot of variation between speakers here, with some preferring one version (in both meanings), some preferring the other version (in both meanings), some differentiating the meanings one way, and some differentiating the meanings the other way.  As for Google, the raw webhits are pretty much a dead heat, with the two versions each a bit over a million hits.

In any case, we're dealing with idioms here, and idioms by definition aren't entirely "logical", that is, semantically compositional.  Not something you'd want to develop a "burning pet peeve" (in the blogger's words) over.

I've looked at a sampling of the advice literature for opinions on hit and/or miss, without finding anything.

A similar search for advice on conditional then is slightly, but only slightly, more successful.  Nothing in MWDEU, Garner's MAU, Bernstein's Careful Writer, the AHD Guide to Contemporary Usage and Style, Follett, Follett/Wensberg, any of the editions of Fowler (old original Fowler, Gowers's Fowler, Burchfield's Fowler), Morris & Morris, etc.  Not even in the 1918 advice of Prof. Omit Needless himself, Will Strunk.

Evans & Evans, Dictionary of Contemporary American Usage, has the mildest of warnings:

When a condition is introduced by if, the conclusion may be introduced by then, as in if he said it then it must be true.  As a rule, these sentences are more forceful when then is not used.

But Mark Davidson's Right, Wrong, and Risky (2006:309) just says no -- well, almost always no:

if-clauses should not be followed by then ... If you add the word then... you'd be adding a word that contributes nothing to your message.  And you might be contributing momentary confusion, because the word then in that type of sentence could be interpreted to mean either "as a consequence" or "at that particular time."

The then in such sentences is permitted by The Columbia Guide to Standard American English when the introductory if clause is "very long"...

Actually, what Wilson's Columbia Guide (1993:234) says is:

Then is required for the main clause of a complex sentence whose subordinate clause begins with if only when the subordinate clause is very long ...

so Wilson doesn't say it's permitted in these cases -- which would implicate that it's not permitted elsewhere -- but that it's required only in these cases -- which entails that it's permitted elsewhere.

Finally, Fiske's Dictionary of Disagreeable English (2006:189) is as forthright as Davidson in labeling conditional then as an error, though he softens the punch:


Solecistic for if. ... DELETE then.

In certain mathematical or computer expressions, if...then is the necessary expression; in prose, the understood then, when explicitly stated, is often an encumbrance to grace and elegance.

That's my current crop of opinions on conditional then.  "Make You Look Stupid", Davidson, and Fiske object to it, explicitly or implicitly, as needless, and Evans & Evans and Fiske express reservations on aesthetic grounds. 

Davidson also finds a potential ambiguity in conditional then, but my experience is that if you want to object to a usage, you can almost always find a potential ambiguity to complain about -- because potential ambiguities are everywhere, in almost every part of almost every sentence.

Turning now to Omit Needless Words (ONW).  Part of its popular appeal comes from the assumption -- a piece of language ideology, resting on the so-called Conduit Metaphor -- that what language is for, all that language is for, is to convey messages from one person to another.  Material that doesn't serve this purpose is then just so much dead weight; it "contributes nothing to your message" (Davidson), it's "superfluous" ("Make You Look Stupid"), so it should be excised.

Whenever you see an appeal to ONW, you should wonder what people are doing with those "needless" words.  Most of the time, those extra words are serving some function that conflicts with brevity; they're doing some work.  Conditional then marks the consequent clause of a conditional sentence in parallel to the way that if marks the antecedent clause, so what it does is to clearly indicate sentence and discourse organization -- and this can be a distinct service to the reader or hearer.

That's not to say that you should ALWAYS use conditional then.  That would be silly advice, just as silly as requiring that the complementizer that should ALWAYS be used (rather than omitted) in sentences like I know that I can't fly (vs. I know I can't fly).  The point is to have alternatives, which can do different things on different occasions.

I suspect that I'm not a very heavy user of conditional then.  Often I'm happy to do without.  I might even find it inelegant on occasion.  But there are times when it's a good thing.  Virginia Tufte, Artful Sentences, p. 226, gives a particularly "apt use" (her words) of if ... then conditionals in a passage from Ralph Cohen's "Do Postmodern Genres Exist?", which begins:

If we wish to understand ..., then we need names ...

If we wish to study ..., then genre study helps us ...

The parallelism thus established, Cohen continues without conditional then:

If we seek to understand ..., genre theory provides ...

If we wish to analyze ...,  genre theory provides ...

Artful indeed.

Liberty, Equality, Hypocrisy?

Reading Mark's post about France's refusal to implement any language-rights legislation reminded me of a bit of weirdness from the discussion of the UN's 'International Year of Languages' resolution, remarked on in the Eurolang article about it:

'Meanwhile, there was refreshing news for our Breton, Basque, Occitan and Corsican readers when the representative from France said, "The right to use your own language, the capacity to communicate and, therefore, to understand and be understood, the preservation of an inheritance that dates back centuries or even millennia, should be of prime importance to the United Nations." '

So the French government is perfectly happy to have their guys spouting pro-minority-language-rights rhetoric on the floor of the UN, and endorsing symbolic gestures like the declaration of an International Year of Language, but refuses to implement any actual legislation to that effect!

The full (but short) article that the quote is from is now most accessible n the ILAT archives here.

Update: Reader Émilie Pelletier writes in to say,

Your post on Language Log today reminds me of something I read in Morvan Lebesque's Comment peut-on être Breton? I don't remember the exact words, but he was mentioning the big support France was giving, in the 1970's, to French-speaking Quebecers so that they could speak their own language. When Bretons said to the French government that their own situation was very much like that of Quebecers, the French authorities' alleged response was: "But noone is preventing you from speaking French!"
Posted by Heidi Harley at 01:19 PM

Genes and tones

An interesting paper by Dan Dediu and D. Robert ("Bob") Ladd, "Linguistic tone is related to the population frequency of the adaptive haplogroups of two brain size genes, ASPM and Microcephalin", will be published today at PNAS (it doesn't seem to be up there yet). The authors have put an accessible explanation up on the web here -- this is a wonderful idea, something scientists should always do in the case of research that is likely to get significant play in the popular press.

I don't have anything to add to their discussion, pending close study of the paper and (more likely) observation of the press coverage, so you should just go read the whole thing.

Posted by Mark Liberman at 12:13 PM

A word for it

This morning, Dennis Preston wrote on the American Dialect Society mailing list:

Funny how people talk ways they claim they don't; I've even had them deny stuff I've recorded (or even shown them on spectrograms).

Well, not too funny when you consider the reasons.

To which I replied:

Probably everybody who's collected data systematically or taught a course in which variation played a role has come across this phenomenon.  Does anyone have a name for it?

(A related phenomenon -- Do As I Say, Not As I Do -- is familiar to anyone who's looked at the advice literature.  An adviser will sternly proscribe some variant -- restrictive relative which, for example, or logical rather than temporal since, or a pronoun with a possessive as antecedent -- and then use it, often within sentences of the proscription's formulation.)

But a label, doctor, a label!

So I answered my own question:

If we want a suitably scientific-sounding label, we could base it on anosognosia (coined by Babinski in 1914, as French anosognosie), which the OED defines as 'unawareness of or failure to acknowledge one's hemiplegia or other disability.'  (It's usually the result of right brain injury of some kind.  In my partner Jacques's case, the cause was radiation.)  The word has the parts:

negative a- + noso- 'disease' + gnos-(os) 'knowledge' + -ia

We can then replace the Greek stem noso- with some more appropriate one, like praxi- 'performance, practice'.  Ta-da!

apraxignosia 'unawareness of or failure to acknowledge one's actual practice'

Spread the word.

(Oh yes, the Babinski of plantar reflex fame.)

[Addendum: Roger Shuy writes about another kind of case:

I've found over and over again that people who have been surreptitiously tape recorded in sting operations often refuse to admit that they said the inculpatory stuff on the tape in spite of what the recording clearly shows that they said. They usually claim that the government tampered with the tape. In rare occasions, the tape has been tampered with, but 99% of the time it hasn't.

This looks like something a bit different from Dennis Preston's observation, where people don't seem to believe that they sound the way they were tape recorded (usually their phonetics, I assume, although perhaps grammar as well). In the sting tape cases, the person denies the CONTENT of the statements, claiming that they never could have said such things.

Your suggested label, apraxignosia, seems to fit both categories.

Preston was talking about phonetic variation, but the observation applies to all sorts of variants.]

[Further addendum: for those not inclined towards Greek-derived terms, Alex Baumans suggests calling it the Bart Simpson Syndrome: "I didn't do it!"  Meanwhile, Suzanne Kemmer suggests a name for the study of apraxignosia, a.k.a. Bart Simpson Syndrome: denialectology.  Wonderful.]

This is a country that has failed to implement any of the recommendations of a six-year-old report to a U.N. committee dealing with key human rights issues; a country that stands almost alone in refusing to ratify international agreements on the same topic. Are we talking about the United States? Or Israel? Or perhaps Iran, or Syria, or Zimbabwe?

No, it's France. At least I think this is what's going on -- the only reports that I've been able to find are a 5/22/2007 story by David Hicks on the Eurolang site ("France fails to implement UN recommendations on 'regional' languages"), which seems to be the same as an article dated 5/23/2007 on the site of EBLUL (the European Bureau for Lesser-Used Languages); and a 5/16/2007 story in Le Journal du Pays Basque ("Un rapport accablant sur la politique linguistique en France remis à l’ONU").

From the article by Hicks:

EBLUL France are at the UN in Geneva this week calling for the implementation of its 2001 Report on ‘regional’ language rights in France to the UN’s Committee on Economic, Social and Cultural Rights. Six years later and France has failed to implement any of the Report, despite UN recommendations to do so.

One of the specific recommendations in 2001 was to ratify the Council of Europe’s Framework Convention on the Protection of National Minorities (FCNM) and Charter on Regional or Minority languages (ECRML). According to Hicks, France is now one of three (out of 47) affected countries who have not done so: the other two are Turkey and the Principality of Andorra.

Turkey's festering issues with linguistic minorities are well known. A few months ago, Bill Poser noted that "Turkey continues suppression of Kurdish", and asked "Do they really think that no one will notice, or do they just not care about joining the European Union?" But maybe this stuff matters less to the Europeans than he thinks, or at least to the French. In the case of the Principality of Andorra, the crucial factor seems to be that "Andorra is a co-principality with the President of France and the Bishop of Urgell, Spain as co-princes", as the wikipedia article explains.

You can read more about this in the 2007 report of EBLUL-France. I don't know, really, how reasonable EBLUL-France's demands are, or how accurately they describe the French government's (lack of) response. Their side of the story certainly makes sense, but I haven't been able to find any response from the French authorities, or any discussion of the issues by third parties. As far as I can tell from a series of web searches, the report in Le Journal du Pays Basque is the only French-language news item on the EBLUL-France report that has appeared anywhere in the world. On behalf of the citoyens of la France, the journalistes have apparently decided that this is a non-story.

[Update -- Geraint Jennings (Maître-Pêtre des Pages Jèrriaises) writes:

One of the reasons why the EBLUL report hasn't made waves of tsunami proportions is that the constitutional prohibition on recognition of regional languages of France is well known. All the associations and activists, civil servants and politicians have been through the arguments so many times over the years. Out of the three frontrunners in the presidential election, basically Royal and Bayrou were in favour of ratifying the European Charter (by means of constitutional amendment if necessary), and Sarkozy against (while sitting on the fence by sounding positive about regional languages but denying the need for legislation or constitutional change).

The story is rumbling on in local media during the ongoing parliamentary elections, with local candidates being asked to declare the extent of their support or otherwise for the regional languages of the areas they're standing in. Two weekends ago, I myself was part of a delegation from Jersey invited by Norman-language associations to contribute to a debate on the language question with candidates standing in mainland Normandy (we were there really to be able to provide our experience of running a official state-funded language office).

Have you seen the website Worth a look for the ongoing candidate signatories on the regional languages question.


Posted by Mark Liberman at 07:53 AM

Structural Ambiguity In the Courts

One of the hazards of rearranging books is that it is nearly impossible to pick up a book without opening it and reading a bit. This may be pleasant and instructive, but it does rather slow the process down. Anyhow, while attempting to arrange it I opened Peter Hay's The Book of Legal Anecdotes (New York: Facts on File. 1989) and at p. 256 came upon his account of the following discussion of the structural ambiguity of English noun phrases in a South Carolina court at the end of the nineteenth century. The exchange is between attorney James L. Pettigrew and a judge, who interrupted Pettigrew as soon as he stood to speak:

"Mr. Pettigrew, you have on a light coat. You cannot speak, sir."

"May it please the court, Your Honor,", Pettigrew replied, "I conform to the law."

"No, Mr. Pettigrew, you have on a light coat. The court cannot hear you."

"But, Your Honor," the lawyer insisted, "you misinterpret. The law says that a barrister must wear 'a black gown and coat', does it not?"

"It certainly does," said the judge.

"And does Your Honor hold that both the gown and the coat must be black?"

"Certainly, Mr. Pettigrew, certainly, sir."

"And yet it is also provided by law", Pettigrew continued, "that the sheriff must wear 'a cocked hat and sword', is it not?"

"Yes, yes", the judge replied impatiently.

"And does the court hold", Pettigrew went on calmly, "that the sword must be cocked as well as the hat?"

"Eh - er -h'm", mused His Honor, "you - er - may continue your speech, Mr. Pettigrew."

Note: Hay gives Pettigrew's name as Pettigrue, but it is spelled Pettigrew in all of the other sources I have seen.

Memorial Day

According to the wikipedia entry for Memorial Day, the observances began at the end of the Civil War, but the name wasn't used until 1882:

These observances eventually coalesced around Decoration Day, honoring the Union dead, and the several Confederate Memorial Days. [...]

The alternative name of "Memorial Day" was first used in 1882, but did not become more common until after World War II, and was not declared the official name by Federal law until 1967.

The earliest instance of "Decoration Day" in the NYT archives seems to be from May 19, 1869; but a search for "Memorial Day" turns up this touching article from June 7, 1868, under the headline "An Incident of Memorial Day":

The Layfayette (Ind.) Courier, in its account of the decoration of soldiers' graves in that city on the 30th ult., says a wreath of flowers, accompanied by a note from a little girl about ten years of age was exhibited. The note was addressed to Col. Leaming, Chairman of the Committee of Arrangements, and was as follows:

Col. LEAMING: Will you please put this wreath upon some rebel soldier's grave? My dear papa is buried at Andersonville, and perhaps some little girl will be kind enough to put a few flowers upon his grave.        JENNIE VERNON

The reading of the note created a profound impression, the wreath was deposited upon the grave of an unknown rebel soldier -- the only one remaining in the cemetery.

Posted by Mark Liberman at 02:38 PM

Wait till next year

Darn. I missed Towel Day, which was last Friday, May 25.

A towel ... is about the most massively useful thing an interstellar hitch hiker can have. Partly it has great practical value - you can wrap it around you for warmth as you bound across the cold moons of Jaglan Beta; you can lie on it on the brilliant marble-sanded beaches of Santraginus V, inhaling the heady sea vapours; you can sleep under it beneath the stars which shine so redly on the desert world of Kakrafoon; use it to sail a mini raft down the slow heavy river Moth; wet it for use in hand-to-hand-combat; wrap it round your head to ward off noxious fumes or to avoid the gaze of the Ravenous Bugblatter Beast of Traal (a mindboggingly stupid animal, it assumes that if you can't see it, it can't see you - daft as a bush, but very ravenous); you can wave your towel in emergencies as a distress signal, and of course dry yourself off with it if it still seems to be clean enough.

On the good side, this gives me almost a whole year to think of linguistic applications.

For syllepsologists only

This morning's mail included a note from Daniel Hyde, with a lovely example of WTF coordination from the English-language feed of China Radio International ("'Mao Zedong Actress' Astonishes Chengdu Citizens",, 3/28/2007):

Chen Yan startled pedestrians when she impersonated Chairman Mao on Chunxi Road, one of the most crowded commercial streets in Chengdu. Most onlookers thought the costumed Chen Yan looked considerably like the leader.

Chen Yan said at first she only wanted to imitate the acclaimed Chinese actor Tang Guoqiang. But one of her friends, who works as a dresser for a local performance group, suggested she imitate Mao Zedong since she looks more like him.

She said although her facial features are fairly similar to Mao Zedong's, she still needs to don "special makeup," which takes at least four hours and costs 2,000 yuan each time.

The pair of leather shoes worn by Chen Yan also attracted attention. They are tailor-made and heightened by 30 centimeters in the heels, making both Chen Yan look taller and it more difficult for her to walk. [emphasis added]

[Update -- John Cowan writes:

This sort of thing is to be expected from sinophones, because Chinese doesn't have a fixed rule for ergative or accusative coordination. Here's a bit from my Cthulhu-based tutorial on ergativity:

Chinese, on the other hand, is neither ergative nor accusative. Sentences that literally translate to "Cthulhu dropped the watermelon and burst" and "Cthulhu dropped the watermelon and was embarrassed" are equally valid, and Chinese speakers interpret them according to common sense. That is, "...the watermelon burst" (ergative) and "...Cthulhu was embarrassed" (accusative). Only the latter one reads properly in the English translation because English isn't ergative.

(I'm following Randy LaPolla here, who has done the heavy lifting on Chinese.)

I would not have guessed that "Cthulhu-based tutorial on ergativity" was a word sequence likely to be instantiated. The world is full of wonders.]

[Update -- Elizabeth Zwicky writes (under the Subject line "Never mind the coordination error -- what about the SHOES?"):

Umm, "heightened by 30 centimeters"? No doubt it's difficult for her to walk. Particularly if it's just the heels. 30 centimeters is close to a foot! Those aren't shoes, they're very small stilts. Or another translation error.

Indeed. "Small stilts" is what I assumed. Per aspera ad astra.]

Posted by Mark Liberman at 10:06 AM

New frontiers in temporal logic

According to Dave Anderson in the NYT, ("Sometimes it's over even when it ain't", 5/28/2007), Joe Torre put the Yankees' performance so far this season (won 21, lost 27; 12 1/2 games behind Boston) into historical perspective by observing:

We’ve been bad early in other years, but not this late.

If you think about it, you'll see that this is perfectly logical, though in a different way from Yogi Berra's equally logical comment on afternoon shadows in Yankee Stadium's left field: "It gets late early out there."

Posted by Mark Liberman at 08:38 AM

In Memoriam: Karl van Duyn Teeter

One of my teachers, Karl Teeter, Professor Emeritus of Linguistics at Harvard University, passed away on April 20 at his home in Cambridge, Massachusetts at the age of 78. A native of Lexington, Massachusetts, he received his Ph.D. from the University of California at Berkeley and spent the rest of his career at Harvard, as a Junior Fellow from 1959-1962, and then as a member of the Linguistics faculty until his retirement in 1989.

His education was in some ways unusual. He dropped out of college and joined the US Army, which sent him to Japan as part of the Occupation force. There he fell in love with Japanese, but as he lived on a military base had only limited exposure to it. Officers were entitled to live off-base with their families, but as an enlisted man (a Supply Sergeant) he at first thought that there was no way to arrange to live off-base. Then he discovered that although the Army did not bring the wives of enlisted men over, enlisted men were nonetheless entitled to live off-base if their family was present. He arranged for his wife Anita to join him at his own expense, then requested off-base housing. He and his wife ended up living in a Japanese house, which both from the point of view of language acquisition and in other respects he found a great improvement on the barracks.

On leaving the Army he returned to college and graduated with a degree in Oriental Languages before entering graduate school in Linguistics. He is known primarily for his work on Algic languages, but he knew Japanese well and though he published little on it, retained an interest in Japanese linguistics, especially dialectology, throughout his life.

Like most Berkeley students of the time, Karl focussed on the native languages of the Americas. His dissertation, supervised by Mary Haas, was a description of Wiyot, a language of Northern California that would soon be extinct. Wiyot was not completely unknown - in addition to a few minor works Gladys Reichard had published a grammar in 1925 - but Karl's work added immeasurably to our knowledge of Wiyot.

The speaker he worked with, Della Prince, who passed away in 1962, is usually listed as the last speaker and the only one still alive when Karl began his work. Actually, Karl told me, there were two remaining speakers. In addition to Mrs. Prince there was an old man who could speak Wiyot. Karl tried to meet him, but he was unwilling. His son, who could not speak Wiyot, wanted his father to work with Karl in order to preserve the language, but the old man had experienced so much discrimination during his working life that, now that he was retired and did not need to deal with white people, he refused to have any contact with them.

Karl's new material on Wiyot was of interest not only for its own sake but for the light that it shed on the long-standing Ritwan controversy, perhaps the paradigm case of the establishment of remote linguistic affiliation. This controversy concerned the 1913 proposal by Edward Sapir that Wiyot and Yurok, another language of Northern California, were related to the Algonquian languages, together forming a larger language family known as "Algic". Such a relationship was unexpected since the Algonquian languages are concentrated in the Northeastern United States and Eastern Canada, with the closest Algonquian language a good 1000 km from California. The controversy is known as the "Ritwan" controversy because of the proposal that Wiyot and Yurok together form a group dubbed "Ritwan".

The evidence that Sapir put forward was weak and the proposal was opposed by Truman Michelson, the leading Algonquianist of the time. After a brief exchange the debate subsided, and for many years the question was considered unresolved. It was finally resolved in the late 1950s and early 1960s due to the new data provided by Karl's work on Wiyot and field work on Yurok by R. H. Robins and Mary Haas, together with new analysis and argumentation. The first public step was the publication of Mary Haas' 1958 paper "Algonkian-Ritwan: The End of a Controversy", which for the first time put forward extensive regular phonological correspondances between Yurok, Wiyot, and Algonquian, including many proposed by Karl in unpublished work. What clinched the case was the grammatical evidence discoverd by Karl and by Ives Goddard. (The details may be found in my paper On the End of the Ritwan Controversy.)

Karl was also one of the relatively few linguists who successfully bridged the transition between the Bloomfieldian tradition of American Structuralism and generative grammar. He became an advocate of generative grammar, but understood its predecessor well and retained some sympathy for it. When as an undergraduate I read the papers of Bernard Bloch on Japanese and wondered at some of the seemingly very odd things that Bloch said, Karl was very helpful in explaining why Bloch felt compelled by the combination of the facts of Japanese and his theoretical assumptions to draw the conclusions that he did. I always thought that Karl would have made an ideal author for a history of American structuralism, but other topics, especially Wiyot, occupied his time. You've got to like the title of one of the few things he wrote in this area, his 1964 paper "Descriptive linguistics in America: Triviality vs. irrelevance".

Karl was a very nice man who always tried to be tactful. One of the odder tasks with which he helped me as a student was in composing a footnote that demonstrated that I had read a paper that appeared to be relevant to my own work but was unable to comment on it because, as best as I could tell, it was incoherent and unintelligible. It isn't easy to say that nicely.

Karl continued his work on Algic with fieldwork on Malecite-Passamaquoddy, an Algonquian language spoken in Maine and New Brunswick. A volume of the Maliseet texts that he recorded, Tales from Maliseet Country: The Maliseet Texts of Karl V. Teeter, translated and edited by Philip LeSourd, was published earlier this year by the University of Nebraska Press. He also continued to work on Wiyot. His two-volume Wiyot Handbook was published in 1993. The Wiyot lexicon on which he worked for many years remains unpublished.

Lol Vincit Omnia

I'm way behind in posting readers' contributions of linguistic cat macros -- sorry to all -- but right now, you need to go read "I can hath cheezburger?" over at Geoffrey Chaucer Hath a Blog. Here's how he describes his decision to create a set of "Lolpilgrimes" (his Miller is over on the right):

Al of my transportation of sundrie materials and makynge of accomptes hath left me but litel tyme for writing. Ywis, it hath left me but litel tyme for food, sleep or breathinge. And yet in this derke tyme of sorwe and tene, ich haue foond much deliit in the merveillous japeries of the internet. No thyng hath plesed me moore, or moore esed myn wery brayne than thes joili and gentil peyntures ycleped “Cat Macroes” or “LOL Cattes .” Thes wondirful peintures aren depicciouns of animals, many of them of gret weight and girth, the which proclayme humorous messages in sum queynte dialect of Englysshe (peraventure from the North?). Many of thes cattes (and squirreles) do desiren to haue a “cheezburger,” or sum tyme thei are in yower sum thinge doinge sum thinge to yt.

Posted by Mark Liberman at 05:38 PM

Before nary an overnegation could be uttered

Stanford Daily columnist Alex Coley on 5/21/07:

But otherwise, the most entertainment I had on Friday was watching the makings of a fight shape up outside of Exotic Erotic, and, before nary a punch was thrown, listening to a group of four or five security forces as they quickly mobilized into action: "What? A fight? I'm there!" and "Right behind you Jimmy!"

Coley is reporting on two spans of time.  During the first, no punches were thrown (though, clearly, words were exchanged).  During the second, immediately following, the campus police mobilized into action.  Coley could have written

(1a) ... before a (single) punch was thrown ... a group of four or five security forces ... quickly mobilized into action ...

(1b) ... before any punches were thrown ... a group of four or five security forces ... quickly mobilized into action ...

or something like

(2) ... nary a punch had been thrown ... when a group of four or five security forces ... quickly mobilized into action ...

In (1) the negative proposition 'no punches were thrown' is conveyed implicitly via the subordinator before; this is easy to see from the negative polarity elements single in (1a) and any in (1b).  In (2) the negative is explicit in the modifier nary a 'not a single'.  Coley's original has both the implicit negative before and the explicit negative nary a, producing the overnegated "before nary a punch was thrown" 'before not a single punch was thrown'.  Probably Coley threw in nary a for its emphatic effect; the impulse towards emphasis often leads people into overnegation.  And the rather literary nary a stands out so much that the reader could easily fail to appreciate the implicit negation in before (after all, most occurrences of subordinating before have no negative import); it took me a moment, in fact, to appreciate that the original sentence was somehow odd.

(Hat tip to Pat Callier.)

Kids these days

Even in ZippyWorld, things are going to hell in a handbasket.  It's those kids, with their video games and their filthy talk:

One word, one meaning

Pimp my ride, but not that way:

Ah, if only each word (like pimp) had only one meaning and each meaning (like 'snow') were expressed by only one word!  It's the dream of the Ideal Language.  Things are different in the real world, and that's a good thing.

(Hat tip to Paul Falvo.)

The station house writing seminar

Bizarro: when cops get tough on style:

The apotheosis of bad science at the BBC

Yesterday, Ben Goldacre at Bad Science slagged the BBC again ("Wi-Fi Wants to Kill Your Children", 5/26/2007), for a Panorama piece based on the views of someone who quite literally sells tinfoil hats on his website. You couldn't make this stuff up. As one of Ben's commenters put it (edited for spelling and punctuation):

It's the perfect bad science story. It absolutely has it all. A made-up scare, non-scientist scientists, crap experiments, misrepresented results, smears of dissenters, silly scientology-like ‘technology’ with dials that go up to ‘red’, and to cap it all, its very own Panorama programme to give it some tabloid legs.

I'd like to point out that the mutant frog story that Ray Girvan uncovered a few years ago ("The alleged three-headed frog") enacted a similar drama, though on a smaller stage. As I observed at the time ("More junk science from the BBC", 3/10/2004):

[C]onfronted with three frogs mating, found by "children in a nursery", the Beeb's expert biologist Dilger said "I have never seen anything like this", and "it could be an early warning of environmental problems." The first statement was no doubt true, but he might have continued "but of course I don't know anything about frogs", instead of taking an allegedly expert poke at environmental problems. Though there are no doubt many environmental problems afflicting amphibians, this seems to be a pretty clear example of the tendentious, politically-driven "reporting" [typical of the BBC in other areas].

And despite Ray's extensive evidence that the BBC's biology reporter had been taken in by a silly non-story ("multiple amplexus, typical frog and toad mating behaviour"), the article is still up on their website today, with no correction or retraction ever issued. In fact, the Beeb's text-media website apparently never does corrections or retractions, at least openly, although they occasionally indulge in a Kremlin-style silent removal of part or all of a particularly egregious piece of nonsense.

The theme of "bad science at the BBC" is one that we've been following here for a while, though we only comment on language-related stories. (Well, there have been occasional side trips into frog sex and transmutation of wood chips, but only because of resonances with stories that are on our beat, like telepathic parrots and cow dialects.) You may be asking yourself, why do those Language Loggers care?

Speaking for myself, I can tell you that the basic answer is the one that summarizes most blogging motivation: "it's fun". The BBC's unique combination of upper-crust pomposity and tabloid-like credulity is irresistible as a source of shared amusement.

But poking fun at the Beeb also has some redeeming social value, I think. It's bad when a powerful institution tolerates ignorance and carelessness, and it's worse when it gives sensationalism and ideology precedence over truth. This applies to the British Broadcasting Corporation just as it does to the commercial media or to the American government.

I predict that this will not be our last post on the subject.

[Update -- Aidan Kehoe writes:

Your pomposity meters need recalibration. The modern BBC is not the organisation of Alvar Liddell; there remain a few holdouts from then, but Patrick Moore and David Attenborough (whose science I’m sure you won’t be faulting any time soon) are the exceptions, not the rule.

Well, I do recognize that there have been some changes in British social structures in general, and the BBC's relationship to them in particular. But what I mainly had in mind was not the class affinities of broadcasters' accents (though what I hear on the BBC World Service, as carried in the U.S. by NPR, doesn't sound especially proletarian to me). I was thinking of the aura of noblesse oblige that seems to permeate the whole enterprise. Here are a few quotes from the organization's self-presentation:

The BBC exists to enrich people's lives with great programmes and services that inform, educate and entertain. Its vision is to be the most creative, trusted organisation in the world.

It provides a wide range of distinctive programmes and services for everyone, free of commercial interests and political bias.

The BBC is not permitted to carry advertising or sponsorship on its public services. This keeps them independent of commercial interests and ensures that they can be run instead to serve the general public interest.

If the BBC sold airtime either wholly or partially, advertisers and other commercial pressures would dictate its programme and schedule priorities.

For us to stay relevant to our audiences we need to continue to make innovative and creative programmes. But we also want to provide our audiences with opportunities to learn and to explore what it means to be an active citizen.

Our role goes much further than simply making programmes that people find enjoyable or stimulating. We
think it is important to respond to the things that people care about, to react to the things that people are talking
about. Our aim is to connect communities and bring people together. We want to help people have a better
understanding of the world around them and encourage them to participate in that world.

These random but apparently typical selections express "noble sentiments", in several senses of the word noble. In particular, the organization appears to see itself as inhabiting an empyrean realm of social, cultural and moral superiority, above the stresses of commercial and political give-and-take, providing uplift to the mass of citizens of which the BBC's writers, editors and governors are apparently not a part.

These are not at all bad goals to have, but they are expressed in ways that merit the adjective "pompous", in my opinion. And the whole package seems quite similar to the traditional attitudes of the more enlightened members of the aristocracy towards the common folk.

Juxtaposed with the spectacular examples of careless tendentiousness that are under discussion here, these noble sentiments seem even more pompous, in the sense of being "characterized by an exaggerated display of self-importance or dignity", since the squalid reality emphasizes the extent to which the noble sentiments are exaggerated if not entirely deluded or insincere. ]

[Update #2 -- David Eddyshaw writes:

Keep on bashing the BBC!

In my own field, medicine, their reporting is equally flaky. It causes real people real grief, particularly by irresponsible scares and, perhaps worse, irresponsible raising of false hopes. I often have to specifically warn patients not to place much faith in BBC medical reports. Interestingly most of them accept this very readily; sad if you consider what the BBC evidently imagines to be its own high standards. Were they once better, or is this one of those Golden Age delusions?

The BBC deserves a kicking. More power to your elbow - er, ankle.

What, you mean that breast enlargement chewing-gum doesn't really work? I'm shocked.]

Posted by Mark Liberman at 08:37 AM

In the weeks and months

At President Bush's May 24 press conference, his opening statement included the following passage:

This summer is going to be a critical time for the new strategy. The last of five reinforcement brigades we are sending to Iraq is scheduled to arrive in Baghdad by mid-June. As these reinforcements carry out their missions the enemies of a free Iraq, including al Qaeda and illegal militias, will continue to bomb and murder in an attempt to stop us. We're going to expect heavy fighting in the weeks and months. We can expect more American and Iraqi casualties. We must provide our troops with the funds and resources they need to prevail.

His prediction has been featured in broadcast news sound bites over the past couple of days, and every time I hear it, I notice the apparent failure to complete the thought. "In the weeks and months ahead"? "In the weeks and months to come"? Presumably something like that was in the original text.

The disfluency before "weeks" (a broken-off production of "month"?) makes it clear that something went wrong in the president's performance:

Another phonetic signpost of trouble was the size of the silent pause preceding that sentence: 2.25 seconds, vs. about 1.5 to 1.7 seconds for W's within-paragraph pauses up to that point.

In reporting on the news conference, some stories supplied a suitable time modifier in parentheses or outside the quotation marks:

“We’re going to expect heavy fighting in the weeks and months” to come, Bush told a White House news conference.
"This summer is going to be a critical time for the new strategy," said Bush. "We're going to expect heavy fighting in the (coming) weeks and months."

Others relied on a semantically complete paraphrase, or gave up and just used the quote intact, relying on their readers to understand the meaning in context.

In fact, not much context is required. I suspect that if you ask practiced readers of journalistic, political and commercial discourse in English to complete the phrase

in the weeks and months __

most of them would supply "ahead" or "to come" as their first guess.

That's certainly what an algorithm based on contextual frequency would do. Google finds 310,000 pages containing {"in the weeks and months"}, and almost two thirds of them continue either as "in the weeks and months ahead" (140K) or "in the weeks and months to come" (65K). If we add "in the weeks and months following" (27.3K), "in the weeks and months after" (27.2K), and "in the weeks and months that followed" (14.1), we get 273.6K, or 88%. (Yes, I know that assuming superposition of Google counts is naive, but the additivity of counts is probably good enough, these days, to support this particular argument.)

To use one of the president's signature words, I find it interesting that "in the weeks and months __" is so radically biased towards the future. In comparison, Google has only 2 hits for "in the weeks and months earlier", 7 for "in the weeks and months previous", 8 for "in the weeks and months that preceded". The only even reasonably common past-oriented continuation that I can come up with is "in the weeks and months before", with a mere 966.

But I wonder what larger patterns this is part of. Is it just a fact about particular English conjunction "weeks and months", or perhaps the prepositional phrase "in the weeks and months"? Or could there be a more general bias towards following events into the future rather than tracking them back into the past? It's obviously time for a Breakfast Experiment.

The president's example can be generalized in many directions. Given the available tool of textual search on the web, the easiest dimensions to check are the ones defined by simple string substitutions. So I'll pour another cup of coffee and give it a try.

If we look at a range of time-units from seconds to centuries, and limit ourselves to the common past- vs. future-oriented continuations "before" and "after", we get this:

"in the __ before"
"in the __ after"
before/after ratio
before percentage

The ratio of total before counts to total after counts is 1.44 (2.59M to 1.8M), and the overall percentage of before in the before+after total is 59%.

So it's clear that there's no general bias in favor of looking towards the future -- in this particular set of string-substitution contexts, the past is winning.

But if we do the same thing with conjunctions of adjacent pairs of time units in order of increasing size, like "in the seconds and minutes before" or "in the hour and days after", we get this very different pattern:

"in the __ before"
"in the __ after"
before/after ratio
before percentage

Now the ratio of total before counts to total after counts is 0.15 (13.4K to 88.8K), and the overall percentage of before in the before+after total is 13%. When we look at the all the conjunctions of time units in ascending order of size -- not just "weeks and months" -- the future is winning by a landslide!

Here's the same data, presented graphically. When the phrasal head is a single time unit, the past generally wins:

But when the phrasal head is a conjunction of time units in ascending order, the future kicks the past's ass:

What's going on here?

The consistency of the pattern across time-units makes it clear that it's not a fact about any particular lexical item, or about any particular collocation of lexical items. Is the effect due only to the conjunction of (adjacent?) time units, or does it matter that tme units are ordered from smaller to larger? Let's try it the other way around:

"in the __ before"
"in the __ after"
before/after ratio
before percentage

Changing the order of the time units, so that the larger one comes first, flips the effect towards the past. Now the ratio of total before counts to total after counts is 10 to 1 (4,902 to 489), and the overall percentage of before in the before+after total is 91%. Graphically:

OK, I think it's clear what's going on.

It's natural to think of the time-scale narrowing down -- zeroing in -- as our perspective gets closer in time to the event under discussion. And it's natural to think of (say) "hours and days" as hours followed (temporally) by days, while "days and hours" is days followed temporally by hours.

From those two assumptions, it follows that a conjunction of time units in order of increasing size (like "hours and days") will more naturally be used to describe time after an event; while a conjunction of units in decreasing-size order (like "days and hours") will more naturally be used for time before an event.

And that's what happens!

If this theory is right, then the same thing ought to happen in French or Chinese or Turkish -- as long as the assumptions continue to hold, about zeroing in on events and listing time periods in chronological order.

[Update -- Anatol Stefanowitsch writes:


It also works in German (see attached .csv file).

Turned into html tables, the file that Anatol attached looks like this:

Table 1 -- smaller units first:

und jahrzehnten
in den _ (vor|bevor|davor|vorher)
in den _ (nach|nachdem|danach|nachher)
vor/nach ratio
vor percentage

Table 2 -- larger units first:

und monaten
und jahren
und jahrzehnten
in den _ (vor|bevor|davor|vorher)
in den _ (nach|nachdem|danach|nachher)
vor/nach ratio
vor percentage

I venture to suggest that the kind of research represented by this collective Breakfast Experiment deserves a name. "Yes", I hear you say, "how about 'Computational Linguists With Way Too Much Time On Their Hands'?" No, actually, what I had in mind was something more like "Google Cognitive Linguistics".]

Posted by Mark Liberman at 07:03 AM

Carl's Junior's Beef with Jack-in-the-Box

Carl's Jr. is suing Jack in the Box over an ad for Jack in the Box's sirloin burger that makes fun of its Angus burgers. According to Alana Samuels in the LA Times, ("Carl's Jr. has a beef with Jack in the Box advertising" May 26), the claim is that the ad is deceptive.

The joke is that Angus is not a cut of beef like "sirloin" but a breed of cattle, as opposed, for example, to the Texas Longhorn or the several dozen others you can find on the very enjoyable Breeds of Livestock site at Oklahoma State, and that Angus [æŋgəs] sounds somewhat like anus [enəs]. I don't think that Carl's Junior is on strong ground. The ad is pretty obviously intended to be humorous, which should clue in even the uninformed that they should not take it at face value. Nonetheless, it raises an interesting issue, namely what sort of linguistic and cultural knowledge can reasonably be presupposed by advertisers, politicians, and others whose speech is regulated or subject to criticism for deceptiveness? Supposing that the ad were not so obviously humorous and that an immigrant with a limited knowledge of English and beef saw this advertisement and was put off by the idea that Carl's Junior's Angus Burgers were made from meat from the anal region, should Jack in the Box be held responsible for deceptive advertising?

The incident is reminiscent of how P. T. Barnum reportedly increased the rate at which people moved through his museum by placing a sign halfway through with the text "To the Egress". The less informed would not realize that "egress" is a synonym for "exit" and would take it to lead to another exhibit. A similar case is the apparently apocryphal speech in which George Smathers is said to have smeared his opponent Claude Pepper by accusing him of such things as having a sister who was a thespian (a fancy word for "actress" that sounds somewhat like "lesbian").

Posted by Bill Poser at 07:18 PM


It's Memorial Day weekend in the U.S., and commemorating is on our minds (NOAD2, first definition for commemorate: "recall and show respect for (someone or something) in a ceremony").  Commemorating looks back, but on 5/22 in the NYT we have commemorating in advance:

Images from the covers of all seven Harry Potter adventures will appear on a set of postage stamps to be issued by the Royal Mail in Britain on July 17, commemorating the July 21 publication of the final volume of the J. K. Rowling stories about the boy wizard.

Yes, the stamps are being issued four days before the event they "commemorate".

(Hat tip to the blogger Empty Pockets.)

I can get partway from NOAD2's first definition to this use.  Here's the third definition:

mark (a significant event): the City of Boston commemorated the 400th anniversary of the discovery of America

And a couple of further instances of this use from the web:

The major museums of Europe, commemorate the 80th anniversary of the Armistice of 1918 (link) [announcement of a series of exhibitions of the art of the First World War]

This image is being released to commemorate the 14th anniversary of Hubble's launch on April 24, 1990 and its deployment from the space shuttle Discovery on ... (link)

What's being commemorated in these cases is, strictly speaking, the events of the past, but these events are linked to the present through an anniversary.  The crucial point is that announcements of these commemorative occasions can be made in advance of them.  Now we're one step away from using commemorate for occasions that mark, recognize, or celebrate events without regard to whether the celebratory occasions follow or precede the celebrated events -- one step away from bleaching out the temporal-sequence component of commemorate.

But why extend commemorate into new territory?  Why not use one of the available alternatives? 

The usual answer to such a question is that the writer didn't find any of the alternatives entirely satisfactory, and also found something attractive in the innovative use.  In  the commemorate case, some of the alternatives are bland: observe, mark, recognizeHonor might seem too deferential or solemn for the release of Harry Potter stamps, while celebrate might seem too festive and unserious for a Royal Mail action.  Commemorate hits the tone just about right, though at the cost of twisting the semantics some.

Actually, other reports of the Harry Potter stamps suggest that commemorate might be appropriate here.  You can google up a CBC report that says:

Britain's Royal Mail will issue 11 commemorative stamps to honour the extraordinary success of the Harry Potter books by J.K. Rowling just before the final volume of the series goes on sale.

In this version, what's being recognized is not just the last book in the series, but the book series as a whole; commemorative is not really inappropriate here.  (But in fact though the quote above is what Google reports in its page view, the actual page lacks the adjective commemorative.)

Still, commemorate 'recognize, observe, celebrate' seems to be reasonably frequent.  Here's another commemoration of an event to come:

Next week, the saga of Darth Vader continues with the release of Star Wars: Dark Lord... To commemorate this event, Del Rey Books has dispatched Darth Vader and his forces to a number of bookstores ...

and a couple of "commemorations" of regularly scheduled events:

People's Law School set for next week ... The School of Law historically participates in a variety of outreach activities to commemorate Law Day, which is May 1. (link)

MSU to help commemorate World Food Day next week.
Local speakers, presentations by area service agencies and a simple meal will be part of Mississippi State's observance of 2001 World Food Day. (link)

Semantic change marches on.  Meanwhile, we can commemorate Memorial Day on Monday.

Tolkien on walh

In response to my post on "Dialect variation in the terminal flourishes of Flemish chaffinches", John Cowan sent a passage from J.R.R. Tolkien on the Germanic root of words such as Welsh, Walloon, Vlach and walnut, to correct the impression that it might have meant "foreigner" in general rather than "speaker of Celtic or Latin" in particular.

This comes from Tolkien's essay "English and Welsh", originally a lecture given at Oxford in 1955:

It seems clear that the word walh, wealh which the English brought with them was a common Germanic name for a man of what we should call Celtic speech. But in all the recorded Germanic languages in which it appears it was also applied to the speakers of Latin. That may be due, as is usually assumed, to the fact that Latin eventually occupied most of the areas of Celtic speech within the knowledge of Germanic peoples. But it is, I think, also in part a linguistic judgement, reflecting that very similarity in style of Latin and Gallo-Brittonic that I have already mentioned. It did not occur to anyone to call a Goth a walh even if he was long settled in Italy or in Gaul. Though 'foreigner' is often given as the first gloss on wealh in Anglo-Saxon dictionaries, this is misleading. The word was not applied to foreigners of Germanic speech, nor to those of alien tongues, Lapps, Finns, Esthonians, Lithuanians, Slavs, or Huns, with whom the Germanic-speaking peoples came into contact in early times. (But borrowed in Old Slavonic in the form vlach it was applied to the Roumanians.) It was, therefore, basically a word of linguistic import; and in itself implied in its users more linguistic curiosity and discrimination than the simple stupidity of the Greek barbaros.

Its special association by the English with the Britons was a product of their invasion of Britain. It contained a linguistic judgement, but it did not discriminate between the speakers of Latin and the speakers of British. But with the perishing of the spoken Latin of the island, and the concentration of English interests in Britain, walh and its derivatives became synonymous with Brett and brittisc, and in the event replaced them.

[Update -- Ben Zimmer writes:

It's not surprising that Tolkien would have in-depth knowledge of the word walh and its reflexes in English like Walloon and walnut. Tolkien worked on the staff of the OED in 1919-20, researching entries in the range waggle to warlock. See Peter Gilliver's piece in the OED's June 2002 newsletter, "J. R. R. Tolkien and the OED."


Posted by Mark Liberman at 08:01 AM

The doors of infant perception

A lovely study from Janet Werker's lab at UBC was published in Science yesterday. The paper is Whitney Weikum, Athena Vouloumanos, Jordi Navarra, Salvador Soto-Faraco, Nuría Sebastiaán-Gallés and Janet Werker, "Visual Language Discrimination in Infancy", Science, 316(5828) 1159 May 25, 2007. The abstract:

This study shows that 4- and 6-month-old infants can discriminate languages (English from French) just from viewing silently presented articulations. By the age of 8 months, only bilingual (French-English) infants succeed at this task. These findings reveal a surprisingly early preparedness for visual language discrimination and highlight infants' selectivity for retaining only necessary perceptual sensitivities.

You can see pictures, a demo movie, and examples of the stimuli here.

How do you test a four-month-old's perceptual abilities?

Discrimination was tested by using silent video clips of three bilingual French-English speakers reciting sentences in each language. Every trial contained a video clip of a different sentence by one speaker in one language ... The infants (n = 36) were presented with video clips from one of the languages until their looking time declined to a 60% habituation criterion. Test trials using the same speakers but different sentences from the other language were shown to examine whether the infants' looking time had increased, indicating that they had noticed the language change. The test trials where the language was switched were compared with a control condition (n = 36) for which the test trials were always different sentences but in the same language as the habituation trials.

The results:

Fig. 1. Mean looking time in seconds to silent talking faces. The y axis represents infant looking time; the x axis represents the trials that the infant was shown (final habituation trials and test trials). Error bars represent the standard error of the mean. (A) Experimental (language switch) and control (language same) conditions for monolingual infants at 4, 6, and 8 months. (B) Experimental conditions for monolingual [replotted from (A)] and bilingual infants at 6 and 8 months.

The authors observe that

The finding that infants can visually discriminate their native language from an unfamiliar language at 4 and 6 months but not at 8 months parallels declines in performance seen in other perceptual domains. Indeed, across the first year of life, infants' performance declines on the discrimination of nonnative consonant and vowel contrasts (6, 7), nonnative musical rhythms (8), cross-species individual faces (9), and cross-species face and voice matching (10). Thus, it appears that specific experience is necessary for maintaining sensitivity to some initial perceptual discriminations.

The references from that last paragraph:

(6) J. F. Werker, R. C. Tees, "Cross-Language Speech Perception -- Evidence for Perceptual Reorganization During the 1st year of Life", Infant Behav. Dev. 7, 49 (1984)
(7) P. K. Kuhl et al., "Infants show a facilitation effect for native language phonetic perception between 6 and 12 months", Dev. Sci. 9, F13 (2006)
(8) E. E. Hannon, S. E. Trehub, "Tuning in to musical rhythms: Infants learn more readily than adults", Proc. Natl. Acad. Sci. U.S.A. 102, 12639 (2005)
(9) O. Pascalis et al., "Plasticity of face processing in infancy", Proc. Natl. Acad. Sci. U.S.A. 102, 5297 (2005)
(10) D. J. Lewkowicz, A. A. Ghazanfar, "The decline of cross-species intersensory perception in human infants", Proc. Natl. Acad. Sci. U.S.A. 103, 6771 (2006)

I expect that a preprint of yesterday's Science paper will appear soon in Janet's well-maintained publications list. Meanwhile, there's a good journalistic account in Tu Thanh Ha, "French and English look different to babies", Globe and Mail, 5/25/2007.

To understand this work fully, you need to see it in the context of what we might call the Blakean tendency in modern developmental studies, according to which infant learning is associated with a selective loss of perceptual abilities. As Blake famously wrote in The Marriage of Heaven and Hell:

If the doors of perception were cleansed, every thing would appear to man as it is: infinite.
For man has closed himself up, till he sees all things thro' the narrow chinks of his cavern.

There is a debate (mostly implicit) about whether this "closing up" is a functionally-essential part of learning (because you can only focus on what matters by ignoring what doesn't), or an unavoidable but secondary effect (because making some distinctions more salient tends to put others into the background), or an inessential and avoidable effect of the infant's perceptual environment (because perceptual abilities that are not exercised will atrophy). This context may help you to understand the discussion at the end of Tu Thanh Ha's article:

It is normal, researchers say, for the young child to prune out unused linguistic skills, a pruning process that helps limit information overload.

"As babies grow up, it wouldn't really be advantageous to continue to hear sound differences in languages they don't use," said the study's research supervisor, UBC psychology professor Janet Werker, a long-time researcher on how babies perceive speech.

At the same time, Prof. Vouloumanos said the recent findings also made her believe that there is no harm in getting children to learn new languages at the earliest age.

There is no reason that a healthy child should not be exposed to multiple languages that are spoken in their natural environment, she said, emphasizing that she was stating a personal opinion.

"Much research has shown that young infants can do many things with language, and, in many cases, better than adults can."

She and Prof. Werker noted that, outside of North America, the majority of young children grow up in multilingual environments.

"The brain is definitely set up to acquire more than one language," Prof. Werker said.

For some additional perspective on the topic of visual language discrimination, see Salvador Soto-Faraco, Jordi Navarra, Whiney M. Weikum, Athena Vouloumanos, Nuría Sebastiaán-Gallés and Janet F. Werker, "Disciminating Languages by Speech-reading", Perception and Psychophysics, 2006. The abstract:

The goal of this study was to explore the ability to discriminate languages using the visual correlates of speech (i.e., speech-reading). Participants were presented with silent video-clips of an actor pronouncing two sentences (in Catalan and/or Spanish), and asked to judge whether the sentences were in the same or in different languages. Our results established that Spanish-Catalan bilingual speakers could discriminate running speech from their two languages (Spanish and Catalan) based on visual cues alone (Experiment 1). However, we found that this ability was critically restricted by linguistic experience, as Italian and English speakers, who were unfamiliar with the test languages, could not successfully discriminate the stimuli (Experiment 2). A test of Spanish monolingual speakers revealed that knowledge of only one of the two test languages was sufficient to achieve the discrimination, although at a lower level of accuracy than that seen in bilingual speakers (Experiment 3). Finally, we evaluated the ability to identify the language by speech-reading particularly distinctive words (Experiment 4). The results obtained are in accord with recent proposals arguing that the visual speech signal is rich in informational content, above and beyond what traditional accounts based solely on visemic confusion matrices would predict.

[Randy Alexander wrote:

What you took as a lovely study in your post, I took as a sign of impending doom.

I design methods for children to learn English as a foreign language.

As I read this news yesterday, I saw in my mind's eye crowds of mothers pushing their way into my office begging me to teach English to their four month old babies.

I'm praying that this doesn't make it to the Chinese media.

It's gonna be like the Mozart effect all over again. :-(

What, is business so good that the prospect of opening up a whole new market feels like impending doom? ]

Posted by Mark Liberman at 07:56 AM

Sad news: Carlota Smith

Carlota Smith, a linguist at UT Austin well-known for her 40 years of work on syntax, semantics, acquisition, and discourse structure, died yesterday. An obituary from Richard Meier, chair of the UT linguistics department, follows, then some final recollections of my own.


Professor Carlota S. Smith of the Department of Linguistics at The University of Texas at Austin died Thursday, May 24 at the age of 73 after a long battle with cancer.  She was the Dallas TACA Centennial Professor in the Humanities and taught at The University of Texas at Austin for 38 years.

Carlota Smith received her bachelor's degree from Radcliffe College in 1955.  In the late 1950s, she became a research assistant and then a doctoral student in the Department of Linguistics at the University of Pennsylvania.  During this time she worked with Zellig Harris, who directed the doctoral dissertation of Noam Chomsky and who would also later direct her own doctoral dissertation.  In 1961, Prof. Smith was a graduate student at the Massachusetts Institute of Technology in Cambridge, MA, where she was one of the very first woman students to work with Chomsky. Prof. Smith's first publication (`A Class of Complex Modifiers in English', 1961) dates from this period.  It appeared in the journal Language.

After receiving her M.A. (1964) and Ph.D. (1967) at the University of Pennsylvania, Prof. Smith joined the faculty of The University of Texas at Austin in 1969, where she was a faculty member in the Department of Linguistics until her death.  She served as the chair of the department from 1981-1985.  In 1991, she was named the Dallas TACA Centennial Professor in the Humanities.

Prof. Smith's early research examined the syntax of English.  In 1969, she published, along with Elizabeth Shipley and Lila Gleitman, a very influential paper on how children acquire English as a first language; in ensuing years she would publish several more papers on child language development.  Starting in the mid-1970s, she embarked on what was perhaps her most important line of research.  In many papers and in a very important book (The Parameter of Aspect, published in 1991 by Kluwer), she analyzed the ways in which languages encode time and how they encode the way events happen over time. Prof. Smith's work on temporal aspect has been notable because of its empirical foundation in her careful analyses of a number of quite different languages, including English, French, Russian, Mandarin, and Navajo.  Through her many years of research on Navajo,  she became a member of the Navajo Language Academy, a group that seeks to further the study of Navajo, to keep Navajo from becoming endangered, and to provide training in linguistic research to members of the Navajo Nation.

In 2003, Cambridge University Press published Prof. Smith's second book, Modes of Discourse.  This book analyzes the grammatical properties that distinguish different genres of discourse (e.g., narratives vs. reports vs. descriptions).  In this book and in earlier papers (for example, a 1985 paper on the French author Gustave Flaubert), she sought to bring the analytic tools of linguistics to the humanistic study of literature.

Carlota Smith was an active member of the Department of Linguistics until the very last.  This semester she taught a graduate seminar on time in language.  She was meeting with students and faculty in her office just three days before her death.  Throughout the semester she was thinking about how to ensure the future of the department in which she had taught for virtually her entire career. At The University of Texas at Austin, her absence will be felt for many years to come.
Prof. Smith is survived by her husband, John Robertson, who is a professor in UT's Law School.  She is also survived by her children Alison and Joel, and by her grandchildren Sylvia and Ari.
A memorial service will be held on Saturday, June 2 at 4:00 at the Alumni Center on the UT Austin campus.

 - Richard Meier, Professor and Chair, Department of Linguistics, UT Austin. May 25, 2007.


I came to know Carlota well only in her last year, through which she was constantly living under the pressure of chemotherapy and the likelihood of imminent death. But she never flagged in her commitment to her students and colleagues, to the department, to the school, and to her research. As Richard makes clear, she was active to the end. She was still eager to learn, sitting in on at least one class as well as teaching her own. Even in the last few months she was still advising and teaching, she co-chaired a search committee, and she organized a workshop on Linguistics and Literature (and presented a paper there).

A couple of weeks ago, she had successfully roped me in to teaching a course in UTs Cognitive Science program, of which she was director, and on Monday we had an appointment. She came along to my office. I was  shocked at the plainly visible deterioration in her health, but she went straight to business. No, she'd just had a coffee, so let's not go out for one. Let's make plans for the Cognitive Science program, and what did I think about this and that, and so on. We cut the meeting a little short - she was tiring rapidly, though she sent me a brief follow-up email that evening so that I wouldn't be discouraged.

Carlota achieved more than most of us can ever hope to, but she still had much she wanted to do. She put every breath she had towards doing it.

Posted by David Beaver at 11:30 PM

More passive tense

As Geoff Pullum has just noted, on NPR's Morning Edition of 5/11/07, Steve Inskeep, interviewing U.S. Army Gen. Dan McNeill, pursued a question about current U.S. policy in Afghanistan:

Understanding that you're constrained from criticizing an ally, let's put it in the passive tense:

Ok, a mistake.  But WHICH mistake is it: passive tense for passive voice (a mistake in grammatical terminology, as Geoff thinks), or passive tense for past tense (a mistake in word retrieval)?

Inskeep goes on:

There was a cease-fire agreement in southern Afghanistan with some members of the Taliban at one time; is that something that you would pursue if the opportunity came up?

The accent on the past tense form was (and the absence of anything that looks like a passive) suggests that Inskeep was aiming for past tense and pulled out passive instead of past -- an easy mistake to make, since the words are phonologically similar and also (both being grammatical terms) semantically similar.  I am myself given to saying verb for vowel, and vice versa; I'm not confused about the concepts, just retrieving the wrong v-initial technical term of linguistics.

A few other examples of doubly motivated retrieval errors:

1.  suppletion analysis for syncretism analysis, in a Language Log posting by me (twice); I have now corrected this.

2.  ... a sort of postmodern quality to these mushrooms [marshmallows] (Stanford Humanities Center fellow, in lunch-table conversation, 1/20/06)

3.  I presume "go yard" is intended to be elliptical for "go the whole distance of the ballyard", or words to that extent [effect].  (Ben Zimmer on ADS-L, 10/15/05; he corrected himself in a later posting)

4.  ... who actually agrees with the sentiment quoted, and b'leeves that the same advice goes for communication with/at mr kingsix of the idiomatic [idiosyncratic] punctuation  (Chris Ambidge, posting on soc.motss, 9/21/05)

5.  air traffickers for air travelers, in a press conference by GWB, 4/3/07.

Other retrieval errors are mostly phonological in motivation (pollution for pollination, of flowering plants by bees) or mostly semantic ("this particular noun [consonant] cluster").  Some are mostly phonological in motivation, but also have probable triggers in the context: my "a preposition [presentation] of brutal masculinity", about gay porn, but not long after a discussion of stranded prepositions; a friend's "a biological orgasm [organism] that reproduces by...", an error undoubtedly facilitated by planning for the word reproduces.

Such errors suggest that the "mental lexicon" is organized according to both phonological and semantic properties (as well as syntactic properties: retrieval errors almost always preserve syntactic category) and that retrieval is sensitive to the context in which language is produced, in that errors can be facilitated by what's "in the mind" of the speaker or writer.

But wait!  There's more.  This is not our first brush with passive tense here at Language Log Plaza.  If you google on {"passive tense"}, you'll get a distressingly large number of hits -- in the tens of thousands -- almost none of them (if any) for word retrieval errors; instead, there are a few for lists of the tense forms of the passive voice in one language or another (more on this below), but most of the hits are, alas, for errors in grammatical terminology (passive tense for passive voice).  And as the top-ranked hit you'll probably get Geoff Pullum's earlier plaint about an occurrence of this error in The Economist, where they really should know better.  Soon you'll get to Mark Liberman's response to someone asking for advice on "how to avoid passive tense".

Particularly distressing is the fact that so many of the errors in grammatical terminology you google up are, omigod, on pages giving advice to writers or students.  A few examples:

Our editors find that one of the greatest weaknesses of admissions essays is their frequent use of the passive tense. For this mini-lesson you will learn ... (link)

The passive tense is still used in some forms of academic writing. It is best to become familiar with the type of writing style that is most commonly used within a particular subject area. (link)  [a particularly wonderful find, with two passive clauses and one impersonal-subject clause]

Help your students identify sentences written in active or passive tense with this entertaining deck. Students use the 28 pairs of illustrated cards to ... (link)

Avoid passive tense when possible because it is boring. ... A simple way to check your paper for passive tense is to use the FIND command ... (link)

At this point you suspect that an awful lot of people are using tense to mean something like 'verb form'.  For the English passive even this isn't quite right, since the passive isn't a verb form but a syntactic construction (most commonly composed of a form of the verb BE and a VP with its head verb in the past participial form).  But let that pass.  There are bigger things to worry about. 

Look at English-Zone's "Active and passive tenses chart".  English-Zone (which describes itself as "the BEST English-Learner's site on the 'Net!") talks about active and passive VOICE, so there's one problem avoided.  The chart begins with the "simple present" and the "simple past", for which the "forms" of the passive given are, respectively:

am/is/are + past participle
was/were + past participle

This is already fairly silly, since it treats these forms as fresh things to memorize.  But really all the learner needs to know is (a) that (more or less in English-Zone's terms) the passive is composed of head verb BE + past participle, and (b) that the present forms of BE are am/is/are, the past forms was/were.  Only the first of these is new information for the learner.

The chart then continues in this vein, giving the "present and past continuous (progressive)", "present and past perfect", and "future" forms of the passive.  These forms are entirely systematic, being constructed on the basis of fact (a) and the principles governing the corresponding forms of the active.  English-Zone has managed to turn a pretty simple system into a massive pile of unrelated formulas.

Put that aside, and note that this is supposed to be a chart of TENSE FORMS, where the tenses include: simple present, simple past, present progressive, past progressive, present perfect, past perfect, and future.  They're all "tenses", presumably because their associated meanings have some temporal component.  This is not some idiosyncrasy on English-Zone's part; such a use of tense is all over the pedagogical literature for English and dozens (if not hundreds) of other languages, and it's not unknown among language professionals:

Watchers of the History Channel have noticed its general taboo against use of the past and perfect tenses.  (Lexicographer Jonathan Lighter on ADS-L, 5/16/07)

Some treatments of English go on from the list above to the future progressive, future perfect, present perfect progressive, past perfect progressive, and future perfect progressive (in the passive: will have been being given!).

Linguists factor the grammatical categories into tense (for English, present, past, sometimes future, but see below), having to do, in many of its occurrences, with the location of situations in time, and aspect (for English, unmarked vs. perfect and progressive, with the possibility for these latter two to co-occur), having to do, in many of its occurrences, with the internal organization of situations over time.  There are then twelve possible combinations, enumerated in the preceding paragraph.

Given this extended sense of tense in many quarters, it's probably no surprise that some people have extended it to cover voice as well as tense and aspect. 

There are two different extensions of careful technical vocabulary here: tense is extended to cover all sorts of verbal categories (often realized by morphology on the verb); and form is extended to cover multi-word combinations -- periphrastic expressions -- as well as single words.  These extensions are in principle independent of one another. 

Extended sense of tense.  I am sorry to report that you can find references to infinitive, conditional, subjunctive, negative, causative, permissive, inceptive, plural, imperative -- Mark Liberman complained here about this one three years ago -- and interrogative tenses in one language or another.  No doubt there are many more verbal categories that have been labeled "tenses", but I gave up my search in sorrow after finding these.

What we have here is a terminological morass.  The way out is to distinguish different types of categories; these are customarily given labels that suggest something about the semantics associated with each type: tense, aspect, voice, mood, mode/modality, polarity, finiteness, evidential status, and the like. 

These types can be relevant for some languages and not for others.  For any particular language, within each type we distinguish various forms belonging to  that type, and those too are customarily given labels that suggest something about the associated semantics: present vs. past tense, for example. 

All of this labeling is problematic: the "same" type in different languages will cover rather different territory, and that's also true of the "same" form in different languages; and within a single language, almost all forms are multifunctional (associated with a variety of meanings), so that choosing a label for any particular form is a somewhat arbitrary process.  For these reasons, for some time I've been advocating assigning arbitrary labels whenever we need to be absolutely clear about how a language works; see my posting on the "subjunctive" in English for some development of this proposal.

But for informal discussion of English (or any other language), traditional labels will usually do.  Still, they should be used accurately.  "Passive tense", "infinitive tense", and the like are not accurate uses.

Extended sense of form.  There are two motives for extending the term form to multi-word combinations: semantic parallels between periphrastic and inflectional realization in one language (for instance, between periphrastic future will see and inflectional present see(s) and past saw, and between periphrastic to see and inflectional seeing); and periphrastic realization in one language corresponding to roughly equivalent inflectional realization in another (for instance, the English passive construction vs. the inflectional passive of many other languages, among them Latin).

Both of these moves are unwise, because they take us down a slippery slope.  If we treat the expression of future time via the modal will ("I will be your assistant") as a tense, then why not say the same for the futurate quasi-modal be going to ("I'm going to be your assistant"), the future-plan modal be to ("I am to be your assistant"), and the inceptive future quasi-modal be about to ("I am about to be your assistant")?  And the future-in-past modal would ("I would soon see why the idea was problematic").  And we pick up at least two more past tenses, expressed via the quasi-modal used to ("I used to be your assistant") and the modal would ("When I was a child, I would always help set the table").

Somewhat more subtly, if we're picking out tenses on the basis of meaning, why don't we say that "I leave at noon tomorrow" and "I am leaving at noon tomorrow" illustrate two more future tenses (rather than saying that they are futurate uses of the present tense)?  And why don't we say that English has several more tenses -- for instance, a gnomic tense, for (putatively) universal truths ("Ice melts at 32 degrees F."), a narrative tense, as in "A panda walks into a bar,...", and still others?  The usual way people talk about these phenomena is as "uses of the present tense", and that's basically right.  But how do we avoid calling these things different "tenses" simply because they have different kinds of temporal reference?

In fact, calling the progressives and perfects of English "tenses" follows from this identification of "tense X" with "meaning Y": unmarked aspect in "I see you" (a state description) describes something happening at the moment, but so does progressive aspect in "I am jumping over an anthill" (an event description); unmarked aspect in "I will go to Hawaii" describes a future event, but so does perfect aspect in "By the time you arrive, I will have gone to Hawaii" (a past in the future).

The multiplication of verbal categories will continue if we analyze English on the basis of the inflectional categories of other languages, where voice, mood (declarative, imperative, interrogative, subjunctive, conditional, etc.), mode/modality (expressing necessity, obligation, permission, possibility, likelihood, intention, and much more), polarity (postive vs. negative), finiteness, evidential status, and various other meaning ranges can be expressed via inflectional morphology.  Going down this road gives English at least one "verb form" for each of its modal verbs, plus more for the constructions with have to, want to, ('ve) got to, (had) better, etc.; two passive "forms" (for the be and get passives); a whole pile of moods; and much more.

This is craziness.  The way out that I favor is to split the description of the data into two parts: an inventory of the inflectional morphology of the language and an inventory of its syntactic constructions.  The forms in the inflectional inventory are "bits of stuff" that can be used in the constructions ("grammatical words" are another kind of stuff, and there are still other kinds).   Each construction is a set of conditions on the composition of expressions -- think of the conditions, taken together, as a recipe for putting a construction together -- associated with a meaning for the construction as a whole. 

Such an approach gets us down to two verb forms for English that can be referred to, informally, as tenses; call them Form:R and Form:T.  They can be used in constructions with present-time meaning and past-time meaning, respectively -- this is, in some sense, their customary use -- but they can also be used in constructions with other meanings, and the temporal-location meanings can be expressed by other means. 

This isn't the place to develop an analysis of English along these lines -- I've probably strained your patience already -- but it should be enough to show that we don't have to slide down that slippery slope.

Beyond dispute

From a story in the Wall Street Journal last week on controversies over Texas's new "Bible curriculum":

Mr. Adkins says the classes focus on historical fact. "We're not talking about miracles," he says. "We're talking about Old Testament sites located through modern digs.... Do you dispute the Bible influenced the founding of the U.S.?"

That use of dispute brought me up short. I can talk about disputing a pass, a parking ticket, or somebody else's version of events, or at the limit, about disputing whether something happened, but not about disputing that something happened (though it's obviously okay to use that-S complements with negative passives like "It isn't disupted that he met with Rove"). Nor can I use that syntax with related verbs like question.

Of course I'm reminded every day how subcategorizationally hidebound I am. But it turns out that the meaning of the verb has been shifting along with its syntax, with dispute coming to mean "deny" -- actually, very like what's happening with refute, a topic that Mark and I were posting on a couple of years ago.

It isn't easy to tell, of course -- in most situations, there isn't a lot of distance between disputing a claim and denying it. But Googling around turns up some examples where "deny" would seem to be the intended meaning. Sometimes it's because the person referred to by the subject NP ought to be in a position to deny the claim on the basis of personal experience without having to offer an argument, as dispute generally entails:

Before the Arbitrator, the grievant noted that the event at issue occurred five years earlier, and he disputed that he used profanity towards a co-worker.

He disputed that he was intending to "disrupt" the test, saying that he was there to "bear witness for millions of people worldwide" who oppose Bush's missile defense plans.

He disputed that he received any gratification from the sexual assaults he pled guilty to, and also attempted to explain in more detail his culpability for two of his previous convictions.

A lot of these occur in legal contexts, which suggests that for these writers, dispute implies making a formal declaration. But the "deny" meaning also appears in relatively informal discourse where the traditional meaning of "dispute" would be inappropriate -- for example, when the proposition at issue is being conceded for rhetorical purposes:

I don't mean to dispute that your ancestors were kind and charitable, nor that they believed their reward would be in heaven and that they they'd have connected the two.

I don't mean to dispute that there are so very many more important things for people to debate, but this is something that really matters to a large contingent of people...

In fact there are analogous sentences where dispute appears with a simple NP object and pretty clearly has a "deny" interpretation:

You can't dispute your hatred for UPS. . .

I can't dispute that feeling! I've already outgrown my 24" Dell after just three months.

I do dispute disliking the character of Anakin "because he is too emo."

And for inclusion in the "oh, well in that case" file:

The grievor admitted the intent to threaten to beat the customer up. However, he disputed saying, "I'll fucking kill you." Rather he responded to the receptionist's verbal abuse with: "Step outside you fruit loop, I'll kick your ass."

And there are quite a number of hits for phrases like "dispute the Holocaust" and "dispute the moon landing" (as opposed, say, to "dispute the claims about the Holocaust" and the like, which are consistent with the meaning of the verb as "take issue with"). In other instances, dispute occurs with an inanimate subject (often "fact" or "facts") and seems to mean something like "disprove" or "call into question":

However, teachers need to be taught about the facts that dispute evolution and should be required to teach these facts to students.

Keeping in mind these facts, which dispute the legality and legitimacy of the Hague Tribunal and indicate the common understanding regarding the option of extraditing one's own citizens, we are free to conclude that there is not a single legal basis for the FRY's duty to meet such demands from the Hague.

Due to the short election period, for appeals to be considered there must be further documentation or additional facts which dispute the information used to determine the original sanction.

I haven't checked very thoroughly to see how recent this phenomenon actually is (Ben Zimmer, call your office), though there are cites for the relevant syntax going back to the 1980's at least:

. . . sources close to Manley say he disputes that he tested positive for cocaine. Washington Post, 11/20/89

It stands to reason, though, that this has been going on for a while, even if it has only become salient with the advent of the Web, as public space suddenly fills with the writing of people whose public expression was previously confined to their refrigerator doors. I mean, a verb doesn't unravel in a single afternoon.

Posted by Geoff Nunberg at 01:09 PM

Let's put it in the passive tense

A startling grammar terminology goof came up on NPR's Morning Edition a couple of weeks ago, on May 11. (I hope the two-week delay in posting this didn't make NPR think this one would slip by without getting a mention here on Language Log. I heard it live early in the morning, and was jolted out of semi-slumber, but it was a teaching day, and I was busy (teaching students in my course on The Structure of English what a passive clause is). Thanks to Jonathan Lundell for reminding me to blog it by sending me the NPR story link — the quote is about four minutes in.) Morning Edition host Steve Inskeep was interviewing U.S. Army General Dan McNeill, the NATO commander in Afghanistan. General McNeill was sounding reluctant to speak about another general's effort to establish a ceasefire with the Taliban (he thought that if he spoke about it in personal terms he might be construed as criticizing another NATO officer); so Inskeep put it this way:

Let's put it in the passive tense: there was a ceasefire agreement in Southern Afghanistan with some members of the Taliban at one time. Is that something you would pursue if the opportunity came up?

No passive clause at all there, in any sense whatsoever. The first clause ("there was a ceasefire...") is an active-voice existential clause; the interrogative that follows is also in the active voice.

My mind is filled with only partly rhetorical questions, none of them in the passive tense. (1) What on earth do people imagine the passive construction is? (A tentative answer, of course, is that they mostly think a passive clause is one that is vague about agency, nothing more and nothing less. Which is of course untrue in both directions: you don't have to be vague about agency in a passive clause, and you don't need a passive clause to be vague about agency.)

(2) How could people ever think passives have something to do with tense? (Remember, it has happened before.) The passive construction is so obviously independent of tense: I am irritated by such things is a present tense passive while I was irritated by such things is a preterite tense passive — the tense contrast is orthogonal to the voice contrast.

More broadly, (3) why do people in public fields like journalism attempt to use grammatical terminology — even in contexts where it is not strictly necessary — when they do not control it well enough to tell a passive clause from an existential clause? And (4) how could grammatical education in this country have fallen to such a low point that you can refer to an active existential clause as being in the passive tense when speaking to millions of educated people and almost no one who is not a Language Log reader will even notice, let alone choke on their morning coffee?

Posted by Geoffrey K. Pullum at 12:53 PM

Dialect variation in the terminal flourishes of Flemish chaffinches

I've gotten a great deal of mail in response to my post about the Belgian finch-tweeting contests known as vinkensport ("Watch out for those Wallonian finches", 5/22/2007), and I've also done some additional research of my own -- so far only in the form of a literature survey, as my vinkensport CDs have not yet arrived. One note of particular interest came from Ivan Lietaert:

Finches sing a phrase (song) and they repeat it again and again. The so called Flemish finch ends each phrase with "suskewiet", which is an onomatopeia. During competitions, people count the number of 'suskewiets' they hear, and the highest count is the winner. The French or Walloon finches do not end their song like that, and therefore cannot be used in competitions. The word 'Walloon' here does not really refer to the French speaking community in Belgium, but refers to its older, etymological meaning being: speaking a foreign/different language (here: singing a different song). (This older meaning is also present in the term Wales (GB) and Wallis (Switserland). So the word French is out of context here.

By the way, none of this a joke. It is actually a very serious matter for finch lovers.

In support of Ivan's suggestion for the possible interpretation of Walloon as "foreign", the OED's entry gives the etymology:

a. F. Wallon (fem. Wallonne), n. and a.:—med.L. Wallōn-em, f. Teut. *walah, walh, foreigner (OE. wealh): see WELSH a. The name represents the appellation given by the Teut. Flemings and Franks to their Romanic-speaking neighbours.

Similarly for vlach we get:

Bulg. vlakh or Serb. vlah, = OSlav. vlakhŭ Romanian, Italian, Czech vlach Italian, Pol. włoch Italian, wołoch Walachian, ORuss. volokh Walachian, Italian; these terms are Slavonic adoptions of the Germanic walh (OHG. walh, walah, MHG. walch; OE. wealh) foreigner, applied especially to Celts and Latins.

And for walnut:

OE. walhhnutu str. fem. = WFris. walnút (NFris. walnödd from Da.), MDu. walnote (Kilian walnot), Du. walnoot, MLG. wallnot, -nut, LG. (Bremisch. Wörterb. wallnutt) walnut, G. walnuss (earlier wallnuss), ON. valhnot str. fem. (Norw. valnot, Sw. valnöt, Da. valnød). The first element is OTeut. *walχo-z (OE. wealh, OHG. walah) ‘Welshman’, i.e. Celtic or Roman foreigner.

The AHD entry for Wales agrees:

Although Celtic-speaking peoples were living in Britain before the arrival of the invaders from Friesland and Jutland whose languages would eventually develop into English, it was the Celts and not the invaders who came to be called “strangers” in English. Our words for the descendants of one of the Celtish peoples, Welsh, and for their homeland, Wales, come from the Old English word wealh, meaning “foreigner, stranger, Celt.” Its plural wealas is the direct ancestor of Wales, literally “foreigners.” The Old English adjective derived from wealh, wælisc or welisc, is the source of our Welsh. The Germanic form for the root from which wealh descended was *walh–, “foreign.” We also have attested once in Old English the compound walhhnutu in a document from around 1050; its next recording appears in 1358 as walnottes. This eventually became walnut in Modern English, which is thus literally the “foreign nut.” The nut was “foreign” because it was native to Roman Gaul and Italy.

I'm not sure whether today's vinkeniers interpret "Walloon" in its etymological sense of "foreign" -- they seem sometimes to use "French" as an equivalent, at least if the New York Times report can be trusted, which makes this seem less likely. But Ivan's note may still point to explanations for several things that puzzled me about the original story.

I especially wondered how the chaffinch (Fringilla Coelebs) could wind up singing such a short and stereotyped song in these competitions, given descriptions of that species' song such as this one (from Albertine Leitão, Tom J. M. van Dooren, Katharina Riebel, "Temporal variation in chaffinch Fringilla coelebs song: interrelations between the trill and flourish", Journal of Avian Biology 35 (3), 199–203, 2004):

Each male chaffinch has a small repertoire of one to six distinct song types, but mostly two to three (Slater 1981). Although chaffinch song types differ substantially in phonology, all have a trill (two to five phrases of repeated syllables) segment followed by a terminal flourish (a shorter sequence of mostly non-repeated elements ...).

The length (about 2.5 seconds) and structure (15-20 elements organized as 1-5 repetitions of each of 5 or more types) of typical chaffinch songs seem too complex to be rendered by a two-syllable onomatopeia [suskwit] -- here's Figure 3 from the Leitão et al. study:

And an audio clip of chaffinch song (recorded in Cologne) is here, so you can listen for yourself.

Ivan indicates that a Flemish finch in competition is supposed to "end each phrase with 'suskewiet'" -- perhaps this is what the birdsong researchers call the "terminal flourish"? -- so that by counting suskewiets, you can count song repetitions. This allows the rest of the song to be variable, and also accounts for the emphasis on the importance of a stable terminal pattern. (I should have gotten this from the original NYT article, which referred to suskewiet as "the conventional transliterations of the final chirp in the bird's call".)

Leitão et al. continue:

Within a population, many different song types exist, and variation within one population can be as large as between populations (Slater et al. 1984).

This makes it unclear how there could be stable "Flemish" and "Walloon" finch songs, at least in the sense of a difference characterizable so simply as "susk-e-wiat" vs. "susk-e-wiet"; and my curiosity was further aroused by reading in a study of song variation among chaffinches in New Zealand (where the species was introduced by British colonists quite recently) that

...we estimated that approximately 10 syllables per generation enter each population as a result of immigration [and] [a]pproximately 20-30 new variants of syllables arise [by mutation] in a population each generation.

(Alejandro Lynch, Geoffrey M. Plunkett, Allan J. Baker; Peter F. Jenkins, "A Model of Cultural Evolution of Chaffinch Song Derived with the Meme Concept", The American Naturalist, 133(5) 634-653, 1989.

Again, if only the terminal flourish is controlled -- and perhaps only some aspects of it -- this makes it easier to believe in a stable aspect of regional variation. But the re-interpretation of "Flemish" and "Walloon" as "our kind" and "the foreign kind" also helps: perhaps the "suskewiet" flourish is not actually an invariant characteristic of wild finches in Flanders, but rather is a crucial characteristic of finches that can be used in Flemish vinkensport competitions. This idea is confirmed by a quote in the NYT article that started this off (Dan Bilefsky, "One-Ounce Belgian Idols Vie for Most Tweets Per Hour", NYT, 5/21/2007)

“In Belgium, even the birds sing in different languages,” said Romain Furniere, 70, a veteran vinkenier, who recalled that when he was a young man in western Flanders, he and his friends would capture finches in the wild but release those that sang “in French.”

In other words, an aspect of natural variation in the songs of finches, in Flanders as elsewhere, has (I conjecture) been re-interpreted in terms of the locally salient concept of ethno-linguistic allegiance.

Another relevant factor is that vinkeniers reinforce their birds' "Flemish" dialect by training them with recordings or by exposure to "master teachers":

Vinkensport also has exasperated animal-rights activists like Jan Rodts, director of the Flemish Bird Protection Society, who accuses vinkeniers of “brainwashing” their finches into performing by forcing them to listen to recordings of susk-e-wiets played over and over.

In 2002, the society won a case it filed at the Belgian Constitutional Court to prevent the government from relaxing a 1979 European Union law banning the capture of wild finches. Belgian vinkeniers breed 10,000 finches for competition each year.
But Mr. Rodts says clandestine hunts remain widespread in western Flanders, because finchers believe that wild birds sing better than those bred in captivity.

“These cages are like miniprisons, and the birds are not happy,” he said.

Mr. Santens retorts that one of his birds lived 24 years, while finches typically live 3 years in the wild.

He trains his finches in his backyard, where he places the caged birds for several months in a heated aviary alongside Briek, a “master teacher” whose perfectly tuned susk-e-wiets, he boasts, have few equals in Flanders.

This process will obviously produce a kind of homogeneity not found in the wild, especially given that the training is combined with a process of selecting those birds who respond to it in the desired way.

[Another reader wondered whether there any connection between the onomatopoeic finch-call susk-e-wiet and Suske en Wiske, a Belgian comic-book franchise whose name is based on the Flemish nicknames for Francis and Louise. This seems unlikely, but I invite correction.]

[Update -- Alain van Hout writes:

Let me start my mentioning how much I enjoy reading Language Log. I'm a Ph. D. student in Behavioural Biology at the University of Antwerp, with a large interest (but little expertise) in linguistics.

I especially wondered how the chaffinch ( Fringilla coelebs) could wind up singing such a short and stereotyped song in these competitions

I think the main reason for this is in fact that because chaffinch song has such stereotypical structure, it is relatively easy to determine the quality (sensu winners vs. losers) of a chaffinch's song. Something that has not been mentioned, I think, is that vinkeniers only receive credits for fully formed song bouts. The main goal of the contest is therefore for the chaffinche to have a high song rate, combined with well structured, stereotypical song.

perhaps the "suskewiet" flourish is not actually an invariant characteristic of wild finches in Flanders, but rather is a crucial characteristic of finches that can be used in Flemish vinkensport competitions.

This process will obviously produce a kind of homogeneity not found in the wild, especially given that the training is combined with a process of selecting those birds who respond to it in the desired way.

Vinkeniers do try to improve the 'quality' of the finches' song by exposing them to chaffinchsong during the right developmental period (usually referred to as the 'sensitive period', for obvious reasons). However, most wild chaffinches in Flanders nevertheless do produce the 'suskewit' at the end of a large portion of their song bouts. Conversely, I do not remember ever having heard any Wallonian chaffiches use it. Although I am not aware of any research into the biological cause for this difference, one possibility is that a female preference exists for those terminating elements. This mechanism could be sufficient to make Flemish male chaffinches less successful in Wallonia and vice versia, and create two somewhat isolated populations. This is nothing but guesswork however.

a difference characterizable so simply as "susk-e-wiat" vs. "susk-e-wiet"

A small note on this: I have always heard the difference referred to as "suske-e-wiet" vs. "sisk-yew", which isn't that much more of a difference, but possible sufficient for the female chaffinches to successfully spot the difference.

I've learned from talks about songbirds of other species that a female will prefer a male who sings a wider variety of songs, but also, other things equal, will also prefer males who sing songs similar to the songs her father sang. I'm not sure whether these preferences have been demonstrated in the case of the chaffinch, though. ]

Posted by Mark Liberman at 06:08 AM

Quite who talks likes this? Not me!

According to "Unwelcome Guests", Guardian Unlimited, 5/25/2007,

Quite who Fatah al-Islam are, or where they came from, is a matter of dispute.

That quite took me by surprise, though I'm not sure why. "Just who..." would have been fine, or "Exactly who..."

And I have no problems with "quite who" in complement clauses, like these from the current crop on Google News (though I think I tend to use "exactly" for the same function, myself):

“I wasn’t sure quite who to call,” she said.
... if we never know quite who this man is, we still can't stop thinking about him.
We wonder quite who he could have been talking about.

Likewise, "quite who" following a form of "to be" seems OK to me (as long as quite's polarity requirements are met -- see below):

No one is quite who they seem to be in "The Last Time," a twisty drama set against the backdrop of the high-pressure advertising business.
All of the supporting cast excel, although the most impressive are Sebastian Koch as a Nazi who sees the war's end coming and tries to be kind to the Dutch people, and Thom Hoffman as a Resistance member and a doctor who is not quite who he appears to be.

But "quite who" in subject position seems weird to me. Could it be a British thing?

In the Language Log archives, we have three examples of {"just who"}, and two of {"exactly who"}, but none of {"quite who"} in any context.

In the NYT archives since 1981, there are 11 examples of "quite who", of which only one is in subject position (Paul Griffiths, "Finding Monteverdi in the Now", 6/7/1998):

''ARIANNA,'' which will be given its first American performance today by the Opera Theater of St. Louis, is not exactly a new work and not exactly an old one. Quite who wrote it is also in doubt.

(I gather from various internet clues that Paul Griffiths is either from the UK, or has spent some time there.)

In contrast, there are 1,103 uses of "just who" in the NYT archive since 1981 -- 100 times more than "quite who" -- and many of them are in subject position, like these:

Just who is in charge there?
But just who most benefits has never been determined by scientific research, Dr. Henry said.
Just who uses mag crews is in dispute.

Again, for me, quite doesn't work in such examples: "Quite who is in charge there?" No, I don't think so.

The NYT since-1981 archive has 829 examples of "exactly who", including subject-position examples like

Exactly who are we?
Exactly who removed Alatriste not only from the famed painting but from a play written about the Breda siege?
Golden tells us that the admissions process, at least at the 100 top colleges and universities, is not a meritocracy — and exactly who thought it was? — but a marketplace.

Ditto for these -- "Quite who are we?" No way.

If we search the Guardian's current crop of articles, we find 55 examples of "quite who", and many of them are in subject position (in a matrix clause or in a sentential subject):

Quite who to believe in this battle of the brickwork is something of a dilemma.
Quite who would recommend Albert Luque to anyone is as unknown as Casper Bellyodyssey (and you've never heard of him, have you?), but the flaky Spanish winger is contemplating a move away from Newcastle.
Quite who is doomed, and why, has varied with the development of his thought.
Though quite who else might buy such a shade of pink is uncertain.
Quite who Vieira would play with in Madrid remains to be seen.
Quite who would look good in an argyll-print jacket with giant fur sleeves will remain a mystery...
Quite who is saving money here?

Some of these don't seem so bad to me -- especially the subjects of sentential subjects -- but I think I'm beginning to suffer the fate of any open-minded person who ponders a few dozen examples of a foreign usage.

The same Guardian archive finds 450 examples of "just who" -- that's about 8 times more, but you'll recall that the NYT had 100 times more in the corresponding comparison. So the hypothesis that "quite who" is a Britishism in general, and especially in subject position, seems to be confirmed.

What I can't figure out is why Americans should object to "quite who" in subject position but not elsewhere. It seems to have something to do with polarity -- thus my judgments are:

I don't know exactly who is responsible.
I know exactly who is responsible.

I don't know quite who is responsible.
*I know quite who is responsible.

But I believe that speakers of British English have the same pattern -- at least, the Guardian has

"I|you|we|they don't know quite who|what|where|why|how" 6
"I|you|we|they know quite who|what|where|why|how" 0
"I|you|we|they don't know exactly who|what|where|why|how" 24
"I|you|we|they know exactly who|what|where|why|how" 85

Do British speakers have different rules about the scope of polarity-licensing operators? Or is (this sense of) quite not really a polarity item for our British cousins, despite the evidence in the table above? Perhaps some well-informed and sociolinguistically-inclined syntactician or semanticist will enlighten this befuddled phonetician.

[Update -- Lynne Murphy has a lovely follow-up over at Separated by Common Language (quite wh-), which features comparative British and American counts of total and sentence-intitial quite, exactly and just, for how, why, what and who, along with some speculations about possible connections to trans-Atlantic differences in the meaning and use of other kinds of quite.]

[Update -- Chris Brew writes:

Your Guardian example

  Quite who Fatah al-Islam are, or where they came from, is a matter of dispute.

works fine for me. I am a little discomfited by the number agreement, because I am in the process of internalizing the "<Organization> is " versus "<Organization> are" distinction that my British upbringing led me to fall on the opposite side of from most Americans. But "Quite who" is fine in subject position. As are "Quite what", "Quite how" and "Quite why".

It may be somehow relevant that "quite" in British is more often in the sense which means somewhat (e.g. "Their argument was quite reasonable. but flawed") than in the sense (more common here, I think) of "Quite perfect/excellent/" which more or less implies having gone all the way along that dimension.

If I wrote sentences starting with things like "Quite who do you suppose would accept this paper?" and so on, I'd definitely be being obnoxious. So I try not to.

Well, "Exactly who do you suppose would accept this paper?" is not especially tactful, either...]

Posted by Mark Liberman at 08:55 PM

Journalists' quotations: unsafe in any mood

Following up on Barbara Partee's note about "Baseball conditionals", I've gotten a lot of mail full of great examples and insightful analyses, and I'll summarize it all for you as soon as I have some time. What worries me, though, is that most of the examples come from quotations in the newspapers, and experience suggests that quotations in the newspapers are almost as fictional as the dialogue that I cited from Elmore Leonard's novels.

A little searching in Google News confirmed my suspicion in this case: here's David Ginsburg's version from the AP, compare to Allan Ryan's version from the Toronto Star:

David Ginsburg, AP "He could have been a little rusty early on, and then the inning he gave up four runs I think he kind of lost his composure a little bit," Orioles manager Sam Perlozzo said. "He just did a little damage control in that situation, we're OK."
Allan Ryan, Toronto Star "He could've been a little rusty early on because it's, like, his seventh day," said Baltimore manager Sam Perlozzo. "He gave up four runs and I thought he lost his composure a little bit and wasn't able to do damage control for us."

"Wait", you're probably saying to yourself, "Perlozzo probably used similar but not identical phrasing in two different interviews. Surely journalists aren't so cavalier with direct quotations, even in sports stories."

But if you said that, you'd be wrong. As I understand it, quotes like Perlozzo's usually come from group post-game interviews, staged in a press-conference-like format. And the press reports of direct quotes from these interviews are remarkably divergent, from one another and from a careful transcription of the recorded interview (which is sometimes available on line). For an example with some discussion, see "What did Rasheed say", 6/23/2005; "Ipsissima vox Rasheedi", 6/24/2005; "'Quotations' with a word error rate of 40-60% and more", 7/30/2005.

Does this affect the linguistic point that Barbara and all my other correspondents are interested in? Well, sort of. Making a quick scan of today's baseball news, I found a counterfactual conditional with the first clause in past indicative (Phils beat Marlins in wild finish; closer Myers hurt, Sports Network, 5/24/2007):

Then, on Myers' second pitch to Miguel Olivo, the right-hander uncorked a wild pitch and immediately grabbed his right shoulder before leaving the game with a strained right shoulder.

"Something wasn't right when I threw the first pitch to Olivo," Myers said. "Then the second (pitch), I couldn't stop it. I didn't even know (where my control) was going. I don't think (my shoulder) popped or tore because if it did I wouldn't have any strength in it right now, but I have strength in it."

But instead of "if it did", the version quoted by Todd Zolecki in the Philadelphia Inquirer ("Phils Win a Disaster", 5/24/2007) has "if it would have":

After the Phillies had blown a four-run lead in strikingly improbable fashion, Myers uncorked a wild pitch to Miguel Olivo and immediately grabbed his upper right arm. He left the game.

"I felt it after the first pitch to Olivo," Myers said, seated in a chair in the visitors' clubhouse with his shoulder wrapped in ice. "I didn't even know where the second one was going. I don't think [the shoulder] popped or tore. If it would have, I wouldn't have any strength in it right now. I have strength in it. I'll just find out how it feels tomorrow. A couple days, maybe? I don't know."

That doesn't detract at all from the linguistic interest of the examples and their analysis. But it means that the examples represent one dialect and register -- the players and managers, speaking in interview style -- filtered through the perception, memory, and note-taking abilities of speakers of different dialects using a different register -- the journalists, writing in newspaper style. And maybe, in some cases, a copy editor has intervened as well.

Whatever the source and the chain of transmission that brings them to us, these examples definitely fall into consistent patterns. In this mornings's email from Barbara Partee:

Oh no, look what they have followed up with today! This has not only an indicative counterfactual, but a haplologous "if". And the third sentence in the quote shows that this pitcher does have "would" in his grammar and can use it when he wants to (David Ginsburg, AP Recap of May 23 Baltimore-Toronto game):

"McGowan (0-2) gave up three runs and eight hits in six innings. The right-hander pitched well enough to win, but his poor toss during the rundown proved costly.

"I'm not sure if I make a good throw I get him," McGowan said. "I tried to hurry it up, dropped my elbow and it just sailed on me. It'd have to be a perfect throw."

(That was the Toronto pitcher made the bad throw and Baltimore won this one!)

As far as I can tell from a quick scan of Google News, no one else reported this particular quotation from McGowan. However, a Google search for {"if I make a good throw"} turns up 1,470 examples like these:

"I knew that we should have at least gotten two," Wilson said. "A play like that, I don't think you ever expect to get a triple play. You never know. If I make a good throw, there's probably a good chance that Ty makes that play and you end up with three outs right there. But I threw low, and we only ended up with one. It really cost us."

"I jumped for the ball, I should have made the play, I have to take the heat and I'm man enough to take the heat," Perez said. "If I make a good throw, the game is tied. I didn't make a throw and that's it."

"The throw to second, that was the whole game right there," admitted Rivera. "I didn't have a good grip on the ball. If I make a good throw, he's out by far. It just got away from me."

“That whole inning was tough,” Lidle said. “Anderson looked like he was out at first on the replay, and if I make a good throw to second it’s an easy double play. It was just a frustrating inning.”

Interestingly, leaving out the "if" only turns up one paratactic example:

"I make a good throw, we should be playing right now," Guillen said. "I make a bad throw. That's it."

And Jose Guillen, from the Dominican Republic, is probably not a native speaker of English.

Does this mean that baseball players and coaches don't use the if-less form much, in this particular context? Or does it mean that sportswriters and editors usually add "if" to paratactic conditionals in such cases? I'm not sure.

[Update -- Language Hat writes:

You write "That doesn't detract at all from the linguistic interest of the examples and their analysis." But I think it does. It's not necessarily the case that sportswriters are simply rewriting utterances in their own dialect and register; they may well be rewriting them according to their ideas of how baseball people "should" talk. This phenomenon is inescapable when reading pre-civil-rights writing by white people trying to represent black speech, and to the extent that it occurs it invalidates any linguistic interest aside from a peripheral curiosity about imaginary dialects.

He's basically right, as usual. But in my opinion, it's a bit more complicated. The syntactic patterns in journalistic quotations can still have several sorts of interest: they may suggest hypotheses to test in conversational transcripts from more reliable sources; and when we share the patterns in question -- in this case, most varieties of paratactic conditionals are well within my own norms for informal conversation -- our reactions to the journalistic examples can lead us to see new things.

At least in the sports context, the cases where I've checked journalists' quotes against a recording show mainly (a) extensive ellipsis and reordering, (b) random substitutions and paraphrases apparently caused by memory errors or careless note-taking, and (c) partial editing towards standard-English norms. Unlike the case of "imaginary dialect creation", errors of these kinds will leave some interesting cases of paratactic (and other non-standard) conditionals more or less intact, or at least will not create too many imaginary examples. ]

[Update #2 -- another reader writes:

When I worked as a managing editor at a small paper with well-known sports coverage, it was routine for the sports reporters to clean up the grammar of athletes and coaches they interviewed, particularly coaches. (As long as the grammar-clean didn't change the meaning of the quote.) They got better repeat access, particularly with the coaches, when they did so. Even athletes who TALKED like no-necked neanderthals didn't like to be REPORTED as talking like no-necked neanderthals. My understanding is that this is reasonably common in the sports reporting world. In fact, one way sports reporters took minor vengeance on figures they didn't like was to report their bad grammar exactly as spoken.

On a related note, re: Jose Guillen:
Multi-lingual reporting is a tricky issue with a lot of ethical implications; international reporters and city beat reporters on immigrant-heavy beats could talk really interestingly about that, since there's a lot of debate about when you should say, "'He shoved me,' she said, in Spanish." vs. "'He shoved me,' she said." with no indication of language used. It's a particularly hot topic in cities where immigration is a tense issue and choosing to identify the language spoken when reporting a crime story could even induce jury pool bias! This is not the kind of beat I ever covered, so my knowledge of the topic is fairly general, but I did listen to colleagues debating it extensively.

My understanding is that different papers have different official policies about how much "cleaning up" of what type is possible and/or mandatory, but that the rules are not very uniformly enforced, even at places like the New York Times.

One crucial point here is that things like evaluation of someone's accent, and even of their general perceived social status, may have a lot to do with how much 'cleaning up" a writer feels is appropriate. And in some cases, there is even a sort of "dirtying down", which can take the form of "eye dialect" mis-spellings of standard pronunciations like "sez" or "an'". I say this as someone whose informal conversation could very easily be presented as that of a "no-necked neantherthal". Of course, I'd probably be amused or even pleased if someone decided to go that way in writing about me.]

[Update #3 -- Jim Lewis wrote:

There's another reason, I think, why reporters regularly clean up quotes, and that's simply that people rarely speak in complete sentences, and often not in coherent ones. If you look at a word for word transcription of normal conversation (as I would imagine you have), it's full of sentence fragments, stops and restarts, ums and uhs, interruptions, switching tenses midstream, and so on -- all sorts of noise.

Jim gave more details about the incoherence of spontaneous speech rendered as text, but editing out filled pauses and false starts is not what we're talking about here. In one of the posts that I cited at the start of this article ("Ipsissima vox Rasheedi", 6/24/2005), I gave a specific example of how to clean up such disfluencies honestly -- which none of the reporters whose stories I quoted had bothered to do. I don't mean that they intended to communicate something that is false, in those cases, just that they didn't exert the minimal effort required to create accurate (if appropriately edited) direct quotations. Of course, sometimes the effect of such "approximate" quotations is to communicate a falsehood, and perhaps that's the intent as well, I don't know -- for an example, see "'Approximate' quotations can undermine readers' trust in the Times", 8/27/2005.

Chris Phipps writes:

Reading your post about Journalists' quotations on Language Log reminded me of a dissertation one of my colleagues wrote for the U. Buffalo linguistics department: Krainz, Stacy - Professionalism vs. Audience Appeal in News Analysis Discussions: A Journalist's Dilemma. (9/02)

One of the sub-topics Stacy covered was the differences between "direct" quotes and what she called "constructed dialogue" or "the rendering of what another speaker has said from a first person (direct speech) perspective" which she found prevalent in political roundtable discussion (which I suspect, is close in style to sports reporting-speak).

I think her basic point was that reporters try to create a sense of "interpersonal appeal" so they unknowingly modify the quotes of others.

If you're interested in following up, here's Stacy's GURT paper on the topic:

Prof. Krainz's paper focuses on the methods that broadcast journalists use to generate and maintain audience interest in round-table discussions. One of those methods is what she calls "constructed dialogues" -- but these are more like puppet-theater personifications of the positions of others, as in an example she quotes of a reporter re-imagining Ken Starr's request to a judge as "So he goes to the judge in the Paula Jones case, and he says to her 'Look, stop those depositions. Shut em down. My case is more important than that case. ...'" Her discussion is interesting, but I don't think it's relevant to the examples under discussion, where the reporters are not supposed to be acting the part of someone in the news, but rather claim to be supplying a direct quote. Of course, you could argue that this is the whole problem -- the journalists involved actually do think that their job is to make the whole story up. But I don't believe this -- they're not dishonest, they're just lazy. Or rather, their culture doesn't give them any incentive to take the trouble to be accurate, so they take the easy route of passing half-remembered paraphrases off as direct quotations.]

Posted by Mark Liberman at 08:19 AM

Punctuation, now with heightened indifference!

Proposals to supplement the arsenal of English punctuation have historically been about as successful as proposals for epicene pronouns — which is to say, not successful at all, despite the enthusiasm of the proposers. Perhaps the best-known of the failed punctuation experiments is the interrobang, devised by an ad executive in 1962 to indicate a mixture of query and surprise by fusing a question mark with an exclamation point. The latest quixotic concoction, from Mark Healy of the Toronto-based firm Torque Market Intelligence, is the pomma point, an indicator of "mild excitement" that looks a bit like an exclamation point that can't be bothered to stand up straight. You can watch a slide show where Healy makes the case for filling this niche in our punctuation ecology:

No punctuation mark currently exists in the English language which connotes a feeling of mild joy, vague happiness, or heightened indifference. ... A new punctuation mark is required for this new age where we are defined by our lack of true highs and lows.

Thanks to Nancy Friedman and Martha Barnette for the pointer to the pomma point. A commenter on Friedman's blog Away With Words was reminded of a recent poem by Paul Violi that calls for a new punctuation mark to accommodate the surprise of defeated expectations:

Appeal to the Grammarians
by Paul Violi

We, the naturally hopeful,
Need a simple sign
For the myriad ways we're capsized.
We who love precise language
Need a finer way to convey
Disappointment and perplexity.
For speechlessness and all its inflections,
For up-ended expectations,
For every time we're ambushed
By trivial or stupefying irony,
For pure incredulity, we need
The inverted exclamation point.
For the dropped smile, the limp handshake,
For whoever has just unwrapped a dumb gift
Or taken the first sip of a flat beer,
Or felt love or pond ice
Give way underfoot, we deserve it.
We need it for the air pocket, the scratch shot,
The child whose ball doesn't bounce back,
The flat tire at journey's outset,
The odyssey that ends up in Weehawken.
But mainly because I need it—here and now
As I sit outside the Caffe Reggio
Staring at my espresso and cannoli
After this middle-aged couple
Came strolling by and he suddenly
Veered and sneezed all over my table
And she said to him, "See, that's why
I don't like to eat outside."

Copyright © 2007 by Paul Violi.

(As a New Jersey boy, I find it mildly irritating that Weehawken should serve as the terminus for the failed odyssey Violi imagines. It's possible that he is making a historical allusion to Alexander Hamilton's notorious Weehawken disappointment, but more likely he chose that particular place name because it's in New Jersey and it sounds funny — much as Bugs Bunny was once disappointed to learn that the penguin he was escorting to the South Pole actually hailed from Hoboken.)

The inverted exclamation point that Violi would like "the grammarians" (who?) to repurpose has already been pressed into various typographic uses, most famously in Spanish to introduce exclamatory sentences. A few years ago Josh Greenman in Slate proposed that the inverted exclamation point (or a subscripted version thereof) should be used in English as a "sarcasm point." "My fellow Americans," Greenman declaimed, "we need to embrace a new punctuation mark — one that embraces the irony and edge of contemporary conversation and clarifies rather than condenses or confuses." As it happens, the inverted exclamation point is already used to convey sarcasm in another writing system. The Ethiopic script has a mark known as Temherte Slaq, used for sarcastic purposes in editorial cartoons and the like. The Unicode representation of the punctuation mark is evidently under debate among Ethiopian scholars (see: "A Roadmap to the Extension of the Ethiopic Writing System Standard Under Unicode and ISO-10646," PDF, HTML cache).

In a similar vein, French and Dutch writers have toyed with irony marks, though none have caught the public fancy. Not that literary experimentation is necessarily a dead end: the impulse for emoticons can be found germinating in Ambrose Bierce's snigger point and Vladimir Nabokov's supine round bracket. And emoticons, though not punctuation marks in the traditional sense, do represent the most successful extension of English typography in modern times, much to the chagrin of the Lynne Trusses of the world. (In Eats, Shoots & Leaves, Truss calls emoticons "a paltry substitute for expressing oneself properly ... designed by people who evidently thought the punctuation marks on the standard keyboard cried out for an ornamental function.") The pomma point may never catch on, but millions of online communicators would easily recognize another symbol of heightened indifference (if not mild excitement)... :-|

News from the further reaches of Ellipsilandia

Our Young Eric has mischievously tossed us a playful instance of Verb Phrase Ellipsis (VPE) in his latest posting:

All completely unnecessary, if you ask me (though, of course, nobody did ___ or is ___).

Both did and is are missing their complement VPs (indicated by the underscores above), which we can supply as base-form ask me (as the complement of supportive DO) and present participial asking me (in the progressive construction), respectively.  The antecedent VP is the present-tense ask me (bolded above).  Base-form omitted VP with a present-tense antecedent is generally unproblematic, but present-participial omitted VP with a present-tense antecedent is on the iffy side; the VPE here calls attention to itself:

All completely unnecessary, if you ask me (though, of course, ...
    [ok]  nobody did ___).
    [?]  nobody is ___).

Call the latter pairing the Bakovic Configuration:

Ellipsis: present participle
Antecedent: present tense

We've commented on a somewhat similar case on Language Log before, in John McWhorter's also playful:

[?] One could write a whole paper on it (and, as it happens, one is ___!).

Call this the McWhorter Configuration:

Ellipsis: present participle
Antecedent: base form

(McWhorter is also playing with two different uses of one, as I pointed out in that earlier posting.)

Here's a recent instance of the McWhorter Configuration, from a letter to The Advocate (4/20/07, p. 8) from Robert Barzan of Modesto, Calif., about the development of gay life there:

[?] If this can happen in Modesto, it can happen anywhere, and it is ___.

This one seems to have been written entirely seriously.  It does have an additional point of interest, the quantificational semantics of anywhere.  Mechanically filling in the omitted VP gives us:

... it can happen anywhere, and it is happening anywhere.

This won't fly at all.  The problem is that any-expressions come into (at least) two flavors, and neither fits in this context.  There's a kind of existential reading for anywhere (and other any-expressions), but it's limited to "negative polarity" contexts: under negation ("It's not happening anywhere"), in yes-no questions ("Is it happening anywhere?"), and in conditionals ("If it's happening anywhere, it's happening in Modesto"), in particular.  Then there's a kind of universal reading for anywhere (and other any-expressions), involving "free choice any" ("It can happen anywhere" 'Pick any place X; it can happen there -- that is, in X'), but it too is limited in its contexts; the facts are complicated, but for  anywhere in a VP, the preferred contexts have certain modals, as in "It can/could/would/might happen anywhere".  "It is happening anywhere" fits neither pattern.

To get something like the intended reading, we need to supply an explicit universal, something like:

... it can happen anywhere, and it is happening (almost) everywhere.

How to get this result, and in a general way, is a non-trivial task for semanticists.  Fortunately, I don't pretend to be a real semanticist, so I can pass on to an even more startling, and entertaining, piece of data, the title of Stephen Colbert's new book, which Lee Beck tells me (5/20/07) is, with the relevant parts marked as before:

I Am America (And So Can You ___)

This would appear to be just an omitted base-form VP (be America) with a present-tense antecedent (am America), a configuration that I described above as "generally unproblematic".  But Colbert's title, though (eventually) interpretable, is massively problematic.  To put off some complexities, I'll shift to a slightly different version, with additive adverbial too rather than so:

I Am America (And You Can ___ Too)

Compare this tortured ellipsis with the much better:

I Love America (And You Can ___ Too)

The problem is with omitted base-form VPs with head verb BE.  And it's general.  Colbert's original has BE + predicative NP, but the same problem arises with predicative AdjP, predicative PP, present participle VP in the progressive, and past participle VP in the passive:

AdjP:  I Am American (And You Can ___ Too)
PP: I Am At My Peak (And You Can ___ Too)
VP in progressive: I Am Wearing a Hat (And You Can ___ Too)
VP in passive: I Am Being Praised (And You Can ___ Too)

All of these are much improved if the be is preserved, that is, if only the complement of be is omitted in VPE:

NP: I Am America and You Can Be ___ Too
AjpP: I Am American and You Can Be ___ Too

(Remember from earlier discussions that, although the construction is called "VPE", the omitted material is not necessarily a VP.  Labels are not definitions.)

In any case, there's a constraint on VPE against omitting certain VPs headed by BE; call it the VPE BE Constraint.  The details aren't crucial to an analysis of Colbert's title, so I'll pass on to the original, with so instead of too.

The thing with additive adverbial so is that (a) it has to be initial in its (elliptical) clause (while too is final) and (b) it requires the "inverted" auxiliary + subject order (while too has the default subject + auxiliary order):

I Can Be America (And ...
     So Can You ___    *Too Can You ___
    *So You Can ___    *Too You Can ___
    *You Can ___ So      You Can ___ Too )

and, significantly, for many speakers (c) it requires ellipsis of the complement of the auxiliary (while too does not):

I Can Be America (And ...
     So Can You ___               You Can ___ Too)

I Can Be America (And ...
    *So Can You Be ___          You Can Be ___ Too)

I Can Be America (And ...
    *So Can You Be America   You Can Be America Too)

Call this last condition the SO Ellipsis Requirement.

Now, when we move away from these fully parallel examples to ones with a present-tense verb in the antecedent, we're confronted with a conflict between the two conditions: the VPE BE Constraint says not to omit, in this context, the be of you can be America, but the SO Ellipsis Requirement says you must omit it, in general.  Colbert omitted the be, violating the first condition; preserving the be (and violating the second condition) gets you:

[??] I Am America (And So Can You Be ___)

I find this marginally more acceptable than Colbert's choice, but less entertaining (because his presents a little puzzle for you to solve).

On the other hand, maybe Colbert was just treating BE as if it were a transitive verb -- "What do you do to America?  I be America!" -- in which case the elliptical clause (So Can You) is impeccable, but the preceding main clause (I Am America) is peculiar.  Colbert being Colbert, maybe we shouldn't try to decide between these possibilities.

zwicky at-sign csli period stanford period edu

Baseball conditionals

I had two breakfast meetings this morning, which cut into my usual blogging time, so I apologize for neglecting an inbox full of fascinating messages about Belgian finch dialects, jihad and war etymologies, linguistic lolcats, redundant prepositions, and other topics as well. In the few minutes before my next appointment, though, I do have time to pass along a truly wonderful quotation. Under the subject line "Baseball conterfactuals", Barbara Partee wrote:

Here's an example of the now-common way of expressing counterfactual conditionals among baseball players and managers that's so extreme I had to read it twice before catching on that it was a counterfactual; the preceding sentence and the play-by-play show that it must be, and it's grammatically consistent with other slightly less extreme examples I encounter almost daily.

(Structure: Clause1, clause 2. -- both plain indicatives. Interpretation: if clause 1 had been the case, clause 2 would have been the case.) Oh, I think what makes this one a little unusual is that the first clause is in the past tense. I think usually they're both present tense.

"He could have been a little rusty early on, and then the inning he gave up four runs I think he kind of lost his composure a little bit," Orioles manager Sam Perlozzo said. "He just did a little damage control in that situation, we're OK."" (AP Recap of Toronto 6, Baltimore 4, game of May 22, David Ginsburg, AP Sports Writer)

Perlozzo means that if the pitcher (Daniel Cabrera) had kept his composure and done a little damage control in the fifth inning, his team would have been OK. That's Barbara's analysis, and she's obviously right.

I think of this general class of constructions as "Elmore Leonard conditionals". Here are a couple of examples that I cited in an earlier post ("Parataxis in Pirahã", 5/19/2006):

"What're you having, conch? You ever see it they take it out of the shell? You wouldn't eat it."

"Listen," Renda said, "we get to a phone we're out of the country before morning."

The first one is counterfactual, and has the additional feature of an implicit when-clause: "If you had ever seen [a conch] when they take it out of the shell, you wouldn't eat it." The punctuation may imply a rising intonation on the clause ending in "out of the shell", and may also suggest a missing initial "did".

The second example is a more ordinary present-tense case, equivalent to "if we get to a phone, we'll be out of the country before morning".

I don't think there's any special connection between the national pastime and paratactic conditionals of this kind, whether counterfactual or otherwise. But baseball is an activity where post-game counterfactual reasoning is half the fun, and it's carried out by the same kind of (vernacular American) people that Elmore Leonard writes dialogue for.

Exercise for the reader -- analyze the functionally similar (but structurally different) example at the end of the same article:

"We just needed one fly ball or a base hit with the bases loaded, and we've got a different ballgame," Perlozzo said.

Politics and Lexical Semantics

A few minutes ago on Jeopardy a contestant missed the question "one of the two land-locked countries in the Himalayas" by responding with the name of the largest and most famous: Tibet. The answers they wanted were Nepal and Bhutan. Tibet is as landlocked as Nepal and Bhutan (unlike Pakistan and India, which have seacoasts), so the Jeopardy people must have excluded it on the grounds that it is not a country. As a proponent of the liberation of Tibet, and more generally of national self-determination, I was offended by Jeopardy's acceptance of Chinese occupation of Tibet.

Is this purely a matter of politics? Is the question of whether Tibet is a country purely a matter of whether one accepts China's political position? One could argue that, whatever the rights and wrongs of the matter may be, Tibet is not a country because the Tibetan government no longer has de facto control of its territory. But if a country is only a country if it has control of its territory, how is it that Somalia has continued to be considered a country even though it has had no effective government since 1991?

Perhaps the criterion is that a former country remains a country so long as no other country controls it? This would distinguish Somalia from Tibet, but it fails to capture the way in which we regard other countries as having continued to exist during periods of occupation and outright annexation. For example, Japan annexed Korea in 1910 and remained in control until 1945. Although foreign control ceased, Korea has yet to be reunified. Nonetheless, it is quite common to refer to Korea as a single country which has at times been over-run by other countries and at times, including the present, divided into two or more parts.

Although the answers considered correct by Jeopardy can be defended on narrow legal grounds, the usual use of "country" is different, based on the existence of a political and cultural entity whose status may change over time. Since Jeopardy generally accepts all reasonable answers and the question did not specify a particular legal theory, Jeopardy should have considered Tibet a valid answer. Their failure to do so seems very likely to be a political choice.

Watch out for those Wallonian finches

Language Log Labs is working up an acoustic analysis -- or will be, as soon as I can find some audio clips -- but meanwhile, there's no reason for you not to enjoy Dan Bilefsky's recent article on a popular sport in Flanders, finch tweeting ("One-Ounce Belgian Idols Vie for Most Tweets Per Hour", NYT, 5/21/2007):

The timekeeper waves a large red flag. Spectators wait in hushed anticipation. The nearly 50 featherweight rivals — including Rambo and Duracel — are surrounded by nervous trainers.

But the event is not a boxing or a wrestling match. The one-ounce contestants, with gray caps and blue beaks, will be judged on how many “susk-e-wiets” they can tweet in an hour from inside a wooden box.

This is vinkensport, or finching, the 400-year-old Flemish competition in which winning finches are feted like feathered opera divas, and one false note, like a “susk-e-wiat” instead of a “susk-e-wiet,” can lead to disqualification or, worse, disgrace. [...]

“As with any star athlete, like Lance Armstrong, what separates a champion bird from a loser is natural talent,” said Filip Santens, a leading vinkenier, who prepares his five-time champion chaffinch for matches by pumping heavy-metal Guns N’ Roses music into his cage and feeding him high-protein birdseed.

You'll want to read the whole thing, but my favorite part was this:

This being Belgium — divided between Dutch-speaking Flanders in the North and French-speaking Wallonia in the south — the sport also has been infected by linguistic divisions. Flemish finchers insist that only Flemish chaffinches chirp the susk-e-wiet and that Wallonian finches, found a few miles away, sing in a dialect closer to French. If a bird fails to sing the Flemish susk-e-wiet, it can be disqualified. (Hard-core vinkeniers insist they can discern the difference.)

“In Belgium, even the birds sing in different languages,” said Romain Furniere, 70, a veteran vinkenier, who recalled that when he was a young man in western Flanders, he and his friends would capture finches in the wild but release those that sang “in French.”

For more background, see Bill Poser's recent post on "The Belgian Language".

[If you have a pointer to some recordings of Flemish vs. Wallonian finches, please let me know. I've already found the web site of the Koninklijke Nationale Federatie Algemene Vineniersbond A.Vi.Bo. v.z.w., and I do know about Peter Marler, "Variation in the song of the Chaffinch, Fringilla Coelebs", Ibis 94:458-472 (1952), etc. It's the specific Flemish vs. Wallonian dialects that I'm looking for.]

[By the way, I apologize for not making it completely clear which aspects of this post are jokes. Geographical variations (i.e. "dialects") in the songs of some birds, including finches, have been well documented for more than 50 years. However, in the cases that I know about, the "dialect regions" involved are fairly small, like a few square kilometers or so; and I suspect that it's unlikely that finch dialect regions correspond in any systematic way with human linguistic boundaries in Belgium. Therefore, my guess would be that the Flemish finch enthusiasts are projecting their own linguistic issues onto some locally-varying birdsong features. I'm hoping to sort out the facts, through some combination of reading the scientific literature and looking at available recordings; meanwhile, I thought that readers might enjoy the article.

Let me also point out that Wallonian is apparently not really a word, although I picked it up from its use in the cited NYT article -- the correct adjectival form for Wallonia should be Walloon. It's not only George W. Bush who has problems with over-regularization of toponymic adjectives...]

[Alex Baumans writes:

As far as I've understood the sport (and I am not, in any way, an expert) the Flemish-Walloon difference has to do with the judging. Only a correct suskewiet is counted as a 'hit' (slag). It is, by the way, a fascinating sight. You have a row of birdcages with in front of them agricultural looking gentlemen in anoraks with a long stick on which they chalk up the hits. As far as excitement is concerned, sheep trialling beats it hands down.

What is a correct 'song' is judged differently in Wallonia and in Flanders (no doubt influenced by local song patterns). As it is said on the Avibo website:

In tegenstelling tot Vlaanderen waar alleen Vlaamse of inheemse zang getekend wordt, gelden in Wallonië meerdere zangen. ("Contrary to Flanders, where only Flemish or local song is counted, in Wallonia several songs are valid")

So, as far as competition finches are concerned, yes there is a Flemish and Walloon way of singing.

I'm a bit confused by the Avibo quote, which could be interpreted to mean that in Walloon competition, anything goes (well, at least, "several songs are valid"), whereas in Flanders, the rules are stricter. And does "Vlaamse of inheemse zang" imply that Flemish song and local song are two different options? Or are they just two different ways of describing the same thing?]

The global sausage on terror

An article in this month's Arts & Sciences magazine quotes Jamal Elias, my colleague here at Penn, as follows:

"Jihad doesn't really have a root meaning that has anything to do with war," Elias explains.

Professor Elias is referring to the fact that jihad is based on a root J-H-D whose core meaning is "to strive" or "to exert effort". But the funny thing is, war doesn't really have a root meaning that has anything to do with war, either. As the American Heritage Dictionary explains,

The chaos of war is reflected in the semantic history of the word war. War can be traced back to the Indo-European root *wers–, “to confuse, mix up.” In the Germanic family of the Indo-European languages, this root gave rise to several words having to do with confusion or mixture of various kinds. One was the noun *werza–, “confusion,” which in a later form *werra– was borrowed into Old French, probably from Frankish, a largely unrecorded Germanic language that contributed about 200 words to the vocabulary of Old French. From the Germanic stem came both the form werre in Old North French, the form borrowed into English in the 12th century, and guerre (the source of guerrilla) in the rest of the Old French-speaking area. Both forms meant “war.” Meanwhile another form derived from the same Indo-European root had developed into a word denoting a more benign kind of mixture, Old High German wurst, meaning “sausage.” Modern German Wurst was borrowed into English in the 19th century, first by itself (recorded in 1855) and then as part of the word liverwurst (1869), the liver being a translation of German Leber in Leberwurst.

Thus the use of war in English to refer to certain kinds of armed conflict is at least five centuries more recent than the use of jihad in Arabic for the same purpose. And English has some even more recent residues of the metaphorical shift from mixture to mayhem: to mix it up or to tangle with someone; a mêlée.

According to the OED, the word war has also been used since 1200 in a figurative sense, to describe "conflict between opposing forces or principles", as in

c1374 CHAUCER Troylus v. 234 Who kan conforten now youre hertes werre?

No one has written about the two kinds of war more vividly than that 18th-century salafist, William Blake -- though he didn't call them the "greater" and "lesser" forms of war, but rather "mental" vs. "corporeal" war. Here's the preface to his poem Milton:

The Stolen and Perverted Writings of Homer & Ovid: of Plato & Cicero. which all Men ought to contemn: are set up by artifice against the Sublime of the Bible. but when the New Age is at leisure to Pronounce: all will be set right: & those Grand Works of the more ancient & consciously & professedly Inspired Men, will hold their proper rank, & the Daughters of Memory shall become the Daughters of Inspiration. Shakspeare & Milton were both curbd by the general malady & infection from the silly Greek & Latin slaves of the Sword.

Rouze up O Young Men of the New Age! set your foreheads against the ignorant Hirelings! For we have Hirelings in the Camp, the Court & the University: who would if they could, for ever depress Mental & prolong Corporeal War. Painters! on you I call! Sculptors! Architects! Suffer not the fash[i]onable Fools to depress your powers by the prices they pretend to give for contemptible works or the expensive advertizing boasts that they make of such works; believe Christ & his Apostles that there is a Class of Men whose whole delight is in Destroying. We do not want either Greek or Roman Models if we are but just & true to our own Imaginations, those Worlds of Eternity in which we shall live for ever; in Jesus our Lord.

This introduces the famous hymn:

And did those feet in ancient time.
Walk upon Englands mountains green:
And was the holy Lamb of God,
On Englands pleasant pastures seen!

And did the Countenance Divine,
Shine forth upon our clouded hills?
And was Jerusalem builded here,
Among these dark Satanic Mills?

Bring me my Bow of burning gold:
Bring me my Arrows of desire:
Bring me my Spear: O clouds unfold!
Bring me my Chariot of fire!

I will not cease from Mental Fight,
Nor shall my Sword sleep in my hand:
Till we have built Jerusalem,
In Englands green & pleasant Land.

Would to God that all the Lords people were Prophets.

Of course, due to his neglect of corporeal war, Blake's followers are not known to have captured and held any English departments, much less any countries.

[Update -- Chad Nilep writes:

Of Jamal Elias's claim, "Jihad doesn't really have a root meaning that has anything to do with war," you suggest, "Professor Elias is referring to the fact that jihad is based on a root J-H-D whose core meaning is 'to strive' or 'to exert effort'."

The word "root" of course has many uses. Quoting an AHD entry on *wers- suggests the 'etymological source' meaning of root. I don't think, though, that Dr. Elias necessarily intended 'root' in the sense of 'historical source' or even 'semitic triliteral'. By "root meaning" he may have intended "basic meaning".

As you know, jihad means "to strive" and is often used by Muslims in the sense of striving to be a good Muslim. For some people, such as the Mujahidin and various so-called terrorist groups, this striving is military and/or violent. For many, though, jihad implies simply striving to be a good person or striving to avoid sin.

English "wurst" and "war" share a historical connection, but it is not obvious to contemporary speakers. The synchronic polysemy of "jihad" on the other hand, is probably obvious to Arabic speakers. (More to Dr. Elias's point, though, it is not apparent to consumers of English-language media.)

Well, the (often positively-associated) metaphorical uses of words like war, battle, fight, attack, etc. are also quite accessible to contemporary English speakers, so the "greater jihad"/"lesser jihad" idea is an easy one to understand, even if there are only about 30,000 Google hits for "greater jihad" compared to 13 million for plain "jihad". Anyhow, my point was only that the historical (or even contemporarily associated) meanings of a term are less important than how it is intended on particular occasions of use.]

[John Cowan observed:

Frye and Davies may not have *captured and held* the UToronto department, but they were certainly pretty important there for a long time. We can throw in McLuhan, too.


[Jay Cummings:

President Bush was widely criticized for using the word "crusade" prior to the formulation of "the war on terror". Crusade has long been used for non-violent struggles, usually with a moral underpinning. I think in modern usage it is an very good translation for most uses of "jihad" It can be construed as a literal holy war or as a moral struggle. Jihad, however, can mean an individual internal struggle, whereas crusade ordinarily would not.

Likewise, Arabic media should have translated the word as used by Bush, and most uses of crusade, as jihad. Perhaps Bush's writers should have foreseen the provocative choice not to, and the ensuing uproar. But that particular bushism never bothered me much.


Correction was made to grammatical errors in body of notice.

An e-mail announcement was sent earlier this morning to everyone here at my home institution, signed by our Chancellor, announcing the retirement reception of a distinguished member of our administration. About a half hour later, virtually the same message was re-sent with the title of this post as an explanation. I was curious about what the "grammatical errors" might be, so I read the two messages a little more closely. There indeed was one grammatical error, duly corrected, but there were a number of other corrections of a completely different sort.

First, the sentence with the grammatical error. (Here and throughout, the original message text is in red, the corrected message text is in blue, and the specific corrected bits are highlighed in boldface in both.)

He has developed and implemented the faculty mentor program and has worked with Career Services to ensure that our programs meet the needs of all students and encourages them to apply to graduate and professional schools.

He has developed and implemented the Faculty Mentor program and has worked with Career Services to ensure that our programs meet the needs of all students and encourage them to apply to graduate and professional schools.

The grammatical error is one of subject-verb agreement: our programs ... encourage, not our programs ... encourages. An understandable error given the complex form of this sentence, and one that is surely not worth flooding everybody's inboxes with a new version of the exact same message. But the good folks in the Chancellor's office made a number of other "corrections" to the original message while they were at it.

You can see one of those in the first part of the sentence above: faculty mentor program in the original is capitalized to Faculty Mentor program in the corrected version. This isn't a question of grammar, at least not in the technical sense used by linguists; it's a question of orthographic convention: names, titles, and other such things tend to be capitalized. (Please ignore the fact that it's supposed to be Faculty Mentor Program, with all three words capitalized.)

Another correction has me wondering:

The reception will be held from 3 to 5

The reception will be held from 3 to 5 p.m.

I'm guessing that the original error was due to the Cupertino Effect, but I haven't been able to replicate it (try as I might) with the spelling-and-grammar checker of my word processor (Microsoft® Word X for Mac® Service Release 1) -- if anyone out there is successful, or has an alternative hypothesis, please follow the comment link below.

Another correction is extremely curious:

Dr. Watson's 11-year service as the founding Provost of Third College, now Thurgood Marshall College

Dr. Watson's 11-year service as Provost of Third College, now Thurgood Marshall College

According to this recent UCSD press release, Dr. Watson was indeed the founding Provost of the relevant college (he served as Provost from 1970 to 1981, and the college was founded in 1970). Why the "correction"?

The remaining corrections involved changing some perfectly good commas (in a list of Dr. Watson's many accomplishments) into semi-colons, removing a space from the name of our campus web portal, and adding a hyphen after the area code in the phone number that folks can call with questions about the retirement reception. All completely unnecessary, if you ask me (though, of course, nobody did or is).

U.N. declares 2008 International Year of Languages

On Friday the UN declared 2008 the International Year of Languages; here's the official press release. I got it via a post from Wayne Leman on the ILAT list; he had learned of it on the Eurolang website.

There's a noticeable difference in focus between the UN press release and the Eurolang article. The press release is all about multilingualism, eliminating disparities in the use of the UN's six official languages, and the need to provide UN services (including peacekeeping) in local languages, while the Eurolang article is more about protecting and conserving endangered languages.

Here's the first chunk of the press release:

The General Assembly this afternoon, recognizing that genuine multilingualism promotes unity in diversity and international understanding, proclaimed 2008 the International Year of Languages. ...The Assembly, also recognizing that the United Nations pursues multilingualism as a means of promoting, protecting and preserving diversity of languages and cultures globally, emphasized the paramount importance of the equality of the Organization's six official languages (Arabic, Chinese, English, French, Russian and Spanish).... Further, the Assembly emphasized the importance of making appropriate use of all the official languages in all the activities of the Department of Public Information, with the aim of eliminating the disparity between the use of English and the use of the five other official languages.
The press release focusses on mulitlingualism as a means of facilitating communication between peoples -- "genuine multilingualism promotes [unity in diversity] and [international understanding]".

The Eurolang article, on the other hand, quoting the floor speeches touching on protecting and preserving minority and endangered languages, reports that

The Assembly called upon States and the Secretariat to work towards the conservation and defence of the world's languages.
I felt there was a kind of disconnect between these two descriptions of the proceedings. This disconnect is reflected in Eurolang article, which, apparently conflating the 'unity through multilingualism' part and the 'protect diversity' part, states that the UN "will aim to promote unity through linguistic diversity," which surely is a bit of an odd thing to say. Linguistic diversity is a Good Thing, but it's hardly in principle unifying!

Qua linguist, I can get 100% behind both these goals, both (a) promoting multilingualism in national/majority languages particularly with a view toward fostering international understanding, and (b) protecting and revitalizing endangered minority languages. These goals are certainly not mutually exclusive, but they're really not the same.

The UN seems to think they are, though. This is implied in the press release clause about how "the United Nations pursues multilingualism as a means of promoting, protecting and preserving diversity of languages and cultures globally". This seems to me to be like promoting the use of rice, potatoes, wheat and corn as a means of promoting and preserving diversity of staple crops globally.

When I went to look at the actual text of the resolution, the sense of disconnect persisted. The bulk of the resolution -- OPs 2 through 22 -- have to do with translation services and intra-UN concerns about the use of the six official languages. The last bit -- OPs 23-25 -- have to do with promoting linguistic diversity and preserving endangered languages.

From those last few paras, though, I did learn that February 21 is "International Mother Language Day", which I didn't know (though it hasn't escaped the collective eagle eye of us Loggers, so I ought to've) and that on March 18th of this year, UNESCO's "Convention on the Protection and Promotion of the Diversity of Cultural Expressions" came into force. All good.

International Mother Language Day is in my calendar now. We'll officially celebrate it from here on out with a brunch at the Plaza featuring finger foods from around the world and a keynote address on the grammatical properties and current usage context of a featured endangered language from an expert speaker or fieldworker. Anyone who can convincingly greet the bartender in an endangered language gets a free drink.

Here's another sign of the current Cosmic Chicken Convergence. According to Paul Kane's Capitol Briefing blog ("McCain, Cornyn Engage in Heated Exchange")

During a meeting Thursday on immigration legislation, McCain and Sen. John Cornyn (R-Texas) got into a shouting match when Cornyn started voicing concerns about the number of judicial appeals that illegal immigrants could receive, according to multiple sources -- both Democrats and Republicans -- who heard firsthand accounts of the exchange from lawmakers who were in the room.

At a bipartisan gathering in an ornate meeting room just off the Senate floor, McCain complained that Cornyn was raising petty objections to a compromise plan being worked out between Senate Republicans and Democrats and the White House. He used a curse word associated with chickens and accused Cornyn of raising the issue just to torpedo a deal.

The New York Times may "take shit from the president", but the Washington Post is apparently not yet willing to accept chickenshit from a mere senator. Amazingly, Andrew Sullivan was (at least temporarily) baffled by this bit of bowdlerization.

The American Heritage Dictionary has an entry that covers Senator McCain's usage:

NOUN: Vulgar Slang Contemptibly petty, insignificant nonsense.
ADJECTIVE: 1. Contemptibly unimportant; petty. 2. Cowardly; afraid.

But the august OED does not: it recognizes only

chicken-shit (coarse slang, orig. U.S.), a coward; also used as a general term of abuse; also attrib. or as adj.

with citations back to 1947:

1947 C. WILLINGHAM End as Man xvi. 192 You're both acting like *chicken-shits. We win a batch of money—you're afraid to take it.
1948 N. MAILER Naked & Dead I. i. 7 ‘What's the matter?’ he asked. ‘You going chickenshit?’
1968 Southerly XXVIII. 281 ‘You're just a pile of compromising chickenshit,’ Gillian says in a whisper.
1969 C. HIMES Blind Man with Pistol xix. 203 She's a slut, just a chickenshit whore.
1970 It 12-25 Feb. 17/1 American groups are not so chickenshit about getting into underground work.

1988 ‘DR. DRE’ et al. Fuck Tha Police (song) in L. A. Stanley Rap: the Lyrics (1992) 238 The jury has found you guilty of being a red-neck, white-bread chicken-shit motherfucker.

The gloss "also used as a general term of abuse" is intended to cover uses like Senator McCain's, but it's way too general for the specific meaning that the AHD tags as "contemptibly petty, insignificant nonsense".

And actually, I'm not sure that the AHD has it entirely right. When I look back over my own extensive experience of chickenshit regulations, chickenshit objections, and the people who make use of them -- all encountered in earlier jobs or with people no longer among my circle of acquaintances, I hasten to add -- I agree that chickenshit is always contemptible, and frequently petty, but it is by no means always insignificant. More important, the gloss "petty, insignificant nonsense" leaves something out, something related to Harry Frankfurt's pathbreaking analysis of bullshit:

What bullshit essentially misrepresents is neither the state of affairs to which it refers nor the beliefs of the speaker concerning that state of affairs. Those are what lies misrepresent, by virtue of being false. Since bullshit need not be false, it differs from lies in its misrepresentational intent. The bullshitter may not deceive us, or even intend to do so, either about the facts or about what he takes the facts to be. What he does necessarily attempt to deceive us about is his enterprise. His only indispensably distinctive characteristic is that in a certain way he misrepresents what he is up to.

This is the crux of the distinction between him and the liar. Both he and the liar represent themselves falsely as endeavoring to communicate the truth. The success of each depends upon deceiving us about that. But the fact about himself that the liar hides is that he is attempting to lead us away from a correct apprehension of reality; we are not to know that he wants us to believe something he supposes to be false. The fact about himself that the bullshitter hides, on the other hand, is that the truth-values of his statements are of no central interest to him; what we are not to understand is that his intention is neither to report the truth nor to conceal it. This does not mean that his speech is anarchically impulsive, but that the motive guiding and controlling it is unconcerned with how the things about which he speaks truly are.

Similarly, it seems to me that the essence of chickenshit -- or at least a critical factor in chickenshit -- is an analogous implicit misrepresentation of motives. Bruce Schneier put his finger on one example in his essay "Why Smart Cops do Dumb Things":

Since 9/11, we've spent hundreds of billions of dollars defending ourselves from terrorist attacks. Stories about the ineffectiveness of many of these security measures are common, but less so are discussions of why they are so ineffective. In short: Much of our country's counterterrorism security spending is not designed to protect us from the terrorists, but instead to protect our public officials from criticism when another attack occurs.

He calls this "Cover Your Ass security". Whether or not he's right about the particular examples he cites, the phenomenon clearly exists. And everyone has encountered many other cases where elaborate regulations and precautions seem not to be connected to any rational calculation of risk and cost, but instead to have some other motivation. It could be CYA, but it could also be bureaucratic aggrandizement and turf protection -- if the people in the regulatory biz just sit around without creating, modifying and enforcing regulations, jobs might be cut. Ditto with the people in the business of keeping records. There are good reasons for regulations and for record-keeping, but in addition, some of the people involved in the process have another motive to create and enforce non-functional complexity.

Maybe the commonest hidden motivation is simply money -- insurers have an interest in imposing chickenshit procedures on those they insure, and similarly with government agencies and their clients. And chickenshit regulations or procedures can also mask plain old conflict between individuals or groups.

The key point about chickenshit regulations and procedures is not that they are unproductive or even counter-productive, and the key thing about chickenshit objections is not that they are invalid. The essential point is that their authors just don't care, one way or the other. Their motivations are orthogonal to considerations of effectiveness and validity.

In the case of Thursday's senatorial dispute, Kane's description makes it clear that Senator M felt that Senator's C's objections to detailed aspects of the immigration bill were hiding a different set of motives: philosophical objections to immigration reform, or political calculation about the effects of a bipartisan deal on the subject. It would be childish and counterproductive to lose your temper at a fellow law-maker for paying sincere attention to the details of a bill, but it's normal to get angry at someone who hides general and fundamental disagreement behind a blizzard of specific objections. And you have little to lose, in that circumstance, by indulging in a little plain talk.

(More Language Log posts on the bovine variety of metaphorical excrement: "Bullshit", 2/17/2005; "Bovine excrement on NPR", 3/11/2005; "Labov's test", 8/17/2005; "Bullshit: invented by T.S. Eliot in 1910?"; "Maybe it was John Dryden", 8/20/2005; "British Science: West Point takes the lead", 8/21/2005; "The great tradition", 9/2/2005; "The ultimate nightmare becomes an everyday reality", 1/10/2007)

[Update -- Stephen Ambrose discusses the WWII roots of chickenshit here. He quotes a definition from Paul Fussell's Wartime, which sums it up nicely:

Chickenshit refers to behavior that makes military life worse than it need be: petty harassment of the weak by the strong; open scrimmage for power and authority and prestige... insistence on the letter rather than the spirit of ordinances. Chickenshit is so called -- instead of horse -- or bull -- or elephant shit -- because it is small-minded and ignoble and takes the trivial seriously. Chickenshit can be recognized instantly because it never has anything to do with winning the war.


[Update -- Bruce Rusk writes:

I have to disagree with you knocking of the AHD for the "Contemptibly unimportant; petty." definition. I've heard and read the word in this sense, and this is the way I'd be most likely to use it myself (perhaps there's some regional variation?).

Here are some quickly googled examples:

Inflation made your fortunes *worth chickenshit* and you end up becoming the town drunk.

we get paid chickenshit - compared to the amount of hours..

“You flunked. You are the proud possessors of nothing at all. Whenever Lawndale High School gets around to giving you a diploma, it will be worth chickenshit given your miniscule levels of intellectual achievement. And Lane…!”

the $17 million stock offered by Wetherell would be worth chickenshit few months later...Maybe BG can eat it dumbhead...

Also, Google Books yields ten results for "chickenshit job"; here's a prime one:

I want my dad to be proud of me, but I am not going to suck up to every Tom, Dick, and Harry who offers me a dead-end, *chickenshit job* just so dad can boast ... (If You Walked In My Shoes - Page 180)

So while this sense may be much less common than the others, it's hard to read these example in any other way (and it's usage seems to be nominal rather than adjectival, most clearly in the collocation "worth chickenshit").

I doubt that there is much regional variation on this point -- all the cited examples seem idiomatic to me. There's clearly a range of uses for which "contemptibly petty" is about right. But then there are other cases where the "nonsense" part comes into play, as a lack of fit between ostensible purpose and actual motive.]

[Update #2 -- Andy Hollenbeck writes:

The Washington Post's avoidance of curse words in the case of McCain vs. Cornyn is another example of how NOT using the word itself can lead, or mislead, readers to misunderstand what actually happened. As I read the excerpt from Paul Kane's blog, when I got to the bit about McCain using "a curse word associated with chickens ," I assumed that McCain had called him a cock. Or maybe even a pecker.

The circumlocution can leave one wondering whether McCain was attacking Cornyn's acts or launching a personal verbal assault directly at Cornyn -- which may have repercussions among voters trying to decide what type of president McCain would be.

And Frederick Dicky writes:

The Fussell quote quote you give concerning the origins of chickenshit is exactly correct in my opinion. I served in the US Army in the 68-70 timeframe and chickenshit was alive and well in exactly sense given in the quote. I probably heard the term used more times in the Army then I have since I left the Army in 1970. For me, the term is primarily a military term. McCain served in the Navy at roughly the same time I served in the Army. I assume the term was also common in the Navy. When I read your quote concerning McCain's use of the term, my first thought was that McCain, the former Naval aviator, not the senator, was speaking.

Yes, my experience in the Army around 1969 was the same, and my reaction to McCain's usage was similar.]

[Update #3: Jay Cummings writes:

I think Bruce Rusk's examples are heavily influenced by the expression "chickenfeed" meaning something of little worth. To me, chickenfeed would be a noun and chickenshit an adjective, ordinarily. Hence, "chickenshit job" makes sense, but the other examples, saying something is "worth chicken___" should use chickenfeed.

Chickenfeed, chickenshit, it's all the same in the end.

The resonance with chickenfeed is very relevant to some of Bruce Rusk's examples. But the (originally military?) use of chickenshit, described by Fussell, can certainly be a noun as a well as an adjective.]

[Update #4 -- Mark Paris writes:

I don't have anything to add regarding the language use of chickenshit, but I would like to defend the existence chickenshit itself, at least in some cases. The most obvious case to me is in some government procurement regulations. Critics have cited long federal acquisition regulations specifying, for example, how creme-filled cookies must be made for bidding on federal contracts. That and many other examples certainly seem idiotic, until you consider what contractors might try to pass off as creme-filled cookies in the absence of minutely specified ingredients. Given the alternative, maybe chickenshit regulations are not such a bad idea in some cases.

If the specifications genuinely protect America's consumers of federally-procured creme-filled cookies, then they aren't chickenshit, at least by my lights. ]

Humpty Dumpty linguistics from Robert Fisk

I was surprised to learn from a column in The Independent called "Blair's lies and linguistic manipulations" that Robert Fisk once studied linguistics. Fisk reports:

By great good fortune, I studied linguistics at Lancaster University. Indeed, I read the books of Noam Chomsky, many years before he became a good friend of mine; to be honest, when I read his work, I thought Chomsky was dead.

Eventually Fisk did learn that Chomsky was alive, and met him, and was delighted to find that they shared politics. And in his column dated May 19 he appears to be attempting to connect Chomsky's linguistics to the political views they share. Unfortunately, it rapidly becomes clear that Fisk is totally clueless about the linguistics he claims to have studied at the University of Lancaster while studying for his B.A. degree in English and Classics. It struck me as a little suspicious right at the start that a serious student of linguistics in the mid to late 1960s could possibly have thought that Chomsky (still under 40) was other than highly active, since he was publishing constantly at a staggering rate. And looking at the use Fisk makes of his dimly recollected linguistics classes reveals that he simply was not paying attention — and as usual, did absolutely no fact-checking before he started hammering out angry prose.

The subject of the latest piece of prose hammering is, once again, his hatred and contempt for U.K. prime minister Tony Blair. (New readers should note that Fisk always refers to Blair as "Lord Blair of Kut al-Amara", this being some private joke of his about the proper name to be associated with the peerage that presumably lies in Blair's future; the siege of Kut al-Amara in 1915-1916 has been described as "the most abject capitulation in Britain's military history", so it would not be an honorable name to attach to one's title.) This week's news is that Blair is stepping down, and Fisk says:

Lord Blair is going from us. His self-serving memoirs will, of course, remind us of his God-like view of himself (and, heaven spare me, we share the same publishers) but I doubt if Chomsky's "foregrounded elements" will save him. A "foregrounded element" was something unusual, a phrase placed in such a way that it warned us of a lie to come.

This is not a good account of what a "foregrounded element" is (or was; the odd tense choice seems to be an allusion to the fact that back during his undergraduate days at Lancaster Fisk was taught this definition of foregrounded elements). Those acquainted with linguistics will be surprised to see the word associated with Chomsky's name: foregrounding is a matter of discourse and pragmatics, not of Chomsky's primary subfield of linguistics, syntax. I for one cannot recollect him using the term "foregrounded element" ever, not even a single time. Chomsky is famously uninterested in pragmatics, stylistics, or the analysis of discourse. I recall being told by someone who heard him lecture at the University of Colorado at Boulder that he dismissed the analysis of videotaped conversational discourse as a waste of time, not fit to be included in the language sciences. He has often mentioned in interviews that he does not connect his linguistics to his politics — he does not think there is any intellectually serious way in which linguistics can contribute to the analysis of political propaganda, for example.

"Foregrounded element" is a familiar term for those who study discourse rather than individual sentences, style rather than grammar, and information structure rather than syntactic structure. The primary use is in describing what happens in cleft sentences (see The Cambridge Grammar, pages 1414-1416, in the chapter headed "Information packaging"). Compare the most neutral way of saying that Oswald assassinated Kennedy, as in [1a], with the two cleft sentence versions in [1b] and [1c], in which the foregrounded element is underlined:

[1] a. Oswald assassinated Kennedy.
  b. It was Oswald that assassinated Kennedy.
  c. It was Kennedy that Oswald assassinated.

While [1a] would be suitable for use in a context where nothing was presupposed (for example, as the answer to "What happened next?"), [1b] would be suitable for a context in which someone who knew about Kennedy's death but had the wrong idea about who the assassin was, and [1c] would be suitable for a context in which someone who knew about Lee Harvey Oswald's crime but had forgotten which president had been his victim. As The Cambridge Grammar points out (page 1416), the part of the sentence following that gives a presupposed part of the meaning in the form of what logicians call an open sentence (a sentence with a variable substituted for one of its noun phrases), and the noun phrase preceding that provides the most important (hence foregrounded) contribution, the correct value for the variable. Example [1b] presupposes "x assassinated Kennedy", and contributes the crucial information that x = Oswald.

It is possible that Fisk may have confused foregrounding with fronting. Fronting is a syntactic matter: positioning a constituent of a clause at the beginning rather than where its grammatical function might have suggested it would go. The underlined elements in these sentences are fronted:

[2] a. Her, he tends to ignore.
  b. Who do you love?
  c. To my son James I leave the balance of my estate.

Sometimes fronting a phrase syntactically is a strategy for foregrounding it in terms of information structure, but the two should not be equated. In [2a], for instance, I am not presupposing that he tends to ignore some person x and asserting that x = her; almost the opposite is going on. It is presupposed that there is some female person x, and the new contribution to the discourse — left till last to make it salient — is that he tends to ignore x. Foregrounding her (as in It is her that he tends to ignore) has an entirely different pragmatic effect.

Emphatic stress can (in speech) also accomplish foregrounding: to say Oswald assassinated KENNEDY, with heavy stress on that last word, has an effect somewhat similar to using the cleft version It was KENNEDY that Oswald assassinated. So we have to take it into consideration that Fisk might have meant to refer to emphasis.

Fisk's own characterization of what he meant by a foregrounded element, "something unusual, a phrase placed in such a way that it warned us of a lie to come", is somewhere between the weak and the hopeless; and his suggestion that Chomsky has something to do with the definition of the term is just incorrect. But the worst thing is that every single one of the examples he gives makes it clear that he doesn't know what he's talking about.

His first example is bafflingly irrelevant: it is George Tenet's gloss on what he was referring to as a "slam dunk". He just slings this in as a warm-up. You can read it for yourself in context, but Fisk doesn't even try to connect it up to foregrounding, and nor will I.

The first claim Fisk makes that does relate to foregrounding concerns a Beirut newspaper which

quoted our dear Prime Minister as saying that he was very angry that a review committee had prevented him from deporting two Algerians home because their government represented a "different political system". The "foregrounded" element, of course, is the word "different". This is the word that contains the lie.

I note that he has suddenly departed from his own definition: the foregrounded element was supposed to have "warned us of a lie to come", but now it "contains the lie". That is very different. But never mind. The main thing is that different is not foregrounded in Fisk's sense or in any sense at all. It is an adjective functioning as an attributive modifier, and it is in exactly the expected position, not placed in some special way. It is not foregrounded or fronted or highlighted or emphasized or anything else that I can imagine he might have intended. What he wants to say is that he is furious that Blair should just say "different political system" when what Algeria actually has is (according to Fisk) a political system that "allows it to torture to death its prisoners." He raves for a paragraph or two about how disgusting the treatment of prisoners is, and gives horrible details about how prisoners are tortured or raped to death; and all that may well be just as he says it is. But in that case different is an understatement or a piece of evasiveness rather than a lie, and — more relevant here — foregrounding has nothing to do with what he is talking about.

Further examples of "foregrounding" now come thick and fast, and now, bafflingly, foregrounded elements are called "Chomsky foregrounded elements":

Putting the country first didn't mean "doing the right thing according to conventional wisdom" (Chomsky foregrounded element: conventional) or the "prevailing consensus: (Chomsky foregrounded element: prevailing). It meant "what you genuinely believe to be right" (Chomsky foregrounded element: genuinely). Lord Blair of Kut al-Amara wanted to stand "shoulder to shoulder" with Britain's oldest ally, which he assumed to be the United States. (It is actually Portugal, but no matter.) "I did so out of belief," he told us. Foregrounded element: belief.

Am I alone in being repulsed by this? "Politics may be the art of the possible (foregrounded element: may) but, at least in life, give the impossible a go." What does this mean? Is Blair adopting sainthood as a means to an end?

Conventional: adjective in the ordinary position for serving as attributive modifier, not in any sense foregrounded.

Prevailing: adjective in the ordinary position for serving as attributive modifier, not in any sense foregrounded.

Genuinely: adverb in the ordinary position for modifying a verb phrase, not in any sense foregrounded.

Belief: noun functioning as head of a noun phrase in the ordinary position for the object of a preposition (following of), not in any sense foregrounded. (It is the last word in the quoted sentence, incidentally, which makes it baffling how it could be an element that "warned us of a lie to come", which is what Fisk's definition says.

May: modal auxiliary verb functioning as head of a verb phrase in the ordinary position to be predicate of a tensed clause, following its subject, the noun phrase politics; may is not in any sense foregrounded.

Not a single one of the words he singles out as "foregrounded elements" are foregrounded in any sense that someone who does syntax or semantics or pragmatics would recognize. You could do an empirical experiment on this: type out the sentences involved and take them to some linguist who has not seen this post and ask them to underline one word in each to indicate the foregrounded element. You don't even need to do the typing, I'll do it. Here they are; just copy and paste:

  1. He was very angry that a review committee had prevented him from deporting two Algerians home because their government represented a different political system.

  2. Putting the country first doesn't mean doing the right thing according to conventional wisdom.

  3. Putting the country first doesn't mean doing the right thing according to the prevailing consensus.

  4. It means doing what you genuinely believe to be right.

  5. I did so out of belief.

  6. Politics may be the art of the possible, but, at least in life, give the impossible a go.

The results of this forced choice test will be, I predict, roughly the same as if you chose nouns, verbs, adjectives, and adverbs using a pigeon or a monkey or a random number generator. Fisk may know a lot about the Arab world and the terrible things that have been going on there, and his hatred of Tony Blair may be rooted in real political arguments, but in this column he is just frothing. It is clear that he did not pay attention to his linguistics instructors at the University of Lancaster back in the 1960s, and he didn't look up any of the old notes he took as an undergraduate. He wrote his piece using "foregrounded element" to mean just what he wanted it to mean: something like "word which made me feel when I read it that I was looking at a disingenuous claim", or something along those lines. His grasp of syntax and information structure is nonexistent, and his semantics is that of Lewis Carroll's Humpty Dumpty. The linguists in Lancaster's eminently reputable Department of Linguistics and English Language must be a bit embarrassed that he mentioned them.

Update: The date of Fisk's degree from Lancaster appears to have been 1967; a piece from Selves and Others, July 2nd, 2005, which Mark Liberman located, says: "I belong to that generation of undergraduates who cut their teeth on linguistics. Lancaster University in its second year of existence", says Fisk; "Class of '67, if I'm not mistaken." In that piece, Fisk says this of Noam Chomsky:

Less famous then than now, he it was who introduced me to the "foregrounded element". "Foregrounded" is when someone places words in such an order that a new meaning is attached to them or deliberately leaves out a word that we might expect. The big bad man emphasises the meanness of the man. But the bad big man makes us think of size. "Big" has been "foregrounded". Real linguists won't like the above definition but journalists, I fear, sometimes have to distort in order to make plain.

Foregrounded, as understood in modern linguistics, means nothing like what he says here (which in any case is not the same as what he says in the Independent piece quoted above); and Chomsky (I am prepared to offer a modest bet) never said anything like this about foregrounding in the whole of his life.

Much more likely (as Mark points out to me) that Fisk got the term from a source presenting ideas something like those of (for example) Gerald Bruns. In his 1974 book Modern Poetry and the Idea of Language (too late to be the right source, but suggestive of what might have gone before), we find the following passage, quoted here, which has the right kind of muddle to be the sort of thing that could have inspired Fisk, and even a link (thoroughly inappropriate, in my view) to the early work of Chomsky:

The implication here, of course, is that in poetry the aesthetic experience is finally an experience of language itself. It is this idea of poetry which was taken up and developed on a systematic basis by the Prague Structuralists, who extended the traditional theory of linguistic functions or purposes (referential, conative, emotive) so as to include those utterances in which language is used intransitively. In place of Shklovsky’s ambiguous distinction between prose utterances and poetic speech (that is, between "tortured" and "easy" discourse), the Prague Structuralists, particularly Bohuslav Havránek and Jan Mukarovsky´, formulated a distinction between those utterances in which language is "automatized" according to the economy of everyday speech, and those in which language is "foregrounded." Foregrounding, according to Havránek, is "the use of the devices of the language in such a way that this use itself attracts attention and is perceived as uncommon, as deprived of automatization, as deautomatized, such as a live poetic metaphor (As opposed to a lexicalized one, which is automatized)." Thus, for example, Noam Chomsky’s happy line, "Colorless green ideas sleep furiously," is a foregrounded utterance. Although, as it happens, the sentence is perfectly grammatical—Chomsky composed it to show that meaning is not a necessary effect of "grammaticalness"—it is afflicted or, like many lines of poetry, blessed with a dissonance between lexicon and syntax that renders it impervious to whatever effort we may make to impose an interpretation upon it. The structure of words by which Chomsky’s utterance is constituted occupies, that is to say, that "foreground" of the utterance that is ordinarily the special domain of meaning.

Nearly all the ingredients are there: Fisk cites an example illustrating the salience an attributive adjective gains when it is positioned out of the usual order; Bruns connects foregrounding in the literary sense of "deautomatization" (that is, "use of the devices of the language in such a way that this use itself attracts attention" — utterly different from any of Fisk's efforts at defining the word) to a Chomsky example sentence (used by Chomsky to point out that grammaticalness and meaningfulness are distinct, not to illustrate foregrounding in any sense), and it has a couple of attributive adjectives in it... Mix all this together carelessly in a bowl and add a third of a bottle of whisky, and it should be possible to get something like Fisk's chaotic state of understanding.

But I am just speculating, of course; whether Fisk ever saw any source containing stuff like the above ideas of Bruns is not known (though it would appear that Bruns has been writing on similar topics since the 1960s). And I'm not sure I would trust a man who can't definitely remember the date of his own B.A. ("Class of '67, if I'm not mistaken") to give us a self-report.

Hearing the sentence in your head

From Atul Gawande's op-ed piece "Let's Talk About Sex" in the NYT, 5/19/07, p. A25:

Reducing unintended pregnancy is the key -- half of pregnancies are unintended, and 4 in 10 of them end in abortion.

The first reading I got was that 4 in 10 pregnancies end in abortion.  But whoa, that can't be right; surely the abortion rate isn't that high.  Gawande must have intended to say that 4 of 10 UNINTENDED pregnancies (2 in 10 of all pregnancies) end in abortion.

I read the sentence in my head with unaccented them, the usual prosody for anaphoric pronouns; with that accenting, the referent of them is pregnancies, pregnancies being the nearest available antecedent NP, an NP moreover in a phrase (of pregnancies) parallel to the phrase (of them) that them is in.  Gawande presumably heard it in HIS head with accented them, the accent here signaling that the usual referent-finding procedures don't apply.  (The accented HIS in my last sentence illustrates a different use of accent: to point up a contrast, his head vs. my head.)

Gawande could have made things clear -- by putting them in small caps or italics, to indicate accent, or by choosing those or these, which here would convey the introduction of a new discourse referent, one the reader has to identify from the context: unintended pregnancies.  Why didn't he make things clear?

Because he heard the sentence in his head and didn't realize that he'd have to mark the accent for the reader (or choose a different anaphor).

One of the hardest tasks in writing is taking the viewpoint of your audience, reading your own stuff the way your readers are likely to; putting the sentences in your head down on paper isn't enough.  Sometimes even excellent, practiced writers get it wrong.

[Added 5/22/07.  First, a clarification: Gawande was reporting on abortions in the U.S. specifically, not around the world.  The 20% (rather than 40%) figure is consistent with a report, passed on to me by Alexa Mater, in the latest Economist: "In the United States, Australia, Canada, Britain and most of the rest of western Europe, around 15-25% of pregnancies are terminated." Even better, Ray Girvan has found what was probably the source for Gawande's statistics, in a Guttmacher Institute report: "Nearly half of pregnancies among American women are unintended, and four in 10 of these are terminated by abortion" (note these rather than them). Finally, Andy Hollandbeck points out that the anaphora error might be an editor's, rather than Gawande's.]

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 03:45 PM

Dolphin dialect science?

Inspired by Geoff's post, this morning's Breakfast ExperimentTM at Language Log Labs deals with dolphin dialects. West Philadelphia is a bit short of suitable experimental subjects, so we've had to work with simulated cetaceans. However, there are some significant results, reported below.

Since we need accurate parameters for the simulation, our crack web search team went looking for the scientific research behind the media hype. Consistent with our well-known glass-half-full outlook, I'm happy to say that we found some. In contrast to the cow dialect story, which was spun up out of nothing by a PR firm promoting regional cheese, there is some real science here.

The Shannon Dolphin and Wildlife Foundation really does exist, and there really is a marine biologist named Simon Berrow, the author of works such as "Discarding Practice and Marine Mammal By-Catch in the Celtic Sea Herring Industry" (Proceedings of the Royal Irish Academy, 98B (1), 1998) and "EU Habitats Directive and Tourism Development Progammes in the Shannon estuary, Ireland". And there really is an MSc student named Ronan Hickey, who has been funded by the Sea Watch Foundation to do a Comparative study of bottlenose dolphin whistles in the southern Cardigan Bay SAC and in the Shannon Estuary, and who reported some of the results as a poster presentation "Comparison of Whistle Characteristics of Bottlenose Dolphins (Tursiops Truncates) in Cardigan Bay (Wales) and Shannon Estuary (Ireland) Populations" at the 20th Annual Conference of the European Cetacean Society, April 2-7, 2006, in Gdynia. He won the student paper award -- congratulations, Ronan!

Of course, the fact that there's some actual science behind the media reports doesn't mean that the media reports are informative, responsible or even minimally truthful. The BBC News version is merely exaggerated ( "Bay dolphins have Welsh dialect", but the Times, among others, went completely gaga (Helen Nugent, "Meet the bottlenose boyo of Cardigan Bay – they all speak Welsh at his school", 5/18/2007):

First it was cows who moo with a Somerset drawl. Then it was birds with regional accents. Now scientists say that dolphins living off the coast of Wales have developed a Welsh dialect.

This is a preposterous way to present Ronan Hickey's MSc research, which is less tendentiously described in two abstracts, one from the Cetacean Society conference and one from the web of the Sea Watch Foundation. Both are given in full at the end of this post, with links. On the most optimistic interpretation of this research, the media response has been foolish and exaggerated, as usual. Unfortunately, while there is certainly some serious research going on, a less generous reading leaves it far from clear that there's actually any dolphin dialect discovery to exaggerate.

The abstracts don't give us enough information to evaluate the "dialect" claim, or even to understand exactly what it is, but there are some things in the Sea Watch Foundation abstract that raise significant questions about whether the propensities of Irish and Welsh dolphins to vocalize actually differ at all. First,

Cardigan Bay whistles were collected actively, onboard survey vessels using a deployed hydrophone, while Shannon Estuary whistles were collected passively via a fixed hydrophone.

The difference in collection methodology is obviously a problem. Maybe some of the vocalizations collected from the Cardigan Bay survey vessels meant things like "Damn noisy boats!" or "Hey, let's surf the wake!"; maybe the fixed Shannon Estuary hydrophone was located near a popular dolphin pick-up area ("Any hot babes out there?") or feeding spot ("Wow, herring!").

But an equally serious question has to do with sampling statistics:

Whistles were compared using a series of quantitative parameters and sorted into categories using contour shape. Overall 1882 whistles were analyses [sic] throughout the course of this study. The Vast majority were collected in the Shannon Estuary. A total 32 different whistle categories were described, of which 21 were observed in both populations, 8 were exclusive to the Shannon Estuary and 1 was exclusive to Cardigan Bay

Can we can reject the hypothesis that the cited differences were simply a statistical accident? Not from the information published so far.

We aren't told what "Vast majority" means, but let's say that 1,600 whistles were collected in Ireland and 282 were collected in Wales. Let's grant that the 32 categories are well defined and were accurately and unambiguously identified, and assume that the relative frequencies of the different categories have the expected "Zipf's Law" type of distribution, where the relative frequency of the kth most common of N categories is given by

Then if we set the exponent s=1.6 (this paper suggests a range of 1.6 to 2.4) and select 1,600 items at (Zipfian) random from a set of 32 categories, we find that our sample is missing one or more of the 32 categories about 20% of the time. If we select 282 times at random from exactly the same distribution, our sample will be missing 8 or more of the 32 categories about 53% of the time. (If someone asks me, I'll post the code that I used to do these simulations, so you can check my work or try the effect of different assumptions.)

Now, Ronan Hickey's actual counts and distributions might be different from my guesses, in ways that would lead to a different conclusion. But from what we've learned so far, there's no reason to be convinced that the underlying repertoires of Welsh and Irish dolphin whistles are actually different at all.

[...] The differences observed in the whistles characteristics between the two populations could be representative of behavioural, environmental, or morphological differences between the Cardigan Bay and Shannon Estuary populations.

Indeed. But the differences could also be representative of sampling error, from what we know so far.

Further Research is required to expand upon the results of this study before the variance in whistle characteristics of Cardigan Bay and the Shannon Estuary populations can be fully understood.

Further research is always good. But a full report of the research done so far would tell us a lot. When a piece of work like this gets into the popular press, the details behind it -- to whatever extent they exist -- should also be made available. In this case, putting Ronan Hickey's MSc thesis on the web would be a good start. (If it's out there and I just couldn't find it, or if there's some other source of published information on this research, please let me know.)


As promised, here are the full abstracts:

R.H. Hickey, "Comparison of Whistle Characteristics of Bottlenose Dolphins (Tursiops Truncates) in Cardigan Bay (Wales) and Shannon Estuary (Ireland) Populations", European Cetacean Society 20th Annual Conference, April 2-7, 2006, Gdynia.

Comparisons of whistle characteristics between geographically isolated populations of delphinid species have reviled variance between locations. The waters of Britain and Ireland are home to three known resident populations of bottlenose dolphins Tursiops truncates: Cardigan Bay (Wales), the Shannon Estuary (Ireland) and the Moray Firth (Scotland). This study compared the whistle repertoires and characteristics of two of these populations; Shannon Estuary and Cardigan Bay. Whistles were compared using a series of quantitative parameters and sorted into categories using contour shape. A total of 32 different whistle categories were described of which 21 were observed in both populations 8 were exclusive to the Shannon Estuary and 1 was exclusive to Cardigan Bay. The average duration of whistles from the Shannon Estuary population was found to be longer than whistles from Cardigan Bay. The average starting, ending, maximum, minimum, and mean frequency of whistles from Cardigan Bay was significantly higher than Shannon Estuary whistles. There was no statistical difference in the whistle rate between the populations. The differences observed in the whistles characteristics between the two populations could be representative of behavioural, environmental, or morphological differences between the Cardigan Bay and Shannon Estuary populations. 66% of the whistles described in this study were common to both populations. This similarity of whistle repertoire between the populations could be the result of a recent divergence time between the populations or possible transition of individuals between the locations. To further understand the whistle characteristics of bottlenose dolphins in Britain and Ireland, it would be necessary to include whistles from the Moray Firth population in Scotland.

Seawatch Foundation web site: Ronan Hickey, University of Wales, Bangor: Comparative study of bottlenose dolphin whistles in the southern Cardigan Bay SAC and in the Shannon Estuary.

In previous studies, comparisons of whistle characteristics between geographically isolated populations of delphinid species have revealed variation between locations. The waters of Britain and Ireland are home to three known resident populations of bottlenose dolphins Tursiops truncatus : Cardigan Bay (Wales), the Shannon Estuary (Ireland) and the Moray Firth (Scotland). This study compared the rate, repertoires and characteristics of whistles of two of these populations: Shannon Estuary and Cardigan Bay. Comparisons between years, groups and different group sizes were also carried out within the Shannon Estuary population. Cardigan Bay whistles were collected actively, onboard survey vessels using a deployed hydrophone, while Shannon Estuary whistles were collected passively via a fixed hydrophone. Whistles were compared using a series of quantitative parameters and sorted into categories using contour shape. Overall 1882 whistles were analyses throughout the course of this study. The Vast majority were collected in the Shannon Estuary. A total 32 different whistle categories were described, of which 21 were observed in both populations, 8 were exclusive to the Shannon Estuary and 1 was exclusive to Cardigan Bay. The average duration of whistles from the Shannon Estuary population was found to be longer than whistles from Cardigan Bay. The average starting, ending, maximum, minimum, and mean frequency of whistles from Cardigan Bay was significantly higher than Shannon Estuary whistles. There was no statistical difference in the whistle rate between the populations. Variations in whistle parameters and frequency of occurrence of whistle categories were also observed in comparisons within the Shannon Estuary population. Whistle rates increased with increasing group size. On a side note, dolphins in the Shannon Estuary were observed to have cyclic behaviour, which was influenced by tidal times. Dolphins were most commonly encountered during the mid ebb tide. The differences observed in the whistles characteristics between the two populations could be representative of behavioural, environmental, or morphological differences between the Cardigan Bay and Shannon Estuary populations. Further Research is required to expand upon the results of this study before the variance in whistle characteristics of Cardigan Bay and the Shannon Estuary populations can be fully understood.

The SDWF has a web site with a page that describes the "dialect" research, which lists two papers:

Hickey, R. (2005) Comparison of whistle repertoire and characteristics between Cardigan Bay and the Shannon estuary populations of Bottlenose dolphins (Tursiops truncatus) with implications for passive and active survey techniques. School of Biological Sciences, University of Wales, Bangor
Berrow, S.D., O’Brien, J. & Holmes, B. (2006) Whistle production by bottlenose dolphins Tursiops truncatus in the Shannon estuary. Irish Naturalists' Journal 28(5), 208-213.

The first of these is apparently Hickey's MSc report, which hasn't been published and isn't available on the web, as far as I can tell. The second one is an archival publication, but is not available in digital form -- and also doesn't include the "dialect" comparison.

Some other research suggesting that the idea of "dolphin dialects" is not an unreasonable one:

Rendell, L. and H. Whitehead (2001). Culture in whales and dolphins. Behavioral and Brain Sciences 24(2): 309-24; Discussion 324-82.

Abstract: Studies of animal culture have not normally included a consideration of cetaceans. However, with several long-term field studies now maturing, this situation should change. Animal culture is generally studied by either investigating transmission mechanisms experimentally, or observing patterns of behavioural variation in wild populations that cannot be explained by either genetic or environmental factors. Taking this second, ethnographic, approach, there is good evidence for cultural transmission in several cetacean species. However, only the bottlenose dolphin (Tursiops) has been shown experimentally to possess sophisticated social learning abilities, including vocal and motor imitation; other species have not been studied. There is observational evidence for imitation and teaching in killer whales. For cetaceans and other large, wide-ranging animals, excessive reliance on experimental data for evidence of culture is not productive; we favour the ethnographic approach. The complex and stable vocal and behavioural cultures of sympatric groups of killer whales (Orcinus orca) appear to have no parallel outside humans, and represent an independent evolution of cultural faculties. The wide movements of cetaceans, the greater variability of the marine environment over large temporal scales relative to that on land, and the stable matrilineal social groups of some species are potentially important factors in the evolution of cetacean culture. There have been suggestions of gene-culture coevolution in cetaceans, and culture may be implicated in some unusual behavioural and life-history traits of whales and dolphins. We hope to stimulate discussion and research on culture in these animals.

Rendell, L.E. and H. Whitehead (2003). Vocal clans in sperm whales (Physeter macrocephalus). Proceedings of the Royal Society of London. Series B. Biological Sciences 270(1512): 225-31. ISSN: 0962-8452.

Abstract: Cultural transmission may be a significant source of variation in the behaviour of whales and dolphins, especially as regards their vocal signals. We studied variation in the vocal output of 'codas' by sperm whale social groups. Codas are patterns of clicks used by female sperm whales in social circumstances. The coda repertoires of all known social units (n = 18, each consisting of about 11 females and immatures with long-term relationships) and 61 out of 64 groups (about two social units moving together for periods of days) that were recorded in the South Pacific and Caribbean between 1985 and 2000 can be reliably allocated into six acoustic 'clans', five in the Pacific and one in the Caribbean. Clans have ranges that span thousands of kilometres, are sympatric, contain many thousands of whales and most probably result from cultural transmission of vocal patterns. Units seem to form groups preferentially with other units of their own clan. We suggest that this is a rare example of sympatric cultural variation on an oceanic scale. Culture may thus be a more important determinant of sperm whale population structure than genes or geography, a finding that has major implications for our understanding of the species' behavioural and population biology.

Posted by Mark Liberman at 09:55 AM

And now dolphin dialects

In our ceaseless quest to keep you informed about interspecies dialogue, here at the Language Log Stupid Animal Communication Stories desk we have noted a report on this site (it appears to be a news blog in Croatia; thanks to Bruce Webster for the tip): "Scientists who study dolphins in the Shannon river estuary [in Ireland] believe that these animals can develop a dialect of their own."

Yes, a dialect. The evidence for this appears to be simply that there are certain noises made by the dolphins of the Shannon river estuary that are not made by other North Atlantic dolphins. That's it, really. But never mind if the evidence is a bit weak: just serve it up anyway, the general public will believe anything about language.

They will soak up stories about telepathic parrots; regional cow accents from the west of England; a 104-year-old macaw still spouting a WW2 repertoire of anti-Nazi obscenities (totally false); dialect-based species differentiation in Scottish crossbills; parrot parents giving names to their chicks; 40-year-old tales of chicken language; research on whale song pattern regularities more regular than the writing of the scientific prose written about it; anything, anything. The public will drink it up.

Until you start telling them that split infinitives are not ungrammatical and never were; then for some reason people who will believe any fantasy about animals suddenly go all skeptical and won't believe a single thing.

Pretty strange creatures, the general public. And do you know, they have parasites living amonst them, called "science journalists", who write up and feed them these stories? They're an extraordinary and fascinating species. I wonder if they have a language.

Posted by Geoffrey K. Pullum at 03:29 PM

Re-doubled prepositions

Following up on a series of previous posts -- "A note of dignity or austerity" (5/3/2007), "Back to the future, redundant preposition department" (5/4/2007), "A phenomenon in which I'm starting to believe in" (5/14/2007), "Could preposition doubling be headed our way?" (5/15/2007) -- several readers have contributed interesting examples and observations.

Peter Howard reported an example from the BBC News website ("Tarrant jokes about curry arrest"), which originally read:

Speaking on Monday, Mr Tarrant's spokesman said he had jokingly dropped cutlery onto the table of a couple with whom he had been chatting to.

On Friday, May 18, this sentence was revised to remove the final "to".

Evan Bradley sent in another example with different prepositions fore and aft, from a post on a college sports web forum:

In addition, people like Davies who is a bit undersized could possibly be used a hybrid LB/S for which Peters is being groomed as.

Nuria Yáñez-Bouza contributed a historical sketch of linguistic scholarship on the double-preposition contruction in relative clauses, "A Note on Double Prepositions". This is drawn from the material in her forthcoming dissertation, to which I'm looking forward to.

Simon Musgrave sent in an extraordinary example from "Second Report of the Ad Hoc Group" in L'affaire Wolfowitz (p. 45-46), where there is an initial that as well as the pied and stranded prepositions:

As President, he bore principal responsibility for safeguarding the institution and establishing the ethical standard that to which the staff would be expected to adhere to.'

Geoff Pullum's comment: "This is the neatest quote that of which I have had the pleasure of being apprised of in quite a long time! I love it."

Leaving relative clauses behind us, "Neddie Seagoon" points out that the search pattern {"of mostly of"} turns up quite a few examples:

This league consists of mostly of players who have about 10+ years of Soccer experience,
This page started out as a collection of pictures of mostly of abandoned lines in the Bruce Peninsula.
One particular type of mudrock consist of mostly of white colored smectite clays and colloidal silica.

The pattern /P ADV P/, where the first preposition seems redundant, is also frequent with other prepositions and adverbs:

For example, the plant cell wall is composed of largely of carbohydrate polymers, cellulose, hemicellulose and lignin.
The terrain consists of largely of rolling piedmont hills.
Growth in non-revolving debt, made up of largely of auto and education loans, has been slowing...

...the rear-facing elevation consisting of partly of glazing and partly glass blocks.

The hot dogs have been distributed to mostly to grocery stores in east Georgia and west South Carolina...
A movie that will appeal to mostly to children and grandparents, it is the gentle tale of an average man with an impossible dream.
That the film makes such impact is due to partly to the men in the supporting roles but mostly to Dench and Blanchett.

..we travel with mostly with repeat travelers and word-of-mouth referrals.
They worked with mostly with local singers and rappers to pay the rent on their lofty Melrose Place address.
I think we can all agree that NN4 is not a problem when used with mostly with HTML presentation layouts.

I just lose sight of sometimes of what I must do and focus on obsessing about what she is doing or not doing.

Credit cards came along in mostly in the 1970s.
The procedure has been used in mostly in college classes...
The Second Vermont Battery Light Artillery, "Chase's Battery," served also in mostly in the Department of the Gulf of Mexico.

We are striving to accomplish this goal through mainly through the spoken and written word...
The second is about memory, written through partly through flashbacks.
Homeopathy or even some conventional treatments may work through partly through the placebo effect.

As in the case of phrases like "world in which we live in", my first impulse is to see these as simple mistakes: errors of inattention in writing, or the results of careless editing. But if something is "an easy mistake to make" it's probably also a case of "performance variation from which language change could evolve". And it can be hard to tell where in this process we are.

This question has come up here before, for example with respect to "un-X-ed" and overnegation.

Posted by Mark Liberman at 07:09 AM

May 18, 2007

Accidental dropped keyboard command issuance probability

My wireless Macintosh keyboard, which was talking to my laptop, fell off my desk at home, and as it went down and bumped against the drawer knobs and I grabbed at it, several keys were accidentally pressed (and a Shift key actually came off). In the active window on the laptop at the time was a live SSH connection to the Linux machine on my desk up at the UCSC campus, so everything that was accidentally typed was interpreted as a sentence in the language of the tcsh Unix shell language on my Linux box two miles away. And as fate would have it, the keys that were hit as the keyboard clattered to the floor spelled out a fully grammatical sentence of that language, meaning that an actual executable command was issued to the operating system of my desktop machine, and was executed. What are the chances of that?

Well, it would take some tedious but elementary work in elementary combinatorics and Unix command file listing to figure it out exactly. Unix/Linux commands are spelled mostly in lower case letters with occasional upper-case letters and digits (I'll ignore one or two other legal characters like the underbar and the @-sign), so first we need the probability of all the keystrokes being contained among the 62 characters {0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z}. That is already low, but I won't bother to work it out; it would be different for different keyboard design features like function keys and numerical keypads and so on.

After that, given a combination of n characters in the right range for some positive integer n, the probability of their spelling a command name would be the number of available n-letter commands divided by 62n. For example, there happen to be 55 accessible commands (that is, commands in the directories on my path) that are spelled with 2 letters, so if just two random characters from the correct range were typed, the probability of their forming a legal command would be 55/(622), roughly 0.0143 — considerably less than a 2% chance.

The chances of accidentally hitting commands with longer names get lower and lower, of course, because there are fewer and fewer command names as length goes up (Unix loves short, cryptic command names), and at the same time you keep get a larger and larger divisor (6225

But then we have to take account of the fact that some commands require additional words. The command ls means something on its own ("list the contents of the current working directory"), but rm ("remove") requires at least one filename (which can be any arbitrary string of letters and/or digits and/or certain other printable characters), and cp ("copy") requires at least two filenames. If you get a name of a command that needs one extra word on the command line, you have to get a space after the command line and then a sequence of characters; and if you happen to get a name of a command that needs two extra words on the command line, you have to get a space after the command line and then two sequences of characters separated by a space.

You do the math; it can be your Breakfast Experiment™. As it happened, what my keyboard actually told my Linux system to do this morning was this:

bg OP+

(plus a Return on the end, which caused the actual execution attempt). I was lucky. This is not a very dangerous command. It means "Take the stopped job whose identifying number is OP+ and restart it running in the background." And it turned out to be semantically incoherent: OP+ is not a valid job number, so the result was just an error message saying that there was no such job.

It could have been an issue, though. The following string is of exactly the same length (seven keystrokes including the space and the final invisible Return):

rm ~/*

But that one means "Remove all plain files in the current user's home directory." And a Unix system will do just that if you (or your dropped keyboard) should happen to tell it to. It won't ask "Are you sure?" or "Delete all files?"; it will just swiftly and silently destroy the record of their former existence. And there is a finite, though very small, probability that it might happen simply by accident. There but for the grace of God and the low probability of random strings turning out to be grammatical in most kinds of language...

You may recall that I remarked on Language Log in another context that in English nearly all strings of words are ungrammatical, in the sense that the probability of a random string of English words being grammatical in Standard English heads down toward zero in the limit as string length goes up toward infinity. That's only a conjecture, but I think it's true. (By the way, it doesn't have to be true: Chris Barker devised, just for fun, a programming language called jot in which all programs are expressible and in which every string is grammatical; see this fascinating page. But in fact jot only uses the characters 0 and 1, so there is hardly any chance of a jot program being typed and run by accident even if he drops his keyboard while running a jot interpreter in the active window. Given about a 3% chance of the first keystroke being a 0 or a 1, the probability that both of the first two will be binary digits is only 6/10,000 = 0.0009, and for the first three the probability is only .000027, and we head downward toward zero land pretty rapidly.)

In his appeal for linguist macros, Mark Liberman writes:

And anyone with access to a supply of kitten-photos, and a linguistics course or two under their belt, ought to be able to come up with uncomputably many linguistics-oriented substitution-instances of "I'm in ur X, Y-ing ur Zs"; or "Invisible X", or "I can has Y?"; or maybe "Z: No Want".

The lol-ologists are way ahead of Mark on this one. To the right is one of the classics of the genre, created by Evil Mad Scientist Laboratories last November. The inimitable Mr. Verb picked up on this one right away.

For more contributions to the "I'm in ur X, Y-ing ur Zs" snowclone frame, see this BoingBoing post. There used to be a history of the Internet meme available from Encyclopedia Dramatica (tracing it all back to the gamer's exclamation, I am in your base killing your d00ds), but their server seems to be down. You can still check out the Google cache of the page, at least for now. [Update, 5/20: Encyclopedia Dramatica is back up, though they warn that their "server is falling apart" and make an appeal for donations.]

[Update: Here's a variation on the dictionary-kitty theme, also via BoingBoing:

And yet more lol-linguistic meta-commentary, from Meme Cats...]

Linguist macros?

I'm concerned about our field's lack of involvement in the dominant intellectual currents of our time. How are we going to get the public to see the value of linguistics if we don't engage the issues that concern our youth? I mean, there are lolbrarians, lawyer macros, physics cats, Shakespearean lolcats, and straight-up lolShakespeare. Where are the linguists?

It's not like we don't have good raw material in the form of aw+-inspiring linguists:


And anyone with access to a supply of kitten-photos, and a linguistics course or two under their belt, ought to be able to come up with uncomputably many linguistics-oriented substitution-instances of "I'm in ur X, Y-ing ur Zs"; or "Invisible X", or "I can has Y?"; or maybe "Z: No Want".

Think of the depth this would add to discussions on Linguist List or ADS-L.

[Update -- Amy Forsyth draws our attention to yesterday's Bunny at Frozen Reality:


[Update #2 -- Amy "mhwombat" de Buitléir writes:

As an enthusiastic student of the Irish language, I know that minority languages risk being consigned to the periphery of history unless young people take up the torch. I fear that our youth may abandon a language that has no accomodation for lolcats. We cannot afford to be complacent. To support this noble cause, I hereby present what I believe to be the first ever cat lolling "as Gaeilge", featuring my own beloved Porpentine. The caption rougly translates to "I likes alpaca flavr".

As far as I know, An Coiste Téarmaíochta, the organisation responsible for coining new terms in Irish, has not recommended a translation for "lolcat". To address this shortcoming, a group of us are debating the issue on an online forum. The Irish word for "cat" is "cat" (plural "cait"), and the standard net abbreviation for "lol" is "agoa" (ag gáire ós ard = laughing out loud), so "catagoa" (plural "catagoacha") seems an obvious choice. However, "lolcat" (plural "lolcait") also has support.

Finally, research is needed into the nature of kitty pidgin in Irish. Do cats make "TSF errors", confusing the substantitve verb with the copula? Do they lenite when they should eclipse, and vice versa? And what about dialects? My own cat, who has a Donegal accent, says "wee-ooh" for "meow". Perhaps Munster cats say "mee-ukh". Surely any linguist would find this a rewarding area of study.

Anyone who wishes to know more about catagoacha is welcome to read and participate in our discussion here.

Le meas,
Amy "mhwombat" de Buitléir

And under the OT heading "non-lol cat", John Cowan writes:

This morning it was "I'm in ur dry-clean-only clothes, pissing on ur shirts and pants." So off to the laundromat with me to wash them, then hang them up to dry, and only on Monday or so take them to the dry cleaner's.

(That's "off topic" rather than "optimality theory", which this event apparently was not, in his view.)]

[Update #3 -- Laurel MacKenzie writes:

A little embarrassed to admit this, but as soon as my friend Jacqueline Palmer (a fellow Berkeley undergrad linguistics alum) and I discovered I Can Has Cheezburger last month, we knew we had to extend it to the linguistics domain. Here are some captions to get you started, though how you could pair some of these up with cat pictures is beyond me...











Very nice. I should have known that It would be Out There: they're in our culture, replicating our memes. ]

[Update #4 -- Deiniol Jones sent in this one:

And Karolina Owczarzak sent in the first of these -- the second one is entered on behalf of Pinker and Jackendoff:


Posted by Mark Liberman at 07:41 AM

Getting the science of communication backwards

A key technique for "finding facts in a world of disinformation", according to unSpun, the recent book by Brooks Jackson and Kathleen Hall Jamieson, is "RULE #4: Check Primary Sources" (p. 160). And as they point out, the internet makes it easy to do this, especially if you have a university affiliation that gives you access to journal subscriptions online.

I warn you, though, experience with the process may make it hard for you to follow their excellent "FINAL RULE: Be Skeptical, but Not Cynical" (p. 175).

Here's a case in point. A few days ago, I took a look at the media's (uncharacteristically minimal) reaction to some research by Camelia Suleiman and Daniel O'Connell on alleged gender differences in interviews with Bill and Hillary Clinton ("Women and men again, you know?", 5/13/2007). And one of the most interesting things that turned up in a Google News search for {Suleiman Clinton} was Marilia Duffles & Jeffrey Lord, "Why Hillary Talks Like Bill", The American Spectator, 5/7/2007.

Duffles and Lord present a perspective that has its roots among progressive-minded feminists, but has recently become popular on the American right. And in support of this viewpoint, they cite some recent research on the neuroendocrinology of communication that turns out to -- well, you'll see.

Here's their theme:

Women are biologically predisposed to behave in a certain way with the aid of hormones like estrogen, which has dictated their softer physical characteristics throughout our prehistoric lineage. Hope as anyone may for uniform conduct, the incontestable fact is that our stone-age biology still holds sway against the cultural tides of modern society, manifesting as behavior we still anticipate: women as care-givers, provider of bosom, as the more empathetic, the more socially connected of the two genders. And this also happens to have biochemical proof.

When women chit-chat, their oxytocin level -- a feel-good hormone that elicits feelings of trust, bonding and love -- rises, according to a recent study by Dr. Shelley Taylor, psychology professor at UCLA. This means they experience pleasure and feel connected with others. And it lines up with Harvard psychologist Carol Gilligan's work on the difference in moral thinking between the sexes that concludes men generally rely more on a universal set of rules that determine obligations and rights (justice-based reasoning), whereas women are oriented towards care-based reasoning that focuses attention on the needs of others. [emphasis added]

For more on this bio-political trend, see "David Brooks, Neuroendocrinologist" (9/17/2006). Our focus here is not the broader ideological movement, it's the relationship between specific assertions and the scientific sources that they cite. So we're zeroing in on the sentence in bold, about how oxytocin levels rise when women chit-chat, reflecting the experience of pleasure and a feeling of being connnected with others; and we're going to check the "recent study by Dr. Shelley Taylor". That would be Shelley E. Taylor "Tend and Befriend: Biobehavioral Bases of Affiliation Under Stress", Current Directions in Psychological Science 15 (6), 273–277 (2006):

... we examined the relation of plasma oxytocin levels to reports of relationship distress in adult women (Taylor et al., 2006). We found that women who were experiencing gaps in their social relationships had elevated levels of oxytocin. Specifically, women with high levels of oxytocin were more likely to report reduced contact with their mothers, their best friends, their pets, and social groups to which they belonged. In addition, those with significant others were more likely to report that their partners were not supportive, did not understand the way they felt about things, and did not care for them. Poor quality of the marital relationship and infrequent display of affection by the partner were also associated with higher levels of plasma oxytocin. Thus, oxytocin appears to signal relationship distress, at least in women. [emphasis added]


Additional details, tables and graphs can be found in Taylor, S.E., Gonzaga, G., Klein, L.C., Hu, P., Greendale, G.A., & Seeman S. E., "Relation of oxytocin to psychological and biological stress responses in older women". Psychosomatic Medicine, 68, 238–245 (2006).

Now as R. Bowen explains in Pathophysiology of the Endocrine System, the physiological functions of oxytocin are many and complex:

In years past, oxytocin had the reputation of being an "uncomplicated" hormone, with only a few well-defined activities related to birth and lactation. As has been the case with so many hormones, further research has demonstrated many subtle but profound influences of this little peptide. For example, administration of oxytocin to species ranging from mice to humans has revealed a number of effects on social behavior.

Well-documented effects of oxytocin in humans include promotion of cervical dilation and uterine contraction during childbirth, and the "letdown reflex" in lactating mothers. Injecting oxytocin into the cerebro-spinal fluid causes erections in male rats, and vaginocervical stimulation releases oxytocin within the spinal cord in female rats. Oxytocin has been implicated in pair-bonding in monogamous prairie voles, maternal behavior in ewes, protective inhibition of fetal brain activity during childbirth, and so on.

As far as I've been able to determine, though, no one has ever measured oxytocin levels during female chit-chat. So where did Duffles and Lord come up with their howler about how oxytocin rises when women interact with one another, and "means they experience pleasure and feel connected with others"?

One possibility is that they got it from a 2003 popular book by Shelley E. Taylor, The Tending Instinct, which says things like this:

...oxytocin ... appears to be both a cause and a consequence of social support. In the face of at least some threats, oxytocin is rapidly released. From what we know about its effects, oxytocin may be one of the biological factors that propels people and animals into one another's company in stressful situations: as an "affiliative" hormone, oxytocin leads people to seek contact with others. It is released in the company of others, so it may also be implicated in the emotional experiences of bonding that are seen in the aftermath of tragedy.

I haven't been able to find any human studies to support the claim that oxytocin "is released in the company of others" -- as far as I know, this generalization comes from what happens when you put a bunch of rats together in a cage, in circumstances that are at least as likely to cause the rodent version of "relationship distress" as "trust, bonding and love". In any case, Duffles and Lord may have been led astray by Taylor's theory of oxytocin as an "affiliative hormone", which mediates what she calls the "tend and befriend" response to stress, as opposed to the better-known "fight or flight" response. This theory has been deservedly influential, though it's by no means uncontroversial.

However, you'd have to read Taylor very carelessly in order to characterize oxytocin the way that Duffles and Lord do. I suspect that they were influenced by another source -- Louann Brizendine's book The Female Brain. From p. 36-37:

And why do girls go to the bathroom to talk? Why do they spend so much time on the phone with the door closed? [...]

There is a biological reason for this behavior. Connecting through talking activates the pleasure centers in a girl's brain. Sharing secrets that have romantic and sexual implications activates those centers even more. We're not talking about a small amount of pleasure. This is huge. It's a major dopamine and oxytocin rush, which the beiggest, fattest neurological reward you can get outside of an orgasm. Dopamine is a neurochemical that stimultes the motivation and pleasure circuits in the brain. Estrogen at puberty increases dopamine and oxytocin production in girls. Oxytocin is a neurohormone that triggers and is triggered by intimacy. When estrogen is on the rise, a teen girl's brain is pushed to make even more oxytocin -- and to get even more reinforcement for social bonding. At midcycle, during peak estrogen production, the girl's dopamine and oxytocin level is likely at its highest, too. Not only her verbal output is at its maximum but her urge for intimacy is also peaking. Intimacy releases more oxytocin, which reinforces the desire to connect, and connecting then brings a sense of pleasure and well-being.

Now again, as far as I can determine, no one has ever measured oxytocin in teen girls on the telephone, or reported any other direct evidence of the effects of conversational interaction on oxytocin levels in men or women of any age. (If I'm wrong, please let me know.) So there's no direct evidence to show that this picture of female neuroendocrinology is false. On the other hand, there's no evidence that it's true, either -- it's a sort of neuroendocrinological science fantasy. And the evidence from Taylor 2006, that oxytocin levels in adult women rise with "relationship distress" and general social disconnection, certainly seems to point in entirely the opposite direction.

I don't want to disagree with Jackson and Jamieson's FINAL RULE: it's true, you should be skeptical, not cynical. But I'd suggest a codicil: if you read something about the science of communication in the popular press or in a popular book, these days, it's probably incomplete and misleading, and it may well be 100% backwards.

It would be easy to sneer at the American Spectator, which is not exactly famous for concern about whether stuff is actually, you know, like true or anything. But the fact is, the rest of the media, from the New York Times to ABC's 20/20 to the BBC, has not been any better in covering the neuroscience of sex differences in general, and sex differences in communication in particular.

There are several effects that combine here to create a perfect storm of misinformation. One is the modern tendency to treat science as a source of morally instructive tales, not to be taken literally. Another is the generally abysmal level of scientific understanding among journalists and editors. And then there's the special ignorance of language and communication among our society's intellectuals in general, who may have forgotten the physics and biology that they took in college, but who never learned any linguistics to start with.

As I said, it can be hard not to let skepticism turn into cynicism. My own method, as I've explained elsewhere, is to join H.L. Mencken in viewing the mass media as a "daily panorama ... of private and communal folly inordinately gross and preposterous, so perfectly brought up to the highest conceivable amperage, so steadily enriched with an almost fabulous daring and originality, that only the man who was born with a petrified diaphragm can fail to laugh himself to sleep every night, and to awake every morning with all the eager, unflagging expectation of a Sunday-school superintendent touring the Paris peep-shows".

Posted by Mark Liberman at 06:50 AM

The Doonesbury web site is reporting that Paul Wolfowitz, referring to several senior staff members at the World Bank, said: "If they f*** with me or Shaha, I have enough on them to f*** them too." Tough talk. But one really should save one's tough talk for cases in which one knows one can win. When I posted "Stupid prophylactic public statement blather" on April 23, I was of course (did you ever doubt it?) thinking more of Mr Wolfowitz than anyone else, and I knew with certainty that he would be resigning soon (though heaven knows there are plenty of other public figures who will need to get their staff to write some stupid prophylactic public statement blather for them pretty soon as well).

Applications to be the next president of the World Bank are now being sought; you could follow James Wolfensohn (did I say Wolfenden when I first typed this?) and Paul Wolfowitz in this job. But first, you must be well in with the United States executive branch, because it is a tradition at the bank that the President of the USA chooses the Bank's president; and second, because of another little-known linguistic regulation at the bank, your name will have to be Wolf, Wolfe, Wolfberg, Wolfcraft, Wolfensberger, Wolfensohn, Wolfenstein, Wolfer, Wolferen, Wolferman, Wolfgang, Wolfinger, Wolfman, Wolfmark, Wolford, Wolfowitz, Wolfram, Wolfsberger, Wolfsburg, Wolfschmidt, Wolfson, Wolftrap, or Wolfy. Good luck.

Posted by Geoffrey K. Pullum at 12:56 AM

More interesting

Jake Tolbert writes:

You know, it's interesting. I just read your post about 'it's interesting' ("Interesting times", 5/12/2007). No more than two days ago, I was practically driven mad listening to NPR as I drove home. The interviewer would ask a question that started with, "you know, it's interesting that such and such. What do you think?"

And the response was "It's interesting. Blah blah blah."

I wish I could remember what the interview was about, or when it was, and I'd try to find an clip of it. If it comes to me, I'll pass it along.

A search for {"you know it's interesting"} gets 143 hits on Google, and {"interesting question"} gets 687. And none of them seem to be in clips from George W. Bush.

Joseph Ruby writes:

As a lawyer I come across this usage in the papers of opposing counsel from time to time. It is sarcastic and implies that the person advocating the interesting position is a hypocrite and a liar. That is how Garry Trudeau's Bush is using it. But it appears from your quotations that the real Bush doesn't use it this way.

By "this usage", I think, Joseph means phrases like "I find it interesting that X", as in the phrase that Trudeau attributes to President-Bush-the-cartoon: "I find it interesting that Congress wants to abandon our troops by defunding them."

As Joseph says, I was unable to find any instances of this kind in quotations from President-Bush-the-real-person. He uses interesting at a relatively high rate, but it seems to be generally for the same reason that NPR interviewers and interviewees do: to set a positive tone, to highlight pieces of discourse, and perhaps sometimes to fill time while composing an answer.

An imputation of hypocrisy and/or dishonesty does seem to come up in some of the results of a search for {"I find it interesting that"}. In general, though, there's an added element in such cases, namely some sort of meta-comment on someone else's discourse. The structure is something like "I find it interesting that X asserts Y", or "I find it interesting that X doesn't mention Z", or "I find it interesting that the one who says Y is X". The effect is to evoke questions about X's motivations, and thus to cast doubt on X's arguments.

The top-ranked hit, this weblog item from 2/7/2003, evokes several layers of irony in the present context:

I find it interesting that most webloggers I read aren't commenting on Colin Powell's remarks before the U.N. the other day. I wonder if webloggers are scared of admitting to the world "well, he sure does have a good point?"

And the second item, from a Windows partisan on a forum, is

I find it interesting that the guys posting ... ... negative comments about Vista here are 1. Linux for Me and 2. Arm A. Geddon who's tagline is "gnu/ choice to the neX(11)t generation.".

A couple of items down the list, we find:

I find it interesting that people who say I'm wrong about wages being based on supply and demand ... ...resort [to] personal attacks against me, instead of giving a different explanation of how wages are determined.

In the last two examples, the "I find it interesting that the X WHO SAY Y" part forms the header of the post.

Interesting has a completely different impact in those examples from its effect in this NPR interview response:

I think that's a really, really interesting question. And the honest answer is that at the moment we just really don't know.

This is a polite and engaged answer, which gives a very different impression from a similar answer that doesn't mention the interest of the question, say:

I don't know, and neither does anyone else.

On the other hand, it would have been on the edge of rudeness for the speaker to have expressed interest by starting an answer with something like "I find it interesting that you asked that question."

Posted by Mark Liberman at 07:25 AM

Oh no he didn't!

Last night on PBS's Newshour with Jim Lehrer and this morning on NPR's Morning Edition, the late Jerry Falwell's highly inflammatory comments made on the heels of September 11, 2001 were replayed. Here's the quote that's being bandied about our public airwaves, from Falwell's appearance on Pat Robertson's 700 Club on September 13, 2001.

I really believe that the pagans, and the abortionists, and the feminists, and the gays and lesbians who are actively trying to make that an alternative lifestyle, the ACLU, People for the American Way--all of them who have tried to secularize America--I point the finger in their face and say, 'You helped this happen.' [audio]

And immediately after each clip:

Jeffrey Brown on PBS: Falwell later apologized.

Barbara Bradley Hagerty on NPR, speaking to Morning Edition host Renee Montagne: Now Renee, he later apologized for that comment.

I've been searching (not very systematically) for audio of Falwell's alleged apology. No success yet, but I have gathered what I consider to be sufficient textual evidence for my hunch that this was yet another case of a non-apology.

(Respect for the recently deceased be damned. Falwell said the words above just two days after the terrible attacks that killed thousands of innocent people and shocked people from all walks of life across the nation and the world. Where was Falwell's respect?)

CNN has this story in their archives.

[I]n a phone call to CNN, Falwell said that only the hijackers and terrorists were responsible for the deadly attacks.

[ This is followed by quotes from Falwell that, if anything, repeat what he was supposedly apologizing for. That's followed by quotes from Pat Robertson agreeing with Falwell's position, followed by National Gay and Lesbian Task Force Executive Director Lorri L. Jean's expression of hope (which CNN calls a "demand") for an apology from Falwell. ]

Falwell told CNN: "I would never blame any human being except the terrorists, and if I left that impression with gays or lesbians or anyone else, I apologize."

Sure, the words "I apologize" are there, but let's examine this a little more closely. If it's true that he "would never blame any human being except the terrorists" for the attacks of September 11, then how are we to interpret the pointing-of-the-finger that he clearly articulated on September 13? Even putting this incongruency aside, there are several obvious ways in which this is not an apology in the relevant sense. For one, Falwell might only be apologizing for leaving an "impression" ("if" one was even left!), but not for the comments themselves. Or, Falwell might not really believe that the people his comments laid blame on are "human beings". Or, Falwell might believe that the people his comments laid blame on are also members of the group "the terrorists".

Or, somewhat more complicatedly, we might ask what the phrase "that impression" refers to. Of course we're supposed to think that it refers to "the impression that Falwell would blame a human being other than the terrorists". However, it could also refer to "the impression that Falwell would never blame any human being except the terrorists", which corresponds directly with what Falwell actually says in the immediately preceding clause (a prototypical way to resolve the referent of a phrase like "that X"). Under this construal, Falwell would be apologizing for leaving the impression that he would never blame anyone but the terrorists, which is certainly not the impression that his comments left, so he's really not apologizing for anything!

Two other synopses of Falwell's various insincere attempts at an apology can be found here and here. The first link reports the following first two attempts at an apology, and the second link reports the second two. Not one of these is an honest and sincere apology, for the reasons I explain below each.

  1. I sincerely regret that comments I made during a long theological discussion on a Christian television program yesterday were taken out of their context and reported, and that my thoughts -- reduced to sound bites -- have detracted from the spirit of this day of mourning.

    Here Falwell "sincerely regret[s]" what others have done with his comments (taking them out of context, reducing them to sound bites), not for making the comments himself.

  2. In the midst of the shock and mourning of a dark week for America, I made a statement that I should not have made and which I sincerely regret [...] I want to apologize to every American, including those I named.

    Here Falwell "sincerely regret[s]" having made the comments. But wanting to apologize is not the same as actually apologizing. Note that Falwell could have ended this last sentence with "... but I won't" and it wouldn't be infelicitous.

  3. This is not what I believe and I therefore repudiate it and ask God's forgiveness and yours.

    Here Falwell "repudiate[s]" his comments because they're "not what [he] believe[s]". (Then why'd he say them? And what does he believe?) He also asks for "forgiveness" from God and everyone else, as if he's merely a victim of his own inability to express what he actually believes.

  4. I misspoke [...] I apologize for my September 13 comments because they were a complete misstatement of what I believe and what I've preached for nearly 50 years [...] Namely, I do not believe that any mortal knows when God is judging or not judging someone or a nation. In my listing of groups and persons who might have assisted in the secularization of America, I unforgivably left off the list a sleeping church, Jerry Falwell, etc. ... It was a pure misstatement, unintentional, and I apologize for it uncategorically.

    This apparently "uncategorical" apology is preceded by the highly dubious claim that Falwell "misspoke" and that it was "unintentional". So, Falwell's really just apologizing for not having control over his ability to speak correctly (or something).

Finally, note that the second link also reports the following quote from Falwell:

[M]ost of the heat I've taken has not been because of the statement. It's from people who are upset that I apologized. Thousands of people of faith in America unfortunately agreed with the first statement. ... They were incensed that I apologized.

Rest assured, ye "[t]housands of people of faith in America": Falwell never really apologized. Nor do I believe he should have had to or tried. It's abundantly clear that Falwell said exactly what he believed, and what the majority of those that he represented believe -- that's what made Falwell an effective leader, and why his comments (not to mention his death) made the headlines. One of the guests on the Newshour last night made precisely this point.

Tony Campolo, Eastern University: Not only did he say what he meant, but I think that was his genius. When he made statements, which a lot of people thought were harsh, he was really articulating what huge numbers of Americans really feel and think. And that's what made him such a lightning rod.

The other guest, on the other hand, left open the possibility that Falwell may have just been really good at working the media.

Tony Perkins, Family Research Council: He would make a statement, usually in jest, knowing that a reporter somewhere would pick it up and run with it. Obviously, taken out of context, it looks horrible, but he got a headline. And he knew how to get his message across. Sometimes that was a double-edged sword, but most of the times [sic] he was very effective at doing it.

At least in the case at hand, I don't see how any of what Falwell said could be considered "in jest", "taken out of context", or simply an effort to "[get] a headline". Falwell said what he said because it was an emotional time and he really, truly believed it. Posthumously pretending that Falwell apologized for his comments may make him look a little better to those of us who think that what he said was wrong, but this comes at the expense of how he looks to those who think that what he said was right. He said what he said, and he didn't apologize for it, no matter how that makes any of us feel.

Syntactic protection against generification

Victor Steinbok wrote to me recently about this not-very-serious post of mine to say I should take a more serious look at INTA's policies. Victor knows something about this topic, and he says that INTA

could not care less if someone uses a trademarked term as a verb or or a noun or any other part of speech. From their perspective, when a company routinely uses a trademark as a verb in their own advertising or correspondence, they open themselves up to the cancellation of the mark for becoming generic (and they even have a verb for that--how does "generification" strike you?). The goal of every trademark holder is to make his trademark ubiquitous but to prevent it from becoming generic.

So he's saying that INTA's unfollowable advice is not aimed at us; far from offering minatory prescriptions concerning our usage while corporations merrily flout the rules, he's saying it is merely legal advice, and the corporations are its only target. When they ignore the advice (as they do in the Zappos ads where they turn Zappos into a verb, apparently meaning "shop on the Internet for shoes or apparel"), they do so at their own peril. He continues:

So, irrespectively of what linguists think about the advice, it does hold legal water. Xerox nearly lost its trademark because of the common use of "xerox" to mean "photocopy" (or, as you might recall, "make a xerographic copy"). Google is actually at the same risk now, although the verb "to google" generally still means "to search using the Google search engine" and not a generic search on the web. When Google is overtaken as the dominant search engine and if the verb is still in use, they may well lose the trademark, adding to their misery. DuPont barely rescued Teflon from the generic heap by starting to send out letters to newspapers and others who used Teflon in the generic sense of "non-stick coating". Note that Teflon is still a trademark even though the patent is long expired and duPont no longer has a monopoly.

Sanka is already a generic term in nearly every common language but English (decaf instant coffee--sometimes even just any instant coffee).

How does a trademark become generic? The strongest evidence comes from the holder's own use of the term in the generic sense in their own correspondence. If they put it in advertising, it becomes even worse. It has nothing to do with what part of speech it actually becomes in the process--it's just that it is much easier to see a verb use of a trademark as generic than either an adjective or a noun.

In your 2004 post, you correctly noted the pervasiveness of noun use--just the opposite of the INTA recommendation. That's because in advertising, trademarks-as-nouns signify the point I tried to make earlier--they are becoming ubiquitous. "I'd rather have a Heineken" sounds very different from "I'd rather have a Heineken beer." But other than literally quoting advertising slogans, the mark holder should refrain from using the mark in the same manner in internal correspondence. That's all.

Some of the rules covered in the trademarks course I just completed last month are just bizarre and counterintuitive, but this one actually makes sense--to a trademark lawyer.

So the way Victor sees it, we are actually dealing with recommendations against company-internal uses of syntactic constructions whose widespread use outside the company might result in generification — not rules of grammar for the rest of us. And all bets are off in advertising slogans: I could have had a V8 is OK because it's in an official slogan for V8 fresh wholesome vegetable juice drinks.

This message was brought to you by Language Log linguistic information and entertainment services, where utterances like "I spent an hour language-logging over breakfast" are firmly discouraged.

Posted by Geoffrey K. Pullum at 05:17 PM

The Belgian Language

Among the announcements of new releases of software at Freshmeat is one for the WiKID Strong Authentication System, which is reported to have interfaces in English, French, Belgian, Indonesian, German, Spanish, Japanese, Dutch, Polish, Russian, Swedish, Turkish and Chinese. What the heck is Belgian?

As the son of a Belgian, I know that the two major linguistic groups speak French and Dutch (formerly called vlaams "Flemish", but now officially called nederlands "Dutch") and hate each other to the extent that the joke is that the only real "Belgians" are the royal family and the Jews, everyone else identifying as a Fleming or a Walloon (Walloon being the dying Romance language now largely replaced by French). Some prefer the terms "Lemmings" and "Baboons". The Royal Family and the Jews are bilingual and avoid association with either linguistic faction. There is also a small part of Belgium in which the official language is German. There is no language that is referred to as "Belgian", unless one means the language of the people known to Julius Caesar as the Belgae, after whom Belgium is named.

Unfortunately, we do not know what language the Belgae spoke. It was almost certainly either Celtic or Germanic, but we don't know which. Their territory was well within the Celtic-speaking part of Europe, and the name has a nice etymology in Celtic, but it is still possible that they were actually speakers of a Germanic language, since we know that some Germanic tribes fell within the Celtic cultural sphere and the name of the tribe, as in many other cases, may have been the name given them by some other people, not their own name for themselves.

The only true modern Belgians not known to me to be bilingual are the horses, which come from the Brabant region, around Brussels, right on the linguistic border. As far as I know, they don't have a language, but the BBC may disagree.

Posted by Bill Poser at 03:59 PM

At least I ran out of staples...

A new node in the meme-flow diagram?

Posted by Mark Liberman at 05:42 AM

The Linguistics (Have Faith)

Hey y'all, in the Plaza today we have The Linguistics, a SoCal Hip Hop group whose latest, best, and, indeed, debutest cdAlbum cover The Writes of Passage includes a track all you LanguageLog homeys will pick right out as a veiled yet searing never-out-of-character satire on the mores of modern generative syntax... science or religion? Have Faith is the name of the track. Check it out.

Mark has raised a possibility of which I am afraid of: that preposition doubling (simultaneous fronting and stranding) will take off and become standard, and we will all have to use it. The foregoing sentence is still not grammatical for me, and I would correct it if I were copy-editing or reading student work. But look:

A thing of which I am afraid of is the maintenance effort to sort out the user input e.g. putting the created pages into the correct categories.

I looked for (and found) a "new" kind of mathematical symmetry, using Pythagoras principles, and the irrational 12th root of 2 (of which Pythagoreans were "afraid of") to approximate the set of ratios that can produce resonance.

Further, pagan religion itself of which Israel was afraid of was practically converted to the cult of the Greek gods.

And I found these simply by choosing a random preposition P and a random adjective A that takes P as the head of its complement and searching for P which . . . A P. Specifically, I used the Google pattern "of which * afraid of". When I picked another random such phrase, choosing P = to and A = accustomed, and searched on the Google pattern "to which * accustomed to", I immediately hit this:

A bit-part role is something to which Traore grew accustomed to during his time at Liverpool earlier in his career, with a seven-year stay on Merseyside ...

So watch out; the fronted-plus-stranded construction could be coming to your town. Don't blame us linguists; we don't direct linguistic change, we merely observe it, describe it, and if necessary try to get used to it.

Oh, and one other thing: we try to learn from it. For true syntax aficionados, let me just make one technical point: movement theories do not predict this. If you move the PP you get to which Traore grew accustomed; if you move just the NP complement you get which Traore grew accustomed to; and under the copy theory of movement with visible copies you would get *to which Traore grew accustomed to which; but there is no natural way to get to which Traore grew accustomed to. As you may know if you are a practitioner of theoretical syntax, I am skeptical about movement theories, and have been since 1979. This construction provides one more plank for my skeptical platform, doesn't it? I will refrain from saying "Nyaaah nyaaah."

Posted by Geoffrey K. Pullum at 04:30 PM

Les quatre cents mots: Querla a l'envers

Walter Laqueur has a new book coming out, The Last Days of Europe: Epitaph for an Old Continent; and an essay based on it, "So Much for the New European Century", appears as the cover story in section B of The Chronicle of Higher Education (known as The Chronicle Review) for May 11, 2007.

The premise of this essay is a familiar one for those who have read Bat Yeor's Eurabia, or Melanie Philips Londonistan, or Oriana Fallaci's The Force of Reason, or have listened to the echoes of these women's work through the intellectual world, most recently in Christopher Hitchen's essay for the latest issue of Vanity Fair, "Londonistan Calling": Europe is being transformed by Muslim immigration and by its own cultural and political exhaustion.

Laqueur's conclusion:

Europe as we once knew it is bound to change, probably out of recognition, for a number of reasons, partly demographic and cultural, but also political and social. Even if Europe should unite and solve the various domestic crises facing it, its predominant place in the world and predominant role in world affairs is a thing of the past.

The argument is a familiar one, but I was shocked to see it featured on the front page of The Chronicle Review.

I'll leave it to those who know more about demography, politics and Europe to evaluate the premise and the conclusions of this argument. What got my attention was a strange bit of linguistic misinformation that Laqueur drops in along the way.

Today, if our friend really wanted to see the future, a short walk or bus ride would do in order to get a preview of the shape of things to come. An excellent starting point would be Neukölln or Cottbusser Tor in Berlin, or Saint-Denis or Evry in the Paris banlieues. In some ways, moving about European cities has become much easier. There are fewer language difficulties; the argot of the outlying areas of major cities populated by immigrants, the banlieues (verlan), we are told by Le Monde, consists of 400 words.

Overall, the business about "fewer language difficulties" is just a joke, for obvious reasons. Immigrants in Europe generally learn the languages of their host countries, and to the extent that they don't, they add more languages to the European mix, not fewer. But the claim about the impoverished lexicon of "verlan" is not a joke. Rather, it's somewhere between false and meaningless.

The first thing to say about verlan is that it's really not an argot or slang, as such, but rather a language game, something like Pig Latin or Ubbi Dubbi, a way of transforming ordinary French words by re-arranging their sounds. The basic technique is to put things backwards: verlan itself is verlan for l'envers, "reverse". Thus fou become ouf, pourri becomes ripou, vérité becomes tévéri, and so on. But as Marc Plénat explains ( "Une approche prosodique de la morphologie du verlan", Lingua 95 (1-3), March 1995, 97-129),

Ce retournement, cependant, n'a rien de méchanique. Les inititiés insistent souvent sur le fait qu'un même mot peut être codé de plusieurs façons et que l'acceptabilité d'une forme se juge 'à l'oreille'...

This reversal, however, is not at all mechanical. The initiates often stress the fact that the same word can be encoded in several ways, and that the acceptability of a forme is judged 'by ear'...

Some particular verlanized words have become common replacements for their originals. In some cases, there's a shift in meaning, so that beur, which is verlan for arabe, is used to refer second- or third-generation North African immigrants. But most common verlanized words are (as I understand it) just slangy and thus informal, like meuf for femme ("woman"), or keum for mec ("guy").

Given all this, to refer to verlan as an "argot of the outlying areas of major cities populated by immigrants", which "consists of 400 words", is a preposterous misunderstanding. Verlan is an open-ended process, that can be applied to any French word. The fraction of words in the speech of banlieue residents that is verlanized is in any case small. And I'm prepared to wager a year's salary that the overall vocabulary of "the outlying areas of major cities populated by immigrants", whether in France or elsewhere in Europe, is many times greater than 400 words.

The Le Monde article that Laqueur cites appears to be Frédéric Potet, "Vivre avec 400 mots", 3/19/2005. It quotes Alain Bentolila in support of the "400 words" number:

Pas simple de chercher du travail, d’ouvrir un compte en banque ou de s’inscrire à la Sécurité sociale quand on ne possède que "350 à 400 mots, alors que nous en utilisons, nous, 2 500", estime ainsi le linguiste Alain Bentolila, pour qui cette langue est d’une "pauvreté" absolue.

Not so easy to look for work, to open a bank account or to sign up for social security when you have no more than "350 to 400 words, while we use 2,500", estimate the linguist Alain Bentolila, for whom this language is radically "impoverished".

Bentolila was the author of the "Rapport de Mission sur L'enseignment de la Grammaire" that we discussed last winter (see "Back to Bentolila", 12/27/2006; "Cultural specificity and universal values", 12/22/2006; "French report: It's lucky Copernicus had grammar", 12/18/2006). Either Potet is radically misquoting Bentolila -- always a possibility in mainstream journalism -- or Bentolila is talking through his hat, as we can see by comparing Bentolila in a 2002 interview in L'Express: "today, a certain number of citizens are less capable than others of expressing their thoughts accurately: 10% of children entering elementary school have the use of less than 500 words, instead of 1,200 on average for the others".

Can it be that between 2002 and 2006, we went from a characteristic of 10% of children entering first grade (some of whom may not have spoken French at home), to a characteristic of the language of the entire population of the banlieues? It seems more likely that all of these numbers are exaggerations or fabrications, created by Bentolila to impress his interviewers, or by his interviewers to stir up their readers.

One reason to be suspicious of these numbers is the fact that they are so different in the two interviews. But another reason is that none of them really makes sense as an estimate of anyone's vocabulary size. A credible estimate of the passive vocabulary of average American high-school graduates -- the number of words they know, on plausible definitions of "word" and "know" -- is about 40,000 (See M. Graves, "Vocabulary Learning and Instruction", Review of Research in Education, 13 49-89, 1986; W.E. Nagy & R.C. Anderson, "How many words are there in printed school English?" Reading Research Quarterly 19, 304-330, 1984.). As for the number of words that someone actively uses in speaking or writing, that depends on how long you track them (see "An apology to our readers, 12/28/2006). I'm sure that Prof. Bentolila uses far more than 1,200 distinct words, or 2,500 either, and that you wouldn't need to transcribe more than a couple of hours of his lectures to prove it. In general, less well educated people display fewer distinct words per unit time than an intellectual like Prof. Bentolila does, but again, we would not have to transcribe very many hours of the conversation of typical banlieue residents in order to see more than 2,500 (or for that matter 10,000) distinct words.

If you know any references to actual vocabulary measurements in various sectors of the French population, please let me know. For now, I'll just observe that this kind of limited-vocabulary complaint is a staple of "kids today" hand-wringing (see "Vicky Pollard's revenge", 1/2/2007), and that among dozens of journalistic plaints on this topic, I've never seen any credible empirical support for the idea that the population's word stock is declining, whether in France or in Britain or anywhere else.

The ironic thing is that Potet leads his 2005 article with an example of vocabulary impoverishment -- caused by imposition of standard French! A young woman in a remediation program in Grenoble gets something wrong in class, and exclaims "Je suis trop une Celte!" ("I'm such a Celt!"), meaning "I'm such an imbecile!". Potet asks her how and why this word "has been diverted from its [proper] meaning?" She doesn't know, but

L’adolescente sait seulement qu’elle ne prononce plus beaucoup cette expression, en tout cas plus en classe. Elle veut "réussir dans la vie et avoir un métier" et espère reprendre bientôt une scolarité normale, commencer une formation, faire des stages. "Pour cela, il faut que j’apprenne à bien parler", reconnaît-elle.

The young woman only knows that she won't use this expression any more, at least not in class. She wants "to succeed in life and to have a career" and hopes soon to resume normal schooling, begin an education, do internships. "For that, I have to learn to speak properly", she realises.

But in any case, it's hard to believe that someone who knows French well enough to read Le Monde could possibly misunderstand the meaning of verlan as drastically as Laqueur does -- it's like thinking that hiphop is the name of a district in New York, or that jazz is an instrument. So I suspect that Laqueur got the "400 words" from a tertiary source, that is, some English-language article quoting Potet quoting Bentolila.

More generally, the fact that Laqueur didn't even manage to get his slang terminology right doesn't inspire much confidence that he has some better evidence of vocabulary impoverishment in his footnotes.

While we're here, let's look into verlan a bit further. It's been studied under that name at least since the mid-1980s (e.g. C. Bachmann. and L. Basier, "Le verlan: Argot d'école ou langue des Keums?". Mots 8, 1984; N.J. Lefkowitz, "Talking backwards and looking forwards. The French language game Verlan" Ph.D. Dissertation, University of Washington, 1987).

It surely has roots in a much older tradition of language games in France and elsewhere. See "Noi Lai and contrepets", 1/8/2005, for some more literary examples; and Natalie Lefkowitz ("Verlan: Talking Backwards in French", The French Review, 63(2), 1989), cites a 1985 TF1 interview with then-president François Mitterand, in which a bit of verlan came up (chébran, from branché, "plugged in"):

Q: Vous savez ce que c'est, le 'chébran'?
A: Vous savez, quand j'etais enfant, on renversait l'ordre des mots; ce n'est pas très nouveau ça! Ça veut dire 'branché, bien entendu. Je ne veux pas faire le mâlin, je ne suis pas trè informé, mais c'est déjà un peu dépassé. Vous auriez dû dire 'câblé'.

Q: You know what it is, "chébran"?
A: You know, when I was a kid, we reversed the order of words; it's not exactly new! It means 'branché', of course. I don't want to be a smart-ass, I'm not that well informed, but it's already a bit out of date. You should have said 'câblé'.

By the mid 1980s, as Lefkowitz explains, verlan already pervaded "both Standard French and French society":

From the president of the Republic, to the enormously popular French singer Renaud and his album Laisse Béton, to the filmmaker Claude Zidi and his flim Les Ripoux, to the novelist San Antonio, to the large movement of second generation North African Arabs in France commonly known as Les Beurs, to large advertising campaigns, to the Parisian intellectuals whose children I taught [at the prestigious Lycée Henri IV], the widespread influence of Verlan and its penetration of both Standard French and French society is evident on all levels of the popular media.

There may well be good reasons to be concerned about vocabulary development, and especially standard-language instruction, among residents of the banlieues in France, or of the inner cities in the U.S. (For example, see the discussion of Martha J. Farah, et al., "Childhood poverty: Specific associations with neurocognitive development", Brain Research 1110(1) 166-174, September 2006, at the end of this post.)

But it doesn't advance the discussion to proliferate ignorant repetitions of misleading quotations of exaggerated or fabricated statistics about vocabulary size.

[Joseph Ruby writes to point out some similarities between verlan and the back slang of 19th-century London. There are some differences as well -- back slang seems to have been much more strongly influenced by spelling, reversing letter-strings rather than phoneme-strings, and also to have reversed polysyllables letter-by-letter rather than syllable-by-syllable. ]

Posted by Mark Liberman at 08:40 AM

May 14, 2007

Chicken: the PowerPoint Presentation

(The backstory is here. Or perhaps here?)

[As several readers have pointed out to me, there is also an archival publication: "Doug Zongker, "Chicken Chicken Chicken: Chicken Chicken", Annals of Improbable Research, 12(5) September-October 2006, 16-21.]

[And there's also the Great Wikipedia Chicken Vandalism Saga...]

Posted by Mark Liberman at 09:06 PM

The unfab four

A little while ago, after a discussion of the evil passive voice (as characterized by Sherry Roberts in her little handbook on business writing), I set Language Log readers a take-home question:

Take-home.  Section 9 of the Roberts booklet begins:

Watch out for these four commonly misused words.

Some words in the English language take a constant beating in business correspondence. Be one of those writers who use them properly and pleasantly surprise your readers. Your conscientiousness may sell your next idea or product.

So, what do you think these four words are, and what's the problem with them?

The envelope, please!  And the losers are:

That vs. which. Which often follows a comma and introduces a phrase that provides additional information not essential to the meaning of the sentence. That introduces a phrase that is essential to the meaning of the sentence.

The report, which is twenty pages long, is mandatory reading. (Which introduces additional, but unnecessary, information.)

The report that is twenty pages long is mandatory reading. (That points out a characteristic of the report and distinguishes it from a ten-page report.)

Hopefully. This doesn't mean I hope. Hopefully, I'll finish the report by noon. Do you mean you'll finish the report in a hopeful frame of mind by noon? Or do you mean you hope you'll finish the report by noon? Say what you mean: I hope to finish the report by noon.

Very. Avoid this lukewarm, unspecific adverb. I'm very happy that you elected me chairman of the Society for People with Super Sensitive Feet. Is very happy happier than just happy?  [Note rhetorical question, conveying that very happy is not in fact happier than happy.]  Why not overjoyed or: I'm tickled to be the new chairman of the Society for People with Super Sensitive Feet.

How disheartening: Fowler's Rule (which counts as two misused words), speaker-oriented sentence adverbial hopefully, and the intensifier very.

We've written a lot about Fowler's Rule here on Language Log and I don't see any point in rehashing the topic, though I will note that Fowler (who was not the originator of the principle, but did serve as the major vector of its spread in the 20th century) merely said that it might be better if the functions of restrictive and non-restrictive relativization were cleanly split between that and which, respectively -- while admitting that writers in English generally did not do this.

Speaker-oriented (or "stance") adverbial hopefully has been taking abuse pretty steadily for 30 or more years (see MWDEU).  Linguists are mostly just baffled by this disparagement; see the discussion in the American Heritage Book of English Usage, where it's noted that "hopefully seems to have taken on a life of its own as a shibboleth."  But the word fits right into long-standing patterns of the language  -- cf. frankly in "Frankly, this soup stinks" and surprisingly in "Surprisingly, this soup is delicious" -- and it provides a way of expressing the speaker's attitude towards a proposition which is both (a) brief and (b) subordinate: "I hope that S", "I have a hope that S", "It is to be hoped that S", and the like are wordier, and have the hoping expressed in a main clause (as the apparent main assertion), while what writers want is to assert the proposition provisionally, adding a modifier expressing their attitude towards it.  So speaker-oriented hopefully is a GOOD thing, and it's no surprise that it's spread so fast.

As for very, I intend to post on this eventually -- I've had a piece in draft for some time -- because it's a venerable proscription (going back to Strunk (1918) and before), and one that has its puzzling aspects.  For the moment, the crucial thing is that for many people who use very, very happy is indeed happier than happy, while replacements like extremely happy and overjoyed are often too far up on the happiness scale.  (Tickled is just an elaborate joke, and somewhat out of place in a manual of business writing.)

[Added 5/15/07: On ADS-L, Doug Harris pointed how how effective repetition of the very can be, citing a story from the Oneonta (NY) Daily Star, in which a local farmer who'd been struck by lightning was said to be "very, very, very sore".  I then reported having found ca. 69,800 Google webhits for {"be very very afraid"}, some of the form "Be afraid. Be very(,) very afraid." and others with just the second part.  Jesse Sheidlower then supplied the source of the formula: "Be afraid. Be very afraid." (from the 1986 horror film "The Fly") -- an effective use of very as an intensifier, it seems to me.  Bill Mullins suggests that the use of "Be afraid. Be very afraid." by Wednesday in "Addams Family Values" (1993) was probably a bigger vector for its spread.]

Some general remarks...  Note that Roberts simply asserts Fowler's Rule and similarly just asserts what the meaning of hopefully is; these are just inarguable FACTS about English, deriving presumably from some higher authority.  And she just asserts that very is lukewarm (a judgment of taste) and unspecific (a judgment of meaning), brooking no objection from those whose judgments are not the ones she reports.

We are, indeed, in shibboleth territory.  Roberts is merely repeating three very fashionable proscriptions on grammar, style, and usage.  Neither reason nor actual practice have anything to do with it.

Propagating such shibboleths has a variety of unfortunate consequences.  Some people become "blinded by the rules", as I've put it: they can't help noticing the proscribed items and may find that these items slow down their comprehension.  In extreme cases, these sadly afflicted folks suffer from a willful failure to understand -- reporting that they can't understand things like "Hopefully, it's not going to rain today" (because "it" cannot be hopeful) and maintaining that people who say "I didn't do nothing" are saying that they did something -- and so exhibit stunningly uncooperative behavior: in ordinary language use, we're trying to gauge others' intentions from their words, not to enforce what we believe to be language norms.  

For a final flourish, let's return to the evil passive voice.  After Jon Lighter quoted from Roberts's handbook on ADS-L, Larry Horn (5/2/07) came up with this even wilder critique of the passive, from the 5/7/07 New Yorker, p. 87:

Constructing passive sentences is a way of concealing your own testicles, lest someone cut them off.  (psychoanalyst Ernesto Morales, played by Ian Holm in the new film "The Treatment")

Horn commented wryly, "I'm not sure whether according to this theory women can construct passive sentences with impunity."

[Added 5/15/07: Language Hat writes to say that the Morales quote is not in the Daniel Menaker book on which the movie is based, though in the book Morales does have something of a preoccupation with genitals and castration.  Further commentary: "The screenwriter is Daniel Saul Housman, and this seems to be his first movie, so we can't investigate his corpus for other evidence of Strunkism; we can only speculate about what he may have been exposed to at Columbia while getting his MFA."]

zwicky at-sign csli period stanford period edu

In the tradition of Ali G in the land of colorless green ideas -- Sacha Baron Cohen deliberately misinterpreting bilingual as 'bisexual' in an "interview" with Noam Chomsky -- here's the South African comic strip Madame and Eve, going on to trilingual:

Language Log reader Vardibidian, who sent me this link, adds:

... perhaps it's worth mentioning that Helen Zille's acceptance speech [as leader of the Democratic Alliance party] was in English, Afrikaans and Xhosa.

I want to encourage each and every person in this room to take up a new language, one of our eleven official languages that you do not yet speak. I am working on my Xhosa. Perhaps some of us wish to learn Afrikaans, or Sotho, Tswana, Venda or Zulu. It does not matter whether you have a knack for languages. Simply learning a new tongue opens up a new world and builds bridges. We must do that for each other.

zwicky at-sign csli period stanford period edu

A phenomenon in which I'm starting to believe in

The fact is, I didn't really believe in the redundant prepositions ("A note of dignity or austerity", 5/3/2007). We're talking about examples like "some issues to which this newspaper often propagates on", or "the table to which a column belong to". Deep down, I figured that these were textual hypercorrections, stuck in to add "a note of dignity or austerity", or perhaps artifacts of the editing process, where someone adds a preposition at one end of a relative clause and forgets to remove the preposition that was already there at the other end.

Readers reminded me about Paul McCartney's "world in which which we live in"; but I'd always heard that as "world in which we're livin'". Other readers sent me a sprinkling of examples from the web; but I reckoned that those might be hypercorrections or editing errors.

It was harder to come to terms with the long list of historical examples, from David Denison and Nuria Yáñez-Bouza ("Back to the future, redundant preposition department", 5/4/2007), showing that this pattern has been around for a millennium. Their citations convinced me, intellectually, that there's something deep in the grammatical DNA of English that engenders such examples. But I still half-way believed that it's all been due to people getting confused by the process of writing and re-writing.

The thing is, there are a lot of examples on the web, including quite a few that are simple enough that the "getting confused" theory seems less likely:

Kevin Lawhon is a strong believer in supporting the community in which he lives in.
He likes the fact that the streets in which he lives in are so peaceful ...
... felt boots or shoes of birch bark or wooden clogs depending on the area in which he lives in.
... I see Louise as a product of the neurotic, ruthless environment in which she lives in.
But Bissell said Belichick created the situation in which he finds himself in ...
... betting on Florida State to beat Notre Dame, a game in which he played in and FSU eventually lost.
... a whiny teenage girl who can't make sense of the the "crrrrrrrrraaaaaaaaaazy" world in which she lives in.
Bloggers can give information about the city in which they live in or about which they blog.
The proportion of people in each class is dependent upon the place in which they live in.
Motorola and Neotel are committed to building the communities in which they operate in ...
The following individuals have been required by OCGA 42-1-12 to register with the Sheriffs Office in which they reside in ...
she wanted only to understand the world in which she lived in and to stimulate our thinking and acting in the present.
... pubs were described by the street name in which they were in and later still (around 1850) a street name and number was used.

And on Saturday, in a story on Martin Ramirez broadcast by NPR's Weekend All Things Considered, in a quote from Victor Zamudio-Taylor, "a scholar and curator based in Mexico City", I heard:

Much like Frida Kahlo, his art is a form of survival and of therapy. It's a form of making sense of the world in which he lives in.

This is spoken, so the textual re-editing story doesn't apply. He was speaking clearly, and there's no background music, so an ambiguity like "we're livin'" and "we live in" isn't available. There's no evidence of a speech error. You can listen for yourself, but it sounds to me as if this is absolutely what he wanted to say, the way he wanted to say it:

And the ATC producers didn't consider editing this out, which I suppose they would have done if they heard it as a significant mis-speaking.

So I'm coming around, just as I did for "such the". It's a form of making sense of the world in which I live in.

T(w)angy eggcorns from Globe readers

Last month Jan Freeman of the Boston Globe issued an appeal for readers to send in their favorite eggcorns, "those verbal misunderstandings that produce erroneous yet logical new terms," as Freeman describes them. In today's column she reports on the finest of her correspondents' eggcornological jewels (or are they pus jewels?).

Many gleanings will already be familiar to devotees of Language Log and the Eggcorn Database, such as butt naked, nip in the butt, flush out (the details), and heart-rendering. Others are newly observed, such as the reader who sent in "a twangy taste." As Freeman notes, tang (usually referring to sharp flavor) and twang (usually referring to a sharp sound, such as a plucked string or disfavored accent) have been merging in English speakers' minds at least since 1611. The OED has a citation from that year appearing in Randle Cotgrave's Dictionarie of the French and English Tongues, where deboire is defined as "an after taste, ill smacke, or twang, which an vnsauorie thing leaues behind in the mouth." And adjectival twangy has had its special pungency since 1887, when a Saturday Review writer used the phrase "worse...than any other cheese, being, as a rule, either tasteless or else twangy." Moving in the other direction, tang has been used to describe unusual or unpleasant speech as far back as 1669 — in his Elements of Speech William Holder wrote of "a pretty affectation in the Allemain, which gives their Speech a different Tang from ours." (I'm reminded of the similar conflation of guttural with gutter, as examined here.)

If "twangy taste" represents a common eggcornic confusion going back four centuries, other contributions to Freeman's column are so idiosyncratic that they don't even register as pings on the usage radar. The fatigued wife of an editor rather dramatically told him she was "not long for this world," but he heard it as "not long for the swirl" — a collocation that produces exactly zero Googlehits. Still, as Freeman writes, this eggcorn substitution "should resonate with anyone exhausted by the swirl of a busy schedule." More evidence that eggcorns are the offspring of our endless linguistic creativity, not simply, in the words of one of Freeman's harrumphing readers, "created by people who never read, of whom we have far too many."

I'm puzzled. The press is usually all over science stories about sex differences in language use (see "Bible Science Stories", 12/1/2006). And Hillary Clinton is a front-runner for the Democratic party's 2008 presidential nomination. So you'd think that a story about a "scientific" study showing that Hillary "talks like a girl" would get the same kind of play as John Edward's haircuts.

Therefore I'm surprised that the Springer empire's 4/25/2007 press release, "The Power of Speaking Ladylike, has gotten relatively little uptake. It starts like this:

Does gender make a difference in the way politicians speak and are spoken to? This is the question posed in a new study by Dr. Carmelia Suleiman and Daniel O’Connell from Florida International University published this week in the Journal of Psycholinguistic Research. [...]

The researchers found that Hillary and Bill Clinton did largely conform to their gender roles in the interviews, their language reflecting the historic power relation between men and women.

Hot stuff, right? But so far, the mass media have largely failed to rise to the bait. This is just as well, because the study (as we'll see) is deeply flawed. Given the media's general credulity in such matters, I doubt that the (lack of) reaction is due to scientific perspicacity; so perhaps it's just taking a few weeks for the story to percolate into the press. We'll see.

The research report behind this is Camelia Suleiman and Daniel O'Connell, "Gender Differences in the Media Interviews of Bill and Hillary Clinton", Journal of Psycholinguistic Research, published online April 20, 2007 (it hasn't appeared in the print journal yet). Here's the abstract:

Does gender make a difference in the way politicians speak and are spoken to in public? This paper examines perspective in three television interviews and two radio interviews with Bill Clinton in June 2004 and in three television interviews and two radio interviews with Hillary Clinton in June 2003 with the same interviewers. Our perspectival approach assumes that each utterance has a dialogically constructed point of view. Earlier research has shown that markers of conceptual orality and literacy as well as referencing (name and pronoun use for self and other reference) do reflect perspective. This paper asks whether perspective is gendered. Our data analysis demonstrates that some markers of perspective show gender differences while others do not. Those that do include the number of syllables spoken by each interlocutor, referencing, the use of the intensifier so, the use of the hedge you know, the use of non-standard pronunciations, turn transitions, and lastly the use of laughter.

I'm surprised to find this paper in a referreed psycholinguistic journal. The analysis is interesting, but its data has no logical connection whatever to gender differences. There are exactly two subjects, and it's true that one them is female while the other is male. But in addition, one of them is from a suburb of Chicago while the other is from rural Arkansas. So perhaps this is really a study about "Regional Differences in the Media Interviews of Bill and Hillary Clinton"? And the two subjects differ in many other ways as well -- the article could with equal plausibility have been presented as telling us about "Social Class Differences" or "The Effect of Early Family Life". Or just "Individual Differences".

The authors' argument is based on citing the relationship between their findings and the earlier literature on "language and gender". The idea is to show that Hillary and Bill conform to previously-claimed patterns of gendered language use. There are two problems with this method. The first one is that the authors' data is equivocal: the features that they measure in the Clinton's interviews partly agree with the previously-claimed gender differences, and partly disagree. This is exactly what we'd expect if the differences had no particular connection to sex or gender at all. The second problem is that many of the background "facts" about sex/gender differences cited in this paper were originally asserted without any empirical evidence -- and some of them have turned out not to be true. More on this later.

The day after the Springer press release came out, Dennis Baron did his best to blog the story along ("Hillary Clinton: Runs like a man, talks like a girl", 4/26/2007):

Hillary Clinton talks like a girl. That’s the conclusion of a pair of psycholinguistic researchers who analyzed radio and television interviews with Sen. Hillary Clinton and former president Bill Clinton recorded in 2003 and 2004, just after each had published a memoir.

Over the last few days, there has been a bit of marginal mass-media pickup, for example Robin Lloyd, "Clinton-speak reflects political gender: Hillary's use of language is more 'ladylike' than Bill's, researchers say", MSNBC, 5/10/2007:

Some pundits say former President Bill Clinton and Sen. Hillary Clinton operate like a unified political animal. But a study of their TV and radio interviews reveals that gender separates the speech of the power couple, such that he "talks like a man" and she is "ladylike."

[Update -- for completeness, some that I missed earlier: Jim Ritter, "Hillary talks like a 'lady' -- STUDY: Unlike husband, she uses 'non-powerful' words", Chicago Sun-Times. 4/25/2007; "Scientists: Hillary Clinton's a Lady ... When It Comes to Her Speech", Fox News, 4/25/2007; Marilia Duffles & and Jeffrey Lord, "Why Hillary Talks Like Bill", The American Spectator, 5/7/2007 (which deserves examination in its own right).]

And Wesley Pruden, "When Hillary speaks, a lady emerges", The Washington Times, 5/11/2007

When they examined several hours of radio and television interviews of Bill and Hillary, they discovered that Bill inevitably "talks like a man" and Hillary is careful, perhaps subconsciously, to sound "ladylike." [...]

"Even though Hillary Clinton is a politician herself," the researchers found, "she still follows to some extent the historic designation of women's language as the language of the non-powerful."

For example, Hillary is nearly three times more likely to sprinkle her conversation with the linguistic cringe "you know" than Bill is, lapsing into the schoolgirl hedge that diminishes the power of language. Women, the professors say, are more likely to "hedge" than men.

Let's pick up this business about what Pruden calls "the linguistic cringe 'you know'". Suleiman and O'Connell took this from Robin Lakoff's classic 1975 work, "Language and Woman's Place":

Lakoff (1975/1975/2004), however, while acknowledging the stylistic difference, points to the power relation that exists between men and women. Thus, women’s talk is powerless talk. She identifies powerless talk as (a) having a stock of phrases that belong to the domestic domain, (b) a stock of adjectives like divine, (c) rising intonation in statements, (d) use of hedges such as you know, and of intensifiers such as so, (f) use of hypercorrect grammar, and (g) use of polite and indirect statements. Lakoff (1975/2004, pp. 78–81) calls this style ladylike talk.

So use of "you know" is one of the features that S&C measured, with the results as shown in their Table 2 (with the relevant part outlined in blue by me):

(Note that the measure is syllables/you know, so that lower numbers mean higher relative frequency of use of "you know".)

Here's the first part of their discussion:

As Table 2 indicates, Hillary Clinton uses you know much more often than Bill Clinton does (11 < 316 syl/you know). [...] This is in accord with Lakoff’s (1975/2004, p. 79) finding that women use more hedges than men.
[11 < 316 is a typo in the original article -- it obviously should be 111 < 316 ]

The trouble is, Lakoff's "finding ... that women use more hedges than men" was simply an assertion, based on her impressions and not on any counts of the use of hedges by any specific women or men on any specific occasions. But inspired by her ideas, many researchers since 1975 have investigated the matter in detail. I discussed the history of research on one kind of hedges, "tag questions", in a post a few years ago ("Gender and tags", 5/92004). You can learn more of the details from that post and the references it cites, but the basic result is that Lakoff was empirically wrong about sex differences in the rate of use of this particular kind of hedge -- for the kind of tag questions that genuinely express uncertainty, men appear to use them more often than women do.

What about the effects of sex on the frequency of use of "you know"? There may well be some published research on this topic, but I don't know of any. (If you do, please tell me.) So it's time for a Breakfast Experiment.

The LDC Online index of transcribed telephone conversations allows searching by both the sex of the speaker and the sex of the interlocutor. The conversations in question come from several published corpora, including Switchboard, Fisher I and Fisher II. In the current state of the index, it covers 28,274 conversational sides (i.e. 14,137 two-sided conversations). There were 12,589 male speakers and 15,685 female speakers. The speakers range widely in age, educational level, region, ethnic and socio-economic background (though I have not tried to control specifically for such factors in this experiment).

The first result is that males use "you know" about 10-11% more frequently, on average, than than females do. If I search for {"you know" & sex:male} I get 173,321 instances in 11,753 conversational sides, or 14.7 per conversation). Searching for {"you know" & sex:female} I get 198,086 instances in 14,849 conversational sides, or 13.3 per conversation. That's if we count only the conversational sides where "you know" was used -- if we count all conversations, we get 173,321 in 12,589, or 13.8 per conversation for the guys, and 198,086 in 15,685, or 12.6 per conversation for the ladies.

Either way, I think we can reject the idea that "you know" is strongly associated with female speakers in the U.S. these days. Far from being a "schoolgirl hedge that diminishes the power of language", you know might instead be viewed as a guy thing, a marker of male social solidarity. (Though let's not get carried away -- the sex differences are not very big.)

It's interesting that (in the interviews that Suleiman & O'Connell surveyed) Hillary used "you know" three times as frequently as Bill -- but there's no reason to think that this is because "gender make(s) a difference in the way politicians speak and are spoken to".

Suleiman and O'Connell suggest that we also need to look at the other party in the conversation -- they observe, for example, that "Hillary Clinton's mean syllables/you know indices in these interviews are notably different with men TV (M=72.5), woman TV (M=113) and Radio (M=150.5) interviewers". I need to point out again that we're talking about unreasonably small numbers to generalize from -- two male TV interviewers and one female TV interviewer, in this case.

So here is the 2X2 table from searching the same LDC-Online conversational speech corpora, expressed in terms of the average number of instances of "you know" per conversational side. (The numbers are in parentheses divide the counts by the total number of conversational sides of the relevant type, whether or not any instances of "you know" occurred there.)

Speaker Female
13.6 (13.0)
12.7 (11.8)
13.4 (12.9)
15.2 (14.3)

In addition to the result noted earlier (males tend to use "you know" more than females), overall, this susggests that participants in same-sex conversations tend to use "you know" more than participants in mixed-sex conversations. (I'm not taking the time to do statistical tests this morning, but with 28,274 subjects, these differences are likely to be statistically significant.)

The tendency for you know to be used more frequently in same-sex conversations is the opposite direction from the effect noted by Suleiman & O'Connell, which was that Hillary Clinton used "you know" more often in two TV interviews with male interviewers than in one TV interview with a female interviewer. Does this mean that Senator Clinton is somehow acting against dialogic stereotype? Well, my first guess would be that it doesn't mean anything at all about dialogic gender, but relates to the individual personalities or subject-matter involved in the interviews. Or perhaps it's due to what Senator Clinton had for breakfast.

Since I've expressed my results in terms of numbers of "you know" per conversational side, you may be wondering whether the excess male use of "you know" means that guys use it more often per unit time or per syllable or per word, or whether it's just that guys talk more.

In an earlier post ("Gabby guys: the effect size", 9/23/2006), using a subset of the same data, I found that in single-sex conversations, men used about 3.2% more words, on average, than women; while in mixed-sex conversations, men used about 6% more words, on average, than women. The male excess in "you know" per-conversation rates in single-sex conversations was 15.2 vs. 13.6, for an excess rate of about 12%. In mixed-sex conversations, it's 13.4 vs. 12.7, or about 6%. For comparison with S&C's syllables/you know numbers, the male/male conversations in Fisher I involved about 930 words per side, so that 15.2 you know's per side translates to about 61 words per you know.

I don't have the time to count syllables just now -- I've finished my second cup of coffee, and now I've got a Mother's Day picnic brunch to prepare -- but a quick estimate would be that both male and female speakers in this conversational corpus use "you know" somewhat more often, in terms of syllables/you know, than the rate that S&C observed in Hillary Clinton's interviews.

You can take it the interpretation from there. As usual, when we look at the facts instead of relying on stereotypical impressions, the overall sex differences in speech and language (leaving aside biologically-determined features like pitch) are pretty small. And as is all too often true, the empirical differences are in the opposite direction from some supposedly scientific "findings".

Posted by Mark Liberman at 07:10 AM

May 12, 2007

Posted by Geoffrey K. Pullum at 05:55 PM

Things have lost all meaning for Zippy, but he hasn't lost the ability to play with morphology:

In my last survey of playful morphology, I looked at ostentatious -ity (Zippy uses an expanded version -osity in randomosity above), innovative -ness (of several types, including the "Colbert suffix" -iness in truthiness and faminess), playful -licious, and the profusion of -dar nouns.  Above, Zippy adds an instance of -itude where the default nominalizer -ness would be expected and a new -ology noun (which is morphologically well-formed if the base is taken to be a noun: 'study of the absurd').  Zippy's into suffixes.

My survey posting has links to earlier ones on Language Log.  Now, some developments since then on the -dar and -iness fronts.

-dar.  I noted earlier that there seems to be no end of -dar words, and pleaded with people not to send me more.  That request stands, but here's one of special interest from Grant Barrett's Double-Tongued site (hat tip to Ben Zimmer):

You develop what is called "playdar"--a bit like gaydar. Swingers can spot each other in public. A couple once picked me up in a regular bar. (link)

What makes playdar interesting is that it rhymes with the original model radar and the intermediate model -- the probable vector for the spread of -dar words -- gaydar.  And it has a meaning in the sexual domain.  It fits so well.

-iness.  A while back I added referenciness to truthiness and faminess, and suggested that there might be a place for justiciness.  A reader suggested a possible application for justiciness:

Reflecting on the notions of truthiness and referenciness, I was reminded of a recent incident at my university in which a student plagiarized approximately 80% of his paper from an online source -- or rather, since faculty members at my university are not permitted to make such judgements, I should say that his paper bore a close resemblance to the online source, in a number of passages an exact resemblance for several paragraphs at a time. The Associate Dean who reviewed the case, however, found that there was no intent to deceive; it was merely 'sloppiness'. The student's penalty (if such it may be called) was simply to resubmit the work. I thought this might qualify as an instance of justiciness. It's certainly an instance of silliness.

(My correspondent noted that the student missed his extended deadline, so that "a certain justice prevailed after all".)

There is, at the moment, only one webhit for justiciness that isn't by me, and that one refers to me.  This is by the blogger Shinga, examining claims by Patrick Holford about the connection between food allergies/intolerance and IgG (Immunoglobulin G) antibody levels.  Citing me and quoting Ben Goldacre, Shinga suggests that Holford might be practicing referencinesss, adding:

To borrow further from Zwicky, there would be a certain justiciness is seeing this nonsense exposed for what it is.

I'm not at all sure that justiciness would be the right word choice here.  In the plagiarism example, the dean's decision wasn't justice but something that merely had the appearance of justice; that puts it in the neighborhood of truthiness, faminess, and referenciness.  My understanding of the Holford case is that exposing "this nonsense for what it is" would actually be justice.  But maybe I'm misunderstanding Shinga's intent.

Or maybe she wasn't using the Colbert suffix -- with its connotation of falseness, inauthenticity, or masquerade -- at all, but rather the positive -iness reported on here in Mark Liberman's discussion of hostiness, the magnetic quality that all good TV hosts have.

Yet another -iness turned up this week on the American Dialect Society mailing list, where David Bowie wrote on the 8th:

I subscribe to several of the Woody's Office Watch family of email newsletters, and the past few "EMAIL Essentials" ones have dealt with spam filtering. Near the beginning of the latest one is the following:

You and I can glance at a message and know right-away if it's spam or not. Computers are nowhere near as smart and probably never will be, all a spam filter can do is analyse a message and work out the likelihood that it is spam. It's not a simple Yes/No but a sliding scale of (with apologies to Stephen Colbert) 'spaminess'.

I would have probably spelled it "spamminess" myself, but it's interesting to see "-iness" being used actively to mean something like "something like this noun, but not exactly like it" with an overt nod to the Colbert Report.

This is just neutral -iness, denoting approximation along some cline, without the disparaging connotation of the Colbert suffix or the positive, approving connotation of hostiness.  As Larry Horn quickly pointed out on ADS-L, there is some tradition in linguistics and psychology for using -iness in contexts where what is usually treated as a categorical binary distinction -- either X or not -- is instead treated as a matter of degree, as in Haj Ross's 1973 paper "Nouniness" or in work on prototype semantics, where notions like "birdiness" are bandied about.  Larry noted that George Lakoff (in a 1972 CLS paper "Hedges") treated truth itself as a matter of degree.  As Larry puts it:

 ... "a chicken is a bird" or "a penguin is a bird" was considered to be less...well, truthy than "a robin is a bird".

But this isn't Colbert's truthiness (which disparages propositions because they fail to be true despite being put forward as if they were), and I wouldn't myself use truthy in describing Lakoff's ideas; I'd say that Lakoff maintained that "a penguin is a bird" is simply less true than "a robin is a bird".

(A digression.  It turns out that the word birdiness has some currency in a completely different context, namely the world of hunting dogs, where a dog is said to be birdie/birdy -- both spellings are out there -- if it's interested in birds.  See, for example, this site of "Questions and Answers On Birdiness and Scenting".)

In any case, plain approximative -iness has been around for some time and has its uses -- I think spamminess is a good coining -- but Colbert shouldn't be getting credit for it.

Then there's the title of this posting, where -iness isn't approximative at all.  Zippy's suffixiness is a matter of being fascinated with suffixes (like birdy dogs with birds), playing with them, using a lot of them.  Zippy is a suffixy guy.

zwicky at-sign csli period stanford period edu

A recent Rhymes with Orange cartoon (by Hilary Price, Stanford '91) takes us back to Yoda and his eccentric syntax:

It's been two years since Language Log looked at Yodic (Yodian? Yoda-ish? Yodish? Yoda-speak? Yodese? -- all are attested) syntax.  Here's a list of our previous forays into Yoda World:

GP, 5/18/05: Yoda's syntax the Tribune analyzes; supply more details I will!
From anthropomorphism to zugzwang with Zippy

Not your usual alphabet:

All attested words (in some domain or another).  Bill Griffith is scrupulous about such things.  If Zippy talks about a giant pink bunny on a hilltop outside Artesina, Italy, you can bet there is such a thing and it looks like Griffith's drawing.  If Zippy stands outside the Golden Mermaid apartments in Santa Monica, California, you can bet there is such a place and it looks like Griffith's drawing.

Calling him names

Names might not actually hurt you, but they can damage your reputation:

This was the beginning of a long-running thread in Zits, in which Hector tries, pathetically, to establish an image as a bad boy.

L337 Katz0rz

On the LSL ("Lol-kitteh as a second language") front, I've been remiss in not drawing your attention to "A special in-depth analysis by David McRaney - L337 Katz0rz", ICHC, 9/8/2007. Mr. McRaney presents a serious historical analysis with many illustrations and even a meme-flow diagram:

It is no criticism of this magnificent piece of scholarship to suggest that there is still room for an analysis of the morpho-syntactic, phonological and orthographic patterns involved. And on the sociolinguistic front, the issue of contact with baby talk remains largely unexplored.

Let's totally afforest the bodyguard!

Last month, Renato Cruz posted a picture of this sign from a Shenzhen park ("Florestar o guarda-costas", 4/7/2007). A few days ago, Mark Frauenfelder picked it up on Boing Boing ("Funny mangled English sign in China", 5/9/2007).

In pinyin, the sign reads NI3 WO3 GONG4 ZUO4 LYU4 HUA4 WEI4 SHI4, which somehow got translated as "You and I altogether do afforest the bodyguard!"

Someone sent the link to Victor Mair, and he forwarded his analysis to us. In this case, the culprit was not bad dictionary entries, but rather bad syntactic analysis, associated with inappropriate construal of the words.

Here's what happened to it, syllable by syllable, then word by word:

to do, to be
guard, protect
non-commissioned officer

The person(s) who did the English translation on the sign misunderstood the Chinese this way:

you (and)
i.e. afforest
[as a full verb]
(the) bodyguard
[taking it as direct object
of preceding verb]

This is an incorrect grammatical analysis of the sentence, which should be as follows:

you (and)
[taking it as an adjective
modifying the following noun]
[taking it as head
of predicate nominal]

An idiomatic translation of the sign would read something like this:

"Together let's be afforestational protectors."

Less literally:

"Let's work together to protect the afforestation (project)."

Perhaps more people would understand reforestation rather than the less common afforestation, or even more simply, "tree planting". But as it is, the sign certainly got noticed!

Use of an inadequate machine translation program might be responsible for the mis-analysis, but Victor notes

Native speakers of Chinese very often do misconstrue sentences in their own language. As you can see, the grammar is less than explicit.

[Update: Aoshuang (Tim) Xu writes:

As a native Chinese speaker who happened to read your post on Languagelog, I have a different opinion on your translation of "绿化" (or LYU4HUA4 as you put it).

The Chinese "绿化" literally means "to make green", and is used almost always in the sense of "to make green by planting trees or grass". Although the "afforest/reforest" sense can occasionally be the emphasis in some governmental slogans, its "plant grass" sense is always there. As both the photo and your source indicated, the sign is in a park of the city of Shenzhen. Although I have not been to the city of Shenzhen per se, my experience with other Chinese cities, as well as the photo itself, makes me believe the sign is at the side of a lawn rather than a woodland. So what the sign really says is "Let's protect the lawn together."

I have not find a good English word for the full sense of "绿化". Inflectional verbifying of "green" to "greenize" make sense to me, but I haven't seen any of such use, and I am no authority to justify its idiomaticness.

I believe that Victor was relying on the context provided by the sign, which graphically prohibits disturbing or removing a planted sapling, suggesting that the "green" in this case was trees rather than grass. But perhaps the picture is meant to be interpreted more abstractly, and the sign does just mean "keep off the grass". ]

Posted by Mark Liberman at 10:25 AM

Interesting times

Back on Tuesday, Doonesbury satirized President Bush's verbal style ("Advice to the president: omit needless 'in other words'", 5/8/2007), and yesterday, the strip took another shot at the same target:

In addition to overuse of "in other words", both strips highlighted another verbal tic, one which is lexically and syntactically more variable:

Tuesday:   I find it curious that they would offer comfort to our enemies instead of to our warriors. In other words, offering comfort to our enemies instead of to our warriors is something that I find curious.
Friday: You know what I find interesting? I find it interesting that Congress wants to abandon our troops by defunding them. In other words, Congress wants to abandon our troops without funds, and I find that interesting.

Slate's Complete Bushisms does not seem to have picked up on the idea that President Bush overuses "find X interesting/curious". The word curious doesn't occur at all. The word interesting occurs in five citations, but in four of them, its usage seems normal and reasonable to me, e.g.

"I would have to ask the questioner. I haven't had a chance to ask the questioners the question they've been questioning. On the other hand, I firmly believe she'll be a fine secretary of labor. And I've got confidence in Linda Chavez. She is a—she'll bring an interesting perspective to the Labor Department."—Austin, Texas, Jan. 8, 2001

There's one example where interesting is repeated twice, in a way that is slightly reminiscent of Doonesbury's caricature:

"That's George Washington, the first president, of course. The interesting thing about him is that I read three—three or four books about him last year. Isn't that interesting?"—Showing German newspaper reporter Kai Diekmann the Oval Office, Washington, D.C., May 5, 2006

In the long 4/23/2007 Tipp City speech, where I found 17 repetitions of "in other words", the word curious does not occur. But in that same speech, President Bush uses the word interesting 18 times in about 12,500 words, for a rate of about 1,441 per million.

This does seem to be a high enough rate to notice. Turning to the same reference corpora that we used to calibrate his use of "in other words", interesting occurs in a collection of 2.6 billion words of journalistic text at a rate of 28 per million words, and in 26 million words of English-language conversations, interesting occurs at a rate of 475 per millions words. Thus in the Tripp City speech and Q&A session, President Bush used interesting about three times more often than is typical of English conversation.

That's a much smaller difference than the ones we noted for "in other words", which he used 67 times more often in that same speech than the rate we found in the conversational corpus. But still, three times the background rate is enough to suggest that Doonesbury has targeted a real feature of his verbal style.

And as in the case of "in other words", there's some evidence that overuse of interesting is associated with a semantic shift, or at least a partial one. Most of the president's uses of interesting in the Tripp City transcript seem unremarkable to me, other than in their frequency, but there are a few cases where interesting seems out of place:

So I have a decision point to make, last fall. And the decision point was whether or not to either scale back or increase our presence in Iraq. And that was a difficult decision. It's difficult any time, as I told you, you put a soldier in harm's way. I understand the consequence of committing people into war. The interesting thing is I'm the Commander-in-Chief of an incredibly amazing group of men and women who also understand that consequence, and yet are willing to volunteer.

I found it difficult to put my finger on exactly why this use seems odd. The dictionary definition of interesting is "arousing or holding the attention", and the president might have said "The thing that holds my attention is I'm the commander-in-chief...", without raising the same reaction. But here are two speculations about why the use of interesting seems out of place in this context.

First, when we say "the interesting thing is X", we usually imply that X is something that is not already part of the knowedge we share with our interlocutors. Thus we might try to comfort someone by reminding them, "the important thing is that you have your health", but it would be somewhat odd odd to say "the interesting thing is that you have your health". And there's also an implication that X is intellectually rather than emotionally absorbing -- we might explain a colleague's behavior by informing someone "the key thing is that her mother just died", but it would seem strangely callous to say "the interesting thing is that her mother just died".

Now, it's not common to knock President Bush for being too cerebral, but he often uses interesting in contexts like this, where its lack of emotional resonance is disconcerting:

See, that's the interesting thing that people have got to know. There's threats to your freedom.

Scanning the president's speeches on, I found an interesting piece of evidence that the president recognizes this problem ("President's Remarks in 'Focus on Health Care' Event", 9/13/2004):

And then September the 11th came and it hurt us. I'm going to talk a little bit later on what it meant, in terms of working to secure the homeland. There's some interesting -- not "interesting," really important lessons from that day.

For the most part, though, the problem with Bush's interestings is not misuse but simple overuse ("Remarks by the President at McConnell for Senate and National Republican Senatorial Committee Dinner", ) :

You know, it's interesting, I asked Mitch about what we could do here. I went to New Albany, across the line there, to go to a school -- and I want to share some thoughts about public education in a minute -- but I said, what can we do that would be interesting? And he said one thing -- he said, I want you to talk to McConnell scholars at the University of Louisville. Isn't that interesting?

Posted by Mark Liberman at 06:49 AM

May 11, 2007


Steven Poole is a British journalist, and the author of Unspeak: How Words Become Weapons, How Weapons Become a Message, and How That Message Becomes Reality. The publisher sent me a review copy when it first came out last year, but I never wrote anything about it. My problems with the book began at the start of the second paragraph:

What do the phrases 'pro-choice', 'tax relief', and 'Friends of the Earth' have in common? They are all names that contain political arguments.

This sent me back to the publisher's blurb on the front flap of the dust-jacket:

What do the phrases "pro-life," "intelligent design," and the "war on terror" have in common? Each of them is a name for something that smuggles in a highly charged political opinion.

The textual substitution is simple, but the diagnosis is complicated. I wondered whether the three right-wing misdemeanors in the jacket blurb ("pro-life", "intelligent design", "war on terror") were in Poole's original text, which was then edited to create a balance of two left against one right ("pro-choice" and "Friends of the Earth" vs. "tax relief"); or whether the Grove Press publicist, perceiving that the book's natural audience was on the left, made the substitution in the other direction, in order to improve the blurb's market appeal.

Whatever its source, this double-image text stuck in my mind as I read Unspeak last year. Overall, the book was clearly aimed at a left-wing audience, picking up on recent unhappiness with the right's rhetorical success. In other words, a recap of Lakoff on framing (see "It's about ideas, not words", 7/23/2004; "Frames and messages", 9/4/2004). However, Poole decided not to follow Lakoff's line, which I might caricature as "my misleadingly evocative phrases can beat up your misleadingly evocative phrases". Instead, he gives apparently even-handed advice about how to recognize and resist misleadingly evocative phrases, and why journalists in particular should avoid and even oppose them.

Obviously, Poole couldn't present this advice with a straight face if all the examples of misleading language were drawn from only one end of the political spectrum. Still, I got the distinct impression that his main motivation was anger at perceived sins of right-wing rhetoric -- just as Milton gives Satan all the best lines, so Poole gives the right all the worst ones -- and that the examples of misleading rhetoric from left of center were stuck in pro forma.

I considered blogging about this, leading with the curious name-substitution documented above. But I decided to let it go. Poole has made no secret of his political allegiances, and, I thought, he might genuinely be trying to be even-handed, rather than simply dressing up a political argument in apolitical clothing. After all, he opens with the Confucian parable about rectification of names:

A long time ago in China, a philosopher was asked the first thing he would do if he became ruler. The philosopher thought for a while, and then said, well, if something had to be put first, I would rectify the names for things. His companion was baffled: what did this have to do with good government? The philosopher lamented his companions foolishness, and explained. When the names for things are incorrect, speech does not sound reasonable; when speech does not sound reasonable, things are not done properly; when things are not done properly, punishments do not fit the crimes; and when punishments do not fit the crimes, people don't know what to do.

And he closes by explicitly differentiating himself from George Lakoff (p. 237):

Having witnessed the virtuoso use of Unspeak by the Bush administration, some liberals in the US desperately want to catch up in the rhetorical arms race. Studying the work on how different terms 'frame' arguments by such linguists as George Lakoff, the Democrats hit on a counter-strategy: to burnish and sharpen their own language until it became as steely and weaponized as that of the opposition. [...]

Linguist Ranko Bugarski argues, by contrast: 'What is needed in replacement of "Warspeak" is not an equally crude and militant "Peacespeak", but judicious use of normal language, allowing for fine-grained selection and discrimination, for urbanity and finesse.' What counts as 'normal language', of course, is already subject to ideological disagreement. But the sentiment is admirable, even if it describes an unlikely ideal. Politicians will go on trying their luck with all the rhetorical strategies in their pockets. But we should at the very least expect, and demand, that our newspapers, radio and television refuse to replicate and spread the Unspeak virus. As BBC World presenter Kirsty Lang explains: 'It's much easier to take the language that's given to you, and the government knows that full well.' ... The citizen's plan of action is simple. When the media do this, talk back: write and tell them. Possibly the growth of Unspeak cannot be reversed. But that doesn't mean we have to go on swallowing it.

OK, I thought, that sounds fair.

Still, I couldn't shake the impression that Poole's commitment to Confucian "reasonable" speech was insincere -- a position adopted as a rhetorical gambit, because it provides a convenient platform from which to attack the rhetorical sins of his political enemies. On the other hand, I worried that this might be unfair to him. And I'm no fan of the view that there's no reality behind rhetoric (see "Spinning Fish: Mullahs defend Herodotus", 5/8/2007).

So I couldn't come to grips with the book -- given its topic, the difference between its implicit message and its explicit content created a disorienting sort of implicitly self-referential hall-of-mirrors regression -- and I never reviewed it.

But I was reminded of this again, a couple of weeks ago, when I read a post about astronomy on Poole's Unspeak blog ("Super-Earth: Like the Earth, only super", 4/26/2007). In context -- that is, on a blog plugging a book about "judicious use of normal language" in political discourse -- it's a remarkable document:

In galactic news: scientists peering through a massive Chilean telescope have found “the most Earth-like planet outside our Solar System to date, a world which could have water running on its surface”. It’s the first exoplanet ever detected that might support life-as-we-know-it. They call it . . . super-Earth!

It’s inspiring to imagine a planet stuffed with super-aliens, like Superman, as long as they don’t embark on any kind of interstellar “migration” to come here and steal all our jobs.

But then I began to worry: surely a “super-Earth” is exactly like the real Earth, only super? In which case, as a super-parallel-world, it must already boast figures such as a super-”Melanie Phillips” and a super-Cheney, frothing demagogic evilists at least twelve feet tall. Cosmic terror-flash! But here the language is, thankfully, deceptive. Reckoned to have a radius 1.5 times that of Earth but a mass five times greater,“super-Earth” will have much stronger gravity, which ought to mean, ceteris paribus, that its inhabitants are in general smaller. So with any luck, to us, super-”Melanie” would look like a tiny frothing dwarf. As with so much astronomical news, that puts things in heartening perspective.

The idea here seems to be that "super-Earth", like "tax relief", is "a name that contains [an]... argument". The astronomers' language, he tells us, is "deceptive". But in this case, all the "highly charged political opinion" is being smuggled into the discussion by Poole himself.

Like Poole, Melanie Phillips is a British journalist. Like Poole, she is no stranger to political controversy. Like Poole, she was neither mentioned nor implied in any of the stories about the newly-discovered planet orbiting Gliese 581 in the constellation Libra. Unlike Poole, she supported the Iraq war. She's most strongly associated with the argument made in her book Londonistan, which argues that "the collapse of traditional British identity and accommodation of a particularly virulent form of multiculturalism" have created a serious problem -- and this seems to be the issue that has taken her, along with some other British intellectuals like Christopher Hitchens to a place where one of her former colleagues on the Guardian pairs her with Dick Cheney as "frothy demagogic evilists" on super-Earth, and rejoices that she is not 12 feet tall but rather "a tiny frothing dwarf".

This is "urbanity and finesse"?

No. Whether or not you agree with Melanie Phillips' views, this is "a kind of invasive procedure [that] wants to bypass critical thinking and implant a foreign body of opinion directly in the soft tissue of the brain". That's quoted from the Epilogue of Poole's book -- it's his definition of Unspeak, summing up what he thinks he's taught us over the previous 237 pages.

Of course, if you agree with Poole's politics, or if you like to see short women with strong opinions put in their place, you'll find his riff on super-Melanie funny. It's certainly a clever example of the kind of political rhetoric that aims to dismiss opponents with ridicule. You can find vast supplies of lesser-quality examples on sites like Daily Kos or Free Republic, depending on your political preferences. Or you could read more of the Unspeak blog.

So it turns out that my instincts were right. Unspeak illustrates La Rochefoucauld's maxim that "L'hypocrisie est un hommage que le vice rend à la vertu" ("Hypocrisy is a tribute that vice pays to virtue"). Poole's goal was not the promotion of reasonable language in political discourse, but rather the unilateral rhetorical disarmament of his political enemies.

All the same, I refuse to accept Stanley Fish's view that there is nothing beyond rhetoric, and that "reality ... will emerge when one of the competing accounts ... proves so persuasive that reality is identified with its descriptions". Poole's concerns were valid, even if he himself doesn't believe or practice what he preaches.

Posted by Mark Liberman at 07:28 AM

Family Circus Filology

Today's language cartoons come via the Comics Curmudgeon, who has featured a coupla linguistically-significant Family Circus cartoons in recent rants.

First, a simple Eggcorn Genesis moment:

Next, something a bit more sophisticated:

To fully appreciate this one, you really need to read part of what Josh (the Curmudgeon) has to say about it in his original rant:

I wasn't aware that there was some Papally proscribed prayer posture, with more knees denoting more Christian sincerity. I'm also not sure how Dolly can tell Jeffy's only doing half an Ave Maria if he's still in the midst of it -- is he only doing every other word or something?

Josh's remark illustrates an important point about the temporal properties of events that can be described as "saying half the Hail Mary", namely, that they unfold towards a specific endpoint along a timecourse specified by the extent properties of the element denoted by the object NP in direct object position. That is, such events keep on going until half a Hail Mary is said and then they stop. The size of half a Hail Mary determines the extent of the event. This necessary stopping is a crucial property of such events -- called "telic events" -- and distinguishes them from another kind of event, ones with no predetermined endpoint, like running or singing -- ''atelic'' events..

Now, the English progressive (be + V-ing) has the property of focussing on the midpoint of some event -- the event is ongoing when you use the progressive to describe it. So the normal interpretation of "Jeff is saying half a Hail Mary" is that Jeff is in the middle of saying half a Hail Mary -- he hasn't reached the endpoint of the saying-half-a-Hail-Mary event yet.

Josh's point is that Dolly can't possibly tell whether Jeffy is only saying half a Hail Mary or saying a full Hail Mary. Any midpoint of a saying-half-a-Hail-Mary event is ALSO a midpoint of a saying-a-Hail-Mary event, so if Dolly's in the middle of watching Jeffy executing such an event, she can't know if he's going to stop halfway through or not, so she can have no evidence that he's only saying half.

There are only a couple of ways her report can make sense as a true statement about an ongoing event. One is the way that Josh mentions in his comment -- if he's saying half the Hail Mary by uttering every other word. Then, after hearing just a few words, Dolly could extrapolate the pattern to the end of the prayer, conclude that Jeffy isn't going to say the whole thing, and make her report.

The other way is if she's witnessing an iteration of half-a-Hail-Mary-saying events -- Jeffy has been repeatedly saying half-Hail-Marys. This represents a so-called 'coercion' effect of the progressive+telic verb combination -- rather than one event halfway through, the event is reimagined as consisting of multiple iterated events. He knocked at the door could be true with just a single "knock!", but He was knocking at the door has to involve multiple knocks -- coercion to an iteration interpretation, since knocking has no internal event duration that the progressive could focus on. Saying a Hail Mary does, though, so there's both the 'normal' and iterated interpretations of the progressive available. And as a description of iterated half-Hail-Mary-saying events, Dolly's report makes sense.

I imagine this is why it's the Hail Mary and not some other bedtime prayer that is mentioned in the caption -- in my media-based and sketchy impression of Catholicism, Hail Marys are a prayer that is often said repeatedly, yes? Poor little Jeffy. Now he'll have to go back and say all the other half-a-Hail-Marys. Halfs-a-Hail-Mary? Certainly not halves-a-Hail-Mary. Hmm!


Posted by Heidi Harley at 03:50 AM

May 10, 2007

Drop and give me 50 (conjugations)

According to Pauline Jelinek, "Pentagon creating civilian Language Corps to help in times of war, emergencies", AP wire, 5/9/2007:

The Pentagon is setting up a civilian Language Corps, a cadre of some 1,000 foreign-language speakers who can help the government in times of war and national emergencies.

In a three-year pilot program, the Defense Department will recruit volunteers and do testing to see if such a program would work. If successful, a permanent corps could be developed, said Robert Slater, who heads the Pentagon personnel office's security education program.

"The federal government can't possibly identify, hire and warehouse professionals with skills in 150 languages," Slater said Wednesday. "So it's invaluable to be able to respond in emergencies, whether international or national."

Prof. Dennis Baron wrote about an earlier stage of this process, back in January -- "Pentagon declares foreign language a weapon. O.K., maggots, drop and give me 50 conjugations".

"Drop and give me 50 conjugations" is clever, I like that -- but Prof. Baron's picture of this problem's history is bizarrely off the mark:

...the U.S. is way behind in terms of the foreign language arms race.

That’s because government policy for the past century and a half has been to ensure that real Americans speak only English, whether voluntarily or by force.

For some of the history of how wrong this is, with respect to the WWII era, see these Language Log posts: "A tale of two societies" (3/1/2007); "Linguistics in 1940" (3/11/2007); "The Intensive Language Program" (3/20/2007); "The Chinese Episode" (3/21/2007); "The Burmese story" (4/8/2007).

As for the situation today, the Defense Language Institute is a large and extremely effective organization. And the Education Department's alphabet soup of foreign-language programs ( Fulbright-Hayes, FLAS, LRC, NRC, LEAS, SEAS. etc.) may not work as well as they should, but this is not because of a government policy that they should fail, but rather, according to a recent review by the National Academies, because of inadequate implementation by educational institutions. And perhaps in some cases, because of outright diversion of resources from language instruction into areas of more interest to some academics, such as literature.

Wiith respect to attitudes towards Americans' knowledge of foreign languages, the main difference between 1941 and 2007 seems to be that in 1941, academics and other intellectuals were somewhat ahead of the government in planning and acting to provide for national linguistic needs, while in 2007, they're way behind, with many if not most of them uninterested in cooperating, and some actively obstructing.

I don't know how Prof. Baron feels about this, but my reading of our colleagues' attitudes is that many of them are indifferent to the problem of foreign language instruction, and therefore happy to see the federal funds allocated for this purpose diverted to the study of literature and other areas that interest them more. A smaller fraction is actively opposed to anything that would improve pool of foreign-language talent available to the government. To blame the results on a "government policy ... to ensure that real Americans speak only English" is bizarre.

How could Prof. Baron carry on at such length in a way that seems to be so strongly at variance with the elementary and easily-discovered facts of the case? Well, of course, he's focusing on the question of whether immigrants should be encouraged to learn English, and the whole set of issues that this brings up: accomodation to non-English speakers, bilingual education, and so on. He's written about this before, more than once, as we have here; and on this point, I generally agree with him.

But this is not the same as the problem of ensuring that enough citizens are proficient both in English and in a wide range of other languages, especially with respect to what he jokily calls the "foreign language arms race", that is, the availability of foreign-language speakers for the diplomatic service, the military, and the intelligence services. And it's discouraging that Prof. Baron, an eminent intellectual who clearly knows a lot about many things, is apparently ignorant of the history of this other question, and uninterested in it except as a stick to beat the nativists with.

[OK, I'm being just a tad unfair to Prof. Baron, as this 2001 NYT Op-Ed (" America Doesn't Know What the World Is Saying", 10/27/2001) demonstrates. But even in that piece, he writes that

The federal government might give financial help to colleges trying to improve their programs in Arabic and other strategically important languages. Congress could offer subsidies to students at accredited four-year colleges who choose to study these languages.

as if that's not exactly what the various Title VI centers and other programs have been trying to do. Is it possible that neither he nor the editors of the NYT knew that?]

Posted by Mark Liberman at 08:31 AM

May 09, 2007

Who starts these rumors?

How do these rumors get started? You'd think Mark Liberman, for chrissakes, would be a little more careful. Sure, I did do a reading at the Capitola Book Café while riding a unicycle (or rather simply balancing on it — let's not exaggerate). And at a reading in Seattle I did briefly juggle three live Alaskan king crabs (three, not five as widely reported — it sort of looks like five when I speed up). And during my reading at the MIT Bookstore about a year ago I think I did toss two or three live Maine lobsters around while playing "America the Beautiful" on the ocarina. But I have never done the live marine crustacean stuff and the unicycle thing at the same time. Believe me, working with live parrots is dangerous enough. You really don't want to have a frightened lobster come down on your lap while you're trying to keep your balance on a unicycle and read from a Language Log post about adjective frequency at the same time. Not unless you have top-quality medical insurance that will pay for reattachment surgery.

And what's this about performing "rants"? Is this me we're talking about? When have I ever engaged in ranting? Just open your eyes and take a look at my oeuvre. I can be critical, yes. But hey, did I fall asleep and miss Congress passing a law that said we have to switch our critical faculties off while we blog? Huh? Some things deserve a little mild criticism. There is certain poetry that I believe thinking people have a right to object to. There is usage advice written by shameless, pontificating, ignorant, hypocritical, incompetent, authoritarian old weasels that I think should be forthrightly described as such. And there are things that are simply stupid. But I really think that calling me a ranter because I have the courage to point these things out is a case of blaming the victim. It's like, so unfair.

And everyone is so sensitive about criticism these days. When I suggested — in a perfectly reasonable, measured critical discussion concerning the speech of people like young Bakovic — that perhaps some of the prenominal attributive modifier strings in on-line discussions of certain popular music genres might resemble the language use of chimpanzees, I actually got a letter of complaint. From Jane Goodall.

Anyway (where was I? did I have a theme here? oh, yes), Mark is in all sorts of ways exaggerating about the event in San Francisco this Saturday night. From the way he tells it you might think that this will be mostly an ordinary San Francisco evening of women's fiction and wild erotica and outrageous humor, and thus that some of you who like a little linguistics in your lives might feel ill served. Not so. There will be lots of linguistics. Although it is true that I will be the only professional linguist on the bill, we are in fact planning to trade off specialisms a bit. I will be reading some gay Asian erotica that I've written, while Jaime Cortez will be presenting a new distributed morphology analysis of Tagalog verbal infixing. You may be aware that Liz Maverick's next sensual linguistic action-romance novel is about two women who discover the secret to decoding the Voynich manuscript (after which they have sex); but Liz will actually be doing some phonological theory on Saturday night. Stephanie Paul is normally extremely funny, but not when she's giving a lecture on the philosophical shortcomings of the case for linguistic nativism, she isn't; and that is what she promises for this weekend.

So really, it's mainly linguistics. And what with the fact that Geoff Nunberg has promised to stop by, and my profoundly cool son Calvin will be there, and there will be drinks... it's going to be such a party.

[Added later: It was, too. It was a fantastic evening. I had a lot of fun, and my co-performers were extraordinary. Thanks to all the Language Log fans who attended and came over to chat.]

Posted by Geoffrey K. Pullum at 06:18 PM

Pullum with Drinks, 5/12/2007

If you can get to San Francisco this weekend, you won't want to miss Geoff Pullum at Writers With Drinks. According to the notice sent out by Charlie Anders, the founder and ringmaster of the event,

It's been scientifically proven that most people have around FIVE (± 3.2225) life-changing experiences in the course of their lives. It can be a drag knowing you've only had one or two of these experiences so far, and wondering how soon you'll have to rearrange your dog-hair lamp collection after the next one. Which is why next Saturday's Writers With Drinks will provide you with ALL of the life-changing experiences you will have for the rest of your life. Yes, this will be a scream-eating, knuckle-popping ride. But at the end of it, you can be sure your life will never change again. Stability is joy!

When: Saturday, May 12, 2007, 7:30 to 9:30 PM, doors open at 7:00 PM
Featuring: Jami Attenberg, Jaime Cortez, Geoffrey Pullum, Liz Maverick & Stephanie Paul
Where: The Make Out Room, 3225 22nd. St. between Mission and Valencia, San Francisco
How much: $3 to $5 sliding scale, all proceeds go to other magazine.

Rumor has it that Geoff will perform some rants about The Number of Words for X in Y -- featuring the Moken, the Chulym, the Eskimos, and the Arabs -- while riding a unicyle and juggling a selection of live seafood.

What about Saturday's other writers? Well, according to Charlie's descriptions, Geoff will be adding the note of light and racy linguistics that a savvy organizer of tavern-based readings is always looking for, to balance an evening of material that otherwise might be too solemn and cerebral in character:

Jami Attenberg is the author of Instant Love and the forthcoming novel The Kept Man. Her writing has appeared in Salon, Nerve, Pindeldyboz, Spork, Print, Nylon, Radar, the SF Chronicle, Time Out NY and others.

Jaime Cortez edited the anthology Virgins, Guerillas & Locas and edited Corpus magazine and the zine A La Brava. He's been published in Best Gay Asian Erotica, 2SexE, Queer PAPI Porn and Besame Mucho. He co-founded the comedy group Latin Hustle. He illustrated the graphic novel Sexile/Sexilo.

Stephanie Paul has performed at The Comedy Store in LA, the Comedy Club in San Francisco and Improv Hollywood. She appeared on TV in both Hercules and Xena: Warrior Princess, and in the movies Crazy Love, Film School Confidential and The Frequency of Claire.

Liz Maverick's action-romance novels include What A Girl Wants, The Shadow Runners, Crimson Rogue and Crimson City. Her newest book, Wired, launches the Shomi line for Dorchester Publishing. Her story "Kiss or Kill" appeared in Secrets 8: The Best In Women's Sensual Fiction.

This will also be your opportunity to get Geoff to autograph Far from the Madding Gerund. If you leave your copy behind, in your rush to get to the reading on time, there will be probably be spares on hand.

Posted by Mark Liberman at 10:05 AM

May 08, 2007

Advice to the president: omit needless "in other words"

Overuse of the phrase "in other words" is one of the verbal tics that Garry Trudeau caricatures in President Bush:

Is this fair?

Slate's Complete Bushisms includes seven examples of this phrase, among 466 quoted passages comprising about 12,000 words, for a rate of 583 per million words. By comparison, in 2,559,992,056 words of English news text indexed at the LDC's online site, there are 21,576 examples of the string "in other words", for a rate of about 8.4 per million words; and in 26,151,602 words of English conversational transcripts indexed at the same site, there are 535 example of "in other words", for a rate of about 20.4 per million words.

So the rate at which this phrase is used in the Complete Bushisms is about 69 times greater than the rate found in 2.6 billion words of journalistic text; and about 29 times greater than in 26 million words of conversational transcript.

Of course, the Complete Bushisms is surely a biased sample. (And I'm not a fan of this particular genre of personality politics, for reasons explained here.) But in a long (87-minute) extemporized oration from a couple of weeks ago ("President Bush Discusses the Global War on Terror in Tipp City, Ohio", 4/23/2007), the president used "in other words" 17 times in about 12,500 words, for an astonishing rate of about 1,360 per million -- 162 times the rate in news text, and 67 times the rate in conversational transcripts.

So it's apparently fair to say that President Bush really does overuse this expression -- and he uses it at a rate that is much higher than needed for us to notice it as characteristic of him.

What about the other aspect of Trudeau's caricature, namely the use of "in other words" to connect two sentences that say exactly the same thing, in essentially the same words, with the substantive morphemes merely re-arranged:

I find it curious that they would offer comfort to our enemies instead of to our warriors. In other words, offering comfort to our enemies instead of to our warriors is something that I find curious.

There are a couple of examples somewhat like this in the Complete Bushisms, where the two parts do involve some different words, but are so close in meaning that the connective "in other words" seems odd to me:

"He was a state sponsor of terror. In other words, the government had declared, you are a state sponsor of terror."—On Saddam Hussein, Manhattan, Kan., Jan. 23, 2006

"The march to war affected the people's confidence. It's hard to make investment. See, if you're a small business owner or a large business owner and you're thinking about investing, you've got to be optimistic when you invest. Except when you're marching to war, it's not a very optimistic thought, is it? In other words, it's the opposite of optimistic when you're thinking you're going to war." —Springfield, Mo., Feb. 9, 2004 (Thanks to Garry Trudeau.)

And I found one example of this type in the recent Tipp City speech:

... And I believe that we ought to change the tax code so an employee of a corporation is treated equally as somebody who is self-employed. In other words, the tax treatment ought to be the same ...

But these examples are very much in the minority.

I've listed all the examples of "in other words" from the Tipp City speech below. Reading over them, it seems to me that President Bush uses "in other words" so often that much of the meaning has been bleached out of it for him. It's become a relatively empty sort of sentence qualifier, like "so" or "you know", and may be deployed in cases where leaving it out entirely would improve things:

A lesson learned was that, at least in my opinion, that in order to protect us, we must aggressively pursue the enemy and defeat them elsewhere so we don't have to face them here. In other words, if what happens overseas matters to the United States, therefore, the best way to protect us is to deal with threats overseas. In other words, we just can't let a threat idle; we can't hope that a threat doesn't come home to hurt us.

So what's apparently going on is that in extempraneous public speaking, President Bush, like most of us, sometimes restates things in consecutive sentences. But unlike most of us, he uses "in other words" every 30 sentences or so, more or less as a pause filler, and sometimes he uses this phrase between re-statements that are not really different enough for it to be appropriate.


Here are the rest of the Tipp City "in other words" cites:

The enemy succeeded in causing there to be sectarian strife. In other words, the government wasn't ready to provide security.

There's a -- by the way, every new phase of history has its own unique features to it. For example, you've got a kid in the battlefield and he's emailing home every day. Or, four-hour [sic] news cycles. There's a lot of -- asymmetrical warfare, or $50 weapons are sometimes used to defeat expensive vehicles. In other words, these are different times.

Clearly, there's different points of view, and that's fine. That's the greatness about our society. In my discussions with the leaders, I said, you have the authority to pass the funding legislation. That's your authority, not mine. I submitted what the Pentagon thinks it needs. In other words, the process works where I ask the Pentagon, how much do you need? What do you need to do the job? And they submitted their request, and then we, on behalf of the Pentagon, sent it up to Congress. And they have the authority to pass the -- pass the bill any way they see fit.

The difference, of course, is that this time around the enemy wouldn't just be content to stay in the Middle East, they'd follow us here.
It's interesting, I met with some congressman today, and one person challenged that. He said, I don't necessarily agree with that. In other words, I have told people that this is a unique war where an enemy will follow us home, because I believe that. But if you give al Qaeda a safe haven and enough time to plan and plot, I believe the risk is they will come and get us.

I believe it's in the interest of the United States to have a comprehensive immigration plan that meets certain objectives: one, helps us better secure our border; two, recognizes that people are doing work here that Americans are not doing; three, that recognizes that we are a nation of immigrants, and we ought to uphold that tradition in a way that honors the rule of law; four, that it's in the interest of the country that people who are here be assimilated in a way that -- with our traditions and history. In other words, those who eventually become citizens be assimilated. In other words, one of the great things about America is we've been able to assimilate people from different backgrounds and different countries. I suspect some of your relatives might be the kind of people I'm talking about.

It means some barriers, whether they be vehicle barriers, or fencing, different roads to make our enforcement folks be able to travel easier on the border; UAVs -- unmanned aerial vehicles -- infrared detection devices. In other words, this border is becoming modernized.

Now there is double fencing in this area, with a wide area in between that our Border Patrol are able to travel on. In other words, we're beginning to get a modernization program that's pretty effective.

It seems like to me that it's in our national interest to let people come on a temporary basis to do jobs Americans are not doing, on a temporary, verifiable basis, with a tamper-proof card, to let people come and do jobs Americans aren't doing, and let them go home after that so that they don't have to sneak across the border.
In other words, if there's a way for people to come in an orderly way, they won't have to try to get in the bottom of the 18-wheeler and pay a person thousands of dollars to smuggle them into the United States of America. There are a lot of employers who are worried about losing labor here in the United States. They don't know whether they're legal or illegal, by the way, because not only is there a smuggling operation, there's a document forging operation. In other words, the law that we have in place has created an entire underground system of smugglers, inn keepers, and document forgers.

And what's the most important thing we can do for this volunteer army is to provide certainty for our families.
In other words, you sign -- you volunteer to be in the military and you're deployed; we want to make sure there's certainty so that families can prepare.

Just so you know, I am concerned that a soldier getting out of -- or a Marine getting out of uniform and stays in the defense -- is transferred seamlessly from the Defense health system to the Veterans health system. In other words, one of my concerns is that there is a gap. And we owe it to these families, and these soldiers and Marines to make sure that that service is seamless.

I believe we have proven that the best way to balance the budget -- and I know many of you are concerned about a balanced budget -- is to grow the economy through low taxes, which means enhanced revenues, and be wise about spending your money. In other words, pro-growth economic policies have proven to work. And it turns out that when the economy grows, taxes increase. And therefore, the corollary is to make sure we don't over-spend.

The tax code discriminates against an individual on health care decisions. And I believe that we ought to change the tax code so an employee of a corporation is treated equally as somebody who is self-employed. In other words, the tax treatment ought to be the same, all aimed at encouraging individual decision-making in the marketplace.

And so we modernized Medicare with the prescription drug benefit, but we also did something unique when it came to government programs. We gave seniors choices. In other words, we created more of a marketplace.

[Update -- Paul B. sent in a link to a blog post from 2/8/2005, in which he offers a less charitable analysis of the president's fondness for "in other words".]

[Update #2 -- there's an interesting post about this on the Economist's Democracy in America blog, under the heading "Breaking it down for you":

I heard someone say of someone else recently, "he's the kind of guy who breaks it down for you." It was an amiable, not-too-harsh putdown. It's not "he's arrogant and condescending", quite. It's that he really styles himself a straight shooter, a man's man, one of those few people who can see through the cant and the crap, and he's going to do you a favour and break it down for you.

Bush breaks it down for you. That's why you see it so often in the "life expectancy" and "asymmetrical warfare" kind of examples—I'm sounding like Washington! I better get real and break it down again. I think he's so eager to break it down for you that he's come to rely on "in other words", and that's why he sometimes uses it as mere pause-filler.

Bush's "in other words" is born of a decent human instinct and also a canny political tactic: sound like the people, not like the officials. But he's come to define himself so much with it that its use has become nigh-absurd.


Posted by Mark Liberman at 05:33 PM

Spinning Fish: Mullahs defend Herodotus

In Stanley Fish's most recent NYT essay ("The All-Spin Zone", 5/6/2007), he unspins unSpun, Brooks Jackson and Kathleen Hall Jamieson's book about how to cope with political rhetoric:

The book’s subtitle tells it all: “Finding Facts in a World of Disinformation.” [...] The idea is that while “we humans aren’t wired to think very rationally” and are prone to “letting language do our thinking for us,” we can nevertheless become “more aware of how and when language is steering us toward a conclusion.” In this way, Brooks and Jamieson promise, we can learn “how to avoid the psychological pitfalls that lead us to ignore facts or believe bad information.”

It all sounds so – well – rational: There’s a world of fact out there waiting to be accurately perceived, but the distorting power of words, abetted by the psychological disorders of passion and bias, tends to obscure it and lead us astray. And the remedy? Watch your words and watch your mental processes, paying particular attention to your “existing beliefs” lest they “reject evidence that challenges them.” In short, Jackson and Jamieson recommend, “practice active open-mindedness.”

I haven't read the Jackson and Jamieson book, but their prophylactic section headings sound right to me: “Check Primary Sources,” “Know What Counts,” “Know Who’s Talking,” “Cross-check Everything That Matters,” “Be Skeptical, But Not Cynical.” But Fish, a subtle though unregenerate post-modernist, is having none of it.

For him, reality is rhetoric:

Language (or discourse), rather than either reflecting or distorting reality, produces it, at least in the arena of public debate. [...] Clarity is not a condition of unbiased vision; it is a rhetorical achievement.

Insight is bias:

“Active open-mindedness” – standing to one side of our beliefs and assumptions in the service of unbiased observation – is another name for having no mind at all. Open-mindedness, far from being a virtue, is a condition which, if it could be achieved, would result in a mind that was spectacularly empty. An open mind is an empty mind.

And truth is power:

When Jackson and Jamieson declare that Rove’s “upbeat picture” of the economy is divorced from reality, they think of reality – in this case the reality of economic conditions – as ready to reveal itself so long as we adhere to the appropriate evidentiary procedures (like “cross-check everything”). But the reality of the economic situation will emerge when one of the competing accounts (Rove’s or Jackson’s and Jamieson’s) proves so persuasive that reality is identified with its descriptions.

This is all served artfully on a bed of pomo neo-Whorfianism:

Forms of language – pieces of vocabulary, proverbial aphorisms, slogans, revered examples of wisdom, metaphors, analogies, precedents and a whole lot more – furnish our consciousness; they are what we think with, and we can’t think without them (in two senses of “without”).

Then again, maybe the way that I just spun his review is entirely unfair. Go read it and decide for yourself.

Me, I'm wondering about a different professor, Nurredin Zarrinkelk. According to one cryptic sentence in this morning's New York Times (Nazila Fathi, "Beating by Guards Fails to Stop Voting, Iran Students Say", NYT, 7/8/2007):

A prominent art professor, Nureddin Zarrinkelk, was expelled from Tehran University last week after he commented about the beauty of a woman’s hair at one of his classes. [emphasis added here and throughout]

A search on Google News turned up a different description of his crime ("Cartoon Celebrity Banned from University", 5/7/2007):

Nureddin Zarrinkelk, a celebrated Iranian animated cartoonist, has been expelled from the fine arts department of the university of Tehran where he taught, reports said Monday. He is accused of having made fun of a female student who was wearing the full Islamic chador covering calling her a "primitive" according to pro-government news agency Rajanews. Zarrinkelk, 70, an icon of Iranian animated cinema, will not be allowed to teach anymore as part of new punitive measures in a moralization campaign which kicked off late last month and provides for the arrest of women who do not abide by strict Islamic dress codes.

In fact, it turned up more than one ("Prominent Iranian Professor Sacked for 'Insulting Veil'", Payvand, 5/7/2007):

A prominent Iranian professor of fine art has been sacked from his university for allegedly offending a fully veiled woman during class.

Nureddin Zarrinkelk, known as the father of Iranian animation, is accused of having "insulted" the female student at Tehran University by showing some of her hair to other students.

Iranian Science Minister Mohammad Mehdi Zahedi has said that Zarrinkelk insulted the Islamic veil and has been banned from teaching in any university.

And yet another ("Professor sacked for insult", Gulf Times, 7/8/2007):

A prominent Iranian professor has been sacked for offending a fully veiled woman during class amid mounting protests at a prestigious Iranian university over insults to Islam, the press reported yesterday.

The reports said that Nureddin Zarrinkelk, a professor of fine art known as the father of Iranian animation, “insulted” the female student at Tehran University by questioning why she wore the full Islamic chador.

The incident then sparked more protests at another Tehran university - Amir Kabir - that has already been the scene of demonstrations related to the publication of caricatures deemed offensive to Islam.

“The father of Iranian animation was expelled from university and cannot teach in any university because of his insult towards the hijab of a university student,” the reformist Etemad daily reported.

It quoted Science Minister Mohamed-Mehdi Zahedi as saying that Zarrinkelk was expelled from Tehran University for “insulting the Islamic hijab” and has been banned from teaching in any university.

According to the Etemad reports, the incident was sparked during a classroom discussion over an image of a bald angel drawn by a student when the professor asked the woman if she wore the full veil because she herself was bald.

Following Jackson and Jamieson's method, I conclude that Prof. Zarrinkelk probably said something to a chador-wearing female student that she (or others in the class) took as an insult to her choice of clothing; but at this point, I don't really know what happened, in the narrow sense of who said and did what in that Tehran classroom. I have only a tiny bit more knowledge about what happened in larger political and social terms, mostly derived from previous experience rather than from these various journalistic accounts.

If I were running a newspaper, though, I'd look back at the last time Prof. Zarrinkelk was in the news, just a couple of weeks ago ("Iranian films can respond to '300': ASIFA president", Mehr News Agency, 4/25/2007), and run the story of his firing under the headline Mullahs defend Herodotus.

Iranian president of the Association Internationale du Film d'Animation (ASIFA) said that national films and animations can respond to the Warner Brother's historically inaccurate film '300'.

Speaking in a review session of the film, at Tehran's Iranian Artists Forum last Saturday, Nureddin Zarrinkelk called on Iranian cultural officials to equip filmmakers so that they can show the world an accurate representation of the historical events which have been misrepresented in the film '300'.

The writings of Herodotus, which are the main source for such anti-Iran films, are full of unsubstantial claims, he noted.

Perhaps this account of the forces behind his firing -- or alternatively, his account of the battle of Thermopylae -- would "prove so persuasive that reality is identified with its descriptions". Stranger things have happened.

[There are interesting discussions of Fish's essay by Kenny Easwaran at Thoughts Arguments and Rants ("Fish on Spin"), and by Natalia at My Tongue Broke Out in Unknown Strains ("Artful pomo").]

Posted by Mark Liberman at 07:52 AM

India begins monumental language documentation project

Later this month, linguists from across India will convene to begin work on a 10-year, US$100M project to survey 400+ Indian languages. The New Linguistic Survey of India will involve 44 academic institutions and some 10,000 linguists and language experts, making it the largest national language documentation effort to date. The project will describe each language and speech variety, compiling lexicons, grammar sketches, audiovisual documentation, and language maps, and will disseminate these materials over the web.

On a recent visit to the Indian Summer School on Natural Language Processing, I called in at the Central Institute of Indian Languages, Mysore, the coordinating site for the project, and met the deputy director Professor Rajesh Sachdeva. He is busy with logistics for the six-week summer school starting later this month, bringing 400 linguists to Mysore to take stock of the current state of knowledge about Indian languages and to provide advanced training in linguistic survey and analysis.

In a recent interview, Professor Udaya Narayana Singh, director of the institute, described plans to ``develop a Linguistic Data Consortium for Indian Languages (LDCIL) on the lines of the Linguistic Data Consortium at the University of Pennsylvania — a hugely successful consortium of 100 companies, universities and government agencies that aids research in linguistic technologies.''

Posted by Steven Bird at 02:02 AM

May 07, 2007

Gresham's Law meets the Law of Group Polarization

Over at the web forum Free Republic, yesterday at 4:53:04 PM PDT, "blam" posted without comment a press release (from the American Roentgen Ray Society, via Science Daily) about how "Voice Recognition Systems Seem To Make More Errors With Women's Dictation".

Within five minutes, "USFRIENDINVICTORIA" posted a link to a Language Log post "Sex and Speaking Rate", 8/7/2006). The cited LL post debunked Louann Brizendine's bizarre claim that women speak twice as fast as men, by showing that none of her cited references said anything about the matter, and presenting data from a large study where the measured difference was a 2% average speaking-rate advantage for men over women.

When I saw the Free Republic citation, following a link from our referrer log, I was pleased -- until I noticed that the LL link was cited in support of Brizendine's claim, not in opposition to it. Here is USFRIENDINVICTORIA's contribution in its entirety:

This might be a reason:

“Girls speak faster on average — 250 words per minute versus 125 for typical males.”

OK, I figured, USFRIENDINVICTORIA did a quick web search and didn't read past the Google snip, or something like that. Somebody in the thread will set the record straight pretty soon.

Fat chance.

Instead, over the next three hours or so, there's a little explosion of sexist jocularity:

Careful guys. This is a landmine waiting to explode! ;O)

Discrimination! We need to lower male voice recognition rates to that of female!

i seem to have the same problem...

Oops! I sense a NOW/ACLU moment here...

chauvinistic computers? lol

Woman say four times as many words a day than men...and during an argument, thst rate soars to ten times more.
Maybe that's why I can't stand to watch a talk show with only women.

Just set "WhineyVoiceLevel" to at least 80 (0-100 scale) and NagOvertones=True...

They don’t listen either...

With gusts up to 375 words per minute...

They may say four to ten times more words, but they certainly don't communicate four to ten times as much information. That's why I can't tolerate women's talk shows.
It's like going into the hen house after all the chickens have laid an egg. A rooster makes a lot of noise, but only once a day.

I bought some noise cancelling headphones, but was dusappointed to find that I still hear my wife just fine when using them.

Perhaps the computer needs a how to interprint “nagging” algorithm
could be the fact there is no “reading your mind” algorithm? (I didn’t MEAN to say that you should have known that!)

In general, I'm impressed by the quality of thought and research on the web. But in some contexts, it seems that an intellectual analog of Gresham's Law applies.

Actually, it's worse than that. It's not only that bad ideas drive good ideas out of circulation, but also that certain kinds of bad ideas reinforce themselves, becoming stronger in the people who believe them to start with, and taking root in the people who don't.

This is a particularly noxious form of the Law of Group Polarization, which says that "members of a deliberating group predictably move towards a more extreme point in the direction indicated by the members' predeliberation tendencies" (Cass R. Sunstein, "The Law of Group Polarization", Journal of Political Philosophy 10(2), 175-195, 2002; working papers version here).

As Sunstein explains, "[G]roups consisting of individuals with extremist tendencies are more likely to shift, and likely to shift more (a point that bears on the wellsprings of violence and terrorism; the same is true for groups with some kind of salient shared identity (like Republicans, Democrats, and lawyers, but unlike jurors and experimental subjects).When like-minded people are participating in 'iterated polarization games' -- when they meet regularly, wihout sustained exposure to competing views -- extreme movements are all the more likely."

In cases like the Freeper thread that I cited, there seems to me to be an additional factor. In addition to the basic group-polarization dynamic, there's a sort of Gresham's Law effect, whereby people with a taste for the rational evaluation of evidence are likely to withdraw from a forum whose participants are so obviously uninterested in the facts of the matter. As a result, as the group opinion becomes more extreme, the standards of evidence get worse and worse, until we get to the point illustrated in that Freeper thread: a freely-available web link is cited to "prove" the opposite of what it plainly says, and 30-odd participants chime in enthusiastically, over a period of several hours, without even noticing.

It's striking how common this sort of reaction is, right across the political, social and geographical spectrum, to "scientific" evocations of social stereotypes. I've noted this in several previous posts about reactions to Louann Brizendine's book, e.g.

Censorship at the Daily Mail (11/29/2006)
Contagious misinformation (12/1/2006)
Femail again(12/2/2006)
Bible Science stories(12/2/2006)
Fabricated but true?(12/3/2006)
The spread of bogus numbers in the meme pool (12/16/2006)
"Gender myths: letting science mislead" (9/30/2006)

A depressing hypothesis:

  • The Law of Group Polarization applies especially strongly to discussion of social (and individual) stereotypes
  • The role of "science" in such discussions is mainly not to supply prestige, much less evidence, but rather to license the expression of prejudices that are otherwise out of bounds (i.e. "politically incorrect")
  • This applies to "pack journalism" as much as to web forums and bar-room conversations
  • Anyone interested in the facts will tend to withdraw from such discussions

[Note that better performance of ASR algorithms on male vs. female voices is a well-established trend, often observed in the literature. The reason is generally assumed to be that higher-pitched voices have more widely spaced harmonics, with the result that short-time amplitude spectra (the basis of essentially all ASR systems) give a less reliable indication of the shape of the vocal tract and therefore of the identity of the vowels and consonants being produced. As far as I know, human speech perception is not affected by this difference, at least not to anything like the same degree (though a more extreme version of the same effect is part of the reason that operatic sopranos tend to be hard to understand.)

This has nothing to do with the rest of the discussion above, it's just to register the point that there was nothing wrong with the original press release -- it simply triggered an episode of the type that I've described.]

Posted by Mark Liberman at 07:05 AM

Diaper-headed Whores; Or, How Not to Translate "Nappy-headed hos"

[This is a guest post by Reinhold Aman, editor of Maledicta: The International Journal of Verbal Aggression]

On April 4, 2007, radio host Don Imus, on his nationally syndicated program "Imus in the Morning," referred to the Rutgers University women's basketball team as "nappy-headed hos."

Imus was speaking with executive producer Bernard McGuirk when the NCAA Championship game between Rutgers and Tennessee came up. McGuirk compared the game to "the jigaboos versus the wannabes," and Imus commented, "That's some rough girls from Rutgers. Man, they got tattoos...." "Some hardcore hos," replied McGuirk, to which Imus added,"That's some nappy-headed hos there, I'm going to tell you that."

These three words -- nappy-headed hos -- set off the media frenzy that sometimes erupts when a well-known person says something racist, sexist or homophobic in public. And they also resulted in a host of false, misleading or ridiculous translations around the world.

For the past 40 years, two of my major interests have been maledicta (insults, curses, slurs, blasphemies, obscenities, vulgarities, and other "bad words") in all languages and their (mis)translations into other tongues. Don Imus's "nappy-headed hos" has demonstrated once again how incompetent many translators are in matters of maledicta and how non-English-speaking readers are misled by such poor translations. The Italian saying traduttori, traditori ("translators are traitors") comes to mind. Almost all translators mistranslated nappy-headed or hos or both. Below are samples from 16 languages to prove my assertion that foreign readers were severely misled by the wrong translations and that Don Imus was depicted as having been far nastier than he actually was.


Some French, Italian and Swedish translators made the understandable (but inexcusable) mistake of looking up nappy only in British dictionaries, instead of also consulting American ones; after all, their source, Don Imus, was quoting Black American English.

My four U.K. dictionaries (Chambers, Collins, Concise Oxford, Longman) define nappy only as "(baby's) napkin," American English "diaper," without any reference to hair, except for Collins which also lists "having a nap; downy; fuzzy" among its seven definitions of that adjective. For whatever reason, those translators were not puzzled by their strange translation "diaper-headed" or by the bizarre image of black women having diaper-shaped heads or wearing diapers on them. Perhaps those translators thought that nappy-headed was a synonym of "rag-headed" or "towel-headed," common pejoratives applied to Arabs because of their customary headdress.

The Germanic languages' equivalents of "nappy-headed," kraushaarig, kroeshaar, krøllhåret and krullhårig are neutral terms for such hair (all meaning only "curly-haired"), unlike the English nappy, frizzy and kinky, which have (slightly) pejorative connotations.

The Spanish, Catalan and Romanian translations of nappy as "dirty, disheveled, unkempt, filthy, nasty, soiled, unclean, unwashed" are very wrong, because nappy characterizes only the shape of hairs (curly, kinky, frizzy), not their state of cleanliness or the social class and level of education of curly-haired people, as a major Romanian newspaper implied.


Black English ho, derived from the pronunciation of "whore" (analogous to poor > po, for/four > fo, door > do), has a wide range of meanings and should not be simply translated as "whore." Translating it as "whore" is just as foolish and wrong as translating the positive black phrase "She my bitch" into German as Sie ist meine Hündin. The correct German equivalent is Sie ist meine Alte ("my old lady"). And some merry gentleman endearingly referred to as a "gay old dog" certainly isn't a homosexueller alter Hund.

According to American blacks interviewed, depending on context and who uses it, ho can mean anything from "an affectionate term of endearment" to vile misogynistic and racist abuse. While ho is used by American blacks of certain ages and income levels, especially by rap and hip-hop musicians, many blacks dislike this word intensely -- especially when a white person (like Imus) uses it.

Black English ho has become only mildly pejorative for some (mainly male) users, reaching something like the status of "bitch," or perhaps even "chick" or "broad." However, it remains quite offensive to nearly all women, African-American or otherwise, and is deeply offensive in its implications, especially in the sort of context that led to a recent lawsuit against the NYPD, but it is certainly no longer necessarily just a dialect pronunciation of "whore."

Similar negative/positive terms exist in many other languages, such as Yiddish mamzer, French crotte and German Hund, where the user, target and context indicate whether the word is an insult or a compliment.

In all Germanic languages, hos was mistranslated as "whores": hoeren, horer, horor, Huren. Most other translators did the same: donotes, prostitutas, prostitute, putinhas, puttane; curve, kurvák, kurwy; drolje; orospular. They had no idea what ho really means, and their readers must have been shocked by Imus's purported nastiness of calling those athletes "black whores," "prostitutes" or "sluts." He absolutely did not; his intent was to refer to them in a jocular sexist way, with roughly the force that white Americans associate with a term like "broads," which is far less offensive than "whores". In this, Imus followed the lead of his producer, Bernard McGuirk, who was the real culprit. It was McGuirk who first called the basketball players "some hardcore hos," with which Imus then agreed: "That's some nappy-headed hos there...."

(Of course, that sort of thing is exactly what Imus hired McGuirk to do, as "On the Media" documented in 2001 -- see the transcript here. Another typical example of McGuirk in action: during Imus's Nov. 2, 2005 broadcast, shortly after President Bush nominated Judge Samuel Alito to the U.S. Supreme Court, McGuirk called Alito a "meat-ball sucking wop." That ethnic slur, like many others on the program over the years, didn't make the international news.)

In the Romance languages examined, the Spanish mujerzuelas and Portuguese mulherzinhas are fairly good translations, because they use only deprecatory words for "woman" (mujer, mulher plus mild pejorative suffixes). All others (Catalan, French, Italian, Romanian) using the equivalents of "whores" and "prostitutes" are dead wrong.

Not being competent in Slavic languages, I checked only Croatian and Polish. It's curious that of all languages checked, only the two Polish papers didn't print the complete word but euphemized kurwy ("whores") as k.... and k...y. Yet Kurwa! (used like Fuck! or Shit!) is not an extremely vulgar word and is one of the most popular Polish exclamations.


1. Dutch:
hoeren met kroeshaar (whores with curly hair). de Volkskrant (The Netherlands).

2. Flemish:
hoeren met kroeshaar. De Standaard (Belgium).

3. German (Germany, Austria, Switzerland):
(a) kräuselhaarige Huren (curly-haired whores). Der Spiegel (Germany). Other German publications used the synonymous kraushaarige.

(b) kraushaarige Huren (curly-haired whores). ORF = the official Austrian radio and TV network.

(c) kraushaarige Huren. Basler Zeitung (Switzerland).

4. Norwegian:
krøllhårete horer (curly-haired whores). Dagbladet (Norway).

5. Swedish:
(a) krullhåriga horor (curly-haired whores). Eskilstuna-Kuriren (Sweden).

(b) blöjhövdade horor (diaper-headed whores). STV = Swedish TV.

6. Bavarian:
This is a language without a standard orthography, but I would translate "nappy-headed hos" into my Central-Bavarian mother tongue Niederbayerisch as wuggalhårade Waiwa (Germanized: wuckerlhaarige Weiber), "curly-haired broads."


1. Spanish (Spain, Mexico, Arizona):
(a) mujerzuelas de cabello sucio y espeso (uncouth/loose women with dirty and thick hair). El Diario Marca (Spain).

(b) mujerzuelas de cabello negro y espeso (uncouth/loose women with black and thick hair). ¡Ehui! (Mexico).

(c) prostitutas de pelo desgreñado (prostitutes with disheveled/unkempt hair). La Voz (Phoenix, Arizona).

2. Catalan:
donotes de cabells bruts i espessos (whores/sluts with dirty/filthy/nasty/soiled/unclean and thick hair). El Periódico (Barcelona, Spain).

3. Portuguese (Brazil and Portugal):
(a) prostitutas de cabelo ruim (prostitutes with bad/horrendous hair). Último Segundo (Brazil).

(b) putinhas de cabelo pixaim (whores with curly hair). Globoesporte (Rio de Janeiro).

(c) mulherzinhas de cabelos negros e espessos (loose women with black and thick hair). Correio da Manhã (Portugual).

4. French (France and Canada):
(a) putes avec une tête de couche (whores with a diaper head, diaper-headed whores). Le Monde (France).

(b) putes aux cheveux crépus (whores with frizzy hair, frizzy-haired whores). Le Figaro (France).

(c) putes à tête crépue (whores with a frizzy head, frizzy-headed whores). Radio-Canada (Montréal).

5. Italian:
(a) prostitute con i pannolini in testa (whores with diapers on their head, diaper-headed whores). Internazionale.

(b) puttane pettinate alla negra (whores [with hair] combed in the Negress manner). La Stampa.

(c) puttanelle nere spettinate (black tousled/disheveled whorelets/sluts). LSDI Libertà di stampa.

(d) zoccole ricciolute (curly-haired whores). Novamag.

6. Romanian:
curve negre nespălate (dirty/filthy/unwashed/low-class/uneducated black whores). Jurnalul Național. According to Mr. Bratescu, in this context, the meaning is definitely "low person, uneducated."


1. Croatian:
kovrčave drolje (curly-haired sluts/whores). Javno (Croatia).

2. Polish:
(a) przyprawione k.... (??? whores). (Poland). I was unable to find an appropriate translation of the adjective przyprawiony in Polish dictionaries and therefore contacted professors and professional translators in Poland, but none has been able to provide a definitive answer. The primary meanings "flavored," "seasoned" and "spiced-up" (all referring to food) can't refer to hair or head, neither can the older slangy "drunk"; "fixed, attached" may refer to ribbons and the like used to embellish hair but is unlikely. The suggestion by Anna G. that "spiced-up" hints at illegal substances those "hos" might be using is perhaps the closest we can get to that puzzling Polish translation, even though it is unrelated to curly hair.

(b) te "trawniki" to k...y (these "lawns" are whores). SuperExpress (Poland). Mariusz Max Kolonko, an American correspondent for that tabloid and Polish Television TV4, translated nappy-headed hos as "these 'lawns' are whores." He explains that trawniki ("lawns") is slang for "black women," because of their short-trimmed hair. In an e-mail received from Mr. Kolonko, he states that trawniki is "a crass insult against African Americans."

Confusingly, that article about African-American women with short-cropped curly hair is illustrated with a photo of four black women with long hair, some of them sporting even long straight ("processed") hair. This photo bears the caption, "The expression trawnik (literally "lawn," but here meaning "crop-haired person") in the mouth of a white person is a crime. Only black people can call themselves that."


1. Chinese (Hong Kong and USA):
(a) man tou juanfa de jinü (curly-headed prostitutes). Ta Kung Pao (Hong Kong).

(b) juanfa dangfu (curly-haired sluts). Epoch Times, New York.

2. Hungarian:
(a) bongyorhajú kurvák (curly-haired whores). Index. The adjective bongyorhajú reportedly is an "obscure dialect word" referring to tight little curls of human and dog hair; it is completely neutral and does not imply hair color.

(b) kócos, dróthajú ribancok (disheveled/unkempt/rumpled, wire-haired sluts/whores). Népszabadság.

3. Turkish:
kıvırcık kafalı orospular (curly-headed whores). Medyatava (Turkey).


I wish to thank the following for verifying some of my translations or for adding more examples: Adrian and Marta B. (Hungarian), Anna G. (Polish), Artur J. (adding the Polish Super Express "lawn" mistranslation and misinformation), John Swindle (adding two Chinese examples), Alan Crozier (Hungarian, Romanian), Peter Wells (adding Le Monde), Aniko Szabo (Hungarian), Cristi Bratescu (Romanian), and dozens of Polish academics and translators for trying to figure out just what that puzzling przyprawione could mean.


[The above is a guest post by Reinhold Aman]

[Update -- Paul Bickert writes to draw our attention to Gene Weingarten's "Chatological Humor" discussion at the Washington Post, which quotes and comments on a relevant joke:

I am in receipt of an interesting correspondence from Peter Sagal, host of NPR's terrific improv comedy show, "Wait, Wait ... Don't Tell Me!"

On his show a week ago, he and his guests were talking about the death of crooner Don Ho, and it was noted that Ho had dozens of grandchildren and great grandchildren. On the air, with a sudden inspiration, Sagal said "You know, when all those babies were in diapers, that means dozens of nappy-bottomed Hos."

The audience roared. Some groans could be heard. Afterwards, NPR fielded some complaints.

Peter asked me if I thought he had gone over the top.

Nope. That was a great joke, edgy but harmless. It was not about race. It was a joke on the entertainer's name, and on the Imus furor.

People need to get a life and stop looking for reasons to get offended.


Posted by Mark Liberman at 12:29 AM

May 06, 2007

Memo to American linguists

The second and final round of the French election is today. In past weeks, we've talked about the candidates' nicknames, and about a political cartoon's use of those names in a phonologically-defined phrasal template, among other trivial things. Today I want to drawn your attention to something more important: a professor of linguistics at l’Université de Provence, Jean Véronis, has established himself as a mainstream political commentator in France.

One piece of evidence that he has arrived: this interview with Jean in 20 minutes discussing last Thursday's debate, «Ségolène Royal a voulu montrer qu’elle avait la stature d’un chef d’opposition» ("Ségolène Royal wanted to show that she has the stature of an opposition leader").

Jean has reached this position via several books ("Combat Pour l'Elysée: Paroles de Prétendants"; "Les Politiques mis au Net"; "François Bayrou: Confidences"), the development of specialized search engines for current press and political discourse, and many blog entries.

While there are some parallels to the role of George Lakoff in the U.S., the differences are also striking. George has made his impact mainly by offering conceptual advice to one political group, the left wing of the Democratic party; in contrast, Jean has provided analysis rather than advice, largely avoiding partisan comment or commitment. George's analyses have generally been framed as insightful observations illustrated with examples, in the manner of traditional humanistic discourse on grammar; in contrast, the foundation of Jean's work is the statistical analysis of text corpora, in the style of modern computational linguistics.

The French journalistic and intellectual worlds are significantly smaller than their American counterparts; but there are other differences that may have made Jean's accomplishment more difficult than it would have been in this country. In any case, his success highlights the fact that there is an unfilled niche in the intellectual ecology of American political discourse.

Posted by Mark Liberman at 08:55 AM

Growing ice cream in the Russian winter

After the build-up about U.S.-Iran discussions at the recent conference in Sharm el-Sheikh, journalists were left to interpret some scant and ambiguous shards of interaction (Lee Keath, "Suspicions remain after Iraq conference", AP (via Houston Chronicle), 5/4/2007):

Baghdad also did not achieve another goal — progress in easing tensions between the United States and Iran, whose disputes Iraqis say are fueling the chaos in their country. Despite urging from the Iraqis, Secretary of State Condoleezza Rice and Iranian Foreign Minister Manouchehr Mottaki did not hold talks — only exchanged wary pleasantries over lunch.

Here's the Independent's version, which tells us what the "wary pleasantries" were (Anne Penketh, "Lady in red brings abrupt end to US-Iran gala dinner date", 5/5/2007):

The scene had been set for the Rice-Mottaki dinner encounter after a cryptic lunchtime exchange on Thursday.
Mr Mottaki walked into the dining room greeting colleagues in Arabic. Ms Rice responded in English with "Hello", adding, "Your English is better than my Arabic."

The Egyptian Foreign Minister, Ahmed Aboul Gheit, butted in, saying to Mr Mottaki: "We want to warm the atmosphere some." Mr Mottaki replied: "In Russia, they eat ice cream in winter because it's warmer than the weather." "That's true," said Ms Rice.

Here's the CBS News version, which guesses at the meaning of Mr. Mottaki's gnomic remark ("Iranian Official Boycotts Diplomat Dinner", CBS News, 5/4/2007):

Going into the summit, the Iraqi government had hoped for a breakthrough meeting between Rice and Mottaki. Instead, their only direct contact was the wary exchange of pleasantries over lunch Thursday, punctuated by a wry, somewhat mysterious comment by Mottaki.

The Iranian entered the lunch, greeting the gathered diplomats with the Arabic phrase, "As-salama aleikum," a Muslim greeting often used by Iran's Farsi speakers meaning "Peace be upon you," according to an Iraqi official who was present.

Rice replied to him in English, "Hello," then added: "Your English is better than my Arabic," according to the Iraqi official, who spoke on condition of anonymity because the lunch was private.

Aboul Gheit then piped in, telling Mottaki, "We want to warm the atmosphere some."

Mottaki smiled and replied in English with a saying: "In Russia, they eat ice cream in winter because it's warmer than the weather" - more or less meaning, "You take whatever atmosphere-warming you can get."

"That's true," Rice replied, according to the Iraqi official.

Foreign Policy's Passport blog translates Mr. Mottaki's quasi-proverb similarly, with more background ("Tehran can't get its story straight", 5/4/2007):

Iran's diplomats could be overwhelmed and making mistakes. More likely, they are getting conflicting orders. Consider the international conference on Iraq in Sharm el-Sheikh, Egypt. Iran and the United States had both hinted earlier this week that they might be willing to participate in direct talks on the margins of the conference. Talks were apparently the subject of "heated debate" in Tehran, but the hardline view—that "conditions are not ripe at the present time for talks"—looks to be the last word. U.S. Secretary of State Condoleezza Rice and Iranian Foreign Minister Manouchehr Mottaki merely exchanged pleasantries over lunch, and last night Mottaki skipped a dinner where he was to be seated opposite Rice. During their only encounter earlier in the day, the Iranian foreign minister had explained his coolness toward Condi with this cryptic message:

In Russia, they eat ice cream in winter because it's warmer than the weather."

In other words, "Take what you can get." The hardliners may have the upper hand for now, but Iran's confusion suggests that a U.S. strategy of engagement can at least spark a healthy debate in Tehran.

The view from the Russian food industry is slightly different (Angela Drujinina, "Russian ice cream survives the big freeze", CEE Foodindustry, 6/2/2006):

Iceberry, one of the leaders on the Russian ice cream market, temporarily shut down its sales outlets. "We cut off all of our chain sales outlets from electric power, in order to save the energy for the city."

Yet, the group said ice cream sales have held up well during the cold weather. "People continue to buy ice cream, and meanwhile we fulfill our sales plan during winter time."

The group explained that "during winter the human body requires a bigger quantity of calories, more sweets, and the ice cream was and remains to be the most popular Russian delicacy.

And the view from the customer's side seems to be even more divergent, at least if we believe Greg McNafferson, "What Russians Suck". Frozen desserts are easier to make when the outside temperature is freezing:

Russians will tell you that their ice-cream is better then any ice-cream in the world, that they have eaten it since the 10th century, and will proudly tell you: we eat it in the winter. [...]

In winter ice-cream supplies are mostly home-produced, when people start to grow ice-cream themselves. Keep in mind that Russian ice-cream is very different from the western style and is closest to sorbet, or water ice. Russians call their ice-cream “sosulka”, which literally means “sucker”; and in English, a “sosulka” is an icicle. Icicles that grow on the roofs of houses are collected every three hours and sold by 'babushkas' near the metro stations.

Of course, in shops, one can find the standard commercially produced cones, shaped and polished, but I prefer to buy a couple of the fresh, home-made small cones, grown just an hour ago. Each of these ice-creams is unique in its shape and taste.

Home production is not that easy, because icicles grow at different speeds. You need to break them off the roof early, otherwise it will be too thick to fit in the mouth, and you will have to start over. You can frequently see Russians taking icicles from the roofs with a long stick and a bag on the top. They use the same tool to gather apples in the summer.

There are a variety of the ice-creams here for all tastes. When eating an icicle, a Russian may cover it with all kinds of dressings, dip it into honey, or eat it with salt and spices. However, many people still prefer old style plain ice-cream, which, they tell you, was sucked by their great-great-great-grandfathers.

This is mostly satire, I believe, not to say a complete fabrication from beginning to end (except for the part about Russians liking ice cream), but that shouldn't matter to the diplomats. Mr. Mottaki was not really talking about eating ice cream in Russia, whatever the season. And if Secretary Rice (who is a Russia specialist by academic background) had responded "Actually, the Russians eat ice cream in winter because it grows naturally in cold weather", she wouldn't have been talking about Russia or about ice cream either.

Since we're talking about communication among diplomats, it would hardly be appropriate to quote the verses of Mas'ud-i Sa'd-i Salmān (1047-1121), cited and translated in A.J. Arberry's Classical Persian Literature:

Since I have seen with the eye of certitude
that this world is an abode of desolation
that all the generous men of goodly presence
hide now their faces in the curtain of shame,
that heaven, like an inequitable mate,
is set upon sly tricks and wearisomeness,
my heart is bruised and broken, like a grain
crushed by the mill-stone of the emerald sky.
Thanks be to God, my temper, that was sick,
has risen at last from the pillow of ambition
and, in the drugstore of good penitence,
sought the sweet antidote of sincerity.

[Chris Conner writes:

Since you just posted, it's possible that no one has brought it to your attention yet that you have accused yourself of behaving inappropriately, when you wrote "it would hardly be appropriate to quote the verses of Mas'ud-i Sa'd-i Salmān."

Although in this case, I think undernegation isn't the real problem. Looks like a simple re-editing error to me.

I often make errors of typing, editing, concept or fact. But in this case, what I wrote was actually what I meant, though the behavioral norms that I had in mind were not my own, but rather those of diplomats, who have rarely been accused of sincerity.

To put it more straightforwardly, I really wonder about exchanges like the one between Mottaki and Rice -- what is really being communicated, and why? Whatever the answer is in this case, it's surely not the one that CBS News and Foreign Policy provided.]

[Some other readers have suggested that Secretary Rice's reference to "Arabic" was wrong, given that Mr. Mottaki's national language is Persian. But his greeting, transcribed in most of the journalistic accounts as "As-salama aleikum", is a bit of Arabic that has been adopted for use by Muslims world-wide. As an English-language document Ethics of Islam published in Turkey puts it (p. 108),

When two Muslims meet each other, it is sunnat for one of them to say “Salâmun alaikum” and it is obligatory (fard) for the other one to reply “Wa alaikum salâm”. It is not permissible (jâiz) to greet each other with other phrases that are used by disbelievers or by hand, body or other mimicry.

I wonder what various diplomatic services prescribe as the appropriate response for non-Muslim diplomats to use? The same treatise forbids using the normal Muslim greetings for non-Muslims:

It is not permissible to say “May Allâhu ta’âlâ give you a long life” to any disbeliever or to a non-Muslim citizen of an Islamic state. It is permissible to make such a prayer with the following intentions, e.g., in order for him to become a Muslim or in order for him to pay his taxes so that Muslims will become more powerful. A person who greets a disbeliever, (by saying ‘salâmun ’alaikum’ and) with reverence, becomes a disbeliever. Saying any word which would come to mean a reverence to a disbeliever causes disbelief.

Mr. Mottaki comes from a very different religious and cultural tradition, and he was addressing a mixed group of Muslims and non-Muslims. But perhaps he would have been offended if Secretary Rice had given the traditional Arabic response, since I gather that in some traditions at least, the use of these formulaic Arabic salutations by non-Muslims is also forbidden. An example is cited in this newspaper report from Bangladesh ("Declaring Ahmadiyyas non-Muslim in Pakistan has serious repurcussion on civil liberty", The Daily Star, 1/18/2005):

The famous Pakistani human rights lawyer and UN Special Rapporteur on Freedom of Religion and Belief, Asma Jehangir, during her recent visit to Dhaka, was interviewed by Inam Ahmed and Ashfaq Wares Khan of The Daily Star on the state of the religious minorities, specially Ahmadiyyas vis-a-vis human rights. [...]

DS: How did Pakistan deal with the repression of Ahmadiyyas?
AJ: In Pakistan, the issue was used by religious parties to use the emotion of the people to enrage them and build new constituencies. It became the foothold for the religious parties to gain entry into parliament and government institutions.

DS: How did it unfold?
AJ: During the rule of President Zia-ul Haq, the military dictator, in 1984 Ordinance 20 was passed, for which the penal code was amended so any Ahmadiyyas who pretend to be a Muslim would be punished. For example, we had a number of what came to be known as "Assalamalaikum Cases" where Ahmadiyyas would be arrested for greeting another Pakistani by saying Assalamalaikum. The arrests ran into hundreds, if not thousands.

In contrast, the wikipedia article states that "As-Salāmu `Alaykum (السلام عليكم) is an Arabic language greeting used in both Muslim and Christian cultures".; and the form of the greeting is certainly pre-Islamic. Still, in the circumstances, Secretary Rice's reported response ("Hello. Your English is better than my Arabic.") was, let's say, a diplomatic one.]

Posted by Mark Liberman at 07:43 AM

May 05, 2007

Annals of collocation

Any number of people have remarked on the tendency of representatives of the U.S. government (GWB especially) and supporters of the government's current policies to refer to timetables for leaving Iraq as artificial timetables or arbitrary timetables, collocations that are presumably to be understood as involving appositive rather than intersective modification.  That is, those who use these expressions are conveying that they believe that such timetables can be characterized in general as "artificial" or "arbitrary", and they are are reminding us, again and again, of this claim.

Elsewhere on the collocation front, I've been noticing how often vibrant democracy occurs in print.  I got ca. 111,000 raw Google webhits on the expression this morning, referring to countries that are claimed (depending on who you read) to have, to not have, or to be working towards a form of government that is not only a democracy, but a vibrant one.  This is intersective modification.

Just in the first 50 hits, I found 16 different countries referred to:

India, Indonesia, Iraq, Israel, Japan, Latvia, Mexico, Mongolia, Pakistan, Saudi Arabia, South Africa, Sri Lanka, Turkey, Uganda, Ukraine, U.S.

(plus Africa as a whole and the UAW).

Now, about the contribution of vibrant.  It clearly means something beyond just the minimal trappings of democracy (some voting): broad access to the vote, honest elections, perhaps the promotion of social equality and the protection of individual rights.  Vibrant would not have been my choice of adjective to convey this, though it is vivid.  I'd guess it comes from a single source that's been quoted again and again.  But I don't have the resources to search through media databases for the original, though I suspect that some helpful colleague will turn it up soon.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 07:48 PM

The Eternal General of the United States?

No, it only seems that way, recently. And the White House transcript says "attorney general" ("President Bush Celebrates Cinco de Mayo, Discusses Immigration", 5/4/2007):

I'm honored to be here with the Attorney General of the United States, mi amigo, Alberto Gonzales.

But in the recording, President Bush's pronunciation clearly evokes a different word:

This is clearly a slip of the tongue.

In pronouncing Attorney, the president anticipated the final syllable of the following word, General. So what he actually said, in a very broad phonetic transcription, was something like [ði.ɪˈtʰɚ.nəl ˈdʒɛ.nɚ.rəl] when he meant to say [ði.ɪˈtʰɚ.niˈdʒɛ.nɚ.rəl], with the syllable nucleus [əl] subsituted for [i]. (In fact, the pronunciation of "General" was substantially more reduced, but that's another story.) The rest of the word Attorney, including the relatively high and front rendition of the initial vowel, was pronounced just as President Bush (and many others) would normally pronounce it in the context "the ___ General".

That's the mechanism of the error -- a simple anticipation that substituted the rhyme of the last syllable of General for the corresponding part of the last syllable of Attorney. This VC substitution (that's Vowel+Consonant, not Venture Capitalist) is one of the rarer types of speech errors, purely from the formal point of view, as this bar graph taken from a review article on speech errors (via my Ling001 lecture notes) indicates:

Such errors certainly occur, though -- for example, "milestorms and norms" for "milestones and norms".

But what's rarer still is that this appears to be a classical "Freudian slip", that is, a case where we can tell a plausible story about how the speaker's unconscious fears or desires have subverted the speech production mechanism in order to allow themselves to be expressed. Freud wrote ("The Psychopathology of Everyday Life", 1901, chap. 5) that

I do not doubt the laws whereby the sounds produce changes upon one another; but they alone do not appear to me sufficiently forcible to mar the correct execution of speech. In those cases which I have studied and investigated more closely they merely represent the preformed mechanism, which is conveniently utilized by a more remote psychic motive. The latter does not, however, form a part of the sphere of influence of these sound relations. In a large number of substitutions caused by mistakes in talking there is an entire absence of such phonetic laws. [...] [And] I still cherish the expectation that even the apparently simple of speech-blunder will be traced to a disturbance caused by a half-repressed idea ...

In fact, in the large corpora of speech errors that have been collected, the cases where there is any obvious or even plausible account in term of "half-repressed ideas" are very rare, and an account in terms of simple speech production glitches, without reference to unconsious fears and desires, has been preferred by modern psychologists. Certainly, we can create obviously non-Freudian speech errors by simple experimental manipulations, as anyone knows who has ever tried to say "rubber baby buggy bumpers" 5 times fast.

Still, every once in a while, there's a case that makes you wonder whether ol' Sigmund wasn't on to something after all.

[I'm not in general a supporter of the Bushisms industry. The point of this post is not that President Bush is especially prone to spoonerisms. I haven't seen any counts -- as far as I know, no one has ever made any -- but I suspect that the rate of such errors in his public speaking is within the range exhibited by other politicians, and probably not all that different from my own rate in public speaking.]

Posted by Mark Liberman at 02:51 PM

The future of research?

I spent a couple of days last month at a workshop in Phoenix AZ that was jointly sponsored by (the U.S.) NSF and (the U.K.) JISC. The topic was (digital) "repositories". Tidying up my laptop recently, I stumbled on my notes from the meeting, and was reminded of how much I was impressed by the instruction to the breakout groups that was drafted by Steve Griffin and the other organizers:

With the objective of providing a more creative environment for scholarship, assume the following goal:

By 2015, all publicly-funded research products and primary resources will be readily available, accessible, and usable via common infrastructure and tools through space, time, and across disciplines, stages of research, and modes of human expression.

* Identify the intermediate tasks, resources, and enabling conditions
* Sketch a roadmap with major tracks and milestones to achieve the goal

Posted by Mark Liberman at 07:40 AM


It's not just cats anymore:

I'm not a member of that culture, but I hope they won't mind if an outsider suggests that they need more baby animals. Also, more lessons in the lolcat idiom. But whatever its deficiencies, their work reminds me that the field of lolguistics is even more sadly underdeveloped.

[Hat tip: K.G. Scheider]

Posted by Mark Liberman at 07:33 AM

May 04, 2007

Ready to copy?

The shared xerocopying machine that I use at my university has a card reader beside it so that authorized use can be charged to the right account when you put in your approved plastic xerocopying card. On the front of the card reader is a small LED display screen. It says READY when the system is not ready. If the system is ready, it does not say READY.

When I point out these things about linguistically counterintuitive terms or signs in everyday life, people often write to me to explain what the meaning is. They totally miss my point. I know how to find out the meaning. (Trial and error will do. While the little screen says READY, copying will not work.) What I don't know is why people persist in designing linguistic displays so as to make for possible misunderstandings when they didn't have to.

This is what's going on. When you first approach, if the machine is not ready to copy because no card has been inserted into the card reader, the display on the latter reads READY because the card reader (the machine users are not really concerned with, and tend to regard as simply part of the copying machine) is ready to have a card inserted into it. The copying machine, however, is not ready. When you insert a card, the word READY goes away. So if someone goes away and leaves their card in the machine, the display does not say READY. That means the system is ready to copy (that is, the copier is; the card reader actually is not), so if you are ready you can steal from them by charging your copies to their account.

The slip by the engineers here is that they did not see how the whole assembly — copier plus card reader — would be see by users as one machine. They are not experienced as separate, in the way that a ticket machine in a subway station is seen as different from a subway train. They are seen as one thing, like a washing machine and its control panel.

I should add that sometimes, if it takes half a minute or so to position several parts of a complex original or to decide on the orientation, paper size, and settings, the card reader will spit your card out half way. The little screen then reads REMOVE. When you are done with your copying and pick up your copies to go, and you ought to remove your card, the little screen does not say REMOVE; instead it displays an obscure account number, of no interest to the typical user. So people often go away and leave their card behind, which means the next person can steal from them.

The problem there is that the engineers fail to distinguish the timed-out condition (when you've taken too long to choose your settings and position your original) from the finished condition. What's needed is a display saying "Timed out; please re-insert your card if you're not done yet" in the first case, and "Please press END and remove your card if you're done" whenever a copy run finishes.

It would have been easy to devise words or signs to display that would not be counter-intuitive, but the engineers who designed this setup managed to make a machine that appears to say it's ready when it isn't and says "remove" to mean "push back in".

Don't write and tell me I should understand that the little screen is dealing with the internal state of the card reader. I know that. But I (like any other user) don't give a flying fuck about the internal state of the card reader's little finite-state machine. I'm the human here. It should communicate in a way that suits me. It's not as if I'm here to serve it, and learn its little language. Don't you see that?

Whose side are you on, the humans or the machines?

Posted by Geoffrey K. Pullum at 04:56 PM

Back to the future, redundant preposition department

Yesterday I posted about an Australian student whose political manifesto began with a flourish of distinctly non-standard prepositions: "I am one of those Liberals with which this publication has a somewhat unhealthy obsession towards. This article would like to explore some issues to which this newspaper often propagates on." ("A note of dignity or austerity", 5/3/2007).

The problem here is not that that "towards" and "on" are placed at the ends of their clauses, but that additional prepositions are also provided with the clause-initial relative pronouns, "with which" and "to which". (Well, there are other stylistic problems in these sentences as well -- it's more usual to have an obsession with or about something than an obsession towards it; and you can propagate (i.e. spread) a religion, but it's idiosyncratic and distracting to write about propagating on some issues, as if propagate meant "preach" -- but this is not a critical-writing seminar.)

I agreed with Renae O'Hanlon's diagnosis, which was that the author had introduced "with which" and "to which" because they seem "proper and sophisticated, and give him an air of authority". And then I jocularly suggested that the redundant propositions, if adopted by generations to come, might eventually lead to the re-introduction of case marking on relative pronouns in English, through forms like "twitch" and "fwitch". The idea behind this joke was that inflections sometimes develop from re-analysis of phonologically-incorporated function words (see e.g. Arnold Zwicky and Geoffry Pullum, "Cliiticization vs. Inflection: English n't", Language 59 (3) 1983), and that redundant prepositions might be the leading edge of case-marking agreement.

I should have known that there's rarely anything new under the sun, at least when it comes to English morpho-syntax. It doesn't matter whether you view an "innovation" as a disgusting display of terminal degeneracy, or as an amusing example of linguistic creativity -- the chances are, Chaucer or his contemporaries did it too. This morning, I got a note from David Denison, reminding me of this principle.

David's note:

My DPhil dissertation, written back in the Lower Pleistocene, was about the history of phrasal and prepositional verbs in English (and Norse, re phrasal verbs), and at that time I collected quite a few examples from Old, Middle and early Modern English like the ones you cited. I doubt it's a change in progress, therefore, at least as far as the FORM is concerned. The speculation about _with which_ and the like being a new status marker is fun, though, so a new FUNCTION for the form is possible. Exaptation, maybe?

Anyway, a little Middle English evidence:

& sei me hwer þu wunest meast; of hwet cun þu art ikumen of (Seinte Marherete 38.1)
and tell me where you dwell most, of what race you are come of [of = 'from']

Till all & syndry to quham þe knawlage of þir present lettris sail to cum (quot. 1428, OED s.v. whom pron. 7a)
Also they found there namys of ech lady, and of what bloode they were com off (Malory, Works 593.24) -
[NB. Malory does it a lot, and there are more examples in Middle English Dictionary s.v. _of_ adv. 7.]

With different particle:

þuruh hwat muhte sonre ful luue of aquikien (Ancrene Riwle (Nero) 25.16)
through what might sooner foul love from awaken

Ne nis na þing hwerþurh monnes muchele madschipe wreððeð him wið mare þen þet schafte of mon (Seinte Katerine 234)
NEG not-is no thing where-through man's great madness angers him with more than that that creation of man

Unto which place every thyng, . . moveth for to come to (Chaucer, Hous of Fame 733)

And Nuria Yáñez-Bouza, one of my present PhD students working specifically on prep. stranding but from a very different point of view (she was in touch with you the other day), could certainly give you exx of double preps from early and late Modern English.

[Nuria Yáñez-Bouza wrote:

Here are some examples from early Modern English.

With the same preposition:

Behinde the Lunges, towarde the Spondels, passeth Mire or Isofagus, of whom it is spoken of in the Anatomie of the necke (Helsinki Corpus, science, Thomas Vicary 1548, s2,p62,chVIII)

With different preposition:

Furthermore, Sir, if it please you to understand of the great unkindnes that my grandam hath showed unto me now latly, as the bringer herof can more planly shew you by muth, to whom I besech you to take credence on. (Corpus of Early English Correspondence, Germayn Poley, 1503, Letter CXLIV, p.179)

And one which I find particularly peculiar (and interesting to me, I've done a bit of work on that) is the combination of a stranded preposition with the where+preposition compound (e.g. whereat, whereby, wherefrom), instead of the
pied-piped preposition:

But if ye will sel it, send word to your son what ye will doe, for I know nothing els wherwith to help you with. (Helsinki Corpus, private letters, Isabel Plumpton, 1500-70, s2,L.162,p.199)

Nuria notes that she found no examples in her late Modern English corpus, which is "a selection of genres from ARCHER (1700-1900)", but she supplies these late Modern English gems from Jespersen (1909-49:III.10.5.1-3):

a pamphlet of which he came into possession of in London (Sharp, Browning 120)

a young Irishman, with whom I was once intimate, and had spent long nights walking and talking with (Stev.A.141)

She flags as especially idiosyncratic an example from Defoe, "where he strands the same preposition (to) and a different one (upon)":

I had nobody to whom I could in confidence commit the secrecy of my circumstances to, and could depend upon for their secrecy (Defoe)

I've added boldface to her examples to highlight the double prepositions.

Meanwhile, Patrick McCormick found a couple of additional examples from contemporary English:

The reason why they should be used is that not all databases provide sufficient information in the JDBC result set object to determine the table to which a column belongs to.

The group to which arthropleura belongs to dates all the way back to 420 mya and could very well have been the first arthropods to live on land more than in ...

The trouble with examples like these is that they may come from editing errors, where the author (for example) writes "the table that a columns belongs to", and then on a subsequent editing pass decides not to strand the preposition, and so changes "that" to "to which", but forgets to remove the other copy of "to".

But then again, maybe not.

Andrew Fader notes:

A big example of the double preposition is the phrase, "world in which we live in," which has a ton of Google hits, probably because of the Paul McCartney song "Live And Let Die" which features it. Some people actually think it's "world in which we're livin'," but it doesn't much sound like that.

Philip Spaelti sent in the same observation. I naively always thought it was "world in which we're livin'", myself, and never realized until now that the other construal was out there. But I should have, because (as Ben Zimmer just pointed out to me), Tim Warner took an extended look at this one on Mother Tongue Annoyances recently ("Worst Lyrics Ever? In Defense of Sir Paul", 3/31/2007), and Languagehat offered an earlier opinion ("Koaga and Wordtheque", 9/30/2005) that is less charitable to the "world in which we're livin'" interpretation.

I haven't listened to him performing the song recently. But in an r-less dialect, it's going to be just about impossible to distinguish unstressed "we" from "we're", especially before [l], where the inevitable formant transitions from [i] to [l] will pass through the same region of the vowel space as the schwa that would be the expected reflex of /r/ in that context; and "live in" is guaranteed to be homophonous with "livin'" in pretty all versions of English that have "g-dropping".]

Posted by Mark Liberman at 07:09 AM

May 03, 2007

A note of dignity or austerity

Renae O'Hanlon wrote:

I read and enjoyed your recent Language Log Post "Hot Dryden-on-Jonson action".

I thought these lines from an Australian political newsletter might be of interest to you ("A future Liberal leader writes: Multiculturalism is civilisation's single greatest threat", Crikey, 4/27/2007).

The author writes: "I am one of those Liberals with which this publication has a somewhat unhealthy obsession towards. This article would like to explore some issues to which this newspaper often propagates on ...."

I found this to be a very interesting example of prescriptivism. I suppose to this author, constructions such as "with which" and "to which" are proper and sophisticated, and give him an air of authority (and perhaps an air of legitimacy, since Australian Liberals [capital 'L'] are generally speaking represented by the economically advantaged and well educated). However, he clearly fails to notice that once the pied piping option is chosen, there is no longer any need for preposition stranding (also, interestingly, he has selected different prepositions). It seems that he is simply inserting the "with which" and the "to which" as status markers, and for no grammatical purpose at all!

Renae's sociolinguistic analysis is convincing, and reminds me of James Thurber's discussion of "whom" in his Ladies' and Gentlemen's Guide to Modern English Usage. However, like Thurber's examples ("Whom are you, anyways?"), these redundant prepositions are so extreme that I wondered whether they might be a bit of inspired satire rather than a previously undocumented form of hypercorrection.

The cited article in Crikey is introduced by this note from the editors:

The editors of Monash University's student newspaper, Lot's Wife, are often maligned for being soft lefties. So in their last edition, they asked for people from all political persuasions to write in with their views. Con Helas, president of the Monash Liberal Club and electorate officer for Andrew Robb, took up the call.

But in Crikey's section of "Comments, corrections, clarifications, and c*ckups" for 4/30/2007, Patricia O'Donnell writes:

The words "will apparently run in the next edition of Lot's Wife" suggest that you do know that this is an undergraduate attempt at satire. I do hope you know that, even more than I hope that it is.

If you're having some difficulties with Crikey's "trial subscription" system, as I am, then you'll want to read the version of the whole Crikey article reprinted in this weblog entry, written by a Lot's Wife staffer. A post ("Culture Shock", 4/28/2007) on the weblog Awake and Alert suggests that Con Helas really exists. And a post at Andrew Landeryou's The Other Cheek ("Axed: Big Mouth Prominent Liberal Student Hack Shot At Dawn By Angry Andrew Robb", 4/29/2007) appears to confirm his existence, and also to confirm that he really wrote the article attributed to him.

So let's accept for the moment that at least one real person really writes things like "one of those Liberals with which this publication has a somewhat unhealthy obsession towards", and "some issues to which this newspaper often propagates on". And let's also accept Renae's idea that the Preposition+which sequences are inserted because, as James Thurber wrote about the use of whom, "a note of dignity or austerity is desired". Or, as Bishop Lowth put it in 1762," the placing of the preposition before the relative, is more graceful, as well as more perspicuous; and agrees much better with the solemn and elevated style". Mr. Helas' innovation is to see the pre-which preposition as a stylistic extra, rather than an alternative placement.

Let me stipulate, at this point, that Mr. Helas is a very bad writer, and that he sometimes strays over the line from mere stylistic infelicity into overt violation of the grammatical norms of standard English, and that his redundant prepositions are definitely on the far side of that boundary. As a teacher, I recognize that he needs help.

But as a linguist, I'm excited. In the (I admit unlikely) event that this stylistically motivated, grammatically redundant marking of wh-words spreads, maybe English will develop case inflections all over again! English grammars of the 22nd century may cite the dative case of the relative pronoun, "twitch", and the benefactive, "fwitch". And the locative, "nwhich".

I'm just saying, is all.

What I'm wondering about now is how to search the web for other examples of this possible change-in-progress. Unfortunately, breakfast time is over, and I have to get to work. If you come up with a good search strategy, let me know.

[Rick Sprague writes:

If English pronoun case inflections are to be revived and elaborated, please God, let it stop there, and not extend to nounis and adjectivis. I have had Latin's declensionen enough to last me through minem lifetimem, thank youm.

Well, participation in linguistic change is generally optional, so I think Rick is safe, whatever happens.]

Posted by Mark Liberman at 08:35 AM

May 02, 2007


I missed this one, but an reader sent in a link, a couple of weeks late ("Public radio seeks 'hostiness' in 'Idol'-type search", AP, 4/13/2007):

“We’re looking for people with hostiness,” said executive director Jake Shapiro. Hostiness, he explained, is “that elusive, magnetic X-factor quality” all good hosts have.

There is a bit of more-or-less unrelated prior use, though some is blasphemously silly ("Schism in the Jon Stewart Religion", 9/6/2006):

Stewart is known as "His Hostiness" and Colbert as "His Truthiness."

and the rest is just plain silly, e.g.

How awesome my Saturday was because of your hostiness!
I look forward to your gracious hostiness.
I think Pointyjess and Mr Misery will attest to my hostiness qualifications.
Hooray for serverness! And, incidentally, this also means that I will eventually start doing e-mail hostiness!
I have my own place of hostiness that isn't photobucket.
etc., etc.

Analytically, we see that the prior usage is a convergence of cutesy -ness and Colbertian -iness.

(Looking over our truthiness archives, by the way, I see that Ben Zimmer was on top of the story within a few days of its start. So I have to apologize to our readers for being late to the hostiness party -- as always, the Language Log Marketing Department will refund your subscription fees in case of less than total satisfaction.)

Posted by Mark Liberman at 06:02 PM

Forgive me, awful poet

In this era of bardolotry, it was shocking to learn that John Dryden felt himself to be qualitatively superior to Shakespeare, Jonson and Fletcher: "Those who call theirs the Golden Age of Poetry, have only this reason for it, that they were then content with acorns, before they knew the use of bread ..." A bit more reading uncovered that fact that Dryden also believed himself to be so much superior in talent and taste to John Milton, his older contemporary, that he felt compelled to re-write Paradise Lost in dramatic form.

Not all of Dryden's contemporaries agreed with his assessment of his own abilities and accomplishments. One prominent dissenter was Andrew Marvell, who, according this page, "worked with Milton (and Dryden) in the Office of the Secretary for Foreign Tongues in Cromwell's government". Paradise Lost was first published in 1667, and the second edition, published in 1674, began with a poem by Marvell, "On Mr. Milton's Paradise Lost", which included these lines:

18 Jealous I was that some less skilful hand
19 (Such as disquiet alwayes what is well,
20 And by ill imitating would excell)
21 Might hence presume the whole Creations day
22 To change in Scenes, and show it in a Play.

The "less skilful hand" was apparently a reference to Dryden, and the play in which Dryden "presume[d] the whole Creations day / To change in Scenes" was his opera The state of innocence, and fall of man, which was written (and performed) in 1674 and published in 1677.

The published version of Dryden's dramatization of Paradise Lost opens (after a long, servile "Epistle Dedicatory" from Dryden "To Her Royal Highness, the Dutchess") with a dedicatory poem, addressed "To Mr. DRYDEN, on his Poem of Paradice", by some 17-century groupie identified only as "Nat. Lee". Lee argues at length that Dryden is as much an improvement on Milton as Dryden felt himself to be on Shakespeare. It begins:

Forgive me, awful Poet, if a Muse,
Whom artless Nature did for plainness chuse,
In loose attire presents her humble thought,
Of this best POEM, that you ever wrought.
This fairest labor of your teeming brain
I wou'd embrace, but not with flatt'ry stain;
Something I wou'd to your vast Virtue raise,
But scorn to dawb it with a fulsome praise;
That wou'd but blot the Work I wou'd commend,
And shew a Court-Admirer, not a Friend.
To the dead Bard, your fame a little owes,
For Milton did the Wealthy Mine disclose,
And rudely cast what you cou'd well dispose:
He roughly drew, on an old fashion'd ground,
A Chaos, for no perfect World was found,
Till through the heap, your mighty Genius shin'd;
His was the Golden Ore which you refin'd.
He first beheld the beauteous rustic Maid,
And to a place of strength the prize convey'd;
You took her thence: to Court this Virgin brought
Drest her with gemms, new weav'd her hard spun thought
And softest language, sweetest manners taught.
Till from a Comet she a star did rise,
Not to affright, but please our wondring eyes.

Of course, Lee means awful in the sense of "inspiring awe", which Dryden no longer does, if he ever really did. And as a poet, he's not exactly awful in the modern sense of "extremely bad or unpleasant" either. (And see "Terrific is even creepier than uncanny", 10/20/2006, for a discussion of the semantic shift involved.) But if Shakespeare was a bowl of acorns, Dryden is a sack of pebbles. If Milton was a miner of precious-metal ore, then Dryden in comparison is a gravel pit.

Today, not even one phrase by Dryden makes it in to Wikiquote, and in the lists of quotes attributed to him in other places, I can't find any that seem familiar to me.

Poor John Dryden: his legacy is not a well-loved poem, or a set of phrases embedded in everyday language, but a freakish grammatical superstition.

[Arnold Zwicky writes:

Dryden seems to have fit his time superbly, but then that time passed. Probably most people know his poetry now through the wonderful settings by Handel -- Alexander's Feast and Ode for St. Cecilia's Day, in particular. (Some of his translation of Ovid found its way into Acis and Galatea, which is not particularly well-known, though it's a favorite of mine.)


[Alan Wechsler observes that "Nat. Lee" is probably Nathaniel Lee. After reading his wikipedia entry, I can only say, wow, I wish I had the movie rights.]

Posted by Mark Liberman at 11:49 AM

Ineffable apes

Look at me, baby! News from the animal communication front: Chimps and bonobos have arbitrariness of the sign, at least sort of, at least with respect to some gestures.

From the New Scientist article:

By observing captive groups of bonobos and separate groups of captive chimps, Pollick and de Waal identified 31 gestures and 18 facial or vocal signals made by the apes, and recorded the context in which they were used. It turns out that the facial and vocal signals were practically the same in both species, but the same gesture was used in different contexts both between and within species.

For example, the vocal signal "bared-teeth scream" signals fear in chimps and bonobos, but the signal "reach out up" -- where an animal stretches out an arm, palm upwards -- has different meanings. It may translate as begging for food or as begging for support from a friend, says de Waal. "The open hand gesture is also used after fights between two individuals to beg for approach and contact during a reconciliation. So the gesture is versatile, but the meaning depends on context."

In other words, chimps and bonobos seem to have gestural homophones--one symbol with two or more meanings.1 The authors, Amy S. Pollick and Frans B.M. de Waal, of the Yerkes National Primate Research Center at Emory University, find this suggestive in thinking about the evolution of language: Perhaps the earliest symbolic communications were gestures, and the symbolic use of vocal signals came later? In other words, perhaps this supports the gestural hypothesis of human language origins?

A colleague of theirs at the Yerkes Primate Research center agrees:

The openness of the hand-gesture system among chimps and bonobos "is consistent with the idea that the early hominid communications system was gesture based and that vocal communication came later," said William Hopkins, a Yerkes researcher not involved in the study. "The speech system is a very recent adaptation in hominids."

But are the gestures really symbolic? Maybe they're just used as an enhancement of the vocal/facial signal (like the gestures accompanying speech among users of spoken languages). Are they really independent communicative units on their own (like the symbols of signed languages)?

In fact, it seems to me that the 'flexibility' of (at least some) gestures is consistent with the notion that they are general attention-getting devices. On this interpretation, they show up in multiple contexts precisely because they are not symbolic. A gesture might be saying, 'Pay attention to my vocal /facial signal!' rather than, say, denoting 'Help!' in one context and 'Can I have some of that?' in another. Flexibility of context is the opposite of symbolic communication, in a way.

Figure 3 Consider Fig. 3 from the article, where the reliability of the correlation between particular contexts and particular gestures and vocal/facial signals is reported. The gestures that show the least reliability are hand/arm motions in various directions -- 'reach out side', for example. These gestures happen to be the ones which intuitively (speaking as a communicative primate myself) have the least symbolic content.

If these gestures are actually signal amplifiers, rather than symbols, then we wouldn't expect them to show up by themselves, in the absence of the 'real' signals coming through in another modality (another communicative medium, like sound). They would always appear in combination. This seems like the $64,000 question to me, if use of symbols is what is at issue. (Marc Hauser, of {Hauser, Chomsky, Fitch} is quoted in the NY Times article as wondering the same thing).

Indeed, in the article itself, the authors introduce the research with remarks on the pervasiveness of mulitmodality in the communications systems of many different species, and its amplifying effect:

Gestures [in chimps and bonobos] are rarely produced in the absence of other communicative signals, such as facial expressions and vocalizations. Multimodal communication has been appreciated in humans for several decades (43) and is becoming increasingly important in the study of animal communication (44). Multimodal signaling occurs across taxa, from snapping shrimps to spiders and birds, and in all contexts, although those related to courtship and mating are best documented (A.S.P., unpublished work). This communication strategy can have a variety of functions, including amplification and modulation of signal meaning. Combined with the graded facial/vocal signals typical of the apes (46), gestural flexibility has the advantage over the more stereotyped signaling by monkeys that it permits greater communicative complexity.

So, you get multimodal signaling across all taxa; makes its existence in chimps and humans seem less special. But you don't get it in monkeys, which makes it seem more special, within our lineage, anyway. Looking for clues about whether the gestural signals were always in a multimodal context, it turns out there is a difference between bonobos and chimps in this regard:

No significant difference was found in the proportion of signals that was gestural versus facial/vocal, but chimpanzees did combine these two signal classes relatively more often than did bonobos.

So there are at least some non-mulitmodal gestures in the data, enough to run stats on.

Interestingly, although chimps used more multimodal signals, multimodal signals were significantly more effective than gesture alone in eliciting a response in bonobos, but, remarkably, not in chimps. Among chimps, the other chimps responded about 67-68 percent of the time to any gestural signal, whether it was embedded in a multimodal context or not. Among bonobos, on the other hand, a gesture alone elicited a response at about that same rate, 67 percent-ish, but if a bonobo gestured and made some facial or vocal signal along with the gesture, they got a response a whopping 83 percent of the time! And yet the bonobos were less likely to produce mulitmodal signals than the chimps! What's up with that?

The interest of this inverse correlation does not escape the authors, who write in their discussion,

That this contrast between multimodal and single modality utterances held for bonobos only is interesting given that multimodal combinations are less common in bonobos. Could the relative scarcity of multimodal signaling in bonobos relate to a more deliberate combination of gestures with other forms of communication, perhaps in an attempt to add critical information to the message instead of merely amplifying it?

The whole question of amplification is complicated by the methodology. The authors wanted to have good criteria for which signals (gestural or otherwise) to count as communicative, so they only counted gestures made at the start of a 'social interaction'... which one individual approached another and attempted to engage the recipient with a communication signal. The two individuals may have been in proximity before, but without observable interaction. Signals were not included in the analysis, therefore, if they occurred in the middle or toward the end of an ongoing interaction.

So all of the communicative contexts examined were attention-getting contexts; this may muddy the waters for sorting out the precursor-to-symbolic-communication hypothesis from the signal-amplification hypothesis, since amplification is presumably useful in attention-getting.

Even if the gestures are more attention-getters than symbols, though, it's definitely interesting that apes, but not monkeys, use them. Silly monkeys -- it couldn't be more obvious that you should wave your hands to get attention! Working to get someone's attention, though, presupposes that you know they have an attention to get -- presupposes a theory of mind. And that is likely a sine qua non for language.

Maybe the reason that bonobos use multimodal signals more selectively than chimps do is not that they are trying to 'add critical information' to the signal. Perhaps it's because they have a better theory of mind than chimps do, and hence they've got a better grasp of how to deploy the whole 'Look at me!' thing.

Update: Here are Mr Verb's thoughts on the study. Also, Mark Liberman adds:

The chimp gesture meaning-variation may not be that different from the well-known vervet monkey variation in alarm-call meaning. They have three alarm calls, colloquially known to human researchers as the "eagle alarm call", the "leopard alarm call", and the "snake alarm call". But really the "eagle alarm call" is a "predator in the air" alarm, and baby vervets need to learn which large birds in each area are actually dangerous; e.g. not storks, not egrets, but yes to kites or various species of eagles, depending on the birds that live in the vicinity. Ditto for the "leopard alarm", which in the West Indies means "feral dog".

So like the open-hand gesture, the alarm calls have a basic core that is constant, along with referential details that can vary quite a lot, not only with context, but also as a result of learning from the local history of usage.

You can make the local differences seem to go away by re-naming the calls: "avian predator alarm", "mammalian predator alarm", etc. But that wouldn't recognize the fact that the different local populations of vervets have actually learned different associations for the calls, in terms of when to make them and in terms of when NOT to make them.

Like the basic hand gesture, the calls themselves are invariant across space, and presumably are genetically programmed. It's the details of their interpretation in context that are locally "arbitrary"

Um, comments?

1 Anyone want to coin a term for this? Maybe there already is one. Homogests? Ick.

Posted by Heidi Harley at 03:02 AM

Northwest Journal of Linguistics

Those interested in the native languages of northwestern North America may want to check out the Northwest Journal of Linguistics, a new electronic journal that some of us have started.

Posted by Bill Poser at 01:54 AM

The Language Lumberjacks

Mark Liberman wondered about "Language Lumberjacks", but they turned out to be us, the Language Loggers.  Or, as Monty Python had it:

I'm a lumberjack and I'm OK
I sleep all night and I work all day.

He's a lumberjack and he's OK
He sleeps all night and he works all day.

I cut down trees, I eat my lunch
I go to the lavat'ry.
On Wednesdays I go shopping and have buttered scones for tea.

He cuts down trees, he eats his lunch
He goes to the lavat'ry.
On Wednesdays he goes shopping and has buttered scones for tea.

He's a lumberjack and he's OK
He sleeps all night and he works all day.

I cut down trees, I skip and jump
I like to press wild flowers.
I put on women's clothing and hang around in bars.

He cuts down trees, he skips and jumps
He likes to press wild flowers.
He puts on women's clothing and hangs around in bars?!

He's a lumberjack and he's OK
He sleeps all night and he works all day.

I cut down trees, I wear high heels
Suspenders and a bra.
I wish I'd been a girlie, just like my dear Papa!!

I cut down trees, I wear high heels?!
Suspenders...and a bra?!...

Just the Lumberjack:
I wish I'd been a girlie, just like my dear Papa!!

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 12:45 AM

May 01, 2007

Language Log 1, BBC 0

In reference to a recent article on accents at Slate, I got several notes from readers, of which this one from Miles Townes was typical:

Slate just posted an interesting "Explainer" about accents (, but included the "study" that claims cows have accents. I have read your debunking of that claim in the past, so this makes me suspect the rest of the information in the Explainer. I wondered if this merited the attention of the Language Lumberjacks.

But by the time I got over to Slate to look, the cows were gone, except for this little note:

Correction, May 1, 2007: This article originally stated that a study showed cows have "regional accents." That "study" was a now-debunked PR hoax.

Hey, how about a link, guys?

We don't make a dent in the mighty BBC science misinformation machine very often, so this small victory is sweet.

But who are these Language Lumberjacks, I wondered? A quick web search informed me that this came from an old Heidi Harley post that I somehow missed. Now that Heidi is a Language Logger, I have to remember to issue her the monogrammed axe and climbing spurs.

[Update -- Joshua Riley writes:

I think I must take credit for making Slate correct the reference to the ridiculous cow story. As a lingust and regular reader of Language Log, I immediately knew the rubbish that the piece was referring to, and was just as quick to find an appropriate piece of yours to illustrate their folly with.

For what it's worth, the author seemed genuinely apologetic and the correction was made very promptly after I received it. You should be pleased that Slate is now apparently taking your Language Log postings as authoritative evidence more reliable than the BBC science section (which they would know if they were LL readers, of course).

Of course, if I had known the error would be rectified so quickly and so easily, I would have tried to explain that Foreign Accent Syndrome really has nothing to do with foreign accents . . . but you have to pick your battles, I guess.


Posted by Mark Liberman at 08:31 PM

Mair on Washington Post on illiteracy in China

[This is a guest post by Victor Mair]

Below is a report on illiteracy in China from the Washington Post, with interspersed comments by me.

I hasten to point out that the numbers I cite in my comments are impressionistic, based on my personal experiences in various parts of China since 1981. In particular, I have been fortunate to work closely with the Wenzi Gaige Weiyuan Hui (Script Reform Committee), reorganized in 1986 as the Guojia Yuyan Wenzi Gongzuo Weiyuan Hui (State Language Commission). No one knows what these numbers really are, although it would be simple enough to determine them experimentally; I believe that the lack of information is probably because the Chinese government is concerned that the results would be embarrassing.

Perhaps the rough estimates that I give will stimulate someone into doing the (simple) research that would settle the question by providing more accurate figures based on verifiable surveys. For purposes of effective language planning and policy, it is essential to have reliable data.

Maureen Fan, "Illiteracy Jumps in China, Despite 50-Year Campaign to Eradicate It", Washington Post Foreign Service, Friday 4/27/2007 (A19)

LIUPU, China -- Last year, finally, everyone in Liupu village was able to read and write 1,500 Chinese characters, a census showed. Village leaders threw a big dinner to celebrate, presenting commemorative teacups to the last two adults to make the grade.

But ask Zhao Huapu, the earnest principal of Liupu Shezu Girls School, how many people here can actually read and write, and he gives an embarrassed smile. Nearly 30 percent of Liupu's adults are illiterate.

Here we run smack dab up against the stark difference between official government propaganda and statistics on the one hand, and harsh reality on the other hand. The encouraging thing is that -- under current circumstances -- we are starting to get a chance to hear some of the latter.

"That's just reality. . . . A lot of them can't read and write," said Zhao, who acknowledged that the census is based on a test that fails to measure adult literacy accurately.

Illiteracy is increasing in China, despite a 50-year-old campaign to stamp it out and a declaration by the government in 2000 that it had been nearly eradicated. The reasons are complex, from the cost of a rural education to the growing appeal of migrant work that draws Chinese away from classrooms and toward far-off cities.

Well, I would say that the single most important reason is the fact that the script is so cumbersome and hard to master, a fact that is almost always completely overlooked by officials, analysts, and journalists.

In many cases, as in this farming hamlet in China's southern Guizhou province, villagers whose education ended in elementary school have simply forgotten basic skills.

From 2000 to 2005, the number of illiterate Chinese adults jumped by 33 percent, from 87 million to 116 million, the state-run China Daily reported this month. The newspaper noted that even before the increase, China's illiterate population had accounted for 11.3 percent of the world's total.

"The situation is worrying," Gao Xuegui, director of the Education Ministry's illiteracy eradication office, told China Daily, blaming the increase on changing attitudes toward knowledge in a market economy. "Illiteracy is not only a matter of education but also has a great social impact."

Gao's remarks echoed concerns voiced by literacy researchers and served as a reminder of the challenges facing China's mostly rural population.

This country is proud of its traditional focus on education, as well as more recent efforts to raise standards, such as passage of a law that says every child has the right to nine years of schooling. Yet in many rural areas, such schooling remains unavailable or prohibitively expensive.

"Has the right" my eye! Because of the costs of tuition and books, as well as other factors, access to high-quality education in "socialist / communist" China is even more dependent on family income and connections than in capitalist countries.

Ditto for medical care.

In 2000, officials announced that the illiteracy rate in Tibet, the worst in China, had dropped to roughly 42 percent from 95 percent about 50 years earlier.

Despite the appalling poverty and daunting physical conditions faced by the Tibetan people, it is easier for them to become literate in their own language than for Chinese in theirs, simply because they are blessed with an alphabetic script.

From 2001 to 2005, China educated nearly 10 million adults who couldn't read and write, the Education Ministry said in September. Authorities have also boasted of higher enrollment figures in primary and middle schools.

Experts, however, contend that official reports are sometimes unreliable. Local officials are pressured to inflate enrollment figures, and students who are enrolled often don't bother to show up, they say.

There are also questions about how literacy statistics are gathered. In Liupu, for example, Zhao and other local leaders go door-to-door each September, asking the village's roughly 300 families how many people are in each household and what type of education they have. Those who can show they have graduated from primary school are not counted as illiterate, regardless of whether they can actually read or write.

Literacy in China is defined according to an exam taken in fourth grade. Even if villagers pass that exam, they frequently do not pursue further education. Having no reason to read and write, many forget the skills. This is especially true of ethnic minorities, rural women and young dropouts, according to researchers.

"It's undeniable that there's a relapse, but what the number is, is hard to tell," said Guo Hongxia, a scholar at the China National Institute for Educational Research.

Hu Xingdou, a sociologist and professor of economics and China issues at Beijing Institute of Technology, suggested that the problem is related to the perceived benefits of education.

Nobody mentions the monumentally difficult script!

"Farmers don't see a bright future from receiving more education," he said. "Many believe it won't help them much in making money. They also can't afford to send their children to university, and a university degree no longer guarantees a job after graduation."

Farmers are expected to learn at least 1,500 characters,

That's a bad joke!

according to state education regulations. Urban residents should master 2,000.


Teachers in Beijing often tell students they need to know 3,000 characters to read a newspaper.

I'd be very surprised if 10-15% of the population can write that many characters. Perhaps 20-25% can recognize 3,000 characters, but I'm not even very confident about that.

College graduates are tested on 7,000 characters or more.

A pipedream!!! I doubt whether even a hundredth of one percent of the Chinese population can write 7,000 characters; probably no more than 2-3% could recognize that many.

In Liupu, located at the end of a three-mile-long, potholed dirt road, many of those who can't read and write are older, homebound women. Members of the Shezu ethnic minority, they speak their own dialect

LANGUAGE, please!

and have had little formal education.

Still, with the help of Zhao, the school principal, they are trying to make gains. On a recent Saturday night, Zhao used a flashlight to climb a rocky path of steps, past an old woman beating water out of pickled vegetables, up to a spartan wooden house.

Inside, two young volunteer teachers from his school and an older village party cadre sat in a circle, tutoring eight illiterate women, their faces lit by a soundless television.

"Actually, not many people have the patience to read this all the way through," said Zhao Tongxiu, a 24-year-old teacher, pointing at a page in a book. "Did your teacher teach you all this?"

"She taught us all of this, but I just cannot remember it," said his pupil, Wang Chengyi, who thought she was about 30 years old. "My child taught me how to write a little at home, but still I don't write well. I just can't memorize it. I'm already this old, what use is it studying?"

Many women here don't have time for class, the teachers said. They are tired after working 12-hour days in the fields and returning in the evening to feed their families.

Wu Wanqin, 44, the village cadre, earned 2 1/2 cents from the government as an incentive to get the women together Saturday. She would have to walk several of them home by flashlight afterward, and some lived half an hour away.

The women of Liupu use a simple, practical textbook published by the Beijing Cultural Development Center for Rural Women, which began testing it in parts of China a few years ago. But often, adults learning how to read are taught words that don't closely relate to their lives, according to Guo, the national education researcher. By June, Guo said, officials will urge that the approach used in Liupu be adopted countrywide.

Researchers say that illiteracy is not confined to older generations, an assertion borne out in Liupu.

Zhao Xianghua, 15, said half of her friends can't read. She boards during the week at a county school that charges $50 a year in tuition, but she has friends who don't have the same luxury.

"Several are already out working," she said, "and when they come back to visit and we hang out, I can feel the distance between us."

The main test of literacy in China will be officials' ability to follow up with students and cement any gains, said Hu, the professor, who complained that adults are often taught only how to pass a test.

That is indeed true, but Professor Hu hasn't put his finger on the real reasons why it is so hard for Chinese to maintain literacy. With alphabetic literacy, one can forget how to spell a word properly but still get one's idea across by misspelling it. If one forgets a crucial character, like the TI4 of DA3 PEN1TI4 打喷嚏 ("sneeze"), which very few Chinese know how to write, you're stuck.

"It's like planting trees to make a forest," Hu said. "Many people plant trees, but few take care of them, and finally the trees die before becoming a forest."

The people of China -- now, as they have been for the past three millennia -- are constantly challenged by an enormously complicated script suitable only for an elite consisting of a tiny proportion of the population.

The *only* reason people are starting to recognize (once again -- after the propaganda of the last 50 years) the huge dimensions of the illiteracy problem in China now is that there is slightly more, partially honest reporting managing to slip out. The real situation across the length and breadth of China is much, much worse than even the most critical recent reports reveal.

If only China would adopt a policy of true digraphia (PINYIN plus HANZI) and actively promote it, the problems of illiteracy would vanish within a decade or two.


[Above is a guest post by Victor Mair.]

Posted by Mark Liberman at 08:03 PM

Evil passive voice

From Sherry Roberts's 11 Ways to Improve Your Writing and Your Business, in the 7th way, "Be Active":

A sentence written in the active voice is the straight-shooting sheriff who faces the gunslinger proudly and fearlessly. It is honest, straightforward; you know where you stand.

Active: The committee will review all applications in early April.

A sentence written in passive voice is the shifty desperado who tries to win the gunfight by shooting the sheriff in the back, stealing his horse, and sneaking out of town.

Passive: In early April, all applications will be reviewed by the committee.

Wow.  A new record in creative bad-mouthing of the passive voice, that shifty desperado.

(Hat tip to Jon Lighter on the American Dialect Society mailing list, 4/29/07.)

Before the Wild West metaphors, the booklet tells the reader:

If you were one of those people who yawned when your eighth grade English teacher began her lecture on active and passive voice, wake up. What you don't know about active and passive voice may be putting your readers to sleep or making them suspicious of you and your ideas or product.

And after it concludes:

Passive writing is popular in business because it helps the writer avoid responsibility and remain anonymous. Customers are suspicious of writing that evades responsibility. Employees and managers distrust ideas that appear more vague than strong.

Once again, a critic of the passive just asserts stuff, dogma off the shelf, and doesn't even relate these assertions to the examples.  Where in the passive example is the evasion of responsibility, where the anonymity of the writer, where the weakness, where the vagueness?  The active and the passive versions supply exactly the same information (though with different syntax), and the writer plays no role in either version.

Entertainingly, the advice has two passive VPs (serving as postnominal modifiers) in it: "written in the active voice", "written in the passive voice".  This is a good choice, since active voice versions would be wordy -- they have to be cast as full relative clauses ("a sentence that X write(s) in the active/passive voice") -- and, with an indefinite subject X ("you", "someone", "a writer") supplied, would be no less vague about who does the writing than the passive version.

Other postmodifying passive VPs appear in the very first sentence of the booklet:

11 Ways to Improve Your Writing and Your Business is a booklet written for and distributed to participants in Sherry Roberts' business writing seminar.

An active (but wordier) version of this would be something like:

11 Ways to Improve Your Writing and Your Business is a booklet that X wrote for and distributes to participants in Sherry Roberts' business writing seminar.

But what do we supply for X?  The booklet is on a site for The Roberts Group ("editorial & design services"), so maybe it was the work of several hands, and X should be "The Roberts Group (staff)".  The "Who We Are" page mentions Sherry and her husband Tony (but no one else), so there's a staff of at least two.  Or if the author is in fact Roberts herself, then that first sentence could go:

11 Ways to Improve Your Writing and Your Business is a booklet that Sherry Roberts wrote for and distributes to participants in her business writing seminar.

This not merely identifies the writer and distributor of the booklet, but also pushes this information into a position of prominence; the sentence is now about Roberts.  (A version with "Sherry and Tony Roberts" as X would have a similar problem.)  Here we have a case where both information structure and modesty would be served better by a passive postmodifier: really, why do we need to know exactly who wrote the booklet?

There's another postmodifier passive on p. 6:

Newspapers learned long ago that they have only seconds to grab the reader's attention and keep it; a story composed of several short paragraphs appears more accessible than one that resembles a scientific paper.

From here on, I'll leave it to the reader to turn the passives into actives and assess whether the results are improvements.

There are, of course, clausal passives, exactly the sort of thing the booklet tells us to avoid.  At least five (or seven, depending on how you count) in the nine pages of text, all of them "agentless", with the active-voice subject SUPPRESSED (and so "avoiding responsibility"); recall that the shifty-desperado example, in contrast, was an "agentive" passive, with the active-voice subject appearing as the object of the preposition by.  Here's the crop:

Clear, effective business writing is more important than ever. Thanks to the facsimile machine, our skill (or lack of skill) with words is beamed around the world in black and white. (p. 1)

In another study, the U.S. Navy determined it could save $27 million to $57 million a year if officers wrote memos in a plain style. Navy personnel spent more time reading poorly written memos than those written in a plain style. Similar savings could be realized in the private sector if corporations stressed good writing in the workplace. (p. 2)

Your one-line synopsis is a grain of sand; it will help you begin. Large projects can be built from it, but the grain of sand itself is neither overwhelming nor intimidating. (p. 3)

Everything you write in business, from sales letters to budget plans, is intended to elicit a response. (p. 8)

Benefit: Buy our widget with its three new attachments and, finally, relax on a vacation. Our widget works while you enjoy yourself. There's no need to worry; our widget will make sure your cat is fed, your plants are watered, and the temperature of your home is maintained at a constant, fuel-saving level. (p. 9, in a rewritten example stressing benefits rather than features)

Do as I say, not as I do.

A repugnance for the passive has been part of the dogma of writing teachers for a long long time, and it seems to have seeped thoroughly into the consciousness of their students.  Dennis Preston observed on ADS-L, not long after Lighter produced the link to the Sherry Roberts site, that Niedzielski and Preston in Folk Linguistics (2000) has an extensive discussion with folk respondents of the passive, which (in Preston's words) they "judged evil or at least shifty".

I go on at such length about this little booklet because it's intended as practical advice for the business writer.  Having developed some interest in the question of what students should be taught as skills for survival and sucess in the business and professional worlds (what Sally Thomason referred to as "a place for prescriptivism in linguists' lives"), I've begun looking at more and more of the literature intended to be on-the-ground advice: do this if you don't want to displease your boss, do this if you want your writing to produce results, and so on.  Well, on the grammar/usage/style front, the advice is almost all about things NOT to do.  And it's pretty dispiriting stuff.

I'll have more to say on the topic in a while.  Until the next installment, here's an exercise for you to do at home.  Don't write me about it; just save your answer until my posting comes around, or check it against the Roberts booklet.

Take-home.  Section 9 of the Roberts booklet begins:

Watch out for these four commonly misused words.

Some words in the English language take a constant beating in business correspondence. Be one of those writers who use them properly and pleasantly surprise your readers. Your conscientiousness may sell your next idea or product.

So, what do you think these four words are, and what's the problem with them?

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 04:47 PM

Context, context, context

Every so often I point out that sentences that are problematic in isolation (because they seem ungrammatical, confusingly ambiguous, or subject to an absurd interpretation) are just fine when they're viewed in context.  Supplying some linguistic context, or information about the situation in which the sentence was spoken or written, or facts about the world, society, or culture can make things clear.

Here's a simple example from my recent reading -- p. 251 of Atul Gawande's Better: A Surgeon's Notes on Performance (2007):

This is a forty-six-year-old former mortician who hated the funeral business with a right inguinal hernia.

Viewed in isolation, this is a disaster, the sort of sentence that's likely to end up in a "Sic!" column.  That final PP "with a right inguinal hernia" is likely to be parsed as a postmodifier of "the funeral business", or possibly as a VP adverbial of means or manner, modifying "hated the funeral business".  Either interpretation is absurd.

But now look at it in context:

... consider, at an appropriate point, taking a moment with your patient.  Make yourself ask an unscripted question...

... many respond--because they're polite, or friendly, or perhaps in need of human contact.  When this happens, try seeing if you can keep the conversation going for more than two sentences.  Listen.  Make note of what you learn.  This is not a forty-six-year-old male with a right inguinal hernia.  This is a forty-six-year-old former mortician who hated the funeral business with a right inguinal hernia.

What makes the final sentence work (with the PP understood as a postmodifier of "a forty-six-year-old former mortician who hated the funeral business") is the contrast set up in the preceding context:

This is not a forty-six-year-old X with a right inguinal hernia.

This is a forty-six-year-old Y with a right inguinal hernia.

That is, not JUST an X, but in fact an X who is also a Y.  The structural parallelism between the two sentences guides the reader to carry over the interpretation of the PP "with a right inguinal hernia" as a postmodifier of X in the first sentence to an interpretation of this expression as a postmodifier of Y in the second.  If you read the passage out loud, you'll probably set off the PP in the second sentence prosodically, using prosody to indicate that the PP is not attached "low" (as a modifier of "the funeral business" or "hated the funeral business").

The title of this posting is a favorite saying of my friend Ellen Evans.  It's scarcely original with her, as you can see by googling on it.  Googling will, in fact, yield "context, context, context" as an explicit instance of the X3 snowclone:

To paraphrase the axiom which states: the 3 most important words in real estate are: location, location, location; the 3 most important words in the late 20th Century are: context, context, context.  (Jeff Gates, keynote address delivered to the National Conference of Arts Administrators, Anchorage, AK, October, 1996)

But with Ellen Evans you get more: a CafePress shop, Ellen de Sui Generis, with merchandise featuring characteristic Evansian sayings: Piffle; Er, no; FSVO (an acronym for "for some value of", and pronounced like "fizzvo"); and of course Context, context, context.  There you will find two items with CCC on them: a classic thong for $7.99 and a mug for $10.99.  The mugs make excellent presents for your friends in semantics/pragmatics and sociolinguistics (or computer science or postmodern criticism or ...).  (Disclosure: I have no connection, financial or otherwise, with the shop.  I'm merely a Friend of Ellen, and of Lars Ingebrigtsen, who set the shop up.)

A further riff on Ellen Evans:  Ellen used to work in the movie business, and every so often casually mentions having met or worked with various famous people (fsvo famous).  As a result, the newsgroup soc.motss has for some time used the verb ellen to mean 'make the acquaintance of [someone famous]', usually in the perfect, as in "Bernard Malamud?  I've ellened him."  Verb on!

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 01:34 PM

Hot Dryden-on-Jonson action

The modern neurosis about clause-final prepositions is still potent, despite being disavowed by even the crankiest usage authorities of the past century. I got a question about it just last week, from a student who wrote in for help in dealing with a friend who "insists that it is always improper to end a clause with a preposition". As MWCDEU explains,

...recent commentators -- at least since Fowler 1926 -- are unanimous in their rejection of the notion that endling a sentence with a preposition is an error or an offense against propriety. Fowler terms the idea a "cherished superstition." [...] So if everyone who is in the know agrees, there's no problem, right?


Thank you for your reply to my questions but I find it extremely difficult to trust an opinion on grammar prepared by someone who ends a sentence with a preposition.

This is part of a letter received by one of our editors who had answered some questions for the writers. Members of the never-end-a-sentence-with-a-preposition school are still with us and are not reluctant to make themselves known...

How did this curious cult of incorrection get started?

MWCDEU tracks it back to 1672 and John Dryden's "Defense of the Epilogue":

The italic line is from Jonson's Catiline (1611); the comment on it is Dryden's:

The bodies that those souls were frighted from.
The Preposition in the end of the sentence; a common fault with him, and which I have but lately observ'd in my own writings.

In writing about this a few years ago ( "An Internet Pilgrim's Guide to Stranded Prepositions", 4/11/2004), I observed:

It's a shame that Jonson had been dead for 35 years at the time, since he would otherwise have challenged Dryden to a duel, and saved subsequent generations a lot of grief.

This suggestion was an entirely serious one. An online biographical sketch tells us that around 1590,

Jonson enlisted with the English supporters of the Protestant Hollanders who were defending their religious and political liberties against Catholicism and Spanish rule. The fiery young poet proved to be as formidable with the sword as he was with the pen. In one particular act of bravado, he advanced before the English volunteers, challenged a Spaniard to single combat, slew him, and then--in classic Homeric tradition--stripped the corpse of its armor.

And in 1598,

Jonson fell into a quarrel with the actor Gabriel Spencer and, in a duel, killed the man, though his blade was ten inches shorter than Spencer's. He was imprisoned and very nearly put to death. At the last moment, he was granted a reprieve and released, but his property was confiscated, and he was branded on the thumb. His release was celebrated by the performance of his new play Every Man Out of His Humour.

I located the context of Jonson's use of phrase-final from, in a passage describing the slaughter in Rome at the end of the civil war in 82 B.C.:

                    The rugged Charon fainted,
And ask'd a nauy, rather then a boate,
To ferry ouer the sad world that came:
The mawes, and dens of beasts could not receiue

The bodies, that those soules were frighted from;
And e'en the graues were fild with men, yet liuing,
Whose flight, and feare had mix'd them, with the dead.

This seems as unobjectionable as tens of thousands of similar examples from great writers over the years. MCDEU quotes 25 or so, from John Bunyan to E.L. Doctorow, via Shakespeare, Jane Austen, Byron, Lewis Carroll, James Joyce, James Thurber and Robert Frost.

It's not clear to me that anyone has ever even tried to tell a coherent story -- much less make a cogent argument -- about the alleged flaws of sentence-final prepositions. In 1762, Bishop Lowth could do no better than this feeble assertion, subverted (on purpose?) by its own first sentence:

This is an idiom, which our language is strongly inclined to: it prevails in common conversation, and suits very well with the familiar style in writing: but the placing of the preposition before the relative, is more graceful, as well as more perspicuous; and agrees much better with the solemn and elevated style.

Still, I've held out hope that Dryden might have had something coherent to say about this stylistic preference in his original 1672 discussion. I've been able to maintain this hope, against the odds, because I've never taken the trouble to locate a copy of "Defense of the Epilogue".

But last night, Geoff Pullum forwarded to me a note from Dan Mack, under the Subject heading "Dryden on Jonson action", informing us that Dryden's complete works are now available on line from Google Books, and supplying a link to the crucial passage.

What a spectacular disappointment!

Dryden is making an argument that he is a better poet and playwright than Jonson, Fletcher and Shakespeare were. The form of this argument is "that the language, wit, and conversation of our age, are improved and refined above the last; and then it will not be difficult to infer, that our plays have received some part of those advantages."

He frames the argment like this:

... these absurdities, which those poets committed, may more properly be called the age's fault than theirs. For, besides the want of education and learning, (which was their particular unhappiness,) they wanted the benefit of converse [...] Their audiences knew no better; and therefore were satisfied with what they brought. Those who call theirs the Golden Age of Poetry, have only this reason for it, that they were then content with acorns, before they knew the use of bread ...

In order to advance this view of Ben Jonson, he writes that

I cast my eyes but by chance on Catiline; and in the last three or four pages, found enough to conclude that Jonson writ not correctly.

Dryden then briefly cites a dozen passages from Catiline where he claims to have found errors. The "frighted from" line is one of them; and what he has to say about it is nothing more than what MWCDEU quoted:

At the end of his laundry list of Jonson's alleged mistakes, Dryden writes:

And what correctness, after this, can be expected from Shakspeare or from Fletcher, who wanted that learning and care which Jonson had?

I'm not generally in favor of settling intellectual arguments with swordfights, but in this case, I might make an exception.

[The footnote (numbered 6) was apparently supplied by Edmond Malone, the editor of this 1800 edition of The Critical and Miscellaneous Prose Works of John Dryden, and reads "He [i.e. Dryden] accordingly, on a revision, corrected this inaccuracy in every sentence of his Essay on Dramatick Poesy, in which it occurred."]

[Nuria Yáñez-Bouza, who is about to submit her PhD thesis on the history of preposition stranding in the prescriptive (grammatical) tradition, at the University of Manchester (in the UK), wrote:

Although Dryden is well-known for his critique of end-placed prepositions, he was not the first writer to have addressed the 'correction' of end-placed prepositions in the 17thc.

Dryden's explicit critique on Jonson's usage dates from 1672; there is also some evidence of implicit (or latent) criticism in (at least) one early work of his.

Edmund Malone's remark on Dryden's corrections of the Essay of Dramatick Poesie (1668/1684) is not completely accurate; a few stranded prepositions escaped Dryden's thorough revision. Besides, Dryden corrected/revised stranded prepositions in other works too.

And there are many grammarians (and rhetoricians!) in the eighteenth century who criticised preposition stranding apart from Robert Lowth, the most famous one.

I'll look forward with interest to reading her thesis. Meanwhile, a quick web search discloses that she has already published "Prescriptivism and preposition stranding in eighteenth-century prose", Historical Sociolinguistics and Sociohistorical Linguistics, 7, 2007.]

Posted by Mark Liberman at 06:13 AM