June 30, 2006

The Pellicano file

I've posted a couple times now on the need for law enforcement to make use of the latest (well, within the last half century anyway) recording technology and use it to record their interrogations with suspects see here and here. In that way they could eliminate any doubts about the conversation leading up to the confessions they obtain. Let me be as clear as possible that I'm really FOR putting the bad guys in prison. But I argue that this should be done fairly and that the police should provide ALL the evidence found in their interrogations, not just what is in the tail ends of them, like when the suspect breaks down and confesses. Other fields, like linguistics, require us to provide all the information on which we base our findings. Some police departments appear to be lagging in this important requirement.

I'm suggesting is that the police do this during investigations but the rest of us should not go around surreptitiously taping the conversations of our friends and enemies. There are laws against this, to say nothing of the ethical problems involved. For example, how many of you readers have never heard of Anthony Pellicano? Peering through our new special CIA-developed  reverse optical scanner devices that make it possible for us to see you with the equipment recently installed in our Language Log computers (just kidding of course), as you read this, I can't see any hands going up. Of course you've heard of him. You read the newspapers. He's the Hollywood "private eye of the stars" who has recently been indicted in Los Angeles on 110 counts of racketeering, conspiracy, wiretapping, witness tampering, identity theft and destruction of evidence. If you want to catch up on Pellicano's recent activity, Wikipedia has a nice summary of it here. He's accused of using his private eye skills in electronics and tape recording to covertly tape the private conversations of people like Sylvester Stallone, Gary Shandling and Kevin Nealon, among many others. It's a huge investigation that goes beyond Pellicano to his clients and even to some of the lawyers representing them. It's a classic example of the way that tape recording should NOT be used.

I met Anthony Pellicano in the 1980s, when we were  both helping the defense lawyer in John Z. DeLorean's narcotics case, summarized here. That was back in the days when Pellicano's career as a private eye to Hollywood stars was just beginning to blossom. The FBI had made 64 undercover tapes of DeLorean talking with various people, trying either to get a bank to give his nearly bankrupt auto company a loan or to buy a big chunk of DeLorean Motor Company stock (I describe my role in this case in chapter 4 of my book, Language Crimes, Blackwell, 1993).

Pellicano had produced transcripts of those audio and video taped conversations. He was a pretty good private investigator but his transcripts were often way off the mark. He just didn't have the linguistic skills to do this. When we first met in attorney Howard Weitzman's office in Los Angeles, Pellicano made it clear that he wanted nothing to do with a linguist who might disagree with him, so, to put it mildly, our acquaintance was very short-lived. DeLorean used my transcripts rather than Pellicano's and was eventually acquitted of all the charges. Linguistics won that dispute.

But my point here is this. Let law enforcement officers tape record their interrogations from beginning to end, not just the part they want the jury to hear. Maybe even force them to do this. Then make sure that the transcripts of those interactions are accurate. The rest of us should obey the laws about taping other people without their knowledge and consent. Let Pellicano's legal problems teach all of us, even the clients and lawyers who allegedly benefitted from his services, and be our guide.

Posted by Roger Shuy at 12:13 PM

June 29, 2006

Diagnosing soup label syntax

This evening, Thomas Norman contacted the Language Log Help Line with a problem:

My question concerns a pressing grammatical issue: the Campbell's Chunky soup label.  Every can of soup proudly proclaims itself the "soup that eats like a meal."  The meaning of the slogan is obvious, but the soup itself is certainly not eating anything.  While discussing the matter at dinner, my father and I also came up with the expression "she takes a good picture."  The intended meaning is clear--she looks good when you take a picture of her--but the structure of the sentence seems to imply that she is the one taking the picture.  Is there a name (other than "idiom") and explanation for constructions like these, or are they just grammatical curiosities?  (Or perhaps more accurately, ungrammatical curiosities.)

Though I'm a phonetician, specializing in the sounds of language, we linguists are sworn to provide aid and comfort in the event of any language-related emergency. Also, I seem to be the only one on duty today. So I'll give it a shot.

Thomas' question is clearly a matter of grammatical voice. As the Wikipedia explains,

In grammar, the voice of a verb describes the relationship between the action (or state) that the verb expresses and the participants identified by its arguments (subject, object, etc.). When the subject is the agent or actor of the verb, the verb is said to be in the active voice. When the subject is the patient, target or undergoer of the action, it is said to be in the passive voice.

Now, Thomas understandably expects the passive form to be something like "the soup that is eaten like a meal", and he's worried about what has happened to the is and the -en. Well, in the grammatical tradition of the classical languages (or what I remember about it from high school), the Campbell's Soup slogan would be called not "active", not "passive", but "middle voice". And supporting this memory (Mr. Mansur would be proud), a message contributed by Carl Conrad to the B-Greek mailing list ("Kemmer's Middle-voice categories", 10/18/2001) lists 19 "middle-voice categories (to which [he has] added notes on some Greek and Latin verbs falling into these categories) compiled from Suzanne Kemmer's The Middle Voice, Amsterdam/Philadelphia: J. Benjamins Publishing Co., 1993", with category #18 in this list given as:

18. Facilitative: inherent characteristic of patient allows action to take place: "soup eats like a meal."

Bingo! However, there seems to be some variation here in more recent terminology. SIL LinguaLinks says that

Middle voice is a voice that indicates that the subject is the actor and acts

* upon himself or herself reflexively, or
* for his or her own benefit.

(which would cover some of Kemmer's middle-voice categories, but not #18). LinguaLinks offers an alternative category that fits better:

Mediopassive voice is a passive voice in which the

* verb has stative meaning, and
* actor is not expressed.

where "stative" is a category of verbal aspect, such that a "stative" verb, as Wikipedia explains,

is one which asserts that one of its arguments has a particular property (possibly in relation to its other arguments). Statives differ from other aspectual classes of verbs in that they are static; they have no duration and no distinguished endpoint.

The Wikipedia article on the mediopassive voice gives the English examples

The book reads well. The trousers wash easily. Ripe oranges peel well.

We've clearly identified the syndrome and assigned a diagnosis to the symptoms: this pattern is known as middle voice, Kemmer category #18, or in some jurisdictions, mediopassive voice.

As so often in traditional forms of learning such as medicine and grammar, there is a bit of hocus-pocus about this. We owe more of an explanation than simply the assignment of a terminological category, however comforting that may be. I recall taking an infant with an uncomfortable and persistent rash to the doctor, one dark December afternoon, and being told that it was hibernal eczema. I felt at the time that I wanted more than a translation of the season and the symptom into Latin and Greek, respectively; but I suppose that what I was getting for the insurance company's money was the reassurance that this rash was a familiar condition, and that nothing much could or should be done about it.

So, Thomas, you can relax. The condition is nothing to be alarmed about: it happens in all the best languages. Indeed, as the Wikipedia entry for mediopassive tells us,

Proto-Indo-European ... had two voices, active and mediopassive, where the middle-voice element in the mediopassive voice was dominant. Ancient Greek also had a mediopassive voice in the present, imperfect, future, perfect, and pluperfect tenses, but in the aorist and future tenses the mediopassive voice was replaced by two voices, one middle and one passive.

Eat your soup in peace; and if you come back during normal clinic hours, one of our syntactic specialists will tell you about the latest research on the hows and whys of the many voices of grammar.

[And why voice is not tense.]

[Update: some additional insight is provided by CGEL (the Physicians' Desk Reference of syntax), which discusses examples like this one on p. 307 under the scare-quoted heading 'Middle' intransitives:

In this case the transitive use is primary and the intransitive is interpreted as having an unexpressed causer. Cross-linguistically, the primary use of the general term middle is for a term in a system of voice -- it applies to a voice that is in some sense intermediate between active and passive. The term is certainly not applicable to English in this sense: there are just two categories in the syntactic system of voice in English, active and passive. She doesn't frighten easily is active in form, but it has some semantic affinity with the passive, and it is in this semantic sense that it can be thought of as intermediate between ordinary actives and passives: we put scare quotes around the term to signal that it is being used in an extended sense and is not to be interpreted as denoting a formal category in the voice system.

Intransitives like She doesn't frighten easily characteristically have the following properties:

[40] i A causer (normally human) is implied but can't be expressed in a by phrase.
       ii The clause is concerned with whether and how (especially how readily) the subject-referent undergoes the process expressed in the verb.
       iii The clause is negative, or is headed by a modal auxiliary (especially will), or contains an adjunct of manner (such as well or easily).
       iv The clause expresses a general state, not a particular event.

The implication of a causer ([40i]) is what makes such clauses semantically similar to passives: compare She isn't easily frightened. But the causer cannot be expressed: *She doesn't frighten easily by noises in the dark. Property [ii] shows that these intransitive actives are by no means identical in meaning to passives. Compare The shirt irons well, whcih says something about the quality of the shirt, with The shirt was ironed well, which tells of the skill of the ironer. Properties [iii] and [iv] exclude such examples as *She frightens and *There was a sudden noise outside and she frightened immediately.


Posted by Mark Liberman at 11:18 PM

June 28, 2006

There's no battle, Morgan!

I just got back from doing Talk Of The Nation (ToTN) with Neal Conan on NPR from WBUR's studio in Boston. It was about... words. Everybody sees language as just words, words, words. A human language, as most people see it, is simply a Big Bag O'Words (BBoW). Neal Conan believes a language is a BBoW just like everyone else does. And he had Grant Barrett and Martha Barnette there to back him up.

Add a word to a human language, the host and other guests on ToTN seemed to think, and the language is enriched (a caller mentioned hearing a student say "OMG" instead of "oh, my god", and everyone other than me thought that was fascinating; I said it was simply like Colonel Potter on the TV series M*A*S*H calling World War 2 "WW2" except that Potter's abbreviation was over twice as long as the original). Lose a word, and the language is diminished (and losing two would surely seem like carelessness). Change a word, however slightly, and the language is not just altered, but positively degraded.

I do not believe a language is a BBoW at all. To me, it is a structural system, the particular words deployed in the structure being an independent (and much more rapidly varying) matter. But I was swimming against the tide here. Grant and Martha are both in love with lexicography rather than grammar: finding new words, gathering fresh slang, ferreting out rare nouns; and Neal was right in tune with that. The dictionary is where the action is, they alll agreed.

What happened with a caller from San Francisco called Morgan was instructive, I thought. Morgan found it just intolerably irritating that people were shortening until to till and shortening through to thru (those were the only two examples she gave).

I leaped in, probably breaking several radio rules (sorry to interrupt, Grant; I think you were going to make essentially the same point), and I explained (audio available here so you can check what I actually said): Until is actually an early 15th-century embellishment of till, made by adding on before it (as in "Keep right on till the end of the road"). Eventually the two merged into one word, and the two synonyms lived alongside each other happily ever after. The word till is older, and has always been correct. It has never been a contraction or shortening. (It is virtually identical to the analogous form in Swedish and various other Germanic languages.)

And through is of course not changed at all by the American practice of shortening its spelling to thru in informal documents. It couldn't possibly lead to any misunderstanding.

Yet, although all were agreed that too many people think English is rapidly going to hell, Grant did end up telling Morgan (by way of praising her for caring about language, I think): "Keep fighting!". He said, "If you sit back and accept these things without battling for them, then that's the mistake. The mistake is not to say 'This is wrong'."

I demurred. In a confusing scuffle where four people talked at once, I tried to say no, don't keep fighting, Morgan! There is no battle! No degradation! English is going to be OK!

"I've seen degradation," I think I remember telling them, "and this isn't it!"

I thought the others all sounded a bit skeptical. They approved of her battle against those contractions and erosions, those little changes that wear languages thin and loosen things and make things drop off our language like buttons off an old coat. Morgan was basically instructed by a 75% majority (Neal Conan sided with the other two) that these things do matter, and she should carry on fighting for the English language. Grant explained to her that these changes do happen in the slowly winding linguistic river, but it's OK as long as there is no flooding over into unintelligibility.

But I thought the relevant point was that Morgan had not suggested any case where misinterpretation was even a remote possibility.

Morgan did show some signs of lightening up, though. [By the way, is though a contraction of although? No. It just so happens that although is another early 15th-century embellishment. Though is older.] I certainly hope she heard my message of salvation. I hope she will not spend the rest of her life trying to protect English from imaginary threats and mistakes.

Posted by Geoffrey K. Pullum at 02:06 PM

A hospital discovers conversation

Maybe it's the after-effect of my just spending a week in the hospital but it seems to me that medical practitioners may be gradually starting to talk with their patients more. Two years ago, when I had an even longer stay in the same hospital, the doctors and nurses seemed to be far more interested in conversing only among themselves and when they spoke with patients, they used the well-known, in-group medical jargon that tends to isolate them from the rest of us non-medical types. But this time they actually spoke understandable prose to me most of the time, cheerfully explaining what was going on with my body. They also spent what seemed to me to be a great deal of time talking about lots of other things, small talk I guess. It would appear, based only on my admittedly small sample of participant observation, that the medical field may be discovering actual CONVERSATION.

Why was my hospitalization so different this time? Maybe it was because I had different attendants. Or maybe this time my floor had a bunch of easy cases for personnel to deal with. But I suspect that this new relaxed and informal experience had something to do with the fact that my hospital now has a new breed of physicians called "hospitalists."This is a new word in my vocabulary but when I checked google, I found 817,000 references to it, showing how slow I am to catch on I guess. Hospitalists are physicians who've decided that they want to lead a real life, just like other people, replacing their old grind of seeing patients all day in the office, being called at all hours, and visiting their patients in the hospital (definition here). My own internist quit his practice a year ago to become a hospitalist in a nearby city. My smallish hospital now has six of these, each working 12 hour shifts with the rest of the week free to lead their personal lives. They are never on call because other hospitalists take over when they're off-duty. They say that they now can lead a real life and, in my opinion, this makes them more fully human, capable even of engaging in CONVERSATIONS with patients.

The hospitalist who stopped by my room twice a day spent far more time with me than any other doctor ever has. She took time to explain my procedures in clear, non-technical English, discussed how she got into the medical field, showed me photos of my intestines that were taken during my procedure, took care of my medication needs, sympathized with my boring diet of jello and chicken broth, talking about hiking with her husband in the mountains, and joked about the irony of giving me new blood while at the same time drawing some of it out during my innumerable blood test monitorings. We even discussed a little politics. When the day of my release came, she saw to it that I didn't have to wait around all day to get release papers signed. I was out of there by 10 a.m.Two years ago it took until 3 p.m. to get this simple task accomplished.

What I noticed most, however, was the increased amount of CONVERSATION that took place during my stay. Nurses, phlebotomists, aides, all talked as much as I wanted them to. They seemed to have the time. And they were even interesting. The old tensions caused by waking up doctors in the middle of the night for changes in pain medication didn't seem to be present any more. The hospitalist took over. She seemed to be around all the time. The overall feeling of the place was far more relaxed, something that can be very encouraging to patients. At least it was for me.

Maybe one thing that medical practice has needed all along was a redistribution of authority. The "doctor as God" idea definitely needed to change. Sharing responsibility for patient care with another doctor may be one answer to the stressful life that caused my own excellent internist to quit his practice. All I know is that this hospital is not the impersonal, stressful, and frenzied place that it was two years ago. Personnel now seem to have the time and inclination actually to carry on extended CONVERSATIONS with their patients. And when this happens, there doesn't seem to be the need for using shorthand jargon with patients. Most of all though, it was just nice to be talked with once in a while.

Posted by Roger Shuy at 12:07 PM

Blinded by neuroscience

There's a great piece in Seed Magazine by Paul Bloom, "Seduced by the flickering lights of the brain" (6/27/2006). Money quote:

In a recent study, Deena Skolnick, a graduate student at Yale, asked her subjects to judge different explanations of a psychological phenomenon. Some of these explanations were crafted to be awful. And people were good at noticing that they were awful—unless Skolnick inserted a few sentences of neuroscience. These were entirely irrelevant, basically stating that the phenomenon occurred in a certain part of the brain. But they did the trick: For both the novices and the experts (cognitive neuroscientists in the Yale psychology department), the presence of a bit of apparently-hard science turned bad explanations into satisfactory ones.

It's amazing how many people (including many who should know better) see functional brain imaging as "showing us directly what the brain is doing", rather than as providing yet another dependent measure, not fundamentally different in its "directness" from gaze tracking, reaction time measurements and so on.

[Update 8/10/2007: Deena Skolnick's paper came out in June 2007 in the Journal of Cognitive Neuroscience. Links and discussion are here.]

Some relevant Language Log discussion of dubious uses of neuroscience:

"Are men emotional children?" (6/24/2006)
"Maurice Saatchi, cognitive neuroscientist" (6/23/2006)
"David Brooks, cognitive neuroscientist" (6/12/2006)
"How much do those red and blue jellybeans predict about linguistic ability?" (4/17/2006)
"The brave new world of computational neurolinguistics" (12/27/2005)
"Rorschach Science" (8/12/2005)
"'The Japanese are Japanese because they speak Japanese'" (4/6/2005)

And equal time to more positively-evaluated research reports:

"Juliet was wrong" (5/19/2005)
"News about brain structure in Williams Syndrome" (425/2005)
"Structures of words vs. structures of numbers" (3/2/2005)
"Disgust for accents: pre-adaptation or figure of speech?" (8/12/2004)
"Autism as lack of neurological coordination" (7/31/2004)
"Mind-reading experiments at the University of York" (4/13/2004)
"Excitement at the Guardian about language and speech" (2/6/2004)
"Bletchley Park in the lateral interparietal cortex" (1/9/2004)

Posted by Mark Liberman at 10:56 AM

Matrimonial cryptography

This real-life puzzle was sent in by Daniel:

Sarah and I are getting married in September, and want to engrave an encrypted message in our rings. The catches:

(1) Each ring should make sense on its own. I.e., neither ring may look like "XJFIWLGOSIBNQ".
(2) Only about 20 characters can fit on each ring.
(3) The message must require both rings to be decoded; further, every *character* of the message must require both rings to be decoded.

[Update: further explanation can be found here.]

Daniel worked out that an ideal method would be for each ring to serve as a "one time pad" for the other. If the message on his ring is D and the message on her ring is S, then the desired result can be defined as M = D+S, where D and S are combined element-wise (via modular arithmetic, XOR or some other invertible function) to make the joint matrimonial message M.

However, he also recognizes that finding D, S and M such that all are readable (and appropriate) strings is a daunting task.

My best approximation is to suggest that he choose D, S and M as mathematically arbitrary (but personally meaningful) strings; then define X so that D+S+X = M, i.e. X = M - (D+S). Then engrave his ring with D and half of X, and Sarah's ring with D and the other half of X.

This partially violates condition (1), since 1/3 of the characters on each ring appear to be gibberish. However, it does have the property that M is a simple function of both inscriptions, and hard to decrypt otherwise. (And yes, I know that to achieve Shannon security, you'd have to introduce a genuinely random keystream into the process; but as far as I can see, that would require half the characters on each ring to be unreadable gibberish, whereas the 1/3 gibberish in my solution is already too much. Dan is looking for a metaphor to defeat the odds of life and love, not a code that will baffle the NSA. What? Life and love are tougher opponents? Well, perfect encryption of wedding rings probably still isn't either a necessary or a sufficent condition for victory, even metaphorically...)

If you can suggest a better (but still practical) approach to finding inscriptions for Dan and Sarah's rings, let me know. You should be able to get a wedding invitation out of it.

Posted by Mark Liberman at 07:14 AM

Chinese takeout and Watergate: Discuss

If you see a bird that's black, then you say you saw a "black BIRD." But if you see the particular kind of bird called a blackbird, then you pronounce it "BLACK bird." To a linguist, "black BIRD" is an adjective followed by a noun, while BLACKBIRD is called a compound. And to this linguist, it took, of all things, THE MARY TYLER MOORE SHOW to make compounds seem fun.

In English, compounds are different from an adjective modifying a noun in terms of where the stress goes. Compounds place the stress on the first word instead of the second one. And compounds are different in terms of meaning: an adjective and a noun is a noun described, like a bird that is, look at that, black! A compound is "one thing," like that particular bird called a BLACKBIRD, as opposed to crows, ravens, etc. That one thing is often more specific than what the words technically mean.

So ICED CREAM, pronounced "iced CREAM," would refer to cream that had been iced, and it's no surprise that this is what Mr. Burns on THE SIMPSONS calls it in one episode, mired as always somewhere around 1903 and thus encountering it for the first time. But we say "ICE cream," because to us it's really one thing, represented in kids' minds, for example, as "eyescreem."

And then, I imagine one could render cream in an iced fashion in any number of ways, but ICE CREAM means that one particular way that Mr. Burns found so special.

We always hear about language changing, and one fun thing to watch for is new compounds. For example, in my down time, I spend way too much time watching TV shows on DVD, and in one episode of THE MARY TYLER MOORE SHOW from 1973, the characters order Chinese food. However, even as late as the Watergate era, they call it "Chinese FOOD," with the melody of, say, "two plus four," instead of the way we say it now, "ChiNESE food," with the melody of "my OWN car."

This was because CHINESE FOOD wasn't yet a compound for all American English speakers back then. It was still a little exotic; people didn't usually have woks at home, and we were still a more steak-and-potatoes country (the MARY TYLER MOORE characters still casually order steaks and martinis for LUNCH!). For us, it's so familiar that of course, it has become a compound: "ChiNESE food," like "BLACKbird." And just like BLACKBIRD and ICE CREAM, CHINESE FOOD has a more specific meaning than it would literally. It doesn't really mean "food the way they make it in China," but brings to mind Chinese food as prepared in and for Americans and often ordered as takeout. The word summons a vision of a little white box with a metal handle, not what people are eating in Beijing.

I thought about that the other night when I saw even earlier TV characters talking about another ethnic food. On an episode of THE HONEYMOONERS in 1956, Alice talks about making a "pizza pie," as people still said then (although it was already shortening to PIZZA, as she says it a few minutes later). She pronounces it "PIZZA pie" as we would, but almost surely, fifty years before, it was pronounced "pizza PIE," like we would say "nectarine PIE." (Notice, though, that with a more common pie, APPLE PIE, we often say "APPLE pie" -- it's a compound, and refers to a particular type of apple pie involving nutmeg and cross-hatched crust and carrying associations of Americana and such.) I'll bet one could hear some vaudevillean on a scratchy old record or cylinder say "a pizza PIE."

A couple weeks ago I caught a compound a-borning. Someone was telling me that they had been away from work for a while because of ... and they said something that I didn't quite catch because I had never heard it said this way before. It went "..then I was out for two months with peetstreh so I had to..." For the next ten seconds I internally replayed the sentence again and again, trying to use context to recover what he could have meant. Finally I got it -- REPEAT STRESS!

In the nineties we became familiar with the term REPEATED STRESS SYNDROME, pronounced "repeated STRESS syndrome." But now, it's such an established term that it is no longer the adjective REPEATED and the noun STRESS, but a compound. And that means that the stress has to go on the REPEATED, and so with a little trimming, "rePEAT stress." And so English marches on.

And by the way, I feel moved to note that I do not restrict my televiewing to the likes of THE HONEYMOONERS and MARY TYLER MOORE. I've also been enjoying THE WIRE, which people tell me is a serious show. And I've been listening for new compounds all the while.

Posted by John McWhorter at 01:52 AM

Been catachresis so long it looks like an idiom to me

Yesterday ("It's not hyperbole, but what is it?" 6/27/2006), Catherine Burriss and I asked what to call the use of associative exaggeration in sentences like "Movies, theater, parties, travel—those are just a few of the English nouns that parents of young children quickly forget how to pronounce", or "Been walking so long, forgot how to ride". This question was inspired by Geoff Pullum's dissection of a Father's Day article by Daniel Gilbert ("Words fathers forget how to pronounce", 6/18/2006, and "For the millionth time, it's not hyperbole", 6/19/2006). I agreed with Geoff that hyperbole is too specific, and concluded that metalepsis is too general. In response, readers sent in several sorts of insights.

Alexandra C. Horowitz wrote:

Is "catachresis" closer? My old reliable Quinn text, "Figures of Speech", claims it can be used to describe "a substitution of a noun or verb that...jars our sensibilities," for instance. He gives Cervantes "the very pink of courtesy" and cummings' "the voice of your eyes is deeper than all roses" as examples.

And Karl Hagen made the same suggestion, and explained:

In the sense in which it is usually used today, [catachresis] is close to metalepsis, but additionally contains the notion of the ridiculously impossible--"an extreme, far-fetched, or mixed metaphor; strained or deliberately paradoxial figure of speech" (see http://www.nt.armstrong.edu/term2.htm)

Metalepsis and catachresis, of course, are not necessarily mutually exclusive. A figure can be both.

And it's also worth noting that the strong tendency that people seem to have to insist that this is hyperbole is itself an instance of catachresis, as Quintillian originally defined the term. Quintilian had in mind the use of the closest available term, when something more exact was lacking. This would seem to be exactly the process that people are using in calling it hyperbole:

"The more necessary, therefore, is κατάχρησις (catachresis), which we properly call abusio, and which adapts, to whatever has no proper term, the term which is nearest, as,

    Equum divinâ Palladis arte

    A horse they build by Pallas' art divine;"

--Quintilian, 8.6.34

With respect to the original example about parents' forgetting how to pronounce words like movies, Maryellen MacDonald wrote to remind us of the expected malfunction of speech production processes in neglected areas of the mental lexicon:

[T]he mapping from meaning to pronunciation is probably the most fragile part of the language production process, and the not uncommon failure to produce the pronunciation while still knowing the meaning is called  by psycholinguists the Tip of the Tongue (TOT) state, as you undoubtedly know.  It's extremely well documented that the likelihood of being in a TOT state for a given word is strongly dependent on the word's frequency--the less often one has said, heard, read, and probably thought about a word, the more likely one will have TOT state when trying to produce it. 

and to suggest that

the key argument is about usages of "forget" to mean "not be able to access at this moment".  ... [M]emory researchers also have this usage too, in that they use "forgetting" to mean any decay in representation, which need not mean total loss.  So on this view, it's not hyperbole, it's just forgetting, but apparently a different sense of forgetting than Geoff has.

In fact, the OED memorializes the sense of forget that Geoff (mock-) forgot:

3. To cease or omit to think of, let slip out of the mind, leave out of sight, take no note of.
c. To drop the practice of (a duty, virtue, etc.); to lose the use of (one's senses). to forget to do = to forget how to do (something).

with citations from a few psycholinguists over the years:

c1385 CHAUCER L.G.W. 1752 Lucrece, Desire That in his herte brent as any fire So wodely that hys witte was foryeten.
1590 SHAKES. Com. Err. III. ii. 1 And may it be that you haue quite forgot A husbands office?
1592 —— Ven. & Ad. 1061 Her joints forget to bow.
1670 MILTON Hist. Eng. II. 36 The terrour of such new and resolute opposition made them forget thir wonted valour.

There's also this:

4. In stronger sense: To neglect wilfully, take no thought of, disregard, overlook, slight.

a1703 BURKITT On N.T. Jas. ii. 5 Men wallow in wealth, and forget God.

I had remembered (but edited out of my earlier post) some examples of this sort, including a passage from chapter XIX of The Posthumous Papers of the Pickwick Club:

This constant succession of glasses, produced considerable effect upon Mr. Pickwick; his countenance beamed with the most sunny smiles, laughter played around his lips, and good-humoured merriment twinkled in his eye. Yielding by degrees to the influence of the exciting liquid—rendered more so by the heat, Mr. Pickwick expressed a strong desire to recollect a song which he had heard in his infancy, and the attempt proving abortive, sought to stimulate his memory with more glasses of punch, which appeared to have quite a contrary effect; for, from forgetting the words of the song, he began to forget how to articulate any words at all; and finally, after rising to his legs to address the company in an eloquent speech, he fell into the barrow, and fast asleep, simultaneously.

and especially this famous example from Psalm 137

5 If I forget thee, O Jerusalem,
            let my right hand forget her cunning.

where forgetting and remembering are as much about attention as about memory, since remembering is urged even on God:

6 If I do not remember thee,
            let my tongue cleave to the roof of my mouth;
            if I prefer not Jerusalem above my chief joy.
7 Remember, O LORD, the children of Edom
            in the day of Jerusalem;
            who said, Rase it, rase it,
            even to the foundation thereof.

I remembered the beginning of this beautiful psalm, which was featured in the lyrics of the Melodians' 1972 reggae hit. I'd forgotten the grim and eerily topical ending:

1 By the rivers of Babylon,
            there we sat down, yea, we wept,
            when we remembered Zion.
2 We hanged our harps upon the willows in the midst thereof.
3 For there they that carried us away captive required of us a song;
            and they that wasted us required of us mirth, saying,
            Sing us one of the songs of Zion.
4 How shall we sing the LORD's song in a strange land?
5 If I forget thee, O Jerusalem,
            let my right hand forget her cunning.
6 If I do not remember thee,
            let my tongue cleave to the roof of my mouth;
            if I prefer not Jerusalem above my chief joy.
7 Remember, O LORD, the children of Edom
            in the day of Jerusalem;
            who said, Rase it, rase it,
            even to the foundation thereof.
8 O daughter of Babylon, who art to be destroyed;
            happy shall he be, that rewardeth thee as thou hast served us.
9 Happy shall he be, that taketh and dasheth
            thy little ones against the stones.

Posted by Mark Liberman at 12:11 AM

June 27, 2006

Jeopardy Report

We're watching Jeopardy, and they're doing pretty well on the language front. In the language category, the contestants got all five questions right, and the Jeopardy folks didn't make any real mistakes either. My only quarrel with the questions is a quibble. They described Cantonese as having nine tones, in contrast to the four of Mandarin. That is sort of true, but we usually count only six tones for Cantonese because three of the phonetic tones are predictable variants that occur in syllables closed by a stop.

However, Alex Trebek needs to work on his Japanese pronounciation. One clue in the Japanese-American relations category contained the word heiwa 平和 "peace". Alex pronounced it [hajwa], as if it had the same first syllable as "highway". Actually, it's [he:wa], with the first syllable more like "hay". It's the same first syllable as in heisei 平成, the reign name of the present emperor.

Posted by Bill Poser at 10:46 PM

Language documentation

Back in April, Mark drew our attention to a story on NPR's All Things Considered about Naomi Nagy's language documentation ("field methods") course at the University of New Hampshire. Here at UCSD we regularly offer a two-quarter sequence of field methods, and for the past two years the instructors and students in the course have been working with a native speaker of Moro, a language of the Sudan that in 1982 was estimated to be spoken by a mere 30,000 people. By comparison, the Kenyan language Kisii/Gusii studied by Naomi Nagy and her students at UNH was estimated to be spoken by over a million and a half people in 1994.

I offer this comparison not because I think either language is worth studying more than the other; both are underdocumented and I'm glad that this state of affairs is being remedied in both cases. I figured, however, that the work on Moro in our department was also worth profiling in the news, and so in May I arranged for our local Associate Director of Communications, Inga Kiderra, to meet with the instructors of the field methods course, Sharon Rose and Farrell Ackerman, to discuss this possibility. The story was picked up by the San Diego Union-Tribune and appeared today -- read it here. As these stories go, I think it's fairly good. Farrell summarized it nicely:

The nice thing is that the flavor of the article puts the university and department (with its emphasis on Margaret Langdon) in a good light with a balance between research and community involvement and this should be good for everyone.

Careful readers may note that the title of the story, "A way with (rare) words", is a play on the title of a locally-produced radio program, A Way With Words. Here are links to Language Log posts written about this painful little show (or just about its most prominent co-host, Richard Lederer):

Engrish explained (3/11/2006)
Collateral damage (2/20/2006)
Further evidence of declining university standards? (1/25/2006)
Special linguistic providence (10/25/2005)
The care less train has left the station (6/20/2005)
Even more etymological arguments (6/20/2005)
Locating the sarcasm bump? (5/29/2005)
So which is it? (12/14/2004)
Eggcorn and malaprop among the flowers (8/19/2004)
(Auto)biography of a blog thread (7/16/2004)
Speaking sarcastically? (7/16/2004)
If wine and stew can always, why can't toast? (7/10/2004)
Spiteful things (7/9/2004)
Still on the hook (7/9/2004)
Caring less with stress (7/8/2004)
Lederer should care less (7/8/2004)
Don't Dangle Your Participles in Public (7/7/2004)
Everything you always wanted in a language, and less (6/29/2004)
English: international and simple (6/29/2004)
The Culture of Polarization, Linguistics Style (4/2/2004)
Does "Locklear" really rhyme with "cochlear"? (12/30/2003)
Mispronunciation -- or prejudice? (12/30/2003)

(See also this post over on phonoloblog (11/6/2005).)

[ Comments? ]

Posted by Eric Bakovic at 01:28 PM

The Language Log book, now at a discount of -319%

Far from the Madding Gerund, the book of selections from Language Log, has sold faster than expected. Our editor informs us that a caravan laden with new stock has been temporarily stalled by a late snow in the Karakorum passes, but Amazon.com (who shipped at a discount within 24 hours when they had copies in their warehouses) is sticking with a cheery "Availability: Usually ships within 2 to 3 weeks", while also offering a link to "limelightbookshop", who advertise a new copy for $48.99. Barnes & Noble (who also shipped within 24 hours before they ran out) says mournfully that "A new copy is not available from Barnes & Noble.com at this time", while its link to "Used Copies Available from our Authorized Sellers" reveals that Mildred Ibale, of Sherman, Texas, is offering a copy in "Like New" condition for $92.77.

Some reader who knows the book trade may be able to tell me what's going on here. A premium of more than 300% for a book published less than two months ago? It must be a misprint. Meanwhile, in order to protect our readers from unwarranted arbitrage (or to give them the opportunity to engage in a bit of their own profiteering, if they prefer), I'll point out that the publisher has copies in stock, and will take orders online or by phone, at the list price of $22. No need to panic: when they run out, they'll print more. Really.

[Update -- Geoff Pullum inferred the next value in the series:

The function that yields the pricing of our book across the country at the moment is (X*2^n)+5, where X = 22 and n > 0. It matches your data almost perfectly. Look for booksellers offering the book at $181 next, when n = 3.

And Reinhold Aman explained the forces behind the algorithm:

I just read your "The Language Log book, now at a discount of -319%."

Welcome to the seedy world of greedy used-book peddlers.

For more examples of such outrageous prices, see:


I'm still puzzled. I would have neither any moral nor any practical objection to the operation of supply and demand in such cases -- but it seems that the supply is basically fine. The book is far from being out of print, copies are available from the publisher at list price, Amazon and Barnes & Noble will be restocked shortly... Perhaps there are automated pricing algorithms that scan the bookstore sites and kick in whenever an out-of-stock situation arises? Or do Amazon and B&N sell lists of books that customers ask for but they are (temporarily) unable to ship?

Margaret Marks wrote:

I suppose there was more demand for the book in the USA than the distributors expected!

At amazon.de you can get it for 20 euros and it is sent out within 24 hours. (There is a separate tab for 'English books').

And Amazon.co.uk has 24-hour shipment for £11.94. But Jeremy Hawker in Norway to order from across the pond:

You may be interested to know that Amazon.com shipped your book Far From The Madding Gerund to me the same day I ordered it, on June 20.

They charged me $14.30, plus $8 for international shipping and "handling", or profit, presumably. Total, in other words, about $22.

I love your blog.

Well, Amazon in the U.S. was selling to everyone at $14.30 a copy, with immediate shipping, until they ran out of stock. I'm sure they'll go back to that practice as soon as the camel caravan with their new copies negotiates those passes in central Asia. Or was it the dockers' strike in Mogadishu? I forget. ]

Posted by Mark Liberman at 09:24 AM

It's not hyperbole, but what is it?

Back on Father's Day, Daniel Gilbert attributed a form of aphasia to parents: "Movies, theater, parties, travel—those are just a few of the English nouns that parents of young children quickly forget how to pronounce". Geoff Pullum took Gilbert to task for "taking a straightforward claim about the world that is arguably true and turning it, for absolutely no reason that I can detect, into a claim about language that is wildly and demonstrably false". And then, as Geoff told us the next day, "a million people [wrote] ... to explain ... very gently and patronizingly ... that he didn't mean it literally; it was hyperbole..." So Geoff explained, not especially gently, why Gilbert's turn of phrase "is not construable as hyperbole", and disappeared into the New Hampshire forests, leaving the rest of us here at Language Log Plaza to deal with a tidal wave of additional correspondence from readers.

I've forwarded to Santa Cruz the many helpful genealogical, psychological and medical hypotheses concerning Geoff himself. The sacks of messages that can be paraphrased as "yes it is too" have been turned over to an intern (for internment, of course). The others went on my to-blog list (a RIRO queue, patent pending). One of the most interesting of these came from Catherine Burriss, who asked

Does the impossibility of Daniel Gilbert's expression classify it as another figure of speech, in which the effect of the saying comes not from an exaggeration of a truth, but rather from its absolute impossibility?  If such a figure of speech has been classified, the poetry scholars and rhetoricians would know better than I would. In a way, it is an associative exaggeration, an impossible, but still associated, consequence of the underlying claim; the parents have become so unfamiliar with theater that they have lost all knowledge of it, even that which is impossible to lose. 

I don't think that it's exactly "absolute impossibility" that's at issue. I like Catherine's term "associative exaggeration". More precisely, examples of this kind (and there are many of them) seem to be scalar similes or metaphors that evoke an absurdly exaggerated generalization of things associated with the situation under consideration. This could be considered a specifically scalar kind of metalepsis, "[r]eference to something by means of another thing that is remotely related to it, either through a farfetched causal relationship, or through an implied intermediate substitution of terms. Often used for comic effect through its preposterous exaggeration."

Think about Thomas Pynchon's deathless blub for Richard Fariña's Been down so long it looks like up to me: "This book comes on like the Hallelujah Chorus done by 200 kazoo players with perfect pitch." As Geoff explained about forgetting how to pronounce movies, this is not hyperbole in the sense of "A figure of speech in which exaggeration is used for emphasis or effect, as in I could sleep for a year or This book weighs a ton". There is no Handel in Fariña's book, not even one kazoo, and the only chorus is associated with an absent-minded malapropism on p. 312:

A phonograph needle was dropped into place by the swimmer who'd been hit on the head by one of Gnossos' silver dollars, and a percussive chorus of marching mummers rendered [sic] the smoky air.

No, Pynchon is not exaggerating the quantity of things that are already literally present in the book. He's inviting us to imagine an absurd hybrid apotheosis of worship, recreation, celebration, artistry and bombast -- 1966 in a nutshell.

Or consider these typically bluesy lines from Eve Merriam's The Company Agent (1956)

Got me a blister barndoor wide.
Been walking so long, forgot how to ride.

The "barndoor wide" business is standard hyperbole -- it's a big blister, etc. But the idea that lack of experience is going to make you forget how to ride is even more "wildly and demonstrably false" than the idea that parenthood makes you forget how to pronounce movie. You could be in a coma in an ambulance and not "forget how to ride". Merriam's metaphor makes its point by inviting us into a fantasy of hyper-empiricist epistemology, in which lack of experience eventually leads to the decay not only of the ability to act, but even the ability to be acted on. That's some serious lack of experience.

And you can use a similar illogical logic to quantify lack of interest rather than lack of experience, by inviting us into a world where the withdrawal of attention spreads to basic associated knowledge and evaporates it. Thus Catherine Fox's recent interview with John Portman (Atlanta Journal-Constitution, 6/18/2006) ends like this:

Q: Any thoughts of retirement?
A: I'm going to keep on keeping on. I can't even spell retirement. I love what I do. It's not work to me.

This is not the only example of "can't even spell retirement" on the web, and expressions like "doesn't know the word quit" are not exactly unknown either.

However, I think that Aristotle missed the boat on this one, and as far as I know, his successors haven't caught it either. Hyperbole is too specific, and metalepsis is too general: is there a term for this? Not that Geoff will like it any better if it has a name.

Posted by Mark Liberman at 07:20 AM

June 26, 2006

Honor be to Mudjekeewis!

In my in-box, every morning,
Greetings from a slew of spammers,
Each, to fool the filters, using
In the header and the body,
Random lines from "Hiawatha":
"And the fierce Kabibonokka"
(Get your clearitol and cum pills);
"Beat the shining Big-Sea-Water"
(Make your wife or girlfriend speechless)
"Sat the ancient Mudjekeewis"
(Safe Prescription Medication);
"And Nokomis fell affrighted"
(Over half a million clients);
"For the maid with yellow tresses"
(Free Fed-Ex on every order);
Till I have the sense of hearing
The entire fucking cosmos
Droning, unenjambed, insistent,
In tetrameter trochaics,
Lulling me to drowsy numbness...

"Make a bed for me to lie on."
(May already be a winner).

Posted by Geoff Nunberg at 03:05 PM

Too good to be true

In yesterday's Opus, we learn that creative mishearing can be dangerous, at least in the funny papers:

Ben Zimmer points out that such jokes have been floating around the internets at least since 1990.

Posted by Mark Liberman at 07:22 AM

June 25, 2006

Dzongkha Linux

The Dzongkha Linux logo

Last November I commented on Microsoft's refusal to use the recognized name of Dzongkha, the national language of Bhutan, for fear of irritating China. According to this recent report, little has come of the US$523,000 paid to the UK-based Orient Foundation to provide support for Dzongkha in Microsoft Windows, originally expected in early 2003. As a result, the Government of Bhutan switched its attention to Linux. The Department of Information Technology recently announced the release of a version of Debian Linux localized for Dzongkha:

While the promise of integrating the Dzongkha Unicode system, developed since 1998 at a cost of US$ 523,000, in Microsoft Vista may be out of the window locals have come up with a much cheaper but more advanced software for Dzongkha computing.

The work was funded by a grant from the Canadian government's International Development Research Centre administered through the Centre for Research in Urdu Language Processing of Pakistan's National University of Computer and Emerging Sciences, at a total cost of US$50,000. The lead from the Debian project was Christian Perrier, who posted this report.

The Government of Bhutan has put out a brochure about Dzongkha Linux. You can download the localization files from the Sourceforge project site. The logo of Tux the penguin in monk's robes is so cute it alone should motivate you to run Dzongkha Linux.

Posted by Bill Poser at 06:45 PM

More rhetorical abuse of the Eskimo lexicon

I look forward to the day when "Linguists tell us that Eskimos have N words for the different states of snow" joins "Some of my best friends are Jewish" in the list of sentences that people have learned to suppress their impulse to write. In fact, I thought that in the lexicographic industries, this day had already arrived. But in the latest issue of the Vocabula Review (motto: "A society is generally as lax as its language.®"), David Isaacson begins an article on "Drunk Words" with a classic example:

Linguists tell us that Eskimos have dozens of words for the different states of snow. Alcoholics, their victims, and others with more than a passing interest in the state of drunkenness have hundreds of words describing what happens when people have too much to drink.

I'll suppress the impulse to observe that a publication is generally as lax as its rhetoric.

In fact, I'll refrain from any further comment, except to point readers to the Wikipedia's article on Eskimo words for snow, and to list a small selection from (the dozens of) earlier Language Log posts on rhetorical abuse of the Eskimo lexicon:

"112 words for misunderstanding meaning" (2/5/2006)
"Snowclone blindness" (11/19/2005)
"Etymology as argument" (6/18/2005)
"No word for sex" (3/12/2005)
"Can Geoff Pullum rest on his laurels?" (8/13/2004)
"The Eskimos, Arabs, Somalis, Carrier .. and English" (3/4/2004)
"Sasha Aikhenvald on Inuit snow words: a clarification" (1/30/2004)
"A short sharp slap for Dennis Overbye" (1/8/2004)

Well, I'll also provide a link to Laura Martin's seminal work, "Eskimo Words for Snow: A Case Study in the Genesis and Decay of an Anthropological Example", American Anthropologist, 1986, pp. 418-423.

I don't want to seem ungrateful: right under Isaacson's essay, on the VR web site, is a "Book Excerpt" from Far from the Madding Gerund (it includes "Phineas Gage gets an iron bar right through the PP", originally from Language Log of 11/21/2003, plus "Forensic Syntax for Spam Detection" from 9/22/2004, "The SAT fails a grammar test" from 1/31/2005, and "Without Washington's Support . .. . Who?" from 3/1/2005). But fair is fair.

Posted by Mark Liberman at 12:52 PM

Obscenity as commodity

Coincidentally enough, two opinion pieces — one British and one American — were published in Sunday's papers, both tackling the issue of public obscenity. And even more coincidentally, pundits on opposite sides of the Atlantic arrived at roughly the same conclusion: obscene words should be largely kept out of public discourse, but not because they're inherently vile or worthless. On the contrary, both columns argue that taboo words are precious commodities that lose their value when overused.

The British piece appeared as an editorial (or "leader") in the Guardian's Sunday Observer, inspired by two recent cases of public obscenity on Britain's airwaves. In the first incident, BBC host Jonathan Ross asked Conservative Party leader David Cameron whether he had ever "wanked over" Margaret Thatcher. (A helpful explanation of the context can be found here.) The second incident involved the use of the word fuck on the UK version of the reality show "Big Brother." The editorialist complains that such language should not be so common on television:

The problem comes with overuse. Words are a commodity, cheapened when supply runs unchecked. For an expletive to have dramatic effect, it must come in the context of otherwise sober discourse. If every broadcast is peppered with expletives, our language is impoverished.

The American piece is by Washington Post columnist Joel Achenbach. To paraphrase a recent Associated Press poll question, Achenbach is thinking specifically about the F-word. He considers the FCC's conflicting rulings on the use of fuck and consults with F-word chronicler Jesse Sheidlower, but his conclusions are ultimately quite similar to that of the Guardian's (albeit in a more tongue-in-cheek tone).

The F-word remains taboo. But just barely. We may be entering an era in which this fabled vulgarity is on its way to becoming just another word — its transgressive energy steadily sapped by overuse.
From hip-hop artists to bloggers to the vice president of the United States, everyone's dropping the F-bomb. Young people in particular may not grasp how special this word has been in the past. They may not realize how, like an old sourdough starter, the word has been lovingly preserved over the centuries and passed from generation to generation. For the good of human communication we must come together, as a people, to protect this word, and ensure that, years from now, it remains obscene.

Unlike his British counterpart, Achenbach is unable to use the word fuck in his column, even citationally, resorting to the usual avoidance strategies of "the F-bomb," "the F-word," and eff. But his admiration for the word in all its unexpurgated glory shines through:

The reason it must be suppressed in polite society is not because it's a bad word, but because, in certain circumstances, it is a very good word. It is a solidly built word of just four letters, bracketed by rock-hard consonants. It is not a mushy word, but one with sharp edges. Consider how clunky the term "the F-word" is. The authentic article, by contrast, explodes into space from a gate formed by the upper incisors and the lower lip. Then it slams to a dramatic glottal cough.

I'm afraid I don't quite see how the labiodental fricative /f/ is a "rock-hard consonant" — seems pretty soft to me. Also, I'm not sure what a "dramatic glottal cough" is exactly — perhaps Achenbach, true to his name, pronounces the final consonant in fuck with a German-style velar fricative as in ach and Bach, or maybe he uses something even more guttural than that. More likely this folk-phonetic description is simply a way to invest fuck with a "natural" explosiveness, as evidence that it is a special word for special occasions.

While we're handing out obscenity-related thesis ideas, here's another one. If swear words are highly valued commodities, then how are those commodities fetishized, as Marx would put it, through avoidance and suppression? What are the limits on the circulation of taboo lexical items if they are to retain their special transgressive value? And is that transgressive value lessened when opinion-makers laud a supposedly "bad" word as actually "very good," thus transforming its prestige from covert to overt?

Censorship is one surefire way to keep those valued lexical commodities out of circulation. And a Christian activist group in North Carolina is doing its part to heighten the totemic power of "bad language" by limiting access to it. The group, known as Called2Action, has banded together with local parents to challenge the Wake County school district's use of five books — including Cassell's Dictionary of Slang, edited by British slangologist extraordinaire Jonathon Green. (Cassell's is an invaluable resource for the study of all manner of vulgarisms, as I noted in my previous post, "Twonk!") The Raleigh News & Observer and the local ABC affiliate have the story of the parents' objections.

If nothing else, the complaint serves as excellent publicity for the newly published second edition of Cassell's. News has even traveled back to the UK, as the Guardian recently ran a story on the Wake County kerfuffle. The Guardian actually gave Cassell's an added boost by saying that the school district has already "banned" the dictionary, even though use of Cassell's has thus far only been challenged. There's no word yet on whether Wake County officials will really ban Cassell's, but if they do, it would be a badge of honor for the dictionary's creators, not to mention a boon to their sales. In this day and age when collegiate dictionaries routinely cover all the usual four-letter profanities, a banned dictionary must have some pretty juicy entries, right? Clearly, obscene language still carries the potential to shock, even in serious-minded works of lexicography.

Posted by Benjamin Zimmer at 06:14 AM

June 24, 2006

Are men emotional children?

I hate this. The world is full of brilliant, promising advances in science and technology, and here I am again, debunking silly overgeneralizations and misinterpretations and even downright charlatanry. Look, my next couple of posts about science or engineering will be full of praise for excellent and insightful research, I promise. But I'm stuck with this one. After taking David Brooks to task for misuse of cognitive neuroscience, I thought I ought to track down the source of the problem. And once I did, I realized that Brooks is not the culprit. That doesn't mean that what he wrote was true. But he took his ideas hook, line and sinker from a recent book by Leonard Sax, M.D., Ph.D.: "Why Gender Matters: What Parents and Teachers Need to Know about the Emerging Science of Sex Differences".

This book has influenced many others besides David Brooks -- it was featured in a Time Magazine cover story in 2005, and Stanley Kurtz praised it in the National Review, and Sax is a leader in the movement for single-sex education -- so I thought I should get a copy and read it. It's an interesting book, and full of ideas worth thinking about. But judging from my experience with the particular factual claim that Brooks took from this book, you'd be wise to keep a very big sack of grains of salt within easy reach when you read it.

Here's what Brooks wrote:

...the part of the brain where men experience negative emotion, the amygdala, is not well connected to the part of the brain where verbal processing happens, whereas the part of the brain where women experience negative emotion, the cerebral cortex, is well connected.

As I explained, this claim about functional localization is false, and can be shown to be so in a few minutes' reading of the scientific literature. So how did Brooks go so far wrong? On p. 29 of "Why Gender Matters", Sax writes:

Girls and boys behave differently because their brains are wired differently.

Deborah Yurgelen-Todd and her associates at Harvard have used sophisticated MRI imaging to examine how emotion is processed in the brains of children from the ages of seven through seventeen. In young children, these researchers found that negative emotional activity in response to unpleasant or disturbing visual images seems to be localized in phylogenetically primitive areas deep in the brain, specifically in the amygdala. (A phylogenetically primitive area of the brain is one that hasn't changed much in the course of evolution: it looks pretty much the same in humans as it does in mice.) That may be one reason why it doesn't make much sense to ask a seven-year-old to tell you why she is feeling sad or distressed. The part of the brain that does the talking, up in the cerebral cortex, has few direct connections to the part of the brain where the emotion is occurring, down in the amygdala.

In adolescence, a larger fraction of the brain activity associated with negative emotion moves up to the cerebral cortex. That's the same division of the brain associated with our higher cognitive functions -- reflection, reasoning, language, and the like. So, the seventeen-year-old is able to explain why she is feeling sad in great detail and without much difficulty (if she wants to).

But that change occurs only in girls. In boys the locus of brain activity associated with negative emotion remains stuck in the the amygdala.42 In boys there is no change associated with maturation. Asking a seventeen-year-old boy to talk about why he's feeling glum may be about as productive as asking a six-year-old boy the same question. [emphasis added]

That superscript 42 is not the answer to life, the universe and everything. It's just a endnote, and it resolves to a reference to a particular scientific paper, namely Killgore, William D. S. CA; Oki, Mika; Yurgelun-Todd, Deborah A. "Sex-specific developmental changes in amygdala responses to affective faces." Neuroreport. 12(2):427-433, February 12, 2001. A footnote or endnote like that, as I'm sure you know, is how authors flag the authority by which they make non-obvious statements. And the claim that adult males are emotional children is certainly non-obvious -- whatever current sexual stereotypes may say.

So I tracked down the Killgore et al. paper and read it. I'm going to share the results with you, because the disproportion between the reported facts and Sax's interpretation is spectacular. (I've also taken the liberty of making a .pdf of the paper available behind the link involved -- if writers like Sax and pundits like Brooks are going to make public-policy recommendations on the basis of a piece of U.S.-government funded research, then the U.S. public should be able to read the research reports.)

Let's start with the "materials and methods". Killgore, Oki and Yurgelun-Todd used functional magnetic resonance imaging (fMRI) to measure changes in blood flow in certain parts of the brain, between periods when subjects were looking at certain pictures and periods when they were looking at a small white circle.

Visual stimuli consisted of six fearful faces selected from the stimulus set of Ekman and Friesen.

The screen was visible via a mirror mounted to the head coil. Each 150 s scanning sequence consisted of five alternating 30 s stimulus/rest periods. ... During baseline and rest periods, subjects were asked to visually fixate on a small white circle located in the center of the screen.

Here's the first of Sax's overinterpretations. The stimuli were "six fearful faces": Sax talks about "unpleasant or disturbing visual images", and "the locus of brain activity associated with negative emotion", and "feeling glum", and so on. But looking at the faces of other people expressing fear, and being yourself depressed, are very different classes of emotions; and there's a larger set of equally diverse negative feelings like anger, disgust, envy, bitterness, grief and so on -- none of which were involved in this experiment.

(There's also a more general issue with experimental design here. There's apparently nothing about the design that guarantees that we're not looking at the effects of seeing faces (or complex visual stimuli) of any sort, since the comparison is between "fearful faces" and "small white circle(s)". As far as I can tell, the experiment was entirely passive -- the subjects were not asked to perform any analysis, or remember anything, or to attend to these rather boring displays in any particular way at all. And Sax himself claims elsewhere in his book that males and females perceive faces and even general visual stimuli in very different ways.)

The subjects in the experiment were

19 healthy children and adolescent volunteers (13 right- and six left-handed by self-report), ranging in age from 9 to 17 years ... The sample included nine males and 10 females ...

Here's the second of Sax's overinterpretations. This is a very small sample, especially because the experimenters want to draw conclusions not only about the effects of sex but also about the effects of age. Worse, the small samples of males and females cover the range of ages rather differently. In the data plots (see below), we can determine that the boys were 11 to 15 -- specifically 1 at 11, 2 at 12, 4 at 13, 1 at 14 and 1 at 15. In other words, six of the nine boys were 12 or 13 years old. ALL the evidence about maturation effects depends on the other three subjects -- and we'll see below that the amount of individual variation is so large that we'd want 10 or 20 subjects at each age before concluding much -- and in any case, this tiny sample of boys only covers the span from 11 to 15 years old.

But Sax concludes that "[a]sking a seventeen-year-old boy to talk about why he's feeling glum may be about as productive as asking a six-year-old boy the same question". Note that the sample of girls was also very small: there were 10 girls, distributed as 1 at 9, 3 at 12, 2 at 15, 2 at 16, and 2 at 17. This spans a larger range of ages (9-17 instead of 11-15) , but the number of subjects at each age is still tiny -- and as we'll see, that makes it impossible to draw any reliable general conclusions about the effects of maturation, because of the (very large) individual differences among subjects of the same sex and age.

What effects of these (limited) stimuli on this (tiny) set of subjects were measured?

Regions of interest (ROIs) for each amygdala were selected with reference to an anatomic atlas. Each ROI was comprised of four pixels, each pixel 3 × 3 mm, sampled from one axial slice... The amygdala ROI's were placed in medial aspects of the amygdala on an axial slice that included the subcallosal area (Brodmann's area 25) and the inferior regions of the middle and superior temporal gyrus. Two ROI's were placed in the dorsolateral prefrontal cortex (Brodmann's areas 46 and 9), localized anterior to the cingulate cortex at the approximate level of the genu of the corpus collosum.

Here's Sax's third overinterpretation. Sax talks about "the locus of brain activity associated with negative emotion" and "the same division of the brain associated with our higher cognitive functions" -- but the experiment didn't look for such loci in general, it looked only in two very small (four-voxel) "regions of interest", namely a particular small piece of the (paired left and right) amygdalas and a particular small piece of the dorso-lateral prefrontal cortex. The brain was imaged in an array of 12x64x128 = 98,304 voxels, of which only 4+4+4+4 = 16 were examined at all. These 16 little bits of brain were selected before the experiment began, on the basis of the researchers' expectations about what parts of the brain were relevant; the rest of the brain was ignored. That's normal procedure in some kinds of fMRI experiments, but it's important not to interpret the results as if activity in the whole brain had been evaluated.

The experimenters then averaged the "signal measured in all pixels in each ROI for each time point during the task activation period", which was the sum of "five alternating 30 s stimulus/rest periods". i.e. the sum of the signal over the periods of time when the subject was looking at (the same three) fearful faces over and over again, divided by the sum of the signal over the periods of time when the subject was looking at a small white circle:

The MR signal was then normalized to each subject's baseline average, derived from the mean of the first seven images, and converted into a metric representing the percent change in MR signal from baseline.

So let's look at the results. Here's the percentage comparison to baseline for the left and right amygdalas in the boys:

Each dot plots the percentage difference, for one boy, in blood flow (in four amygdala voxels) between watching "fearful faces" and watching "small white circles". The left-hand plot is for the left amygdala, and the right-hand plot is for the right amygdala. The horizontal axis is age, and the vertical axis is percent difference. Look at the range of variation for the four boys at age 13, and the small number of subjects and small range of ages, and you won't be surprised that the trends were not found to be statistically significant (though as you can see, the fitted trend line is going down with age for the left amygdala).

And here are the same graphs for the girls:

Again, there's quite a bit of individual variation -- look at the three 12-year-olds. This time, the authors claim that the correlation between age and signal intensity in the left amygdala is statistically significant. Whether it's meaningful is another matter: a lot depends on that one 9-year-old girl, whose point has a lot of leverage. And in interpreting the difference betwen the boys and the girls, the fact that the girls have a much larger age range makes it a lot easier for their data to turn out to have a statistically signficant trend (whether genuinely or by accident).

You can look at the rest of the details for yourself, but I can't resist putting up one more example -- the alleged interaction of sex and age in predicting the difference between dorsolateral prefrontal cortex and amygdala signals:

Again, (c) and (d) are the left and right sides for the boys, while (e) and (f) are the left and right sides for the girls. The authors tell us that

For males considered as a group, there was no significant correlation between age and DLPFC–amygdala difference scores for the (c) left (r = -0.43, ns) or the (d) right (r = 0.40, ns). Females, in contrast, showed a significant correlation between the DLPFC-amygdala difference score on the (e) left (r = 0.73, p = 0.02), but not the (f) right (r = 0.08, ns).

They conclude that

The difference in the observed trajectories between the males and females was significant and suggests that adolescent maturation may involve sexually dimorphic development of prefrontal cortex-amygdala circuits involved in affective processing.

That conclusion could be true -- but would anyone like to place a small wager on how often I can get a random number generator to produce results that look more or less like these, for a simple model of the distribution of signal levels in which there are no sex effects at all? How about a model where the only sex difference is the age of onset of puberty? Or one in which the sex/age effect is on willingness to pay attention in boring experiments? Actually, make that a big wager.

In fact, Killgore, Oki and Yurgelun-Todd pull their punch (after delivering it, and probably because some reviewer made them do it, but still):

Given that our results are preliminary and were obtained with a relatively small sample, conclusions based on these findings must be viewed as tentative until replicated with larger groups of subjects. Future studies would benefit from the inclusion a comparison group of adults so that the trajectory of amygdala response may be examined beyond the adolescent years. Secondly, functional imaging studies have consistently shown that the amygdala rapidly habituates to affective stimuli, resulting in reduced BOLD signal in studies that employ a blocked stimulus presentation paradigm [3,29]. As our study included a blocked presentation, we may have minimized our ability to detect amygdala activation, and future studies may benefit from the use of event related designs. Another potential limitation was that the ROIs used in the present study were limited to four pixels selected from a single coronal slice for each region. It is therefore possible that some regions that are critical for emotional regulation and processing were not adequately sampled.

And that underlines Sax's fourth and largest overinterpretation. He takes a very small study, with very limited stimuli, whose results are messy at best and completely equivocal at worst, but certainly show at least as much individual variation as variation by sex. And he presents this study as if it showed, unequivocally and categorically, that

Girls and boys behave differently because their brains are wired differently.

And more specifically, he tells this clear, coherent, categorical -- and completely bogus -- story:

In young children, ... negative emotional activity in response to unpleasant or disturbing visual images seems to be localized in phylogenetically primitive areas deep in the brain, specifically in the amygdala. ...

In adolescence, a larger fraction of the brain activity associated with negative emotion moves up to the cerebral cortex. That's the same division of the brain associated with our higher cognitive functions -- reflection, reasoning, langauge, and the like. ...

But that change occurs only in girls. In boys the locus of brain activity associated with negative emotion remains stuck in the the amygdala. In boys there is no change associated with maturation.

Now, there are probably group differences by sex and age in emotional processing. And Sax might be right to argue that single-sex education is a good idea. But in presenting this narrative of males as emotional children, Sax is not telling us about the established conclusions of scientific research, despite his display of powerful authority-symbols ("her associates at Harvard", "sophisticated MRI imaging"). He's projecting his own prejudices onto a small and limited experiment with equivocal results, which disagree in part with other experiments (like the one I surveyed in my earlier post).

Leonard Sax should be ashamed of himself for trying to use such spectacularly overinterpreted science to advance his social agenda. Professors like me should be ashamed for not educating more of our public intellectuals to be able to evaluate such advocacy in a sensible and responsible way -- I'm sorry to say that Sax is a graduate of the University of Pennsylvania, where I teach.

And journalists have a special responsibility here, which they almost entirely fail to live up to. It's specifically shameful that Time Magazine couldn't assign a reporter willing and able to check Sax's science. And David Brooks, who disarmingly describes himself as a "scientific imbecile", should be ashamed for not taking on an intern who can read and understand the scientific research that he wants to use to support his conclusions about public policy. But then again, Brooks is a political commentator, and so his goal is presumably to argue for the conclusions that he prefers, not to seek the truth. All the more reason for the rest of us to do the intellectual due diligence that he avoids.

[Update -- several readers have written to suggest that I was too kind to Killgore et al. Barbara Z. pointed out that

One factor which you did not mention from the original research is the fact that about one third of the children tested were left-handed by self-report. There **may be differences in brain function between right-handed and left-handed children and adults which would only be confounded here by the small sample of both sexes.

Indeed, 6 out of 19 subjects were left-handed -- that's more than you'd expect by chance, I think. The authors don't tell us how the left-handers were distributed by sex, but in any case, inclusion of so many left-handers (maybe of any left-handers at all) seems inappropriate in a study with such small N, where conclusions about functional lateralization are being drawn.

And another reader, who wishes to remain anonymous for the moment, wrote that

[L]ooking at their figures, I'll be damned if all their results aren't driven by that one nine year old girl; her leverage has to be very large. I'd be very surprised if standard regression diagnostics didn't throw up a huge warning flag here. But they don't give any details on their statistical procedure, and they don't make their data available.

I'm also curious about the robustness of their statistical analysis; and with merely 19 subjects, it would have been pretty easy for them to present the data as a table of numbers. In my opinion, it's a black eye for the scientific profession that journals like NeuroReport don't routinely require authors to publish the numbers needed to check their analyses. ]

[Note: more discussion and links on this topic are here.]

Posted by Mark Liberman at 06:19 AM

June 23, 2006

Linguistics in General Science Journals

An issue that has come up here on Language Log on several occasions is the inadequate refereeing of papers in some areas of linguistics, especially historical linguistics, by general science journals such as Nature, the Proceedings of the National Academy of Science, and Science. The problem is that the editors of such journals falsely believe themselves and the referees that they choose competent to evaluate papers in areas of linguistics remote from those that overlap areas traditionally covered by these journals, such as neurolinguistics and psycholinguistics. As a result, highly controversial if not outright crank work is treated as if it were solid. When linguists criticize their editorial practices, the editors of such journals tend to respond huffily that they know what they are doing and have selected competent referees: the linguists who complain are just old fuddy-duddies, irritated at being left in the wake by the new wave.

In theory the names of the referees are confidential, so it isn't possible directly to debate their qualifications. In practice, we do sometimes know who the referees were, and when we do, it turns out as we suspect that they are either linguists competent in some other area of linguistics who know little about the relevant area or people not competent in linguistics at all. Even in these cases, however, a public debate is not feasible.

There is, however, another route, which I am going to take here. I will present an example of a paper that unquestionably should not have been published in the form in which it was published. To be precise, I will argue the following two propositions:

  1. Any competent editor would have recognized that the paper should be refereed by a person competent in historical linguistics;
  2. Important aspects of the paper depend on claims that no one competent in historical linguistics would let pass;

The paper that I will discuss is "On the origin of internal structure of word forms" by Peter F. MacNeilage and Barbara L. Davis, which appeared in Science on 21 April 2000, volume 288, pages 527-531. Let me immediately make clear that this isn't meant as an attack on either Peter MacNeilage, whom I know, or Barbara Davis, whom I don't know. The problem with this paper arises from the fact that it brings together material from two quite different areas of linguistics about which no single person can be expected to be knowledgable. One of the important functions of the editorial process is providing expert review of areas in which the authors are not experts. It is the editors of Science who failed more than MacNeilage and Davis.

Here is the abstract published with the paper:

This study shows that a corpus of proto-word forms shares four sequential sound patterns with words of modern languages and the first words of infants. Three of the patterns involve intrasyllabic consonant-vowel (CV) co-occurrence: labial (lip) consonants with central vowels, coronal (tongue front) consonants with front vowels, and dorsal (tongue back) consonants with back vowels. The fourth pattern is an intersyllabic preference for initiating words with a labial consonant-vowel-coronal consonant sequence (LC). The CV effects may be primarily biomechanically motivated. The LC effect may be self-organizational, with multivariate causality. The findings support the hypothesis that these four patterns were basic to the origin of words.

The paper deals with the relationship among three sets of forms:

  • the words of modern languages as spoken by adults
  • the first words of modern infants
  • the reconstructed forms of words of the original human language

As the claims about proto-language are important and lie outside the areas of specialization of both authors, the editors should have known that obtaining and paying heed to at least one referee competent in historical linguistics was essential.

Let us turn now to the second point, that there are severe problems with the "corpus of proto-word forms" used by MacNeilage and Davis. The corpus in question consists of the 27 "reconstructions" proposed by Bengtson and Ruhlen (1994) in a paper entitled "Global Etymologies". This paper is well known to be badly flawed.

The first problem with B&R's "reconstructions" is that it remains to be established that all human languages are descended from a common ancestor. B&R purport to demonstrate this using the notoriously flawed technique dubbed "multilateral comparison" or "mass comparison" by its proponents. This technique, more accurately called "superficial lexical comparison", consists in presenting sets of words from various languages that allegedly resemble each other sufficiently in sound and meaning that the resemblance cannot reasonably be attributed to chance and declaring "Behold!". In point of fact, the probability of finding similarities of the sort adduced as evidence is quite high. Moreover, the technique is unable to distinguish similarities due to common descent from those due to language contact. The technique is known to be unsound both on theoretical grounds and on the basis of experience: although those ignorant of the history of historical linguistics often claim that it is an innovation introduced to overcome the limitations of the comparative method, in fact it is the older technique, displaced by the comparative method as linguistics developed modern, scientific methods. (For some examples of false conclusions produced by "mass comparison", see Poser and Campbell (1992).)

The second problem with B&R's "reconstructions" is that they are not. The term "reconstruction" has a specific, generally accepted meaning in historical linguistics. It does not mean just any guess as to what the ancestral form of a word might have been. Rather, it refers to the result of applying a fairly well defined procedure. That procedure involves discovering regular sound correspondances among a set of languages, determining which sound correspondances are in complementary distribution and therefore reflect developments from a single proto-phoneme, and using both the interaction of the sound changes and our knowledge of the directionality of sound change to determine what the most likely phonetic properties of the reconstructed phonemes were and what sound changes led to the observed sound correspondances. Laying this out in detail would require a lengthy post in itself, but it is discussed in any good textbook of historical linguistics, such as Campbell (2004), Crowley (1998), Hock and Joseph (1996), Rankin (2003), or Trask (1995). For something online, the Wikipedia article on the Comparative Method is pretty good.

The point is that real reconstructions are supported by very specific kinds of evidence and that they represent a network of falsifiable hypotheses. Bengtson and Ruhlen's "reconstructions" are not the result of such a procedure but are mere guesswork. They have quite literally looked at words in a number of languages and said: "well, these might have come from something like X" and proclaimed X to be a reconstruction. Such "reconstructions" cannot even be wrong. If I were to claim that the proto-Indo-European ancestor of English bear, Latin fero, Sanskrit bharāmi, Greek φερώ, Armenian berem, etc. was *pero rather than the accepted *bhero, I would be proven wrong by the fact that the sound correspondances would not work: Proto-Indo-European */p/ does not yield Latin /f/, Sanskrit /bh/, Greek φ, etc. But if someone disagrees with Bengtson and Ruhlen as to the proto-World reconstruction of, say, *tik "finger" and claims that it was really *dik, or *tug, or *tink, there is no way to settle the issue, because each hypothesis is as good as any other. There is no network of interlocking, falsifiable hypotheses, just a bunch of independent guesses.

Indeed, it is worth noting that B&R do not present any systematic argument for their "reconstructions", nor have they, nor anyone else, ever offered a justification for this "technique". Proponents of superficial lexical comparison have offered justifications for their approach, unconvincing though they be, but neither I nor any of the other linguists with whom I have checked is aware of any attempt at justifying "reconstructions" produced by methods other than the comparative method.

The third problem with B&R's "reconstructions" is that the data on which they are based are badly flawed. Two major critiques have been published. One, Salmons (1992), looked at the evidence underlying a single putative proto-form *tik. It found grave errors in much of the data. The other, Picard (1998), is amenable to compact summary. Picard reviewed all of the data from Algonquian languages cited by B&R. He found five types of error:

Incorrect Language
The form cited is not from the language it is said to be from. For example, the form woxos "shin" is identified as Blackfoot but is in fact Arapaho.

Incorrect Gloss
The meaning given for the form is incorrect. For example, Natick mukketchouks is glossed as "boy", but is actually "son, man child".

Incorrect Transcription
The pronounciation given is incorrect. For example, the Shawnee word for "girl" is given as kwan-iswa but is actually kwaaniswa.

Incorrect Segmentation
The word is broken into morphemes incorrectly or without justification. For example, Blackfoot nóoma "my husband" is given by B&R as no-ma, incorrectly glossed as "husband". The hyphen indicates that B&R think that the word consists of two morphemes. This allows them to compare this word with other words containing ma, which they claim go back to a Proto-World form mano. This Blackfoot word actually consists of the prefix /n/ "my" and the noun stem /-óoma/ "husband". /óo/ is part of the noun stem. /óoma/ looks less like /mano/ than /ma/ does; the effect of B&R's incorrect segmentation was to make the Blackfoot form look more similar to the other forms cited than it really is. Such errors of segmentation do not have a random effect - they almost always are of such a nature as to make the forms compared look more similar than they really are.

Ancestral Disparity
In some cases, a word that in its modern, attested form resembles the other members of the equation is known to derive from an ancestral form that looks very little like the other members. For example, Arapaho /woxos/, which does resemble the putative Proto-World /bu(n)ka/ and words from other languages with which it is compared, is derived from Proto-Algonquian /meθkwaθkana/, which looks nothing like /bu(n)ka/.

Of B&R's 27 "global etymologies", nine involve forms from Algonquian languages. Picard found an average of two errors per form, with at least one error in every form. Picard's findings are summarized in the following table, based on the table in his paper.

Error Type/Form123456789
Incorrect Language (Group)X  X    X
Incorrect Gloss    XXXX 
Incorrect Transcription X X     
Incorrect Segmentation X   X XX
Ancestral DisparityX XX  XX 

Some errors do not necessarily render the form useless. Assignments to the wrong language may not matter if it belongs to the same family. Incorrect glosses may not matter if the true meaning of the form is close enough. Incorrect transcriptions do not always change relevant factors. On the other hand, incorrect segmentation usually means that the real form does not fit the equation. Similarly, when the ancestral form is unlike the form cited, this means that the form does not fit the equation. Errors in these two categories therefore generally invalidate the comparison.

Picard found either an incorrect segmentation or ancestral disparity in eight of the nine Algonquian forms cited. In sum, all of the Algonquian forms cited by B&R are erroneous; eight out of nine forms are flawed so seriously as to invalidate their inclusion in the equation. (Only seven of the nine equations are actually invalidated because in one case B&R compared both an actual Arapaho form (misidentified as Blackfoot) and a reconstructed Proto-Algonquian form. Although the Arapaho form is descended from a Proto-Algonquian form that looks nothing like the other members of the equation, the Proto-Algonquian form that they cite is sufficiently similar in form and meaning to the forms from other languages that it might validly be compared with them.)

The fourth problem has to do with the ambiguity of the term Proto-World. In one sense, this means the hypothetical first language of human beings. In the other sense, it means the reconstructed ancestor of the attested human languages. If all languages that had ever existed were attested, their reconstructed ancestor would be an approximation to the first language of human beings at the point at which it first diversified into two or more speech varieties. Unfortunately, we have no idea how long a period elapsed between the origin of language and the point at which the first diversification ocurred, much less of what changes may have occurred, nor do we know what branches of the family tree may have become extinct without attestation. It is perfectly possible that thousands of years elapsed between the origin of language and the first diversification. Moreover, if major branches are unattested, the common ancestor of all known languages may postdate the original human language by tens of thousands of years. The result is that even if we could reconstruct the ancestor of all attested languages, it would not be safe to equate it with the original language.

In sum, there are four problems with MacNeilage and Davis' use of B&R's "reconstructions":

  • It has yet to be demonstrated that all spoken languages are descended from a single ancestor;
  • These so-called "reconstructions" are not in fact reconstructions. They have no scientific basis whatsoever;
  • There are severe flaws in the data on which the "reconstructions" are based;
  • Even if we assume that all spoken languages are related and that B&R's "reconstructions" are valid reconstructions of forms from the ancestor of all attested spoken languages, the relationship of this proto-language to the first language spoken by human beings is unknown.

These problems are well known to historical linguists and were well known in 1999 when MacNeilage and Davis' paper was under review by Science. They are so severe as to completely invalidate MacNeilage and Davis' reliance on Bengtson and Ruhlen's "global etymologies". These "reconstructions" are not merely controversial; they are nonsense. Relying on them is like relying on Pons and Fleischmann's work on cold fusion or Erich von Däniken's work in archaeology. I submit, then, that we can conclude with confidence that in evaluating this paper the editors of Science did not rely on the advice of anyone competent in historical linguistics.

Bengtson, John D. and Merritt Ruhlen (1994)
"Global etymologies," in Merritt Ruhlen (ed.) On the Origin of Languages. Stanford: Stanford University Press. pp. 277-336.
Campbell, Lyle (2004)
Historical Linguistics: An Introduction. Cambridge: The MIT Press. 2nd edition.
Crowley, Terry (1998)
An Introduction to Historical Linguistics. Oxford: Oxford University Press.
Hock, Hans H. and Brian D. Joseph (1996)
Language History, Language Change, and Language Relationship: an introduction to historical and comparative linguistics. Berlin: deGruyter. ISBN 311014784X.
Picard, Marc (1998)
"The Case against Global Etymologies: Evidence from Algonquian," International Journal of American Linguistics 64.2.141-147.
Poser, William J. and Lyle Campbell (1992)
"Indo-European Practice and Historical Methodology" Proceedings of the 18th Annual Meeting of the Berkeley Linguistics Society pp. 214-236.
Rankin, Robert L. (2003)
The comparative method. In Brian D. Joseph & Richard D. Janda (eds.) The handbook of historical linguistics, Oxford: Blackwell, pp. 183-212. ISBN 1405127473.
Salmons, Joseph P. (1992)
"A look at the data for a global etymology: *tik 'finger'," In W. Davis Garry and Gregory K. Iversen (eds.) Explanation in Historical Linguistics. Amsterdam: John Benjamins. pp. 207-228.
Trask, R. L. (1996)
Historical Linguistics. New York: Oxford University Press.

Note: the dates of the papers as published are misleading. All of these papers were circulated long in advance of publication, so the fact that Salmons' critique of B&R's paper was published two years prior to the appearance of the paper of which it is a critique is not anomalous. Indeed, B&R complained quite publicly about the rejection of their paper by Language, in response to which a public discussion took place.

Posted by Bill Poser at 08:10 PM

Everybody's going meta

"Beetle Bailey" isn't the only comic strip featuring meta-commentary on cursing characters these days. Here's a "Mother Goose & Grimm" strip that ran on May 6th:

And as usual, those playful postmodernists at Mad Magazine are way ahead of everyone: the April 1980 issue had a feature entitled "Comic Strip Cursing Symbols to Match a Given Situation." Required reading for that thesis on cartoon cussing.

Posted by Benjamin Zimmer at 05:05 PM

More @!%!**#~@#!! Wisdom from Beetle Bailey

The Beetle Bailey cartoon that Mark Liberman just posted reminded me of an earlier Beetle Bailey offering on the symbols of cartoon cursing. I can't find the actual cartoon on the web, but I know it was from 1990, because I wrote about it some years back in a little article on language change for the Linguistic Society of America website (here):

In a 1990 Beetle Bailey cartoon, Sarge chews Beetle out with a string of symbols ending in #!!, and Beetle laughs, "#?? Nobody says # anymore!" Sarge, deflated, sighs, "Gee, I always thought # was all-time classic cussing." Sarge is embarrassed because with a very few exceptions -- notably the genuinely classic four-letter English words, at least one of which has a pedigree that includes a Latin obscenity written on the walls of ancient Pompeii -- using last year's slang spells social disaster.

So Beetle Bailey is an old hand at meta-commentary on obscenities. And maybe what the world needs now is an enthusiastic student to write a thesis on the orthography of cartoon cussing; we could even pass the hat around Language Log Plaza and offer a Language Log Fellowship for the purpose.

Posted by Sally Thomason at 10:58 AM

Maurice Saatchi, cognitive neuroscientist

All the hippest conservatives these days are getting into cognitive neuroscience. A couple of weeks ago, David Brooks was telling us about the male amygdala and its need for special reading material. Yesterday, I learned from Maurice Saatchi that

[S]ocial scientists divide the world between digital natives and digital immigrants. Anyone over 25 is a digital immigrant. He or she has had to learn the digital language. The digital native learnt it like you learnt your mother tongue, effortlessly as you grew up. The digital immigrant struggles and forever has a thick, debilitating accent.

The latest affliction, according to neuroscience - and this was the death knell - is that the digital native's brain is physically different as a result of the digital input it received growing up. It has rewired itself. It responds faster. It sifts out. It recalls less.

That death knell was tolling for advertising, as announced in a speech ("The strange death of modern advertising") that Lord Saatchi gave at the Cannes Lions International Advertising Festival. Unfortunately, I wasn't in Cannes to hear it -- from what Francesca Newland tells us, this sounds like a very enlightening conference:

Chief executives passed out on the pavement in front of the Gutter Bar, Traktor party revellers being sprayed with Champagne to keep them cool, DDB's entire London creative department squashed on to an inflatable banana while racing around the Med.

Cannes seems to be the last remaining outlet for the outrageous behaviour that has helped shape the ad business over the decades. For one week in 52 Cannes enables adland to pay tribute to its eccentric roots.

Alas, my own exposure to generous stretches of Lord Saatchi's speech was on the BBC Newshour yesterday morning around 9:20. He was beaming from my kitchen radio as I worked on my small contribution to an NIH grant application, checked my email and took a couple of notes for this post. Anyhow, Maurice is impressed by the rewired brains of digital natives:

This, apparently, is what makes it possible for a modern teenager, in the 30 seconds of a normal television commercial, to take a telephone call, send a text, receive a photograph, play a game, download a music track, read a magazine and watch commercials at x6 speed. They call it "CPA": continuous partial attention.

And the logical terminus of this need for speed has been revealed to him: an effective message must be reduced to a single word.

Can you precisely describe, in one word, the particular value, the characteristic, the emotion, you are trying to make your own?

If it runs to a sentence, you have a problem. A paragraph? Sell your shares.

Geoff Pullum was way out in front on this one: "Snugglebunny is mine" (4/18/2004). Except that Geoff wanted to claim "the verb snuggle and all derivatives thereof (e.g. snugglebunny); the adjective parsimonious; the preposition of; and the nouns crump, ether, parsley, helicopter, oligarchy, and rhodium". According to Lord Saatchi, this is lexical polytheism and it will never do:

In the beginning was the Word . . .. . . and the Word was God.
Two words is not God. It is two gods, and two gods are one too many.
Each brand can only own one word. Each word can only be owned by one brand. Take great care before you pick your word. It is going to be the god of your brand.

Curiously, Lord Saatchi brands his "new business model for marketing, appropriate for the digital age", using three words:

In this model, companies compete for global ownership of one word in the public mind.

This is "one word equity".

On the other hand, some digital native working for him has figured out that you can (must!) leave out all the spaces to claim your internet domain: onewordequity.com. I guess this makes it three but also one, sort of like... No, let's turn away from this lite-weight blasphemy -- and the onewordequity site, one of the worst-designed user experiences I've ever encountered on the web -- and get back to that neuroscience.

The idea that our brains are re-wired by our media experiences goes back at least to Marshall McLuhan, whose contemporary apostle, Camille Paglia, told us a couple of years ago that "[i]nterest in and patience with long, complex books and poems have alarmingly diminished not only among college students but college faculty in the U.S.", because "the new generation, raised on TV and the personal computer but deprived of a solid primary education, has become unmoored from the mother ship of culture", and lacks "the most basic introduction to structure and chronology". Paglia has proposed that we should show these kids slides of great paintings, and get them to read great poems, and why not? There isn't any substantive evidence for her views of the malady, its cause, or the efficacy of the cure; but more art and poetry in education is surely all to the good, as long as no one takes the justificatory verbiage too seriously.

Saatchi is embracing the alleged trend towards attentional fragmentation, not fighting it. But I bet that his pop neuroscience is just as bogus as McLuhan's was. It's hard to tell, though, because he doesn't really tell us what it is: he appeals to the authority of science without getting particular enough for us to check who his authorities are, and whether and how their work supports his ideas. (The only real clue that he gives us is the phrase "continuous partial attention", which seems to have been coined by Linda Stone, who is a software executive, not a neuroscientist.)

It's not fair to ask for footnotes in an inspirational speech on the beach (and on the BBC, which continues to maintain its reputation for credulity in the face of pseudoscience). The trouble is, Saatchi has already stretched his gospel of monotheistic lexicography to a three-word slogan and a thousand-word speech. So I doubt that we can expect an explanatory essay like those we've gotten from David Brooks and Camille Paglia. I'm sure that Lord Saatchi's speech helped the ad executives at Cannes get back in touch with their inner eccentrics. It remains to be seen whether his concept of onewordequity will work out better than the 2005 Tory election campaign did.

[ Some Language Log posts on related topics:

"Balm in Gilead" (4/16/2004)
"O tempora, o mores" (4/16/2004)
"Generational changes: decline or progress?" (4/20/2004)
"Mais ou sont les flamewars d'antan?" (4/21/2004)
"Attention deficit" (4/22/2004)
"In principle, yes" (4/24/2004)
"A field guide to grammar" (8/6/2005)
"Quit email, get smarter?" (4/23/2005)
A tale of two media (4/30/2005)
Never mind (5/03/2005)
News flash: the effect of politics, athletics and sex on IQ (5/03/2005)
"An apology" (9/25/2005)

And Ann Althouse has a great liveblogging of a Camille Paglia book signing: "Try to survive a tornado with a post-structuralist." (4/27/2005). ]

Posted by Mark Liberman at 06:41 AM

June 22, 2006


We at Language Log are charmed by the sound sample for the interjection ahem found alongside Answer.com's definition "Used to attract attention or to express doubt or warning." Can someone at Answers.com help me out? Is that the right intonation for attention, doubt or warning?

Answers.com (prudishly?) omits most sound samples for profanities. But they have a cute one for fucking, which they define only with "Used as an intensive." I'd prefer intensifier, at least if their sound sample was anything like this use of the word. It isn't. Still, it would make a great one-liner in a fruity off-Broadway bedroom farce:

Lady Eldridge: My! What are Archbishop Priestly and the Countess doing?
Ambassador Biggley: Fucking.

Posted by David Beaver at 09:23 PM

Beetle Bailey goes positively meta

In tune with our recent posts about substitutions for taboo language, a few days ago Beetle Bailey looked outside the frame and produced a strip that is actually somewhat funny:

There's also a serious anthropological point here, one that I never thought of before. Many cultures (and some of our own subcultures) proscribe praise just as much as blasphemy, scatology or obscenity. For example, as the Wikipedia explains,

Ashkenazi Jews in Europe and the Americas routinely exclaim Keyn aynhoreh! (also spelled Kein ayin hara!), meaning "No evil eye!" in Yiddish, to ward off a jinx after something or someone has been rashly praised or good news has been spoken aloud.

Some milder residues of such attitudes remain in expressions like "knock on wood". But I've never seen anyone using typographical bleeping to disguise the written expression of praise or good news, as has been done with blasphemy, scatology and obscenity in English texts at least since the end of the 17th century..

[hat tip to anonymous eric]

Posted by Mark Liberman at 08:59 PM

An editorial conflict of interest at Slate?

Yesterday, Ann Althouse offered Slate well-deserved congratulations on its 10th anniversary ("The usual Slate plus a spate", 6/21/2006), while registering an equally well-deserved complaint about Jacob Weisberg's little cottage industry in Bushisms:

Oh, Slate is exasperating at times. Jacob Weisberg keeps cranking out his Bushisms, maybe just to keep Slate critics from noticing other problems. They're so damned distracting. Look, he's got a new one up there now:

"I tell people, let's don't fear the future, let's shape it."

What's even supposed to be wrong with that? The phrase "let's don't" is standard English. Is there something off about thinking people fear the future? Is the idea of shaping the future too arrogant and unrealistic? Come on, Weisberg, that's no "Is our children learning?"

The day before, Eugene Volokh made the same point about the same BotD:

Here's today's Bushism of the Day, "the president's accidental wit and wisdom," from Slate:

"So we'll bring our ideas, they'll bring theirs, let's clarify the differences, let's don't say bad things about our opponents."

Whoops, sorry, wrong President -- that's actually from President Clinton. The Bushism of the Day today is really this:

"Let's don't just talk about it. Let's actually do it, by passing the legislation."

Rats! Screwed up again -- that's actually from Vice President Gore. Here, and this time I'm serious, is today's actual Bushism of the Day:

"I tell people, let's don't fear the future, let's shape it." -- Omaha, Neb., June 7, 2006

As best I can tell, the only supposed flub -- the only supposed humor -- here is "let's don't." (Without that, the phrase isn't terribly rich in content, but neither are "the only thing we have to fear is fear itself," or a wide range of other perfectly normal exhortations from political leaders.)

Yet it's a flub only in the sense that departure from the standard Northeastern/West Coast elite spoken English is a flub. If you search for "let's don't," you'll find it used routinely in spoken English, chiefly (as best I can tell from my searches) by people from flyover country.

Eugene Volokh cites The Columbia Guide to Standard American English, where Ken Wilson says that ""There are three negative idioms: Let’s not stay, Don’t let’s stay, and Let’s don’t stay. All are Standard, although Let’s don’t is more typically American than Don’t let’s, which is more typically British."

In this case, I think Ken is missing a nuance. All three idioms might be Standard, but "let's not" is a lot commoner:

let's not
let's don't
don't let's

"Let's not" is 118 times commoner than "let's don't" on Google, 95 times commoner on Yahoo, 98 times commoner on MSN. In news-oriented indices, the ratios tend to be larger:

  Google News Yahoo News MSN News
let's not
let's don't
don't lets

That's a ratio of 304 in Google News, 169 in Yahoo News, and 41,438 in MSN News (where something strange is going on in this case). 3 of the 10 "let's don't" examples from Google News are Bush quotes; the other 7 are from NC, LA, TX, AL (2), LA(2) -- that's not just fly-over country, it's more specifically the American south.

My impression is that "let's don't" is not only somewhat more informal that "let's not" -- which will already be tagged as informal by some because of the contraction -- but also is in commoner use in the south. (There are four examples of "let's don't" in a corpus of telephone conversations that I searched, three used by southerners and one by someone from a "midland" state, a region that includes places like Oklahoma and Missouri. The 67 instances of "let's not" are from all over.)

The New York Times search (since 1981) turn up 3,241 examples of "let's not" versus 62 examples of "let's don't", for a ratio of (only) 52. This evidence confirms that both idioms are informal: a spot check of the first 20 hits for "let's not" reveals that are all from letters, Op-Ed pieces and quotations in news articles -- except for one use in a book review:

"Neff, let us assume, wants permanent insurance against Keyes's subtle inquisition into the ostensible claims of his sexual life." Oh, come on, let's not assume it.

And the NYT search is also consistent with the view that "let's don't" is associated with southern speech. Among the first 10 examples, 5 are quotes from American southerners, or in one case from an African-American from a northern state:

Senator Jeff Sessions, Republican of Alabama, added, ''Let's don't play games with their lives.''
''Let's don't say one word to anybody,'' Mulkey-Robertson [Baylor basketball coach] said.
Walter and Betsy Cronkite sat at a corner table with Andy Rooney and his daughter Emily. ''We've been coming here 23 years,'' Mr. Cronkite said before his wife added tartly: ''Let's don't lie about our age. It's 43 years.''
At the 1999 Gridiron Dinner, Mr. Clinton won over the room when he looked back at the impeachment ordeal and said: ''Let's don't kid each other. This was an awful year. It was a year I wouldn't wish on my worst enemy.'' Pause. ''I take that back.''
''It was a mayor accomplishment in the final analysis,'' he said. ''I'm happy to share credit with the City Council. But let's don't get carried away with it.'' [David N. Dinkins]

3 are in quotes from sources whose sociolinguistic origin I don't know:

The lyrics and music, written by Mr. Mills, are a pastiche of influences - Stephen Sondheim, Rodgers and Hammerstein, Gilbert and Sullivan - without ever really having a voice of their own. Still, there are a few songs - "Class," "Letters to Boys," "If Only" and "Let's Don't" - that are nicely wrought, making one wish that Mr. Mills and Ms. Reichel had further honed their work.
''These are bad folks, and let's don't forget that,'' Brig. Gen. John W. Rosa Jr. of the Air Force, the new deputy director of operations for the Joint Chiefs of Staff, told reporters in Washington.
'You didn't cross Oli lightly. He let you know just what he thought. At 13, he'd say, 'I don't agree with you, Mom, but let's don't argue about it.' ''

and two are quotes from northerners as reported (or invented) by people (I believe to be) of southern background:

''I remember Howard at the time was very good at sizing up people,'' Mr. Hudspeth said. ''He'd cut to the chase, every time. He'd say, 'Let's don't bother with that guy, he's too contentious, we'll never convince him.' Instead, we worked on some other guy. Howard was always a few steps ahead.'' [Thomas Hudspeth quoting Howard Dean]
Huber pauses, as if considering what he's going to say next, then: ''You just killed somebody, Win. Let's don't talk politics.'' [from a serial novel by Patricia Cornwell]

All of this underlines the points that Ann Althouse and Eugene Volokh made:

  • "let's not" and "let's don't" are both informal but widely-used idioms;
  • "let's don't" is less common and has southern-states associations;
  • Jacob Weisberg is engaging in cynical manipulation of regional and class prejudice in order to enrich himself.

Well, they didn't actually say that last part -- I'll take responsibility for it myself.

For several years, I've joined others in complaining about the preposterous over-reaching of the Bushisms industry (see below for some links). The individual cases are just like any disagreement over usage: we argue over what linguistic norms really are, what they should be, and why. But there's a broader pattern here, and it's not just that many people dislike President George W. Bush and are happy to find a linguistic focus for their feelings. That's the demand side of the industry, and it's obvious. However, there's something to say about the supply side as well: the Bushisms industry apparently accounts for a significant portion of Jacob Weisberg's income, and he's the editor of Slate, who gets to decide which "Bushisms" to print and how often to print them.

Amazon.com lists mroe than two dozen Bushisms products, including at least five book-length collections, various special editions ("The Deluxe Election Edition", etc.), yearly quote-a-day calendars, wall posters, refrigerator magnets, and even a DVD. Stacks of Bushism-objects for sale are prominently displayed in most bookstores that I visit. This is not a flash in the publishing pan -- it's been going on for almost six years. Maybe someone who knows the publishing industry better than I do can estimate what Weisberg's royalty payments from this enterprise are like, but I'm pretty confident that they're in the same range as what he makes at his day job as editor of Slate. (I'm assuming that the royalties go to Weisberg as author, and not to Slate as the magazine where the Bushisms were originally published -- the copyright pages read "Copyright 200X by Jacob Weisberg").

There's nothing wrong with this in general -- Americans have been excoriating their leaders since the republic was founded, and a good thing, too.We can disagree about the principles and the details involved, and that's also as it should be. And when someone writes books that others want to buy, (s)he makes money, and that's likewise fitting and proper.

But isn't there something wrong when a magazine editor, whose job is making judgments about what is and is not worthy of publication, makes much of his income from re-publication of collections of a feature whose instances are so often so spectacularly superfluous? Does anyone think that Jacob Weisberg would consider very many of these "Bushisms" worth the space in his (excellent) magazine and the attention of his readers (which include me), if he wasn't making money from George W. Bushisms, Still More George W. Bushisms, ..., George W. Bushisms V, Bushisms 2006 Day to Day Calendar, etc.; and if he didn't foresee the need to fill the pages of George W. Bushisms VI, the Bushisms 2008 Day to Day Calendar, and on and on? and if he didn't have a personal financial motivation for keeping the Bushisms brand and the Bushisms product line in the public eye?

As journalistic conflicts of interest go, I guess this is a venial one. It's not like the DNC is slipping envelopes of cash to Weisberg to reward him for making fun of the president. (Instead, Simon & Schuster is sending him quarterly royalty checks to reward him for making fun of the president.) But you no longer need to wonder why in the world a series of fluent and sensible statements by W, which would never be noticed if anyone else produced them, are routinely displayed on Slate as Bushisms. Just follow the money.

Some posts on related topics:

"'Too much of a coincidence to be a coincidence'" (10/10/2003)
"You say Nevada, I say Nevahda" (1/3/2004)
"Non-Bushism of the day" (1/27/2004)
"Sauce for the gander" (4/22/2004)
"Weisbergism of the week" (4/27/2004)
"A CNN-ism" (6/18/2004)
"Two paradigms of eloquence" (7/26/2004)
"Gibson Scores a 'Bushism', with an assist to Kerry" (10/9/2004)
"Beware linguistic and political stereotypes" (10/12/2004)
"Ceci n'est pas un Bushism" (10/15/2004)
"Wilgoren invents a trend" (10/25/2004)
"Fasten = Grecian?" (5/18/2005)
"Quotes from journalistic sources: unsafe at any speed" (7/9/2005)
"And they're just as ignorant as it used to do" (19/19/2005)
"Don't read it as something more than it's not" (10/29/2005)
"Has George W. Bush become more disfluent?" (11/17/2005)
"Trends in presidential disfluency" (11/26/2005)
"Trembling to be wrong" (12/20/2005)
"People that would do ourselves harm" (1/13/2006)
"Chinian, not Chinese?" (1/26/2006)

Posted by Mark Liberman at 08:53 AM

Now you hear it...

NPR and The New York Times have recently noted the appearance of a high-pitched 'adult-proof' cellphone ring tone. The idea is that kids would employ this ring tone to indicate that a text-message has arrived, in situations like high-school classes where text-messaging is frowned-upon. Since the ability to hear high-frequency sounds declines with age, the teachers can't hear the ring tones. Or so the story goes.

Of course, being 43, an age where one is just becoming a tad defensive about any signs of such age-related deficits, I lost no time in downloading the NPR ring tone mp3 and you will all be happy to know that (for now) I can hear it just fine.

The ring tone at the NPR site is a relatively pure tone at 15 kHz (that's 15 kilohertz, which is pretty darn high). The ability to hear of course depends on how loud the sound is; this ring tone was at 85 dB. Sure enough, the audiology literature (Sakamoto et al 1998) suggests that your average 40-49 year old can hear 15 kHz just fine. The threshold of audibility for people of my generation (or anyhow my decade) for a 15 kHz signal is, on average, 90dB, plus or minus about 10 dB.

Now I gather that the ring tone was an offshoot of an earlier product, the Mosquito, a loud pulsed high-frequency sound designed to keep teenagers from loitering around convenience stores. NPR reported that sound as being at 17 kHz. 17 kHz is REALLY high. Sakamoto et al. show that the average 10-19 year old can hear 17 kHz at 80 dB. I can hear 17 kHz, barely, at 90 dB or more. Sakamoto et al suggest that the average person in my decade can only hear 17 kHz if it is as loud as 120 dB or so, which suggests that playing the drums can't be as bad for one's hearing as my parents thought.

Anyhow, this particular issue with technology has not yet arisen in my classroom, although like most college professors these days I do find myself challenged to keep the undergraduates paying attention instead of reading their email. One more thing to keep us on our toes.

Posted by Dan Jurafsky at 03:25 AM

Extraterrestrial Soccer Report

From the Language Log Sportsdesk

The world may be about to end. But before I get to that, here's some advice. If you're a World Cup goalie, don't worry about fizzing balls. They always miss. Don't ask me why. Maybe it's the fizz.

Here's some of the latest World Cup action:

Paraguay 2 - 0 Trinidad and Tobago
Trinidad started the second half much livelier, with Carlos Edwards fizzing a fine cross right across the goalmouth. It begged to be fired home but was not.
Sydney Morning Herald June 22, 2006

Holland 0 - 0 Argentina
Riquelme sent a shot fizzing just wide.
Guardian June 21, 2006

England 2 - 2 Sweden
Twice in the first half, Lampard made perfectly-timed runs at the heart of the Swedish defence. The first ended with a glanced header wide and the second culminated in a fizzing 20-yard strike that whistled narrowly over the crossbar.
The Northern Echo June 21, 2006

Italy 1 - 1 USA
Two minutes later Dempsey came closest, with his 30-yard strike fizzing inches wide of Italian goalkeeper Gianluigi Buffon's right post.
Belfast Telegraph June 18, 2006

Argentina 6 - 0 Serbia and Montenegro
There was no let up for the Serbs after the break as Argentina continued to strut their stuff. Crespo almost added his name to the score sheet when he sent in a fizzing shot which Jevric was unable to hold.
World Soccer June 16, 2006

Brazil 2 - 0 Australia
Ronaldo's first good goal chance first good goal chance came on 28 minutes when he saw the whites of Mark Schwarzer's eyes but as pulled the trigger Australian defender Craig Moore got across to make a timely block. Just before half-time he sent another shot fizzing past the post.
Unison.ie Sports Desk June 18, 2006

In a Google News search, I found exactly 12 fizzing goal attempts (if you include that Trinidad "cross"). They all missed. The only on-target fizzing goal I found in current Google News was hypothetical,  in a report in The Times about goalkeepers who've been fretting about the new lighter World Cup football:

Robinson primed to deal with curve ball
Robinson's complaint has been echoed by Jens Lehmann, the Arsenal and Germany goalkeeper. Beckham or Roberto Carlos curling a fizzing shot into the top corner is one thing, but Robinson claims that even England's yeomen have suddenly been transformed into free-kick specialists.
Times Online June 21, 2006

Robinson and Lehmann needn't worry about those fizzing shots. As soon as they hear fizz, they know they should just leave the ball well alone. Touching it will almost certainly lead to a corner, whereas if they casually put on the shades they have in their back pockets and light up a cigarette, they're guaranteed (or their Language Log subscriptions will be returned in full) a goal kick after the wildly spinning ball passes harmlessly inches past the post. It's a bit like those Doodlebugs the Germans sent over London: if you heard the sound of one, you knew you were safe. Oops. Don't mention the war.

Mind you, in a search not restricted to current news, I did find evidence the fizzing balls can get into the goal, especially in hockey. In fact, 2 out of 2 fizzing hockey goal attempts I found were on target:

Columbians 3- 0 Beavers
Guernsey Hockey Cup Probably the best goal of the day was the last goal of the day. Cairns beat two defenders at the top of the D before fizzing a reverse stick drive into the corner of the goal.
BBC Guernsey Website March 23, 2006

Luton Town 3  - 1 Harpenden
The talented Luton side were not to be overawed however and quickly hit back with a goal from a short corner, the fizzing flick into the top corner leaving keeper Ben Brind with little chance.
Harpenden Hockey Club Feb 11, 2006

Hmm. This is either an effect of the stick, maybe even with the fizzing describing the stick action rather than ball motion, or else lower quality journos who don't get the whole fizzing thing. You know, like all this fizzing maybe has something to do with language? Ahem.

Now, the latest extraterrestrial science, from today's CNN: Earth surrounded by giant fizzy bubbles. Apparently, "the space above you is fizzing" and bubbles of superhot gas are being observed popping all around the Earth. The implications are clear. An intergalactic superspecies is playing target practice. And here is the bad news - remember you read it first on Language Log - we are the target!!! It's an intergalactic version of soccer, obviously: who would believe in aliens playing hockey? But the good news is that these guys, though slimy, green, multi-eyed, in possession of ultra-long-distance magnetic superheated plasma technologies of which we can only dream, and determined, lizard brains overwhelmed by the furious grip of football fever, that if they can't win the World Cup there shall be no World, can't shoot to save their sorry asses. Too flashy by half. As long as they keep fizzing the ball, we have absolutely nothing to worry about.

Posted by David Beaver at 03:10 AM

Time after time after time...

The Oxford English Corpus, a lexicographical research project on 21st-century English, has generated a surprising amount of copy for news organizations lately. First it was announced that the corpus had surpassed 1 billion words, though unfortunately the accompanying Associated Press article ran under the risible headline, "English Language Hits 1 Billion Words." Then it was revealed that the OEC is chock full (chalk full?) of pernicious eggcorns. Now the AP brings the latest news: the most frequently occurring noun in the OEC is time. The wire story (which was considered newsworthy enough to be reproduced by CNN, the LA Times, the Washington Post, the Guardian, and a host of other media outlets) begins as follows:

For those who think the world is obsessed with "time," an Oxford dictionary added support to the theory Thursday in announcing that the word is the most often used noun in the English language.

The AP writer apparently followed the lead of the press release out of Oxford, which reads:

It's official: we're a nation ruled by time

We like to be punctual, we expect our trains to run to schedule, and many of us spend our working day watching the clock. Now the new revised eleventh edition of the Concise Oxford English Dictionary can officially confirm that we are indeed ruled by time. Drawing on evidence from the Oxford English Corpus, the word time comes top in the list of commonest nouns in the English language, with year (3rd), day (5th) and week (17th) not far behind.

So is the ranking of time as the top noun really indicative of anything about the way we live in the 21st century? Well, first of all, who are "we"? The AP writer refers to "the world," apparently under the belief that the English language is now completely universal, while the Oxford press release writer more modestly refers to the "nation" (i.e., the United Kingdom), pitching the story to a domestic audience. The truth, of course, lies somewhere in between, as the sources used for the OEC are neither restricted to the UK nor open to non-English texts. The coverage is intended as a representative snapshot of the world's Anglophones, with US and British English dominating (together accounting for 80 percent of all texts) and the remainder devoted to a variety of world Englishes (Australian, South African, Canadian, Caribbean, Indian, Singaporean, etc.).

Restricting the discussion to the Anglosphere then, do the OEC findings imply that we are "obsessed with" or "ruled by" time? I'm not convinced. This seems like a kind of pop-Whorfianism not too far removed from the old "Eskimo words for snow" meme. But at least frequency data from a broad corpus of texts should be a little more telling than a simple count of words in the lexicon, right? Depends on what you want the data to be telling you. For instance, though time is the top-ranked noun in the OEC's list of most common words, overall it only ranks #55. The top spots are dominated by boring words like the, to, of, and, a, and in, which are hardly conducive to lively PR copy. There are also several verbs that rank ahead of time on the list: be (#2), have (#9), do (#19), say (#28), get (#47), go (#49), and make (#52). Does this mean we English speakers are obsessed with being, having, doing, saying, getting, going, and making?

Nouns are easier to get a hold on, even abstract ones like time, so it's not surprising that other parts of speech got neglected by the copywriters. (Imagine the headline: "We definitely love definite articles! The triumphs over a!")  But even just considering nouns, I fail to find anything particularly newsworthy about time being ranked the most frequent. Taking a look at other extensive English corpora collected in the past, we can see that the dominance of time is nothing new. The British National Corpus, compiled in the early 1990s, had time as the top noun, according to the frequency lists published in a book based on BNC data (Word Frequencies in Written and Spoken English). And we can go all the way back to the Brown Corpus of Standard American English, a million-word corpus derived from texts printed in 1961. Time was the top noun back then too (according to this list), coming in at #66 overall.

So has time at least jumped in the standings, if it's at #55 in 21st-century texts? Probably not. The OEC uses slightly different coding standards from the Brown Corpus, choosing to merge items that differ by number or tense. So, for instance, the Brown Corpus separately ranks is (#8), was (#9), be (#17), are ( #24), and were (#34), while the OEC lumps them all together under be, bringing that item all the way up to second place. Similarly, say, says, and said are all merged together under say, which stands at #28 on the OEC list — same goes for have, has and had. Once all of those mergings are accounted for, time ends up ranking just about where it was back in 1961.

This is not to say that the OEC is just a retread of old corpora. From what I've seen, it has already yielded fascinating new findings on a range of research topics. For instance, the OEC website gives some glimpses into the collocational data for the corpus, which could be immensely valuable for the study of snowclones and other phrasal patterns. Here's a sampling:

The idea of one's 'inner child', popularized in psychotherapy in the 1980s, has spawned an array of humorous variations. These illustrate the way that language is routinely exploited and extended, not as part of a literary endeavour but simply as part of normal creativity in language use. In the Oxford English Corpus the most common of these are (in order):

  • inner geek
  • inner nerd
  • inner diva
  • inner dweeb
  • inner slut
  • inner cynic
  • inner hippie
  • inner brat

Or how about this insight into the productivity of the suffix -fest?

The most common uses of -fest are: slugfest, lovefest, gabfest, crapfest, talkfest, gorefest, snoozefest, hatefest, bitchfest, snorefest, geekfest, gabfest, bloodfest, blogfest, songfest, shitfest, screamfest, filmfest, yawnfest, funfest, sobfest, plugfest, mudfest, fragfest, and suckfest.

Surely the rise of formations such as inner slut and suckfest would make for far more interesting reading than a story about the frequency of the word time? I await the next round of reporting on the corpus, which I hope my inner cynic will not consider a snoozefest.

[One final caveat about word frequency lists. If you see a list of "the most common nouns of English" — say, on Wikipedia — and that list finds room for such words as colony, continent, and slave, but not for way, thing, or life (all among the OEC's top ten nouns), be very, very skeptical. That list is evidently derived from one attributed to Jerry Jones on esl.about.com, which, to its credit, actually does show way, thing, and life in its top 250 overall. But Jones' corpus-gathering techniques are still highly suspect, since his list contains such oddities as hot at #30 (#776 in the Brown Corpus). The Jones list also has unusually high rankings for word, write, sentence, and spell, which suggests that the corpus leans heavily on ESL texts and the like.]

[Update: The Wikipedians are on the case, as that page of "most common nouns in English" has disappeared — the link now redirects you to a page of "most common words in English" based on OEC data. The old list of nouns to which I referred is still visible here.]

Posted by Benjamin Zimmer at 01:17 AM

June 21, 2006

Vincius vincit

[A guest post by Stefano Taschini.]

As a side note to Geoff Nunberg's "Da Bomb", I'd like to point out that Italian surnames have been formalized with the Council of Trento, which ended in 1563. For somebody born after that, as is the case for the other people mentioned in the post, there is no doubt that the "Da X" is the family name, and referring to them as Da Pos, Dall'Oca and so on is perfectly natural, whatever the etymology.

Until the end of WWII, people with noble titles were usually referred to by their estate. Camillo Benso Conte di Cavour (last Prime Minister of the Kingdom of Sardinia and first PM of the Kingdom of Italy) is usually referred to as Cavour, without the "di" preposition. Even with the Republic, the same scheme is followed for compound family names in which the second part is reminiscent of an estate: Luca Cordero di Montezemolo, chairman of Fiat and Ferrari, is usually referred to as Montezemolo, again without preposition.

People born before the Council of Trento did not necessarily have a family name. Earlier on, only prominent families had one, but by the low Middle Ages they were already rather common. Still, people born before the Council are primarily referred to by their first name even when they have a family name: Raffaello (Santi), Tiziano (Vecelli), Michelangelo (Buonarroti), consistently with the usage at the time. As already stated many times on the Language Log, Leonardo didn't have a family name and his birthplace, Vinci, was used to identify him.

Surnames were often used in latinized form (Santium, Vecellium). Lionardo (with "i") da Vinci was latinized as Leonardus Vincius, and Vincius alone appears in a few latin epigrams. One epigram is found in Vasari's "Lives of the most excellent painters, sculptors and architects":

Pinxit Virgilius Neptunum, pinxit Homerus
dum maris undisoni per vada flectit equos.
Mente quidem vates illum conspexit uterque
Vincius ast oculis, iureque vincit eos.

"Rightly, Da Vinci beats Virgil and Homer," according to the epigram's author.

[Guest post by Stefano Taschini]

Posted by Mark Liberman at 01:14 PM

Da Bomb

I haven't actually read The da Vinci Code -- well, no, lose the "actually" -- but I'm amply persuaded by Geoff P's catalogue of Dan Brown's stylistic vices, not to mention the complaints of any number of critics, that the book is chock-a-block with literary and linguistic howlers. On one count, though, Brown has gotten a bad rap. In enumerating the book's lapses, as Mark noted a while ago, critic after critic has remarked that the title itself is in error -- referring to the artist as "da Vinci," they say, is like referring to Jesus as "of Nazareth" or to William as "of Orange" or to Lawrence as "of Arabia," or whatever. Mark traced that comparison through its various avatars in pieces by Adam Gopnik, Jay Nordlinger, Charles Moore, and various bloggers, most of them giving due credit for the quip to other sources, though Mark Steyn simply stuck it in his lapel uncredited, as if it were a flower he'd plucked from his own garden.

But whatever Brown's other derelictions, the comparison of da Vinci to "of Nazareth" and the like isn't apt. As solecisms go, referring to Leonardo as "da Vinci" is a pretty venial error in Italian, and to call it an error at all in English is simply misplaced pedantry.

True, da Vinci means "from Vinci," and Italians generally refer to the artist simply as Leonardo, or a bit more pompously, as "il Leonardo." Still, even literate Italians use da Vinci by itself on occasion. On the same day a few weeks ago that I was reading Mark's post (from Rome, as it happens), I ran across Beppe Severigni's column in Corriere della Sera, which said:

Mi sembra quindi che Da Vinci abbia voluto esprimere nel modo più forte possibile il concetto nella divinità di Gesù. "So it seems to me that Da Vinci wanted to explain the concept of Jesus' divinity in the strongest possible way."

And a recent story about The da Vinci Code from the Italian news service ANSA, for example, contains the sentence:

Per esempio racconta che da Vinci, nella sua celeberrima "Ultima Cena", essendo a conoscenza della "vera" storia di Cristo, avrebbe dipinto, a fianco di Gesù, Maddalena più o meno camuffata. "For example he tells how da Vinci, in his celebrated 'Last Supper,' aware of the 'true' story of Christ, painted Mary Magdalene next to Jesus more or less in disguise."

A Nexis search turns up other examples from the Italian press. And there's scarcely a major Italian city that doesn't have an Albergo da Vinci or a Ristorante da Vinci. So if the usage is a solecism, it's hardly on the order of referring to Jesus as "of Nazareth," which no English-speaker, literate or no, would ever think of doing.

If many Italians are tempted to refer to Leonardo simply as da Vinci, in fact, it's because the rule here is by no means as obvious or cut-and-dried as Brown's critics make it out to be. There are plenty of people who are routinely refered to as "da X" in Italian, either because the connection of X with a place name is obscured, or because the da-phrase is felt to have become a family name. Italians are comfortable using "da X" form by itself when talking about the writer Valerio da Pos or the painters Angelo Dall'Oca Bianca and Antonello da Saliba, for example.

But you can hardly expect English-speakers to consult an inner Italian gazetteer when deciding whether it's acceptable to use "da X" alone in referring to a person. And in fact the English writer who chooses to refer to Leonardo simply as da Vinci can point to some pretty authoritative precedents. In volume I of his classic Renaissance in Italy, the Victorian critic and poet John Addington Symonds wrote:

The April freshness of Giotto, the piety of Fra Angelico, the virginal purity of the young Raphael, the sweet gravity of John Bellini, the philosophic depth of Da Vinci, the sublime elevation of Michael Angelo, the suavity of Fra Bartolommeo, the delicacy of the Della Robbia, the restrained fervor of Rosellini, the rapture of the Sienese and the reverence of the Umbrian masters, Francia's pathos, Mantegna's dignity, and Luini's divine simplicity, were qualities which belonged not only to these artists but also to the people of Italy from whom they sprang.

The 1876 catalogue of the Corcoran Gallery of Art spoke of "that galaxy of genius whereof Ghiberti, Da Vinci, Angelo, and Raphael were fixed stars." "In such a man as da Vinci," wrote the redoubtable Monroe Beardsley in his 1966 Aesthetics from Classical Greece to the Present. And there are other references to Leonardo as da Vinci in the Columbia Encyclopedia and a 1985 article in The New York Times by Jan Morris, who wrote:

Great minds have been fostered entirely by staying close to home. Moses never got further than the Promised Land. Da Vinci and Beethoven never left Europe.

So the objections to Brown's title and the analogies to "of Nazareth" and the like are a kind of bungled pedantry -- they manage to betray not just a limited knowledge of Italian, but an unfamiliarity with the English-language critical tradition. Or at least I assume that Nordlinger and Steyn wouldn't have dismissed the da Vinci usage with such airy condescension if they'd encountered it in Symonds or Beardsley. Brown's critics should have left bad enough alone.

[Ben Zimmer drew my attention to a post at Tensor that provided some other defenses of the "da Vinci" usage.]

Posted by Geoff Nunberg at 03:04 AM

June 20, 2006

A stricter prescriptivism

Here's a piece of mail we recently received at Language Log Plaza, from a correspondent who shall remain nameless so as not to inflame the ire of his already ireful boss:

I'm a first year reporter at a small town daily newspaper. I recently filed a story regarding a town's burning ordinance. In the story, I wrote that the town's burning ordinance was stricter than the state law. My editor read the story, and he was quick to point out that stricter isn't a word.

It is a word. Yet, my editor does not think so. At some point in his education, I assume, he was told that more strict and most strict are preferred to stricter and strictest.

Any idea from where this rule derives? Did Strunk and White advise more strict? Is this standard usage? Would the strictest of the strict prescriptivists tell me I am wrong and my editor is correct?

After conducting a quick poll of fellow Language Loggers, I can report a clear consensus: not only is the editor mistaken about the unacceptability of stricter, he's also flat-out nuts.

Okay, perhaps he's not nuts, but editorial power has clearly gone to his head, leading him down the road of stylistic tyranny. What possible objection could the editor have to stricter, to the extent that he would deem it "not a word"? It's a simple enough matter to find examples of its use by Shakespeare, Hawthorne, Austen, Hardy, Dickens, and any number of other notable English writers (see the list at the end of the post). And a moment on Google News or Yahoo News will find many thousands of hits for stricter in text recently produced by journalistic institutions near and far.

As readers of the New York Times now know, we here at Language Log enjoy "coming down hard on rules that ignore linguistic facts," as Michael Erard put it. And an arbitrary ruling against the word stricter is just about as ignorant of linguistic facts as it gets. But we also seek to understand the basis for even the most capricious fiat about language. Perceptions about "proper" usage, no matter how misguided, can still tell us a great deal about how we seek to structure our linguistic consciousness.

In the case of stricter, it's helpful to return to Geoff Pullum's Feb. 2004 post about another contentious comparative, wronger. As Geoff writes, we usually have no problem inflecting monosyllabic adjectives with the comparative suffix -er or the superlative suffix -est. Nonetheless, there are certain monosyllabic adjectives that never take a comparative or superlative inflection (like, loathe, worth), and others that rarely do (cross, ill, real, fake, wrong). But we could consider a third class of monosyllables that are deemed improper by some and not by others. It's usually adjectives of two syllables that elicit these grey-area judgments (e.g., often, common, pleasant), but there could very well be some monosyllabic adjectives that also fit the bill.

And it turns out that the dictatorial editor isn't the only person who has a problem with stricter. When a question came up a few years ago on the Usenet newsgroup alt.english.usage over whether to use stricter or more strict in a particular sentence, Carter Jefferson wrote:

I can't think of a rule that would preclude the use of either. I suspect I used "more strict" to give the sentence better rhythm. Also, "strict" is one of those words that for some reason sounds complete in itself (to me). Using the comparative seems to me to weaken it. I have absolutely no rationale for this.
Also, some words simply don't sound right in the comparative. Nobody says "beautifuler" except babies or, possibly, someone else just beginning to learn the language. "Stricter" isn't as bad as that to me, but I just don't like the sound.

At least Mr. Jefferson acknowledges having "absolutely no rationale" for preferring more strict beyond a vague dislike for the sound of stricter. But beautifuler is not a good point of comparison, since three-syllable adjectives hardly ever take -er or -est. (Alice voiced the rare exception: "Curiouser and curiouser!") In another post on alt.english.usage, Alan Jones declared a preference for stricter but said he had noticed a rise in the usage of more strict in British English:

The use of the suffixes -er and -est is certainly standard in BrE for "strict", as for most one- and two- syllable adjectives. The "more strict" form has become commoner [sic] in my lifetime (some 65 years of reading and writing English, almost 40 of them teaching it) but seems to me generally awkward and ugly.

From preliminary database investigation, I can discern no rise in more strict at the expense of stricter in recent decades. Stricter continues to far outpace more strict, though as with other monosyllabic adjectives more strict is often used for emphasis — or when strict appears in conjunction with another adjective, as in "I am more strict and formal than you." (That's a line from Hardy's Jude the Obscure; further literary examples of more strict can be found below.)

Still, stricter seems to elicit a low-level sense of linguistic anxiety for some speakers. Occasionally this is evident from hedging tactics in casual online texts:

Now, he's in CORPORATE, hehe, so he has alot more responsibilities and a much stricter/more strict?(whatever) schedule to keep. (hess is jess)

I started with light exercise and minor diet changes and worked my way into tougher routines and stricter (more strict??) eating habits. (alt.support.diet)

Why should the comparative inflection for strict, as opposed to any other one-syllable adjective, sound "wrong" to certain speakers of English, so much so that an editor would go so far as to redact it? My working theory is that the final consonant cluster of strict creates a problem. The cluster /-kt/ is relatively unusual for short adjectives — except for participial adjectives like packed or locked, which, of course, can't take -er/-est. The only other "bare" (monomorphemic) adjectives ending in /-kt/ that I can think of all have at least two syllables, such as exact, intact, correct, erect, and abject. And those are all in the grey area for inflectability (for instance, exacter sounds okay to me, but not exactest).

So it seems that strict lacks similar sounding forms that could provide an analogical incentive for accepting the comparative/superlative inflections. In fact, past-participial forms like packed and locked might serve as an analogical disincentive, since we know that they're always uninflectable. (Well, almost always — you can find the occasional exception like damnedest/darnedest, though the comparative damneder/darneder is quite rare indeed.)

For this theory to have any plausibility, we should expect similar problems with other monosyllabic adjectives that have final consonant clusters associated with uninflectable past participles. One such adjective that fits the bill is vast, and sure enough vaster can create similar reactions of uncertainty or even prescriptivist backlash:

The majority of users are naive, and therefore do not want a "much vaster functionality"
(grammar check, much more vast? Not sure, it just sounds funny)... (alt.destroy.microsoft)

> and there is a reality of our souls that is far vaster [...]
Grammar Alert: that should be "far more vast." (alt.tv.mst3k)

>> > Are you saying that the common knowledge of Westerners is vaster than
>> > that of the Japanese?
>> I think it's 'more vast than that of...'
> Not sure.  Is "vaster" incorrect?  My spell checker didn't catch it.
It probably looks right to a spell-checker, but vaster seems an odd usage. (soc.culture.japan)

As your resources are certainly vaster (more vast?) than mine... (Take Our Word For It)

So the final consonant cluster /-st/, like /-kt/, might make vast problematic for some speakers who associate it with uninflectable forms like passed. But wait... what about fast? No one ever complains about faster or fastest. In the case of fast, sheer frequency of use outweighs any associations that the final consonant cluster /-st/ might have. We've heard faster and fastest from very early on in our linguistic development, whereas vaster and vastest occur far less frequently in everyday usage.

Another potential case is fond, with its final cluster of /-nd/ perhaps reminiscent of past participles ending in -(n)ned. Fond is not as common as fast, but the comparative and superlative inflections are familiar enough from set expressions like "Absence makes the heart grow fonder," "Fondest memories/wishes/regards," etc. And yet even fonder can occasionally lead to some hedging:

This way people would be fonder (more fond?) of an reunion, if there's going to be one. (alt.tv.friends)

The sporadic uncertainties over stricter, vaster, and fonder, as with the more obvious unacceptability of wronger, may defy any simple rule-based explanation. Historically speaking, comparative and superlative inflections are an odd vestige in English, since other types of adjectival inflection (for gender and number) disappeared from the language long ago. Hence we make our acceptability judgments using probabilistic rules of thumb (one syllable: yes; two syllables: maybe, three or more syllables: no), but in general we learn what adjectives we can inflect on a lexical, or word-by-word, basis. In the face of such inexactitude, it's no wonder that the prescriptively minded among us react by laying down the law, even when that law is stricter (more strict?) than reality would dictate.

As promised, here are ten examples of stricter from notable literary works:

Take no stricter render of me than my all. (William Shakespeare, Cymbeline)

This friend was the gamekeeper, a fellow of a loose kind of disposition, and who was thought not to entertain much stricter notions concerning the difference of meum and tuum than the young gentleman himself. (Henry Fielding, Tom Jones)

We are not wont to show an idle courtesy to that sex, which requireth the stricter discipline. (Nathaniel Hawthorne, Twice-Told Tales)

Mrs. Norris...consoled herself for the loss of her husband by considering that she could do very well without him; and for her reduction of income by the evident necessity of stricter economy. (Jane Austen, Mansfield Park)

Madame Poincon, who was stricter in some things even than you are, used to wear ornaments. (George Eliot, Middlemarch)

Such an arrangement being in stricter conformity with the absolute wording of old Hiram's will. (Anthony Trollope, The Warden)

"I must be a little stricter than that," he said. (Thomas Hardy, The Mayor of Casterbridge)

Having given this explanation, Mrs Squeers put her head into the closet and instituted a stricter search after the spoon, in which Mr Squeers assisted. (Charles Dickens, Nicholas Nickleby)

Is an illicit affair like a gambling debt — demands stricter honor than the legitimate debt of matrimony, because it's not legally enforced? (Sinclair Lewis, Main Street)

The stricter her honesty, the greater the fraud she would be asked to suffer at their hands. (Ayn Rand, Atlas Shrugged)

And for balance, ten examples of more strict:

Yes, truly; I speak not as desiring more, but rather wishing a more strict restraint upon the sisterhood, the votarists of Saint Clare. (William Shakespeare, Measure for Measure)

And yet I know not, on a more strict examination into the matter, why we should be more surprised to see Greatness of Mind discover itself in one Degree, or Rank of Life, than in another. (Henry Fielding, Amelia)

"They claim," said the clergyman, "to represent the more strict and severe Presbyterians." (Sir Walter Scott, Waverley)

And, after all, the idea may have been no dream, but rather a poet's reminiscence of a period when man's affinity with nature was more strict, and his fellowship with every living thing more intimate and dear. (Nathaniel Hawthorne, The Marble Deer)

If we would make more strict inquiry concerning its origin, we find ourselves rapidly approaching the inner boundaries of thought. (Ralph Waldo Emerson, Lecture on the Times)

No widow, since the seclusion of widows was first ordained, has been more strict in maintaining the restraints of widowhood, as enjoined. (Anthony Trollope, The Prime Minister)

I am more strict and formal than you, if it comes to that. (Thomas Hardy, Jude the Obscure)

He, who gambled away tens of thousands at one roll of the dice and laughed at it, became more strict and more petty in his business, occasionally dreaming at night about money! (Herman Hesse, Siddartha)

Mr. Clutter may have been more strict about some things — religion, and so on — but he never tried to make you feel he was right and you were wrong. (Truman Capote, In Cold Blood)

He was just a touch more strict with us than ever. (Robert Heinlein, Starship Troopers)

[John Cowan writes:

Do you really have strong intuitions about these adjectives never taking comparative and superlative inflections? I don't have access to the OED, but NID3 shows worth (adj) as entirely obsolete, loath(e) only in frames like "COP loath(e) to INF", and the examples for "like" seem rather archaic to me -- some would be "alike", others "likely" in more modern expressions.

Indeed, it is precisely because these words survive as adjectives only in very circumscribed contexts — "of like mind," "loath(e) to (do something)," "for all he is worth" — that the comparative and superlative inflections no longer work for them. The chapter on inflections in CGEL should provide further enlightenment on this point.

And Mark Etherton writes:

In your recent post on Language Log you say that "Alice voiced the rare exception to the rule that three syllable adjectives hardly ever take -er or -est". The relevant sentence in Alice's Adventures in Wonderland begins:

'Curiouser and curiouser!' cried Alice (she was so much surprised, that for the moment she quite forgot how to speak good English);
It seems to me that for Carroll - and I suspect for most native speakers - 'curious' is not really an exception to the rule: the only time it takes -er is when there is an at least implicit reference to Alice.

This is quite true. I should have said that the curiouser exception only holds for Alice and those who want to talk like her.]

Posted by Benjamin Zimmer at 09:39 PM

Speak we proper English?

"What's this with people using IMPACT as a verb?" "But it has to be BILLY AND I WENT TO THE STORE because I is a subject." "Kids are saying 'I'M ALL, YOU CAN'T COME WITH ME.' What's up with that?" That grand old idea: there is some logically "correct" English that for some reason most people can't quite pull off, like they don't floss enough. But one thing people miss is what an arbitrary bastard patchwork of a mess English actually is, and not just because (ho-hum) it has a bunch of French and Latin words in it.

That part, after all, is easy. Everybody knows and accepts that language changes in terms of the fact that words come and go. That seems natural, since life changes. And no problem that the words come from other languages sometimes; that's natural because people mix. Plus it's hard to mind having taken on so many words from romantic French and noble old Latin.

People get itchy with the idea that GRAMMAR changes. But the very birth of Modern English, the language we are now taught is "proper," entailed the complete thrashing of an earlier English. If English had developed according to its own devices, then today we English speakers would find German pretty easy to pick up, just as Dutch speakers do. Instead, we speak a deeply odd singleton of a tongue.

For example, when's the last time you learned a language other than English that uses DO to form questions and make sentences negative? Think about it. "DO you like fish?" "I DO not like fish. " As languages go, this is really, really weird. Of course you might not think so if you were one of the 7000 people who speak, say, Nanai way, way, way down East in Siberia. But otherwise, some of the very few languages on the planet where people use DO this way are, as it happens, Welsh, Breton, and Cornish. These are Celtic languages spoken right alongside English for 1500 years-plus.

For centuries after (Old) English speakers came to Britain, the original inhabitants commonly spoke both a Celtic language and English. Increasing numbers of linguists are arguing that these people gave English a goodly chunk of not words, but grammar, the way the language is put together from the ground up. Traditionally, linguists have reconstructed ways that things like DO could have arisen just all by itself. But these accounts often leave as many questions as answers, one being why NO other Germanic language developed something similar.

Given that Celtic languages were right there alongside English all the time, if English and Celtic share features of grammar that are rather unusual worldwide, then obviously Attention Must Be Paid to Celtic. Another Celtic inheritance is almost certainly the progressive -ING one: in any other Germanic language, where I say I AM WRITING they would just say I WRITE. Again, Celtic languages have always done it as English now does.

I'M KNITTING THIS; DO YOU LIKE IT? would have sounded hopeless in Old English: about as hopeless as I KNIT THIS; LIKE YOU IT? does to us. However, that is how Old English rendered the sentence, just like any card-carrying Germanic language but English still does today. The only reason that I'M KNITTING THIS; DO YOU LIKE IT? is now "proper" English is because English took a walk on the wild side with Welsh and Cornish. Surely there were people back in the day thinking that I'M KNITTING THIS; DO YOU LIKE IT? was "bad" English.

Clearly, the whole notion of "good" and "bad" made no sense then. Well, what about now? Back in the day no one used DO to form questions. We do now. Back in the day no one used IMPACT as a verb. They do now. I doubt anyone would say that it's only okay for our grammar to change in ways that make it more like other languages, and so we can't say that DO is okay because it's Celtic but IMPACT is wrong because we're doing it ourselves.

Sure, we need a standard language. Every new thing that people start saying cannot be immediately ushered into the house style of the New York Times. But even if it were, it would not spell the death of civilization. More to the point, the novelties in casual speech are just different, not wrong. After all, the grammar I am writing in seems "proper" enough, doesn't it? And yet to the authors of Beowulf, that last sentence would sound like "good" English mangled by a Celt!

Posted by John McWhorter at 03:29 PM

In China It's ******* vs. Netizens

That's the title of Nicholas Kristof's op-ed piece in the New York Times today, about getting politically sensitive material past the 30,000 or so Internet censors in China.  Kristof's point is that the enormous amount of blogging being done -- he estimates that 120 million Chinese are on the net -- will overwhelm the censors most of the time.  Meanwhile, he blogs mischievously, and the worst that's happened so far is that avoidance characters sometimes get inserted in his Chinese text.

After two provocative postings that were untouched by the censors, Kristof tries again:

    Desperate, I mentioned Falun Gong, the religious group that is the Chinese government's greatest enemy: "In Taiwan, the Chinese people have religious freedom.  So in the Chinese mainland, why can't we discuss Falun Gong?"  That instantly appeared on both my blogs as well, although on one the characters for "Falun" were replaced by asterisks (functioning as pasties, leaving it obvious what was covered up).

    Finally, I wrote the most inflammatory comment I could think of, describing how on June 4, 1989, I saw the Chinese Army fire on Tiananmen Square protestors.  The two characters for June 4 were replaced by asterisks, but the description of the massacre remained intact.

The Chinese net censors seem to have borrowed the Western scheme of taboo avoidance via conventional characters replacing orthographic units within the taboo expressions -- letters for languages with alphabetic writing systems, characters for Chinese.  In fact, the censors have taken up the most common avoidance character used in English, the asterisk.

This is a very odd form of CENSORSHIP.  (We'd expect the censors to be suppressing entire postings containing taboo expressions, or closing down sites completely -- and these tactics are also used in China.)  The asterisks of "f**k" are a kind of ADVERTISEMENT of the taboo material; they ostentatiously avoid the physical form of this material while unambiguously  communicating it, "leaving it obvious what was covered up", as Kristof puts it.  Using asterisks this way in Chinese scarcely stifles dangerous political discussion.  In fact, if the asterisk (a decidedly non-Chinese character) is being employed extensively in net censorship -- perhaps inserted automatically by spellchecker-like software -- it provides an easy way for readers to find politically racy material: just search for asterisks!

zwicky at-sign csli period stanford period edu

[A sampling of other Language Log posts on taboo words and techniques for avoiding them:

Maybe better make that "freakingly brilliant" (1/25/2004)
The FCC and the S-word (1/25/2004)
Teaching the difference between right and wrong (1/25/2004)
The S-word and the F-word (6/12/2004)
Three vs. four askerisks at Boondocks (9/22/2004)
All's fair in love and redacted (12/21/2004)
Unredacted discussion (12/28/2004)
Twat v. Browning (1/19/2005)
Adios, FCC? (6/23/2005)
Disparaging trademarks and the lexicography of tools (7/16/2005)
Adverbial license (7/17/2005)
You taught me language, and my profit on't / is, I know how to curse (7/17/2005)
Curses! (7/20/2005)
No fuckin' winking at the Times (8/17/2005)
Call me... unpronounceable (9/6/2005)
Football's F-word (11/29/2005)
Standing up to linguistic terrorism (2/3/2006)
The N-word in the news again (3/17/2006)
Twonk! (3/30/2006)
"Thinking specifically about the F-word" (4/2/2006)
Everything is too appropriate these days (4/5/2006)
A brief history of "spaz" (4/13/2006)
Delete expletives (4/29/2006)
"What up, Nick--?" (5/31/2006)
Words that can't be printed the NYT (6/5/2006)
Goram motherfrakker! (6/7/2006)
Wh�tever (6/7/2006)
The history of typographical bleeping (6/10/2006)
The earliest typographically-bleeped F-word? (6/15/2006)

Posted by Arnold Zwicky at 10:47 AM

The gray lady goes up against fark.com

If you linked here from Michael Erard's article in today's Science Times ("Analyzing eggcorns and snowclones, and challenging Strunk and White", 6/20/2006), let me offer a quick guided tour.

You're probably starting with the blog-standard index page, which has the beginnings of the past two weeks' posts, in reverse chronological order, showing enough of each one to let you decide whether you want to read the rest of it. In the right-hand column, you'll find a list of some of our readers' favorites, as measured by our server logs and other analytics. Scroll down below the list of favorites, and you'll find the archives back to July of 2003, followed by links to home pages of contributors, and a "blogroll" in alphabetical order of other language-related weblogs.

Since the blogroll is pretty long, you could try Trevor's Langwich Sandwich for a survey of language-related blog posts, compiled automatically from RSS feeds (Trevor's own blog is here). Or you could start with a sample, say Diacritiques, Double-Tongued Word Wrester, Jabal al-Lughat, Language Hat, No-sword, Pinyin News, Technologies du Langage, Tenser said the Tensor, and Transblawg. The links you find there will quickly lead you into a network of other intesting people and places.

A couple of other recent articles in the popular press about Language Log: Nathan Bierma in the Chicago Tribune ("Two potentially bad ideas turn out to be winners", 5/17/2006), Linda Seebach in the Rocky Mountain News (" Brown's body of work lies a-smouldering on the web", 6/3/2006), Jan Freeman in the Boston Globe ("What do linguists do?", 6/18/2006).

It's 6:30 a.m. here in Philadelphia, and our site metering tells me that Erard's article has sent about 220 of you here since Erard's article was posted on the NYT web site, or about 10% of our visitors over that period of time. We're grateful for the traffic, but I'll be interested to see how the NYT's readership measures against a real media giant such as fark.com. We got farked on June 14, when fark's editors linked to a scholarly exploration of the history of certain typographical practices. You can see the spike in traffic on this graph, amounting to about an extra 10,000 page views. How does the gray lady stack up against fark? Keep clicking to find out.

[Note: we don't routinely enable comments, due to our experience with spam and trolls, but if you email me, I'll be happy to post a comment on your behalf.]

[Update 6/21/2006: as this graph indicates

fark.com wins on visits, but the New York Times wins on page views. Being farked got us an extra 11,500 visitors on the day it happened (about 18,000 vs. the normal mid-week peak of aound 6,5000), while the NYT added only about 4,000 (to about 10,500). But fark only added about one page-view per visitor, whereas the New York Times resulted in about 25,000 extra page views -- each extra NYT visitor apparently read six or so pages on the site. ]

Posted by Mark Liberman at 06:35 AM

June 19, 2006

For the millionth time, it's not hyperbole

About a million people have written to me to explain to me very gently and patronizingly (the. way. you. talk. to. a. very. young. child) that the answer to my puzzle about Daniel Gilbert's strange remark about forgetting word pronunciations is that he didn't mean it literally; it was hyperbole, my correspondents tell me: exaggeration for rhetorical or humorous effect.

Let me make it perfectly clear to you morons all the kind people who have written to me: I agree he didn't really mean what he said. He was intending to be light and funny. I realize that, and I didn't need you turkeys to tell me thank you sincerely for your constructive advice and assistance. But as I have now said in roughly a million emails, it seems to me that you are all wrong. His remark is not construable as hyperbole. Here's why.

Hyperbole takes a claim and exaggerates it, so that if the hyperbolic version were true, the original claim would be true a fortiori. I do know about humorous uses of hyperbole. I believe I used it in the first line of this post; wouldn't you say so? Take that as an example. My underlying claim is that lots of people wrote to me. If my exaggeration in calling it millions were really true, the underlying claim would be all the truer.

Or take the remark that Jan Freeman quoted yesterday from Mark Liberman about why overheard cellphone conversations are annoying: he says that when you are listening to only one end, "you can't help yourself from trying to fill in the blanks. And after a few seconds of this, your paracingulate medial prefrontal cortex is throbbing like a stubbed toe." His underlying claim is that the part of your brain that tries to figure out what the other person is saying gets a little bit of a workout and it causes discomfort. But it doesn't really hurt with the sudden agony of a stubbed toe. If it did, however, his claim about it causing discomfort would be all the truer.

As I have tried to explain, patiently but fruitlessly, to most of the hyperbolically enumerated million dimwits people who wrote in, Gilbert's figure of speech is very different. Take his underlying (broadly true) claim that parents don't get to go to the theater much after the kids are born. If he had said that parents forget what going to the theater is like, that would be hyperbole (they don't forget, it just becomes a tiny bit unfamiliar as far as recent experience is concerned). And if the hyperbolic claim were true — if they completely forgot what happens in theaters — then the underlying claim would be all the truer. However, he says instead that parents actually forget how to pronounce words like "theater". If they did, that would not make the underlying claim true. The loss of this snippet of pronunciation information would not mean that they had forgotten their experiences of theater. Nor would it mean they couldn't go: they get in a taxi, take it to Broadway, and point; or they could tell people they wanted to go to the big building downtown with the lights and the curtain and the actors.

The peculiar thing about Gilbert's purported figure of speech is that it is not hyperbole. Not unless you adopt some very weird counterfactual assumptions. The psycholinguist Mark Seidenberg points out to me that you actually can't forget the pronunciation of a word, because it's all bound up with the pronunciation of similar sounding words. Even if you could, the meanings of the words could stay with you (I don't know whether "sisal" is supposed to rhyme with "thistle", "sizzle", or "Faisal", but I know my apartment has sisal matting in the living room). You have to adopt an extraordinarily strange view in order to have it as a presupposition that "cannot pronounce the word movie" is an exaggeration of "has not seen any movies in quite a while".

So don't keep telling me it's hyperbole for humorous effect. It isn't that. And since it doesn't seem especially funny, I was just wondering why anyone would ever do this kind of thing (saying something about words that couldn't be true instead of something about things that could). If it created instant mirth, fine. But not many people seem to think it can be justified solely by merriment creation. In fact most of the people seemed to be satisfied just with understanding its drift, and explaining it to me. What we have here is a failure to communicate. My fault, I suppose. I just haven't managed to explain my feeling of puzzlement sufficiently well.

Posted by Geoffrey K. Pullum at 09:59 PM

The four meanings of an Arabic word

Khaled Ahmad's most recent Word for Word column in the (Lahore, Pakistan) Daily Times ("Is camel beautiful?", June 18, 2006) tell us that

The word for beauty [in Arabic] is “jamaal” which is taken from “jml” the root that means camel. The popular name in Urdu Jameel means beautiful. Why are the Arabs so taken up with the camel? [In Urdu] We call it “oont” which points to the animal’s very ugly lip. [...] Hundreds of Arabic words are derived from the various motions of the animal. [...]

The Arabs got the horse from outside their region; but once they got used to its qualities, they took it to their heart. Two very important Arabic words that we use today are derived from the horse and not from the camel.

The root sas in Arabic points to the horse and the word siyasat for politics is derived from it. The art of training a horse through a saees — another word in Urdu meaning horse-trainer from the same root — is supposed to be akin to statesmanship.

Another word we use for wisdom in Urdu is farasat. This comes from the Arabic root frs and means horse.

This reminded me of an old joke about Arabic lexicography that I couldn't quite remember, so I appealed to Roger Allen, who furnished this version:

Every Arabic word has a basic meaning, a second meaning which is the exact opposite of the first, a third meaning which refers to either a camel or horse, and a fourth meaning that is so obscene that you'll have to look it up for yourself.

Roger expressed some skepticism that there is really a historical relationship between the "camel" and "beauty" words:

Yes, the words "jamaal" and "jamiil" ("beauty" and "beautiful" respectively) and the word "jamal" meaning "camel" are all formed from the same root structure, made up of the three consonants J - M - L. It is impossible to know (until we have done much more rigorously analytical and historical work) what mono- or-bi-consonantal combinations preceded the formation of this tri-consonantal cluster J - M - L , with its two entirely separate and otherwise unlinked sets of meanings. In other words there is no valid reason for linking the concepts of "beauty" and "camel" except for the fact that, in Arabic, they are both derived from the same tri-consontal root structure.

John McWhorter has warned us against taking patterns of polysemy as philosophical or sociological essays ("Mohawk philosophy lessons", 11/18/2003). I suppose that stories about Arabic camel-love and beauty may help adult learners to remember words, just as similarly fanciful stories about Chinese characters may help some people to memorize them.

Khaled Ahmad's column contrasts the Arabs' (allegedly excessive) affection for camels with the relative indifference of the Jews:

Their fellow Semites, the Jews, call the animal gamal but do not derive the word for beauty from it. The special relationship with camel is found among the Arabs, much less among the Jews. English word camel came from Latin, camelus.

He also attributes a key feature of Egyptian Arabic pronunciation to Jewish influence:

The original Greek khamelos has been borrowed from ancient Phoenician. In Egypt it is the Jewish pronunciation of the letter jeem that is followed. So if you are named Jamaal it will be pronounced Gamal as in Gemal Abdul Nasser.

I gather that this version of Egyptian linguistic history will be a surprise to historical linguists as well as to Egyptians. Tim Buckwalter suggested that this aspect of Egyptian Arabic phonology probably came straight from Coptic, along with some syntactic features not found in any other Arabic colloquial (discussed in this Wikipedia entry).

It's interesting to see that the hunger for nuggets of etymological, historical and cross-cultural information about language is international, and it's odd that there are not more sources of such information that are popular but also accurate.

I also need to remind everyone that it's the Somalis who are most deeply into camels.

[Update -- E. Phoevos Panagiotidis emails:

Only today did I stumble upon your Language Log, which I have been perusing with great interest for a while, at the expense of my doing valuable (?) admin work for the Department...

I would only wish to contribute a factual remark at this point. In your recent post on 'the four meanings of an Arabic word', you quote Khaled Ahmad about the Greek word for camel being 'khamelos'. That would actually be 'kamelos' instead.

Indeed. Here's the LSJ entry, and a citation in Aristophanes, which is interesting for another reason -- it's the earliest reference I've ever seen to a "bat out of hell":

Near by the land of the Sciapodes there is a marsh, from the borders whereof the unwashed Socrates evokes the souls of men. Pisander came one day to see his soul, which he had left there when still alive. He offered a little victim, a camel, slit his throat and, following the example of Odysseus, stepped one pace backwards. Then that bat of a Chaerephon came up from hell to drink the camel's blood.

Who knew? ]

[Update -- Alex Gretlein writes:

Thank you for taking on Khaled Ahmad, whose Word for Word column is consistently full of the headache-inducing incorrect assumptions you have identified. In the piece you mentioned, we might also add that "oont" उंट (camel) is derived from the Sanskrit "ushtra", related to the Persian shatr, and has - as far as I know - nothing to do with the camel's "hont" (lip), beautiful or otherwise. Also, the preferred pronunciation for his horsemanly word meaning insight is "firasat" (Firuz al-Lughat), although it is commonly (or as Platts would have it, "vulgarly") pronounced as Khaled Ahmad has it.

There is a species of middle-aged man - most of them English professors or journalists - common in India and Pakistan. They might have an interest in Urdu (though they use English almost exclusively, at least in their intellectual and professional lives), they acquire a shallow knowledge of Arabic or Persian, and some pop etymology, and then they run with it, never bothering to check with reference works or people who might actually know.

Of course, there are specimens of closely related species to be found in Europe and North America, and for all I know in East Asia and in other places as well. There's a positive side: such people reflect a widespread interest in linguistic history and linguistic analysis. And if the stuff in the popular press that responds to this interest is mostly produced by people who are not as well informed as they should be, maybe that's the fault of the people who know better. ]

Posted by Mark Liberman at 05:11 PM

June 18, 2006

My pop adjective

In reponse to the recent post about Doonesbury's generalizing the idiom "my bad" to "my brave", Mark McConville sent in a reference to a 2004 song by REM, "Leaving New York", which includes the passage

You might have laughed if I told you
You might have hidden a frown
You might have succeeded in changing me
I might have been turned around
It's easier to leave than to be left behind
Leaving was never my proud
Leaving New York, never easy
I saw the light fading out

The "leaving was never my proud" line seems to be genuinely part of the lyric, not a mondegreen: listen for yourself (the full version is here).

If this is the same construction, it's a more interesting variant, since it's integrated into a sentence (as a predicate nominal) rather than being used an isolated fragment, as it is in the usual "my bad" = "my mistake" pattern, and also in the Doonesbury "my brave" example.

But it's not clear that it's the same thing. In "my bad" (and "my brave"), I've always assumed that it's what I did that's bad (or brave), not me -- though I suppose that some transitivity of responsibility applies. However, in "leaving was never my proud", it's got to be me that's (not) proud, not leaving that's proud (or not).

All the same, I wonder what else is out there in the way of possessive_pronoun+adjective noun phrases.

[Pekka Karjalainen writes:

That line with "my proud" is also printed in the lyrics sheet that comes with the CD.

I discovered that my REM collection was dusty. Horror!


[Update: the "never my proud" line was discussed in a Rolling Stone interview, reproduced here:

Q)There's a recurring line in the new single "Leaving New York" where you sing, "Leaving was never my proud." What does that mean, exactly?

JMS)It's ungrammatical, and I had a discussion with [bassist] Mike Mills about it, but the feeling was that the line said what I wanted it to say, so I stuck with it.


[Update -- Adam Cooper sent in another REM lyric, Fretless, which he observes "[is] not an example of the "my proud" construction but ... interesting nevertheless" because "in the the last verse is the line 'Don't threaten me with angry'":

Reach for each other before you leave Reach peace with an E-A-C Don't threaten me with a gentle tease Don't threaten me with angry Please, please, please Don't try to tell me what I am

It suggests that uses adjectives for nouns -- especially when they rhyme better -- might be a standard lyrical trope for Michael Stipe.]

Posted by Mark Liberman at 10:40 PM

A life without "outwith"

I learn new English nouns and verbs and adjectives all the time, but it's not very often that I learn a new English preposition. A story today in The Scotsman by Elizabeth Carr-Ellis "Polls predict Catalans will say yes to radical plan for devolution" taught me one, via this sentence:

However, opinion polls have claimed that more than 50 per cent of Spaniards outwith Catalonia are against the statute, with many right-wing politicians claiming it will begin the end of Spain as a country and leave Madrid with no-one to govern and no money to govern with. [emphasis added]

"Outwith Catalonia"? When I first read this, I wondered if it was a typo. But searching Google News this evening for {outwith} produces 97 results, for example:

Only two teams – Leiria of Portugal and Silkeborg of Denmark – of the 32 all-time winners have historically come from outwith the big four leagues of France, England, Spain and Italy.
I stopped saying yes to literary festivals outwith my native Scotland, in one of those fits of self-subversion – “That’ll show them”.
Edinburgh City Council health and social care director Peter Gabbitas admitted gaps in services existed, but could not say how many autistic people had to travel outwith the city to access care.
Outwith the pharma and healthcare sectors, China and India have been hogging much of the economic headlines of late, though Russia continues to offer attractive opportunities for UK firms.
In a rare move to make their views public, the Sheriffs' Association, which represents 90% of those who sit at Scotland's courts, claims the executive's proposals are unfounded and outwith the devolved powers of ministers.
CAB staff emphasised that the problems that arose were usually with the employment agencies outwith the region and not the companies and farmers employing them.
Catriona Mackie from VisitScotland said: “This was fantastic for Ayrshire and Arran as events have a huge role in attracting visitors from outwith the area to discover our beautiful scenery."

It's clear from the contexts that outwith means something like "outside of"; and from the sources, it's evidently a feature of Scots English. The OED says that it's "Now chiefly Sc.", and glosses its first sense as "1. Outside. a. In a position or place outside of; ... b. To a position or place outside of; ... "

Merriam-Webster's Unabridged glosses it as

1. chiefly Scotland : outside of : out of
2. chiefly Scotland : EXCEPT

and Encarta says

Scotland outside: outside or beyond

The word is not in the American Heritage Dictionary, which makes me feel very slightly better about not having noticed it until now. And according to LION, Robert Burns never used it; in fact it occurs in only 15 places in the history of English-language poetry, most of them from the 14th and 15th centuries, in contexts like this example from John Barbour's The Buik of Alexander (c. 1390):

7892 We be on hors all halely,
7893 Armit with speiris and with blasounis,
7894 Ane lytill outwith the pauilliouns,
7895 The standart dressed vp of Inde,

But by a curious coincidence, one of the four examples of outwith in 20th-century poetry comes from a sonnet by Hugh MacDiarmid about Catalonia, titled Der Wunderrabbiner von Barcelona. The source is given as "hitherto uncollected poems contributed to books and periodicals (1920-1976)". It's dedicated "To Else Lasker-Schüler", who wrote a book of the same title published in Berlin in 1921, so perhaps it dates from the 1920s.

1 Outwith the walls of Barcelona dwelt
2 A wonder-Rabbi humbly and alone
3 Whose eyes with radiance unearthly shone,
4 Whose holiness through all the land was felt.
5 The hearts of all he met a glance would melt.
6 The Jews adored him as their saintly own
7 And Christians, swift to throw the hostile stone,
8 Towards him at all times deferentially dealt.

9 There came a pogrom of the Jews at last
10 And naked corpses in the streets were cast.
11 The Rabbi deep in meditation came
12 Oblivious of the blood in which he trod.
13 Unseen the murderers stood beneath the flame
14 Of eyes that shone remote as th'eyes of God.


[Steve from Language Hat writes:

Nice to see my favorite Scots poet quoted on the Log! The Complete Poems has the Wunderrabiner poem as 1923.


Posted by Mark Liberman at 09:53 PM

Feeling hitterish with Diz and the Babe

During the broadcast of today's game between the New York Yankees and the Washington Nationals on the YES network, announcer Michael Kay had this to say about Alfonso Soriano (once a Yankee, now a National):

"You know, one of the words we hear lately now is hitterish. And Soriano always looks hitterish. Always looks like something is about to happen."

Last year Dick Kaegel of MLB.com also noted the rise of hitterish in a column about Kansas City Royals right fielder Matt Stairs:

"I feel good. I know every time I step in the box I have a chance to get a hit," he said. "I feel hitterish."
"That's the word of the year," he declared. "I've never heard it before."
"I've heard it before," teammate Tony Graffanino interjected.
Stairs cast a doubtful look his way.
Anyway, "hitterish" was used by Mike Sweeney during Spring Training and, as they say, the word is spreading. Somebody call Mr. Webster.

Turns out the Recency Illusion is just as pervasive in baseball as it is in other fields of human endeavor. As the New Dickson Baseball Dictionary points out, hitterish was often used by Dizzy Dean (1910-1974), the pitcher-turned-broadcaster who was renowned for his eccentric use of  language. And it goes back even further than that, to the Babe Ruth era if not earlier.

Though I haven't found a directly attested use of hitterish by Babe Ruth, it has often been attributed to him, as in this reminscence by Ford Frick in Robert Creamer's 1974 book Babe: The Legend Comes to Life:

Sometimes before a game he'd say, "I feel hitterish today. I'm due to hit one."

The hitterish line has been repeated in various Ruthian representations, from the young adult novel Babe & Me to the 1992 biopic The Babe, starring John Goodman in the title role. But it wasn't just Ruth who was using the word. The earliest example I've found on the Proquest newspaper database is a 1927 quote from Washington Senators first baseman Joe Judge, commenting on his success against Ruth's Yankees (in the season of Murderers' Row, no less):

He always does damage with the willow in the Yankee stadium and, on the way back from the park after hostilities had been declared off, remarked that he was in "a particularly hitterish mood today."
Washington Post, Apr. 28, 1927, p. 15)

When Dizzy Dean became a broadcaster for the St. Louis Cardinals and Browns in the 1940s, hitterish joined his stable of much-derided "Dizzy-isms." Dean's linguistic stylings came to national attention on July 21, 1946, when the Associated Press reported that a group of Missouri schoolteachers had complained to the FCC about his "errors of grammar and syntax," claiming that his broadcasts were having "a bad influence on their pupils." The AP said that the teachers' complaint hadn't stopped "the former Arkansas cotton picker" from using "his graphic, if not grammatical expressions." Examples of Dizzy-speak given in the AP article include dialectal past-tense forms ("Slaughter slud safe into second," "Marion throwed Reiser out at first"), malapropisms ("The runners held their respectable bases," "Musial stands confidentially at the plate"), and good old-fashioned overnegation ("Don't fail to miss tomorrow's game").

A week after the AP article appeared in the nation's newspapers, the United Press gave Dean an opportunity to respond to the schoolteachers in his own words, and his piece was published in the New York Times and many other papers. He defended his use of non-standard dialectal forms, saying "I ain't dumb. I know most of the folks listening are from my part of the country — mostly from the Ozarks. They like it. A guy's got to do that sort of thing in this business." He stood by slud ("What do they want me to say — slidded?"), throwed, and also hitterish:

So I say Stan Musial or Chet Laabs is in a hitterish form down at the plate. What's the difference? Nine times out of ten they don't louse me up. They hit. Who cares what they call it?
(New York Times, July 26, 1946, p. 12)

Decades after the usage by Ruth and Dean, hitterish would linger in baseball circles, though its reappearance sometimes baffled sportswriters unfamiliar with the word's pedigree. At least twice it's been taken as a brand-new coinage:

Groping for an explanation for his team's recent batting revival, Angel manager Jim Fregosi coined a new word Saturday night. "The team has become hitterish," he said after the first-place Angels swept a doubleheader from the last-place Seattle Mariners.
(Los Angeles Times, July 9, 1978, p. D1)

"Doug Rader (the A's new batting coach) has everybody looking hitterish," said manager Tony La Russa, making up his language as he goes along.
(San Francisco Chronicle, Apr. 10, 1992, p. F1)

More recently, hitterish has been associated with batting instructor Charley Lau. One of his most famous pupils, George Brett of the Kansas City Royals, is fond of quoting Lau, as in this article about Carlos Beltran from 2000 (when Brett was working with Beltran, then a rising star with the Royals, on his hitting):

"It's like Charley Lau used to tell us, used to tell me: 'You look very hitterish up there. You look hitterish, you look like you're going to hit the ball hard,'" Brett said in camp.
(Baseball Digest, July 2000)

(That article was by Kansas City beat reporter Dick Kaegel, who must have forgotten about Brett's use of hitterish when he wrote about it as a "new" word in the Royals camp in 2005.)

The beauty of hitterish is that it can be revivified again and again, from Ruth to Diz to the current baseball era, and every time it sounds like a fresh innovation. In terms of its semantics and pragmatics, the word with its amorphous suffix -ish is usefully imprecise: how better to describe the vague state of mind of a batter who is performing well than hitterish? Successful hitting in baseball is, after all, far from an exact science. And in the tense struggle between batter and pitcher, "looking hitterish" like Alfonso Soriano (if not "feeling hitterish") may be half the battle.

(For a similarly nebulous neologism from the world of competitive snowboarding, see "Feeling all Olympic-y.")

Posted by Benjamin Zimmer at 08:04 PM

The confession of Adam Gray

Out in the lonely forensic linguistics wing of Language Log Plaza I've been busy recently trying to help with a case that began all the way back in 1993. A house on Chicago's south side was torched with gasoline and Adam Gray, who had just turned 14, was brought in for questionning, mostly because his former girlfriend lived in that house. At the time of the fire, the two were at serious odds with each other and so the police considered Adam a logical suspect and hauled him in for questionning. For whatever reasons, the police did not tape-record the interrogation, leaving few traces of what actually was said and done that day. Police departments that fail to make a verifiable record of what they do and say was the subject of a previous Language Log post (see here).

Adam was quickly charged with the crime, interrogated for four hours, tried and convicted of arson leading to death, and has spent the past 13 years serving a life sentence in prison. Recently the  Youth Network Coucil in Chicago has been trying to get the courts to reassess the evidence, especially the damaging confession that Adam made in a nine minute recap of those four hours in the police department. Confessions are made in language; thus their request for linguistic help. Unfortunately, since no tape-recording was made of any part of the interrogation, there isn't a lot for a linguist to work with. Adam reports that he denied committing the arson many times but the police continued to accuse him of it anyway. You wouldn't learn this from the police reports and testimony, however. They denied using any inappropriate tactics, of course, and the stenographic record of those last nine minutes doesn't reveal much to support the claim that Adam was coerced. But even with the fragmented record left by the police, the following questions remain:

1. As a juvenile suspect, why wasn't Adam allowed to see his mother during the interrogation?
Adam had slept over at his adult brother's house on the night of the fire. At 5 a.m. the police first went to Adam's house and his mother told them where they could find him. They then went to his brother's house and told Adam to come to the station with them for questionning. They told his brother not to hurry because it would be two hours before the questionning actually started. The brother testified that he then got dressed, ate breakfast, called Adam's mother and told her what happened, and arrived at the station before 7 a.m. He asked to see Adam and they took him to the room where he was sitting alone. The brother could see him through the one-way window but he wasn't allowed to talk with him. Adam's mother and sister arrived shortly after. Between seven and noon they asked to see Adam over and over again but were told that they could not do this. Adam reports that he asked to see his mother and brother but was told that they weren't present at the station at any time. Adam claims that the police told him that his mother didn't care what happened to him and that she refused to come to the station for him. The police testified that they didn't tell him this. All that is questionable here could have been avoided by tape-recording the entire process.

2. Was Adam questioned without concern for his well-being and health?
The interrogation actually started at about 8 a.m and lasted until noon. Adam reports that they gave him two cups of coffee that morning but didn't offer him anything  to eat until after  they got his confession. Different police gave conflicting stories about this but one of them testified that Adam was given a McDonalds sandwich BEFORE the stenographer was brought in to record his statement, at about 11 a.m.  Adam says that he wasn't given anything to eat until AFTER he agreed to give his confession statement.

3. Was Adam coerced or harassed while he was being questioned?
Adam's account of the interrogation was  remarkably different from the way the police described it. He said that every time he denied setting the fire the police kept accusing him of it anyway. He reported that he cried frequently as he tried desparately to convince them of his innocence. He claimed that they told him he would get the electric chair unless he confessed. As for tactics, the police are allowed to lie to suspects and one detective admitted that they concocted several lies to get him to admit setting the fatal fire. One story was that Adam's best  friend had turned on him, telling the police that Adam had planned to kill his ex-girlfriend (his friend subsequently denied that he ever said this). Another story was that his mother and family didn't care enough about him to come to the station for him. His mother testified that she and other family members were in the waiting area all the time but were never permitted to see Adam. Finally, the interrogator got Adam to place his hand on a copy machine to take its image, claiming that it would show any traces of gasoline. After Adam agreed to do this, they informed him there was gasoline all over his handprint.

Adam reports that he eventually cracked under the pressure, believing that his only way out of this was to dream up a story that the police might believe. So he told them how he took an empty gallon milk jug to a local gas station and purchased three-fourths of a gallon of gas. He then went to the house and poured the gas from the outside stairway of the house, starting at the top floor and ending at the bottom. Then he used his lighter to set it afire. That was enough for the police. They called in the court stenographer and got Adam to recap the story in her presence.

4. What did the confession say?
From letters that Adam wrote within a week of the fire, it is clear that he could write English sentences in a way that is pretty normal for a 14 year-old boy.  Most were simple sentences but a few contained embeddings and multiple clauses. This contrasted with his syntax in the stenographically recorded nine minutes of the interrogation, where his sentence length was 3.9 words per response. By far the majority of his answers consisted of either one word or two word phrases. One can never know for sure what was  on his mind at that time but clamming up like this at least suggests that he had given up and agreed to everything the Assistant State's Attorney asked him.

This stenographic record became the "confession" that sent Adam Gray to prison for life. It consisted of 78 questions, mostly answered by "yes." Adam didn't narrate or explain. He just responded to questions briefly in a way that looked very much like a well-rehearsed scenario.
Adam signed it at the bottom but, oddly enough, added a sentence in his own handwriting, possibly because the police remembered a loose end for which Adam hadn't accounted. They had quickly scoured the area for an empty milk jug and finally found one that seemed suitable as evidence (no mention was made about whether it had gas traces). But they still had the problem of the missing lighter. So they got Adam to add that he had broken it up and flushed it down a toilet. Case closed as far as the police were concerned.

5. How could these questionable issues have been avoided?
All of the reasonable doubt in this case could have been resolved if the police had tape-recorded the entire interrogation. All we have is a stenographic record of nine of the 240 minutes that Adam was in custody. And there is no way to verify the accuracy of this record, since they didn't tape-record it either. As a result, we can never know for sure what was said leading up to the recapped confession statement or, for that matter, during it. An  unimpeachable record (audio or video) of the entire interrogation (not just the recap) could  have justified every step in the process. More and more police departments around the country are reaping the benefits of such procedure. It eliminates challenges of police coercion, provides good training protocols for new detectives, and has been found to be cost-effective. The true offenders are caught and the tactics  are observable to all who may be concerned about such things. Perhaps most of all, subsequent trials could avoid the "he said, she said" battle between the prosecution and the defense. It eliminates the reasonable doubt and relieves jurors of the need to make inferences based on second-hand information and the self-reports of the police.

In Adam Gray's case we can't know for sure whether or not the police coerced him to invent a false confession so that he could go free. It might seem unbelievable that anyone would naively create a false confession but it's quite likely that a child could submit to the influence of strong authority figures such as law enforcement officers and state's attorneys. We can't know what was going through the mind of a 14 year-old child who becomes subjected to the frightening experience of a police interrogation. For that matter, we can't even know how frequently adults actually act rationally under the power and pressure of  being interrogated at the police station. But a verifiable record would certainly help relieve the onus of some of these questions.

Posted by Roger Shuy at 07:49 PM

Words fathers forget how to pronounce

TIME magazine's salute to Father's Day included publishing an article by Harvard psychologist Daniel Gilbert on whether being a dad will make you happy. His unseasonably curmudgeonly answer is no; and part of the reason people don't realize it is that if time with your children is the only fun you get, you naturally rank it high on the pleasure scale, and in fact people tend to have to give up all their other pleasures in order to cope with the demands of rearing children:

Even if their company were an unremitting pleasure, the fact that they require so much company means that other sources of pleasure will all but disappear. Movies, theater, parties, travel—those are just a few of the English nouns that parents of young children quickly forget how to pronounce.

How's that again? Kids actually damage the record of phonological information in your mental lexicon? Sounds even worse than the worst that parents had previously suspected...

I'm grateful to Jesse Sheidlower for pointing out to me this beautiful case of something I had previously documented (in this case and before that this one): a writer taking a straightforward claim about the world that is arguably true and turning it, for absolutely no reason that I can detect, into a claim about language that is wildly and demonstrably false. I have simply no clue about why people do this strange self-defeating switch into insincere claims about linguistic behavior that could have been sincere claims about the subject they were discussing.

It might well be that movies, theater, parties, and travel are a few of the things that as a parent you should expect to have to give up or at least cut back on for quite a while. It certainly is not true that you will forget the pronunciations. Professor Gilbert knows that. So why did he say it? Was it a metaphor? I don't see it. A joke? No, doesn't seem funny. A brain slip? No, the rest of the article seems sane (for various reasons people think their kids make them happy, he argues, but in general it is actually not true; you should have a kid if you would like to expend a lot of time, worry, money, and trouble raising a kid, but not in order to try and obtain an increase in your average happiness level, he suggests — all seems fairly plausible). A daring lie? A bet? Or a dare? Surely the editors would have caught it and stopped him. No, this is some weird literary device that I do not understand (and it needs a name, by the way).

Luckily, right now I actually live just ten minutes' walk away from the great tower block of William James Hall on Kirkland Street, where the Harvard Department of Psychology is located. I can simply walk up there and find Professor Gilbert and ask him. I know he will see me. I am from Language Log. That opens doors. And when I get my answer, you will read it here, and you will learn at last why sane writers in magazines and newspapers capriciously turn a true statement about things into a ravingly false statement about words, and then, insanely, publish the latter instead of the former.

Update: Dozens and dozens of people are mailing me to explain kind, as if I were a very stupid child, that Professor Gilbert must have meant it as a humorous overstatement. I know that. I was never in doubt about that. I'm wondering (a) why people go for this peculiar kind of falsity overstatement (talk about words instead of things), and (b) why they imagine this makes things more funny. But never mind. I know you'll all keep mailing me patient explanations anyway. Thank you very much for your interest. Sigh.

Posted by Geoffrey K. Pullum at 04:48 PM

A man and a statue and a codex and a cadaver

In honor of Father's' Day, we have a guest post by one of the Church Fathers, Augustine of Hippo.

Now let us take up the `equivoca', in which the perplexity of ambiguity grows like wild flowers into infinity. I shall try to divide them into certain genera. Whether my faculties are sufficient to the attempt, you shall judge. There are first three types of ambiguity which come from equivocation: 1. by art, 2. by use, 3. by both.

I say art for the sake of the names which are imposed upon words in the discipline of words. ...The single utterance which I make, `Tullius' (Cicero), is a name and a dactylic foot and an equivocal. And if someone presses me to define what `Tullius' is, I shall answer with an explanation of any of these notions. For I can say correctly: "Tullius is a name by which a man is signified, a great orator who as a consul suppressed the Catiline Conspiracy." Watch closely now as I define the name. If I could point out that very Tullius, if he were living, with my finger, and if I then had to define him, I would not say: "Tullius is a name which signifies a man"; I would rather say: "That man is Tullius", and then I would add the other things. I can also answer in this way: "Tullius is a dactylic foot consisting of these letters ..." Perhaps one might say: "Tullius is a word by which all those things mentioned above are equivocal and any other similar ones you can make up." ...

Now look at the next type, which, as you remember, comes from usage. We call that usage through which we know words. For who seeks out and collects words for the sake of words? Let someone hear something who knows nothing of the parts of speech nor is interested in meter or any kind of verbal discipline. Nevertheless, he can be disturbed by the ambiguity of equivocation when `Tullius' is said, for by this name the great orator and his picture or statue and the codex in which his letters are contained and whatever is left of his body in the tomb may be signified. For we say in diverse sentences: "Tullius saved the fatherland from ruin" "A golden Tullius stands in the Capitol" "All of Tullius is to be read" "Tullius is buried in this place". For the name is one, but all these are to be explained in different definitions. For this is the type of equivocation in which the ambiguity does not originate from the discipline of words, but from the very things which are signified.

But if it either confounds the hearer or the reader, if it is either from art or usage that it comes, what happened to the third type which was named? Its example will appear more clearly in a sentence: "Many wrote in the dactylic meter, e. g. Tullius." Here it is uncertain as to whether `Tullius' is cited as an example of a dactylic foot or a dactylic poet, of which the first is perceived by art, the second by usage. But in simple words it happens when the teacher pronounces the word to his students, as we have shown above.

These three types differ among themselves by manifest reasons. The first is again divided into two parts. Whatever makes an ambiguity through the art of words can partly be an example and partly not. When I define what a noun is, I can cite it itself as an example. For the `nomen' (noun) which I pronounce is itself a noun, and is so inflected, when we say: `nomen, nominis, nomini', etc. Likewise when I define what a `dactylus' is, it itself can be an example. For when we say `dactylus', we pronounce one long syllable and then two short ones. But when we say what `adverb' means, we cannot cite it as an example. When we say `adverb' this very enunciation is a noun. Thus, according to one way of understanding it is adverb and a noun is a noun, according to another `adverb' is not an adverb, since it is noun. Also `creticus' (a type of foot), when we define it, cannot be given as an example (of itself). When we pronounce it, `creticus' consists of one long syllable followed by two short ones, but what it signifies is a long, a short, and a long. Thus, according to one way of understanding `creticus' is nothing other than a creticus, according to another, it is not a creticus, because it is a dactylus.

The second type, which pertains not to verbal discipline, but to usage, has two forms. Equivoca are either of the same origin or of different origins. I mention those of the same origin which are contained in one name, but not one definition, but derive as it were from one source, e.g. when `Tullius' can be understood as a man and a statue and a codex and a cadaver. For these cannot be contained in one definition, but they have one single source, i.e. the real man himself, whose statue, books, cadaver they are. But when we say `nepos', it signifies from a quite diverse origin, both the son of the son and the spendthrift (Tr.: According to Isidore `nepos' (spendthrift) comes from a kind of scorpion).

[From section X of De Dialectica. as translated by Marchand (the original Latin is here).

Instruction in linguistics was apparently as problematic in fourth-century Rome as it sometimes is today -- according to the Wikipedia entry about Augustine, "He taught in Tagaste and Carthage, but desired to travel to Rome where he believed the best and brightest rhetoricians practiced. However, Augustine grew disappointed with the Roman schools, which he found apathetic. Once the time came for his students to pay their fees they simply fled." It's to avoid this sort of problem that we maintain the famous Language Log guarantee: your subscription fees are cheerfully refunded in case of less than total satisfaction. ]

Posted by Mark Liberman at 10:10 AM

June 17, 2006

My adjective

Back in February of 2005, Caity Taylor noticed the expression "my bad", and argued that "if this is to become widespread, then other adjectives must be able to be used in the same way, for example, 'my good', 'his stupid' etc.". I suggested that "while logically sound, this reasoning seems to be empirically false, since 'my bad' has been common for years in this part of the world, but I've never heard any analogous expressions in general use", where "an 'analogous expression' would be a possessive pronoun followed by an evaluative adjective used as a noun, referring to a specific event or action". Well, apparently Gary Trudeau has been thinking about this problem, because he's figured out how to slip one of these analogical formations into the Doonesbury strip that ran on June 14 (as Will Fitzgerald pointed out to me a few days ago by email):

In case you haven't been following the strip, that one is is part of a series that ran from June 12 to today, in which Zonker Harris has a "conversation" with B.D. in the kitchen of their house. B.D. never actually says anything, or even changes his facial expression, although he does indicate by changes of gaze direction that he's paying attention:

What Zonker and B.D. take away from the interaction is typically different:

Anyhow, just as "my bad" means something like "what I did was bad" or perhaps "I'm the one who did something bad", so Zonker's "my brave" means something like "what I did was brave".

As I observed in commenting on Caity's blog post, there's no real syntactic barrier in English to usage like this: we've been using adjectives as the heads of noun phrases, at least in limited ways, since Shakespeare's time and before -- "the good is oft enterred with their bones" and so on. And there's no pragmatic barrier in our culture to using a noun phrase like "my mistake" or "my fault" as an acknowledgment and apology. The trick is in combining the two.

According to Ken Arneson's research, reported here by Geoff Pullum, this trick was first performed by the Sudanese-American basketball player Manute Bol, perhaps because adult language learners are more likely to try logical generalizations of features that native speakers use only in more limited ways.

So with Zonker's encouragement, maybe we'll all start congratulating ourselves with "my thoughtful", apologizing with "my stupid", accusing others with "his rude" or complimenting them with "her brilliant", and so on. One small issue is that second-person forms like "your idiotic" are homophonous with the corresponding clausal forms "you're idiotic" etc. But that sort of ambiguity between innovation and conservatism may be an advantage.

[Update -- Tom Raworth has pointed out to me by email that "my bad" and "my brave" are phonetically equivalent to exclamations "Am I bad!" (as in "aren't I bad!") and "am I brave", given the common practice of dropping initial sounds in speaking (e.g. "s wonderful").

This is true, but I don't think it was involved in the origin of the phrase or its original spreading in the U.S.

In (my experience of) the original basketball context, "my bad" was used either to call a foul on yourself (e.g. after a collision) in a game played without a referee, or to signal that you're the one who committed the foul when the referee blows the whistle (in that case you also raise your hand, the purpose being to let the scorer know who to charge the foul to).

This doesn't imply, even ironically, that the user is essentially "bad" or generically at fault -- a typical game involves dozens of fouls, and everyone commits a few. The expression just acknowledges that in the most recent encounter, you were the one (perhaps technically) in the wrong. The commonest equivalent standard expression would be "my foul", in the implied context "that was my foul".

Note that in this case, it's accepted that something specific has happened (there was a significant collision under the basket, or the ref blew his whistle), and the question to be addressed is who (if anyone) is formally at fault.

When I started to hear this expression generalized to everyday life, around 1992, it happened in two kinds of cases: correcting simple mistatements, and taking the blame for something (typically a small thing) going wrong:

A. Now that coal is $50 a pound...
B. $50 a pound? What are you talking about?
A. My bad, I meant $50 a ton...

A. Sally called again today about the XYZ forms, what's going on with that?
B. I thought we sent those out last week.
C. That was my bad, I forgot to put them into the mail until yesterday.

Here the closest equivalent, it seems to me, is a very specific acknowledgment like "(that was) a mistake = what I said was a mistake" or "(that was) my fault = what happened was my fault", and not some generic exclamation (whether sincere or ironic) like "aren't I bad!"

In these extended contexts, as in the basketball game, it's clear that something has gone wrong -- there's a difference of opinion about a factual point, or some problematic (if minor) action or failure to act has been brought up -- and the question on the table is who (if anyone) is at fault in that specific case, not what anyone's overall moral character is like. What was funny about the use of "my bad" in the Doonesbury strip for Feb. 11, 2004 ("So what do you say after you invade another country by mistake? Oops, my bad, sorry about all the dead people?") was the idea of using this inconsequential excuse in a serious context:

Now, it's certainly true that "my bad" is phonetically identical to "Am I bad!" with prosiopesis of the initial vowel. But I can't imagine anyone saying "Am I bad!" in the basketball context, or in the other sorts of scenarios that I described, whereas "(that was) my foul/fault/mistake" works fine. Still, Tom's observation points out that everyone has the chance to apply their own theory when they learn a new expression, and therefore to generalize it in their own way. So maybe we'll start seeing some generalizations of "my bad" along the lines that Tom suggests, e.g. "tee bad" = (isn')t he bad" or "zee bad = (i)s he bad", instead of "his bad". (But I doubt it...) ]

[Update #2: Several readers have suggested a different way to construe "My brave" in the June 14th Doonesbury, as if it were a translation of the French vocative "mon brave", which would make it similar to English vocative expressions of the form "my <adjective>", such as "my dear" or "my sweet". I don't think this is nearly as plausible as the view that the phrase is meant as a generalization of "my bad", but you can decide for yourself.]

Posted by Mark Liberman at 07:23 AM

June 16, 2006

Fun: you've seen the adjective, now read the adverb!

The word fun has taken one more step on its journey from the non-count noun it once was to the regular adjective it is coming to be. Long ago it began on this path by moving on from predicative uses (that was fun) to picking up attributive modifier uses (a fun thing to do). Then it started to get adjective degree modifiers (very fun, so fun), and ultimately it began to inflect for comparative and superlative grade (there's nothing funner than that, comparative; it's the funnest thing ever, superlative). But the next step was to form a regular ly adverb. And I only recently realized that this may have already happened, two months before Language Log was founded. On April 4, 2003, a brief post on a now now possibly defunct blog quoted singer Justin Timberlake on the topic of sexual intercourse thus:

"I've been doing this since I was 15. Sex has never been taboo for me," he noted. "I enjoy it, and I praise it, and I celebrate it openly and funly."

However, the quote is falsified. Or at least, very misleadingly cropped. Here is what actually appeared in Marie Claire magazine according to this news source:

"I've been doing this since I was 15. Sex has never been taboo for me," he noted. "I enjoy it, and I praise it, and I celebrate it openly and funly — if that's a word."

If, indeed. That is the question. But the thing is, while Justin Timberlake wondered whether his neologism (the first use?) was a word, the blogger did not. He quoted the word, and used it in the title of his post ("Celebrate it funly"), without apparently having any serious doubts. This looks like the first open acceptance of the word by a native speaker who felt comfortable with it.

So we have a hypothesis about date of origin for this new evidence that fun has fully adjectivized: the adverb funly appears to have been born in the late spring of 2003 when Justin Timberlake creatively (though hesitantly) applied the standard ly-derivation process to the developing adjective fun. And at least one native speaker adopted it with no adverse symptoms.

Now, I realize that there is a high likelihood that some reader, one of Language Log's real gurus of the lexical predating game like Ben Zimmer, will now proceed to show I'm wrong by finding an earlier occurrence. But the Oxford English Dictionary will be no use (I checked). And googling will be tedious, because nearly all occurrences of funly on the web that do not pertain to the continental loose-leaf lettuce of that name are mistypings of funky (note where K and L are on your keyboard, and how clumsy and inexpert your middle and ring fingers look now you come to think of it).

Anyway, if there are prior citations, that will only make it clearer that there is a new adverb for you to use. Use it funly.

Update: The best predating I have yet seen was pointed out by John Baker on the American Dialect Society's list; it's from Google Groups, and is dated December 4, 1996:

There was a third-block long line waiting to get in. I don't know how long we were in that line, but it was not at all un- pleasant surrounded by all those good-looking, funly-dressed, tat- too'd, punctured, and friendly young people.

That seems a genuinely adverbial use, the meaning of funly-dressed being (if I may assume you understand the adjective fun), "in a fun way". If that's a good citation, we're back to 1996.

Posted by Geoffrey K. Pullum at 04:10 PM

A defense of Aymara uniqueness

[Guest post by Russell Lee-Goldman]

You're probably wondering what the case for the defense is over this Aymara story, about these Andean people whose language use is claimed to show that they see the past ahead of them and the future behind them, right? I've repeated a few pieces of anecdotal stuff that came in over the transom about Aymara not being unique in this respect. Might these instant ripostes be missing the point about what is going on? At Language Log no expense or trouble is too great when it comes to informing you, the linguistically alert public; we are prepared to go out and get top experts to disagree with us, and to put the other side. Here is Russell Lee-Goldman of the Department of Linguistics at the University of California, Berkeley, on what's special about Aymara:

I feel a need to address recent controversy regarding the uniqueness of the Aymara conceptualization of time-as-space. I cannot respond to everyone who says that their language of choice also has a "back to the future" metaphor, nor will I attempt to reconstruct all of the linguistic (metaphor-based) arguments involved. However, many of the objections that I have heard (and that I am sure the researchers of Aymara asked themselves) are based on a misconception that if a language has a single word that is polysemous between "front/past" or "back/future", then it automatically makes Aymara non-unique.

There are two well-studied metaphors of time. One is ego-centered, and one is not. The non-ego-centered ones allow things that look like "future is past." For instance,

Christmas follows Thanksgiving.
Before ('in front of') July.
hou4 tian1 [Chinese for 'day after tomorrow' lit. 'behind day']

But these have points in time situated in relation to other points in time (the time-moving-in-a-queue with respect to observer metaphor). Thus later events are behind earlier events. They are not situated with respect to the speaker. They could be moving towards the speaker, or not -- gestural data indicates that some English speakers conceptualize the line as moving from left to right.

But the Aymara system is ego-centered. The past is actually in front of them. They gesture in front of themselves when talking about the past, and behind themselves when talking about the future. I doubt that any Chinese speaker would gesture behind themselves (i.e, using themselves as the temporal anchor) when uttering hou4 tian1, or in front of themselves when uttering qian2 tian1. The same goes for Japanese. The data from South Sulawesi languages seems to be the same (unless olo / boko are unambiguously 'front/back of ego'), but I would need more data to see. The Maori seems more promising as an Aymara-type case. The "push back/ahead" is a well-known case and reflects the moving-time metaphor, not the ego-centered metaphor (actually, "push back" is judged to have different meanings depending on how the speaker is primed, but never mind that).

This is a tricky area, and linguistic claims should of course be backed up by psycholingustic data (and gestural data is a part of that, I think). But my impression is that Aymara is not as non-unique as some dissenters would like to claim.

— Russell Lee-Goldman

Update: Anthony Jukes thinks he can definitely affirm that olo and boko are "unambiguously front/back of ego"; they actually mean "front/back of body". —GKP

Posted by Geoffrey K. Pullum at 09:14 AM

June 15, 2006

Language Log apology book tour kicks off in Cambridge

Getting out there in front of the public to apologize is an exhausting business. Sitting at home having a glass of wine with Barbara tonight I have a new and unexpected sympathy with people like Pete Rose (except that if my analysis is right he did a lot of appearing in public but tended to cop out at the apology stage), and all those corrupt evangelists and politicians and CEOs who do bad stuff and have to do the humbly-beg-forgiveness thing. It does take it out of you.

I gave good apology, though: the full hour and a half. A talk about Far From the Madding Gerund, a frank apology for the reversed column heads on page 179, correcting people's copies by hand, initialling the correction, writing "Sorry!" in the margin, signing the title page, giving them their nickel refund, and re-explaining lie1, lie2, and lay once again for those who really seemed to need it. The place was packed, the audience was delightful, I really had fun. But it does take a lot of energy. After doing this in city after city across fifteen or twenty countries, I think I'm going to be ready for an incall (or is it outcall?) massage. Anyway,the tour is launched, that's the important thing. Thanks to the MIT Coop for hosting, and thanks to all the Language Log fans who were there today. It was so good to see you in the flesh. By the way, talking of flesh... You're a good-looking bunch, some of you, aren't you? I didn't know Language Log was reaching such an attractive crowd. We should have more social gatherings. Without apology.

Posted by Geoffrey K. Pullum at 10:05 PM

The earliest typographically-bleeped F-word?

In response to my post on the history of typographical bleeping, Mark A. Matienzo sent in a reference to "An Account of the PROCEEDINGS against Capt Edward Rigby, At the Sessions of Goal Delivery, held at Justice-Hall in the Old-Bailey, on Wednesday the Seventh Day of December, 1698, for intending to Commit the Abominable SIN of SODOMY, on the Body of one William Minton", available from Early English Books Online (EEBO). This two-page document contains two examples of the F-word rendered by the initial letter followed by a horizontal line about three ems long, and one example (of what word?) rendered by a similar line with no orthographic clue.

I'd be surprised if this is really the earliest extant instance, but it's the earliest one that I've seen. Here's an image of the relevant section, fetched and cropped by me from the image on line at EEBO:

And here's a transcription of a bit more of the story, provided by Mark Matienzo:

... That about ſix a clock Minton came to the George Tavern, enquired for Number 4. and was ſhewed into the room where Rigby was, and the conſtable and his aſſiſtance were placed into a room adjoyning; Rigby ſeemed much pleaſed upon Mintons coming, and drank to him in a glaſs of Wine and kiſt him, took him by the Hand, put his Tongue into Mintons mouth, and thruſt Mintons hand into his (Rigby) Breeches, saying, He had raiſed his Lust to the higheſt degree, Minton thereupon askt, How can it be, a Woman was only fit for that, Rigby anſwered, Dam’em, they are all Port, I’ll have nothing to do with them. Then Rigby ſitting on Mintons Lap, kiſt him ſeveral times, putting his Tongue into his Mouth, askt him, if he ſhould F----- him, how can that be askt Minton, I’ll ſhow you anſwered Rigby, for it’s no more than was done in our Fore fathers time; and then to incite Minton thereto further ſpake moſt Blaſphemous words, and ſaid, That the French King did it, and the Czar of Muſcovy made Alexander, a Carpenter, a Prince for that purpoſe, and affirmed, He had ſeen the Czar of Muſcovy through a hole at Sea, lye with Prince Alexander. Then Rigby kiſt Minton ſeveral times, putting his Tongue in his Mouth, and taking Mintonin his arms, wiſht he might lye with him all night, and that his Luſt was provoked to that degree, he had ------ in his Breeches, but notwithſtanding he could F------ him; Minton thereupon ſaid, ſure you cannot do it here, yes, anſwered Rigby, I can and took Minton to a corner of the Room, and put his hands into Mintons Breeches, deſiring him to pull them down, who anſwered he would not, but he (Rigby) might do what he pleaſed; thereupon Rigby pulled down Mintons Breeches, turn’d away his ſhirt, put his Finger to Mintons Fundament, his hand behind him, and took hold of Rigbys Privy Member, and ſaid to Rigby; I have now diſcovered your baſe inclinations, I will expoſe you to the world, to put a ſtop to theſe Crimes; and thereupon Minton gave a ſtamp with his foot, and cry’d out Weſtminſter; then the Conſtable and his Aſſiſtance came into the Room, and ſeized Rigby ...

There's a piece of slang there that's completely new to me: "they are all Port". Not a good thing to be, apparently -- but why not?

[In the list of citations for the F-word given in the OED, it's written out in full in seven citations between 1503 and 1684, and then typographically bleeped in citations from 1707 to 1800. What happened between 1684 and 1698 to change the habits of printers? The Glorious Revolution? Or is this apparent trend just sampling error? ]

[Update -- with respect to the mysterious "they are all Port" passage, Karl Hagen, Gail Hapke and several others wrote in to suggest that the word is really "Poxt", i.e. afflicted with venereal disease. I believe that they're right -- I should have thought of that!]

[Update -- Jesse Sheidlower emails:

I see that HDAS's early bleeped example is from ca1650, cited from John Wardroper's Love and Drollery (a collection of 17th century pieces), but perhaps the piece is actually later, or there's some other explanation. (I don't have a copy here to see what it says.) In this case the entire word is dashed out, but it perfectly fits the sense, and rhymes with "luck" so I don't think there can be much doubt.

HDAS is the Historical Dictionary of American Slang.

To study the historical trajectory of typographical bleeping, the best data would probably be "G-d" and similar things involved in quoted oaths. It seems plausible to me that the socio-religious currents swirling around the English civil war, the restoration, and Glorious Revolution might have caused changes in typographical practice over the course of the 17th century, but that's just speculation. ]

Posted by Mark Liberman at 07:54 PM

June 14, 2006

How to get money from Geoff Pullum

As Woody Allen explained, all you have to do is show up. But as usual, there's a trick: you have to show up at the right time and place. The time is Thursday, June 15 at 5:00 p.m.; and the place is the MIT Coop, at 3 Cambridge Center in Cambridge, Massachusetts (that's right by the Kendall Square station on the T). Geoff will be there with a sack of money, handing it out to all comers.

Well, to be precise, I have to tell you that it'll be a sack of nickels and pennies. But there's a good story that goes along with the coins.

Back in May of 2004, Geoff posted some usage advice ("Lie or Lay? Some disastrously unhelpful guidance"), including a useful 6x3 table laying out the various forms and functions of lie "tell untruths", lie "be recumbent", and lay "deposit". This was one of the posts that Tom Sumner, our editor at William, James & Co., chose for the paper collection of Language Log reprints that was published last month under the title Far from the Madding Gerund. And Tom did a terrific job with the design and production of the book: as Steve at Language Hat wrote,

... this is a beautifully produced book (my hat is off to the publisher, William, James & Company): handsome, nicely laid out (with URLs and annotations in smaller-type sidebars), well indexed; hell, it even smells good. And it's actually been proofread, which seems to be viewed as an unnecessary expense by most publishers these days; the only thing I've found to raise an eyebrow at so far is the failure to change quotes-within-quotes to single quotes in this (from page 25): "A grammatical, usage or pronunciation mistake made by "correcting" something that's right to begin with. For example, use of the pronoun whom in 'Whom shall I say is calling?'" But that's extremely small potatoes.

However, a couple of weeks ago, a reader pointed to us that on p. 179 of the book, someone somehow swapped the first two column labels on the crucial lie/lie/lay table (though it was, and is, correct in the weblog version). As a result, the advice is really and truly "disastrously unhelpful" — if you allow yourself to be guided by the table as printed in the book, you'll say that Bill Clinton "lay" (rather than "lied") when he said "I did not have sex with that woman", and you'll say that John Dean once accused George W. Bush of having "lain" (rather than "lied") about weapons of mass destruction in Iraq.

Well, the error has been corrected in the second printing (due soon), and Tom Sumner is off suicide watch (though it was a near thing), and in a close vote, the LSA executive committee has decided not to rescind Pullum's 2003 Leonard Bloomfield book award. But the Language Log marketing department has been in an uproar over how to apply our famous quality guarantee to this case: "your subscription price cheerfully refunded in case of less than total satisfaction". The conclusion was that there are 376 pages in the book, so 1/376 of everybody's money should be returned on the assumption that a page with an error like this is worthless even if not dangerous. 1/376 of the book's list price is 5.85 cents -- if you got it at Amazon's current discount price, the corresponding amount is about 3.8 cents.

So if you show up tomorrow (Thursday, June 15) at 5:00 at the MIT Coop, Geoff will give you $.06 if you paid list price, and $.04 if you got the discount. Bring in your copy, and he'll correct the error by hand.

OK, seriously, what's really happening is that the MIT Coop is having a reading and book signing for Far from the Madding Gerund:

Place: MIT Coop, at 3 Cambridge Center, Cambridge, MA — that's on Main Street, across from one exit from the Kendall Square (MIT) T station (and as Quantum of Wantum has pointed out, actually right on top of another exit).
Time: 5:00 p.m., Thursday, June 15

Do show up if you're in the area. Geoff is as funny in person as he is on paper. And he really will bring a bag of nickels and pennies. First come, first served.

Posted by Mark Liberman at 05:11 PM

"Still un-X-ed" is not yet unspreading

Picking up last year's conversation about the interpretation of "still unpacked" as "not yet unpacked", a pseudonymous reader, dr pepper, sends these examples showing that a similar usage exists for unwrapped and uncorked:

(link) "Not much luck has befallen the search for the elusive mummy bands which Weigall stated were found around the KV 55 mummy (JEA 8 [1922], 193ff.) He described the bands as being wrapped around the outside of the mummy at right angles to the bandages, but none of the others present during the clearance of the tomb mentioned such bands in their written accounts. Apparently Weigall assumed that the objects he saw were retaining straps or mummy "braces," used to hold the shroud and wrappings covering the mummy in place. (A good example of such "mummy bands" appears on the still-unwrapped mummy of Isiemkheb-D.)"

(link) "In a Talbot's box, still unwrapped, was an expensive pantsuit."

(link) "Among the items are coins, glass bottles still uncorked with the organic material perfectly preserved, intact amphorae and soles of seafarers' shoes, probably tossed away when they were no longer good."

(link) "Zal sends this "news" story, along with the comment,"Bwahahahaha... mebbe it'll have a screw-top for ya," referring, of course, to the still uncorked bottle of wine my bro-in-law gave me last month."

I'll add some examples of "still unsealed" meaning "not yet unsealed":

(link) I've got one AutoCAD licence for sale. It is $7500 incl GST.... The boxes are still unsealed in the original packages.
(link) Barbie's Horse Adventures: Wild Horse Rescue has been a running joke for many (male) gamers ... Joke all you want, but don't give up any copies you have, especially if they are still unsealed. ... a follow-up post by sonarrat uncovered the hot collectors market for Barbie's Horse Adventures. It appears that on Amazon.com, at the time of this writing, the game is fetching $74.99 to $129.99 on the used market.

including one where "few remaining unsealed" means "few remaining sealed":

(link) On the other hand, all three judges spanked the judges of the Southern District of Florida for engaging in secret docketing. The panel ordered dockets and files in the case unsealed.
(Strafer expects the few remaining unsealed files in the Ochoa case to be automatically unsealed once the 11th Circuit clerk issues an official mandate. He said he expects some significant documents to be among those still unsealed, including one detailing the government's sentence reduction scheme.)

But both intuition and web search tell me that "still undressed" never means "not yet undressed".

Is this problem still unsolved or not yet unsolved? If I'm not yet unpuzzled about this, am I still puzzled or still unpuzzled? Tune in for the next exciting episode...

[Ben Zimmer points out that commenters on Languagehat turned up "still unwrapped" last year along with many other examples ("still unloaded", "still unrolled", "still unearthed"). I missed that, alas, having read the post before most of the comments showed up. It looks like "uncorked" and "unsealed" are new, though. And we need to pay more attention to the cases where it doesn't work ("undressed", "uncovered", "unplugged"), though they are interesting only by comparison to the similar words that allow the still = not yet equivalence.]

[Update -- Arne Meyer emailed:

Saw your link to my post (http://arne360.blogspot.com/2006/04/barbie-horse-adventures-collectible.html) where I used the phrase "still unsealed."

Inadvertantly, it made me realize that my sentence was horribly formed and I was probably distracted when I wrote it. My intention had actually been to say "still sealed."

Your post about the usage of those words was quite interesting and I never thought about those words that way. It was definitely an interesting item to read.

I was writing to just give you notice that I will be correcting that phrase in my blog post to accurately reflect what I meant. You might want to alter your link to my post or mention how it's been corrected.

One of the most interesting things about this usage is how widespread it is, even among excellent writers; how hard it is for readers to notice any problem with it; and yet, how often people conclude that it's a mistake when it's pointed out to them, even though there is no hectoring by "language mavens" on the question. ]

[Dr pepper (that's the pseudonymous reader, not the soft drink) emails:

I've found another: (un)veil.

(l ink) "But now you can let Gawker be your methadone, because after many hours of hard work and NSA-level computer sleuthing (read: an innocuous Google search), we've hacked into the still-unveiled new site."

Having thought it over some more, i'm now thinking that the verbs that get treated this way are ones that make their objects in some way unavailable, and have un forms that perfectly reverse whatever the basic forms did.

That might be right. Intuition tells me that "still uncovered" can't mean "not yet uncovered", but on the web I found this: "We visited the site of Pompei, which was covered with a 7 meter layer of ash after the Vesuvius erupted on August 24 79 BC. Most of the city is still uncovered but parts have been excavated since as early as the 18th century ..."   Still, there does seem to be a lexical aspect of this phenomenon. ]

Posted by Mark Liberman at 09:00 AM

Back to the... but you guessed it

Lots of people are mailing Language Log Plaza about Inga Kiderra's report on the new paper in Cognitive Science reporting that in the Andean language Aymara (Bolivia, Chile, and Peru) metaphors about the future relate to the concept of being behind, and metaphors about the past relate to being in front of you where you can see. And we don't need to tell you what film title all the headline writers are turning to for headlines to put on the top of it. Our reduced summertime staff here at Language Log may have a considered view on this topic soon (we have a couple of interns working on it), but right now that time lies behind us, in the future... Ooh, doesn't that sound weird?

I have just a couple of things to say, both from readers who have contacted Language Log (read on for this update).

People who say that some language is unique for this or that attribute are mostly wrong. (Not always, of course. But mostly, if the claim is at all general.) And according to Inga Kiderra,

New analysis of the language and gesture of South America's indigenous Aymara people indicates a reverse concept of time.

Contrary to what had been thought a cognitive universal among humans — a spatial metaphor for chronology, based partly on our bodies' orientation and locomotion, that places the future ahead of oneself and the past behind — the Amerindian group locates this imaginary abstraction the other way around: with the past ahead and the future behind.

But already Language Log's switchboard has been lighting up with calls to the Asian and Pacific Languages department. Anthony Jukes of London's distinguished School of Oriental and African Studies tells us:

I suppose everyone will be writing in to say that 'their' language works like this too. And so will I. Makassarese and the other South Sulawesi languages also consistently refer to the past as in front of ego — minggu ri olo week PREP front = "last week", minggu ri boko = week PREP back = "next week". And while I haven't really looked into this, I get the impression that this is not that unusual in Austronesian languages. I've been told that Sasak does it the same way, for instance.

And then there's Daniel Rosenblatt, of the Department of Anthropology at Scripps College, who says this about Maori (with an extremely interesting added note about English):

I work in New Zealand, where it is commonly noted that Maori refer to the past as nga wa o mua (time in front) and the future as nga wa a muri (time in back): this is widely thought to correlate with a different attitude towards tradition and the importance of history. Such a different attitude may exist, but deictics hardly seem decisive — I have always take the idea with a grain of salt once I realized that the English terms "BEFORE" and "AFTER" have the same implications.

Antony Eagle writes from Oxford with another remarkable insight on English: "one thing I have always been struck by is the locution 'push back', meaning postpone, as in: "Needless to say, we were delayed much more than I had expected, and I had to call Frank yet again and push back our meeting." (http:// www.starbuckseverywhere.net/Log_2004_10_24.htm). Looks like back and future aren't clearly separated in English..."

And Quincy Lu writes from the University of Washington in Seattle:

It's likely someone's already informed you of this, but in Chinese, we have some thing similar: 8eE7 (hou4 tian1, "behind day") means "day after tomorrow" an d A0E7 (qian2 tian1, "front day") means "day before yesterday". Works the s ame with years.

So the notion that Aymara is entirely unique among languages and cultures is almost certainly false, perhaps hugely false. If the newspapers have whipped that up by overstating, then the authors of the paper in Cognitive Science (which I have not yet seen) are not to be blamed. But if they say Aymara is one of a kind, it looks like that's not true. Before making a claim in print about an unprecedented feature in a human language, scholars should try to make sure that the claim can withstand, say, a week on the LINGUIST List and a week on Language Log with no one writing in to contradict.

Posted by Geoffrey K. Pullum at 07:28 AM

Danglers: discourtesy, not ambiguity

A classic dangling adjunct arrived in the morning spam load:

As an Ace Customer, we are asking your permission to send you relevant e-mails about developments in our product range which could benefit you or save you money.

Who's the Ace Customer? One looks to the main clause and the subject is we. But that's not a possible subject, because *We are an Ace Customer would be deviant (semantically anomalous, I think, but certainly not good). The inference that the Ace Customer they are referring to is supposed to be me has to be drawn a few milliseconds later on, after some mental confusion. It's the familiar minor discourtesy of dangling adjuncts. And one interesting thing about this case is that it still causes its extra second of puzzlement despite the fact the matrix clause subject is not a candidate for being the understood subject of the adjunct. The example reminded me that whatever is going on here (and I am still nowhere near understanding it — none of us in the Fellowship of the Predicate Adjunct understand it yet), it is not about the deleterious effects of ambiguity. There is no ambiguity here. It's an unclarity, not an ambiguity. Those are different.

Posted by Geoffrey K. Pullum at 07:13 AM

June 13, 2006

For semanticists only

This post is exclusively for semanticists. If you are not a semanticist, do not try to solve the problem below. Do not even read it.

Medical warning. If after working on the problem below for a while you find beads of blood on your forehead, cease work and consult a physician. Problem void where prohibited. Common side effects include increased heart rate, tunnel vision, loss of peripheral vision, throwing pencil across room, and screaming.

Consider the following sentence, from Jeremy Latham's blog for March 2006:

All someone needs to do now is snap a few pictures.

Your task: using a suitable logical language with a suitable signature (say which one you are assuming), provide a full semantic representation for the sentence, taking care to get the truth conditions right, and in particular, making sure you assign the correct scopes to (i) the universal quantifier "all", (ii) the existential quantifier "some(one)", (iii) the necessitative modal operator "needs", and (iv) the paucal quantifier "a few".

(By the way, there is nothing special about the sort of phrase involved here, and certainly nothing deviant about it. You are not being asked to comment on any error or strangeness. This is straightforward English. A very similar usage occurred in The Economist on June 3 [p. 31, bottom]: "Now all someone needs to do is invent a paper car", where the task would be just about exactly the same.)

All some semanticist needs to do now is to adjudicate between the solutions offered, so that the prize can be awarded. We deal solely in the best here at Language Log; the expert we have engaged for this task is perhaps the world's greatest. Send your answer on a postcard to him, please. The address is:

Professor Laurence Horn
P.O. Box 208366
Department of Linguistics
Yale University
New Haven, CT 06520-8366
Posted by Geoffrey K. Pullum at 09:20 AM

Linguist jokes (4)

Q:  Why do you think the syntactician crossed the road?

A:  That's one of those adjunct extraction cases where you clearly get the reading with the extraction site in the complement clause just as easily as with the one where it's in the matrix.

Posted by Geoffrey K. Pullum at 06:59 AM

June 12, 2006

David Brooks, cognitive neuroscientist

In David Brooks' most recent column ("The Gender Gap at School" 6/11/2006), he observes that "reading rates are falling three times as fast among young men as among young women", and suggests that the problem is that "in most classrooms boys and girls are taught the same books in the same ways". His prescription is "more Hemingway, Tolstoy, Homer and Twain" for the boys. (I guess that the girls can make do with what they get now, though I suspect it is not, alas, mostly Atwood, Austen, Sappho and Alcott.)

I share Brooks' worries about educational gender gaps. I'm 100% in favor of more classic literature. And if there's an educational prejudice in favor of Jane Austen as opposed to Mark Twain, I'm 100% in favor of rectifying the balance. But there are two things about Brooks' column that bother me: he bolsters his argument with apparently misunderstood or made-up results of brain research on sex differences, and (if we fix those mistakes) he applies a black-and-white interpretation to observations of shades of gray.

Brooks starts his column by observing that men and women stereotypically like different kinds of books, relying on studies whose methodology magnifies group differences:

Researchers in Britain asked 400 accomplished women and 500 accomplished men to name their favorite novels. The men preferred novels written by men, often revolving around loneliness and alienation. Camus's "The Stranger," Salinger's "Catcher in the Rye" and Vonnegut's "Slaughterhouse-Five" topped the male list.

The women leaned toward books written by women. The women's books described relationships and are a lot better than the books the men chose. The top six women's books were "Jane Eyre," "Wuthering Heights," "The Handmaid's Tale," "Middlemarch," "Pride and Prejudice" and "Beloved."

Here's the first example of boosting the contrast to turn grays into blacks and whites.

Everyone knows that there are gender differences in reading preferences. I haven't been able to find any of the raw numbers from surveys like the one that Brooks cites, but I can guess what they would be like. We'd find that there are statistically significant group differences, sometimes strong ones, but we'd also find that individuals are complicated and groups are variable.

Speaking for myself, I dislike Camus and Salinger, and can take Vonnegut only in small doses. I like Austen and Atwood, but have to force myself through Brontë (any of them), Eliot and Morrison. I love Twain but find Tolstoy tedious. I'm hardly typical -- but then few people are, in the sense of conforming exactly to group patterns. I know a manly man with a secret habit of reading romance novels, and a womanly woman who loves military science fiction, even though the audiences for those genres are no doubt overwhelmingly gendered.

Adding up group preferences is a fine thing to do, but expressing the results by saying that "men preferred" one thing while "women leaned toward" something else is dangerous. It's like saying that "women preferred Kerry to Bush in 2004", when (according to exit polls) 48% voted for Bush as opposed to 51% for Kerry.

Brooks continues:

There are a couple of reasons why the two lists might diverge so starkly. It could be men are insensitive dolts who don't appreciate subtle human connections and good literature. Or, it could be that the part of the brain where men experience negative emotion, the amygdala, is not well connected to the part of the brain where verbal processing happens, whereas the part of the brain where women experience negative emotion, the cerebral cortex, is well connected. [emphasis added]

Or it could be that Brooks is deeply confused, not just about the meaning of "stark" differences in grouped preferences, but also about the literature on brain localization of emotional processing.

He warned us in his column of 5/25/2006 that

... I, a scientific imbecile, have spent several weeks trying to understand the amygdala and the orbitofrontal cortex.

Though I'm no expert in cognitive neuroscience, I think Brooks might need to go back to the books on this one. As far as I can tell from looking at (some of) the relevant literature, the amygdalas (paired subcortical structures at the inward tips of the medial temporal lobes) and the cerebral cortex are involved in the experience of negative emotion in a rather similar way in both sexes. While there both neural and behavioral group differences in emotional processing have been reported, the results are rather like those for book preferences: individually complex and variable within groups. And in any case, the pattern of sex differences is not, as per Brooks, men:amydala :: women:cortex.

To illustrate the kind of reading Brooks might (have one of his assistants) do, I hit Google Scholar with {amygdala sex difference}, and I'll summarize for you the relevant parts of one of the papers I found on the first page of results.

Turhan Canli et al. ( "Sex differences in the neural basis of emotional memories", PNAS vol. 99 no. 16, August 6, 2002) did a study that used functional MRI to study localization of brain activity in 12 men and 12 women "while they rated their experience of emotional arousal in response to neutral and emotionally negative pictures". Three weeks later, memory for the pictures was tested: "highly emotional pictures were remembered best, and remembered better by women than by men".

The study found that "activation in the left but not right amygdala was correlated with emotional arousal ratings for both sexes".

In addition to the left amygdala, men and women had other common (although not necessarily overlapping) areas in which activation correlated significantly with reported emotional experience. These areas included the bilateral superior frontal gyrus [Brodmann area (BA) 6], right middle (BA 46), and bilateral inferior frontal (BA 44, 45) gyri, left-lateralized anterior cingulate (BA 32), right precentral gyrus (BA 4), left thalamus, and left insula. This pattern of correlation loci suggests that both sexes share an extensive network of structures associated with attention, language, and motor control that are associated with emotional arousal. [emphasis added]

Against this general pattern of similarity across the sexes, there were also some (complicated) group differences:

Women but not men exhibited correlations in the postcentral gyrus and hippocampus. Men but not women exhibited a significant correlation in the putamen. Correlations in BA 37 of the fusiform gyrus were lateralized by gender: right-lateralized for men and left-lateralized for women. There was evidence for different patterns of hemispheric asymmetry between the sexes. Women had significantly more clusters in the left than in the right hemisphere ({chi}2 = 5.90, P < 0.05), whereas men showed no hemispheric asymmetry in the number of clusters in either hemisphere ({chi}2 = 0.12, P = not significant).

I found the last point interesting, and perhaps even relevant to Brooks' argument, since many studies have found that language-related phenomena are more lateralized in men than in women. But this is a sex difference in the degree of lateralization (i.e. left-right asymmetry) in activity of parts of the cerebral cortex, not a difference (that Brooks asserted to exist) between the amygdala(s) and the cerebral cortex as a whole.

As usual in such studies, the raw data from the Canli et al. study is not available. If it were -- and especially if a larger number of subjects were available -- we'd be able to see that there is a great deal of variation within the male and the female groups. It's quite possible (I would guess it's likely) that the within-group variation is quite a bit larger, both qualitatively and quantitatively, than the between-group differences.

In the memory test three weeks later, the researchers found that "highly emotional pictures were remembered best, and remembered better by women than by men". Here's the paper's detailed summary of the behavioral data, which most of you will not care to read:

There were sex differences in reported emotional experience and in memory for emotionally provocative pictures. ANOVA was performed with factors of "sex" (male and female), "arousal rating" (ratings 0–3), and "memory accuracy" (0, forgotten; 1, familiar; 2, remembered). Pictures that provoked high emotional arousal were more likely to be remembered than those that yielded little or no sense of arousal when collapsed across sex (arousal rating x memory accuracy, F(6,120) = 7.40, P < 0.0001). There was a significant interaction between sex and emotional arousal (sex x arousal rating, F(3,66) = 3.37, P < 0.025). Women rated significantly more pictures as highly arousing (rated 3) than did men [t(22) = 2.41, P < 0.025]. Women had better memory for emotional pictures than men (Fig. 1D); pictures rated as most highly arousing were recognized significantly more often by women than by men as familiar [t(20) = 2.40, P < 0.05] or remembered [t(20) = 2.38, P < 0.05]. There were no significant sex differences in memory for pictures rated less intense (0–2) or in false-positive rates (12 and 10% for women and men, respectively). Thus, women had superior memory for only the most intensely negative pictures even when subjective ratings of arousal were equated.

Easier to assimilate is the graphical representation of the results:

You can see the interactions easily in the plots: male subjects rated more of the pictures towards the low end of the scale of arousal ratings, while female subjects rated more of the pictures towards the high end; as arousal ratings increase, women showed an increasing advantage over men in percent recognition in the memory test. On the other hand, the differences were not enormous ones: the biggest (group) difference in percentage of arousal ratings is about 10%, and likewise the biggest group difference in picture recognition is about 10%. These differences are meaningful; it's worth exploring and explaining their causes and consequences; but we're talking about shades of gray here, not black and white. And individual differences among men and women, both in classification of emotionally-loaded pictures and in free recall of picture sets, will be much larger than these group differences.

There was another interesting sex difference found in this study. When they correlated the memory results with the (three-week earlier) fMRI data, they found (among a lot of other complicated stuff) "correlation clusters in the right amygdala for men (Tailarach coordinates, +16, –8, –17) and left amygdala for women (–25, –8, –17)". Here's a picture:

So we've got what looks like a meaningful sex difference (though I'd like to know more about individual differences, and I'd like to see the experiment replicated with different pictures, and so on), but it's not for to the experience of negative emotion, it's for the memory of pictures associated with negative emotions. And it's not men:amygdala :: women:cortex; rather, it's men:right-amygdala :: women:left-amygdala.

This reminds me of the old Soviet-era joke:

Question to Radio Yerevan: Is it correct that Grigori Grigorievich Grigoriev won a luxury car at the All-Union Championship in Moscow?

Answer: In principle, yes. But first of all it was not Grigori Grigorievich Grigoriev, but Vassili Vassilievich Vassiliev; second, it was not at the All-Union Championship in Moscow, but at a Collective Farm Sports Festival in Smolensk; third, it was not a car, but a bicycle; and fourth he didn't win it, but rather it was stolen from him.

In fairness to Brooks, I guess that because of the tendency for speech and language to be lateralized in the left cerebral hemisphere in most people (and somewhat more strongly in men than in women), you could imagine that the Canli et al. sex difference for memory of negative emotions (men:right-amygdala :: women:left-amygdala) might indeed have the result Brooks wants:

It could be that women are better at processing emotion through words.

Maybe, though there's a chain of about ten unsupported inferential steps between us and even a shades-of-gray version of that argument. Is there any conclusion whose empirical and logical foundations are adequately established by Brooks' column? In my opinion, there's at least one: we shouldn't accept any public policy recommendations on the basis of Brooks' understanding of cognitive neuroscience.

[I should add that there's another fallacy implicit in Brooks' use of neuroscience. He writes as if demonstrated group differences in brain activity, being "biological", must therefore be innate and essential characteristics of the groups, and not "socially constructed". But how else would socially constructed cognitive differences manifest themselves? In flows of pure spiritual energy, with no effect on neuronal activity, cerebral blood flow, and functional brain imaging techniques?

I ask this as someone who is quite prepared to believe in genetically-influenced cognitive differences. If such differences exist, let's understand what they are and decide what to do on the basis of the facts. But Brooks appears to believe that measured group differences in brain physiology are ipso facto evidence of innate cognitive differences, rather than different life experiences. If that were true, there would be no point in ever trying to teach anyone anything. ]

[Update -- Lameen Souag emails:

I wonder how Brooks would account for similar phenomena elsewhere, such as Qatar, where men's dropout rates are higher than women's even at primary school and more than twice as many women as men attend university, or Algeria, where 20%  more women than men make it to the baccalaureate, or Kuwait, where two-thirds of university students are women.  Learning styles yes - sitting down in one place and paying attention all day is a sore trial for most boys - but there's surely something broader going on here than choice of violence-filled vs. touchy-feely literature, never mind his further inferences about brains.

Qatar: http://lughat.blogspot.com/2006/04/more-from-qatar.html
Algeria: http://jazairana.blogspot.com/2006/06/60-of-bac-candidates-are-women.html
Kuwait: http://gender.pogar.org/countries/gender.asp?cid=8

Whatever is happening to schoolboys in Qatar, Algeria and Kuwait, I doubt that Brooks' explanation will work:

During the 1970's, it was believed that gender is a social construct and that gender differences could be eliminated via consciousness-raising. But it turns out gender is not a social construct. Consciousness-raising doesn't turn boys into sensitively poetic pacifists. It just turns many of them into high school and college dropouts who hate reading.

Seriously, I wonder what the global demographics of school attendence are -- how have things changed over the past few decades, with respect to sex and otherwise? If there is an overall trend for males to drop out or fall behind, to a greater extent than in earlier times, why is it? Could it be that boys are acting the way they always have, but girls are getting increased opportunities, and turn out to be better adapated (whether by nature or by nurture) to taking advantage of them? ]

[Update: more on the source of Brooks' misunderstandings, Leonard Sax.]

Posted by Mark Liberman at 08:47 AM

June 11, 2006

HBES 2006

The Human Behavior and Evolution Society is in town for its annual meeting. Last night was the keynote address by Dan Dennett, "Domesticating the Wild Memes of Folk Religion":

Organized religions are brilliantly designed social systems. Reverse engineering them suggests that some of their features are ancient, and have no authors, while others are the more or less deliberate brainchildren of religion-designers--and these answer to rather different selection pressures. Like features under sexual selection, which are shaped by interactions with the perceptual and cognitive systems of potential mates, some features of organized religions are "intelligently" selected. But still, Orgel's Second Rule applies: Evolution is cleverer than you are.

Rob Kurzban asked me to introduce Dennett, so here's what I wrote for the purpose:

Daniel Dennett is Austin B. Fletcher Professor of Philosophy and co-director of the Center for Cognitive Studies at Tufts University. He's also the author of many books, including The Intentional Stance, Consciousness Explained, Darwin's Dangerous Idea, Kinds of Minds, Freedom Evolves, and most recently and relevantly, Breaking the Spell: Religion as a Natural Phenomenon.

With these books, Prof. Dennett has accomplished something extraordinarily rare in the modern world. Over the past century, our culture has erected a wall separating serious scholarly and scientific writing from popularization. Fewer and fewer works simultaneously have a serious impact on scholars and scientists, and also appeal to a general intellectual audience, in the way that books by Locke or Darwin or William James did.

Breaking the Spell contains things of professional interest to academics in many disciplines: philosophers, psychologists, sociologists, anthropologists, political scientists, economists, biologists.... It even raises some issues of central concern to my own guild of linguistics.

But Breaking the Spell is also having an impact outside of academia. When I checked earlier today, it was #345 on amazon’s list, just in between Carl Hiaasen’s Skinny Dip and Ralph Ellison’s Invisible Man. Yesterday, it was in between The Pearl, by John Steinbeck, and The Sun Also Rises, by Ernest Hemingway. The only other philosophical work near it, in yesterday’s rankings (and it was five places behind) was Burnt Toast: And Other Philosophies of Life, by Teri Hatcher, one of the stars of Desperate Housewives.

So why is Dan Dennett’s serious (if often entertaining) investigation of the natural philosophy of religion running neck-and-neck with Teri Hatcher’s “personal, heartfelt, and often very funny manifesto on life, love, and the lessons we all need to learn -- and unlearn -- on the road to happiness”?

My hypothesis is that three factors explain their success. First, both authors engage topics that are simultaneously timeless and timely – Hatcher deals with “life, love and dying cats”, while Dennett deals with the nature of religion. Second, their writing offers memorable illustrations of their striking and attractive personal style – Dennett’s exploration of our likely reaction to the epidemiological discovery that music causes Alzheimer’s is as unforgettable as Hatcher’s story of having a t-shirt made up to express her disappointment with a date’s virility. And third, they present clear and broadly-applicable theories – Hatcher suggests that “happiness and success are choices that we owe it to ourselves to make”, while Dennett argues that in culture, as in biology, there are many questions, but only one answer (variation and selection).

But we don’t have to rely on my hypotheses when we can examine the phenomenon itself. I feel deeply honored to introduce Daniel Dennett, one of my intellectual heroes, who will speak on the subject Domesticating the Wild Memes of Folk Religion.

Dan was pleased to learn that Ms. Hatcher is a fellow-philosopher, and speculated that perhaps they could write a book together.

Amy Alkon has some interesting posts on HBES 2006 activities here, here, here and here. In this case, the blogosphere seems to be way ahead of the old media: the only thing that I get this morning by searching Google News for {HBES} is this recycled press release. That's curious, given how evocative many HBES presentations are. You can tell just from a random sample of titles, say the posters that Amy Alkon has pictures of here, or a few examples taken from the first few pages of the program: "The Role of 'Outrages' in the Evolved Psychology of Intergroup Conflict"; "Deception as a Strategy in Long-Term and Short-Term Mating"; "I Don't Get It: Further Evidence for the Encryption Theory of Humor"; "How Fatal 'Accidents' Select for Higher General Intelligence"; "They All Look the Same to Me (Unless They're Angry)"; and so on. I'd think that science reporters would be all over this one. Perhaps we'll see some uptake over the next few weeks?

Looking over the HBES 2006 program, it strikes me at this meeting, linguistics was the one dog that (mostly) didn't bark.

There was a "speech" session on Thursday, which included these paper titles:

"An Evolutionary Explanation for a Deep Voice in the Human Male"; "Maintenance of Vocal Sexual Dimorphism: Adaptive Selection Against Androgyny"; "Male Facial Attractiveness, Perceived Personality, and Child Directed Behaviour"; "Evidence for Universals in Infant-Directed Speech"; "Is Low Voice Pitch a Male Dominance Display?"

As a speech person, I'm pleased to see my subdiscipline represented, but where's the rest of the field? There's a lot of interesting recent stuff on the biological evolution of language, but there was none of it at HBES 2006.

It's easier to explain why work on cultural evolution in language was missing -- HBES seems to be mostly a gene-oriented group. Except for Dennett's talk, the word "meme" only occurs in one paper in the 2006 HBES program.

Still, there's an interesting and curious cultural gap here, even if HBES 2006 is not the best place to look for evidence of it. Darwin took the concept of "descent with modification" from historical linguistics; and the idea of language change as cultural evolution via variation and selection has remained central in modern sociolinguistics, as can be seen in the title of one the key journals in that subfield, Language Variation and Change; but partisans of memetics and linguistic researchers don't pay much attention to one another, as far as I can tell.

If meme is to become a scientifically useful concept -- and I'm agnostic on this issue -- then surely a key test will be its application to the origin and spread of linguistic innovation. This includes ubiquitous but sporadic word-related phenomena such as neologisms, idioms, collocations, connotations, figures of speech, and lexicalized metaphors. And it also includes the more systematic processes of change in sound systems, word formation and inflection, and sentence structure. I don't mean to ask whether you could use the term "meme" to talk about these things. Some people (though mostly not linguists) do that all the time. What I want to know is whether this way of talking can be turned into a scientifically interesting model, or even a sketched explanation that goes beyond the obvious common-sense observation that people make stuff up all the time, and a few of these inventions catch on and spread, sometimes with changes.

[Email from Cosma Shalizi:

I've just read your post about the HBES meeting, and am radiating deep envy in the direction of Philadelphia.
I have run across references to two books which sound like they're trying to do something serious with memes and linguistics, but unfortunately I have not had the chance to read either of them:

* Andrew Chesterman, Memes of Translation: The Spread of Ideas in Translation Theory (John Benjamins Co., 1997)
* Nikolaus Ritt, Selfish Sounds and Linguistic Evolution: A Darwinian Approach to Language Change (Cambridge U.P., 2004)

There is also a book supposedly applying Dan Sperber's non-memetic (but close) "epidemology of representations" to linguistic change in Asia:
* N. J. Enfield, Linguistic Epidemiology: Semantics and Grammar of Language Contact in Mainland Southeast Asia (Routledge, 2002)
but again I've not been able to lay hold of it.

Of course there's also Juliette Blevins' Evolutionary Phonology.

Here's a bit of memetic (or at least selectionist) analysis. The three books in this list that deal with cultural evolution of language share a relatively high price ($100 for Ritt, $135 for Enfield, $100 for Blevins), and presumably also a fairly low availability in libraries. Though I buy more $100 books than I like to think about (and I already own the Blevins volume), I can't bring myself to buy every such book that looks like it might be interesting. One can argue that we (I include myself) should adjust our expectations for book prices -- attending a conference can easily cost over between one and two thousand dollars for airfare, hotel bills and registrations, equivalent to 10 to 20 high-priced academic books. On the other hand, the production and replication cost of such books is usually only a fraction of their price; and in principle, their content could be made available for free on the web. More important, what is the likely success of a field that locks its ideas up in such high-priced boxes, compared to one that makes itself available for free?

There is considerable evidence that "Open Acess" to individual articles increases their impact, as measured by citation indices and so forth. Is there a comparable effect for whole subfields? ]

[Cosma replies:

Re book pricing, I continue to be astonished by things like this:
because it makes me wonder why we still act like this was our situation:

Indeed. ]

Posted by Mark Liberman at 07:36 AM

June 10, 2006

The history of typographical bleeping

A few days ago, I wondered in passing about the origins of typographical bleeping, in which asterisks or hyphens or underscores are substituted for certain letters in order to avoid violating lexical taboos. Greg Hanneman emailed an example from 1869, and this caused me to do a small search that pushed it back to 1688 1680. No doubt some readers will be able to push it back further.

Greg's example:

My 10th-grade English textbook included Bret Hart's "Outcasts of Poker Flat," in which the word "damned" appeared as "d----d."  As I recall, a footnote claimed that "Hart himself omitted the expletive."  If that's correct, then this would date the use of typographical bleeping to at least 1869, which is the date I can find on the Internet for the publication of Hart's story in Overland Monthly.

I do have a copy of "The Outcasts of Poker Flat" in a 1953 edition of "An Anthology of Famous American Stories," edited by Angus Burrell and Bennett Cerf.  The passage in question is on page 331:

The Innocent was holding forth, apparently with equal effect, to Mr. Oakhurst and Mother Shipton, who was actually relaxing into amiability.  "Is this yer a d----d picnic?" said Uncle Billy, with inward scorn, as he surveyed the sylvan group, the glancing fire-light, and the tethered animals in the foreground.

I tried searching LION (LIterature ONline) for "d____d" and similar patterns, and found Richard Ames, "A Satyr Again Man", from Sylvia's Revenge, dated 1688. A few hundred lines into the poem, Ames is complaining about bullies:

314 Bully how great i'th' Mouth the Accent sounds;
315 Bully who nothing breaths but Bl---d and W--nds?

Note that the hyphens seem to match the missing letters in the case of "wounds" but not "blood". Also, although "blood" and "wounds" are here typo-bleeped, as oaths violating the commandment against taking the Lord's name in vain, Ames and his printer find no problem with the same words in other uses, as in these couplets from the same poem:

49 A Spirit of Air and Flame may be withstood,
50 But who can shun a Divel of flesh and blood?

69 We cannot tell---but one at last is found,
70 Whose Charms the Heart of young Philander wound

The same distinction is made for words like "God" and "damn", which are written normally when used as ordinary nouns and verbs:

45 Man, must I than the hated Name rehearse,
46 Lord! how it stains my Ink and spoils my Verse,
47 Man by some angry God in passion hurl'd
48 Down, as a Plague to vex the Female World.

318 More Oaths and Curses not the Damned Vent,
319 Than from the Bullyes Brimstone-Lungs are sent.

but are bleeped when used in oaths:

320 The Divel himself is all amaz'd to see,
321 A wretch more impiously bold then hee;
322 He for one daring Act was sent to Hell,
323 But th'others loud G---d D---me's who can tell?

329 Sr. Fright-all lowers his Top-sail to your hand.
330 Your Pardon Sr. sayes he, I must request,
331 By G--- I thought you'd understood a jest,

Note that in these cases, three hyphens are always used, regardless of the number of missing letters, suggesting that the two hyphens in "W--nds" may have been a typo, but in any case didn't represent a pattern of consistent letter-for-letter substitution. (Perhaps three hyphens is an representation of an original m-dash? That would be odd, given LION's otherwise religious representation of original spelling and other typographical quirks, but perhaps there is an editorial policy against &mdash;?)

Through the last decade of the 17th century, and into the 18th, typographical bleeping of (religiously) taboo oaths is easy to find. In some cases, it seems as if the substitution (typically of hyphens) is letter-for-letter, e.g. Nicholas Amhurst, "Warning to young married Men" [from Poems on Several Occasions, 1723]:

13 No more as once he charm'd her list'ning Ear,
14 Call'd her no more, my Honey, and my Dear;
15 But daily, from his Work, returning Home,
16 With dreadful Oaths and Curses shook the Room;
17 To ev'ry humble Question he'd reply,
18 You saucy B-tch, G-d d--n you, what care I?

The above is also the earliest example that I've found where a word is typo-bleeped that isn't part of a religiously forbidden oath or curse. [If anyone finds earlier relevant examples, please let me know.]

Another passage by the same author provides a particularly nice example of the distinction between a quoted curse (where "damn" is bleeped) and a described curse (where "damns" is not) -- Nicholas Amhurst, "Upon Parties" from Poems on Several Occasions, 1723:

38 The Tory with his sworn Opinions big,
39 Glows with hot Zeal, and cries G-d d--n the Whig;
40 The Whig, of his Perswasion full as vain,
41 Damns the vile Tory, in as proud a Strain;

This distinction is still fitfully made, by those who typo-bleep words like "damn", but no such distinction exists for the assorted taboo words for sexual acts, bodily wastes and the like: here it's not the speech act that's taboo, but the word itself.

The earliest example of typo-bleeping scatology that I've found is James Robertson's poem "Alexander the Great", from Poems on Several Occasions, 1773 (and what's with that title, anyhow?):

1 As Alexander (all the World subdu'd)
2 Amid a throng of circling courtiers stood,
3 "In Me, he cry'd, Great Ammon's offspring view,
4 "To mighty Jove my origin is due;
5 "Let favour'd monarchs swell young Ammon's train,
6 "My father's viceroy, god-like, here I reign;
7 "Whate'er I will's the will of mighty Jove,
8 "On Earth I rule, as he commands above."
9 He spoke:---Adoring courtiers prostrate lay,
10 When a poor Crow whom chance had brought that way,
11 As high in air he o'er the monarch sped,
12 Croak'd loud disdain---and sh-t upon his head.

[Oops, as I just remembered, I myself cited a scatological typo-bleeping from 1680, in a post from August of 2005. I'll leave in the Robertson poem, worthwhile in itself, but here's the earlier passage:

22 Vile Sot! who clapt with Poetry art sick,
23 And void'st Corruption, like a Shanker'd Prick.
24 Like Ulcers, thy impostum'd Addle Brains,
25 Drop out in Matter, which thy Paper stains:
26 Whence nauseous Rhymes, by filthy Births proceed,
27 As Maggots, in some T---rd, ingendring breed.

The author is John Oldham, and the work's title is "Upon the Author of a Play call'd Sodom". I wonder if it means something that he chose to bleep "turd" but not "prick". ]

[Update -- Michael Greenberg writes:

The oldest example I can think of typographical bleeping goes back to Ancient Egyptian. It comes in two forms, the first of which is (presumably) oldest.

1) Mangling of symbols for protection. In the Pyramid Texts of the 6th Dynasty (around 2200 BCE), certain animal characters would appear mangled, e.g. snakes appear without heads and with knives through them. These texts were spells for use by the deceased in the afterlife; mangling potentially harmful characters would help protect the user.

2) Overwriting of other pharoah's names. Ramses II was a big practicer of this in the 18th Dynasty (around 1400 BCE), but it went on before.

In both cases, defacing the written word weakens its power; in one case, it is to protect the reader, while in another it is to efface the intent of the original writer.

I agree that the first practice ("mangling for protection") seems similar in spirit to typographical bleeping. The second one (overwriting names) seems further away, at least to me.

Several readers also pointed out the obvious connection to the history of methods for avoiding pronuncation of the tetragrammaton. As far as I know, though, the traditional techniques in this case did not include replacing certain letters with graphemic wildcards. ]

[Update: the earliest bleeped F-word, here.]

Posted by Mark Liberman at 01:06 PM

Reversed née

In the New York Times today (6/10/06), Maureen Dowd's op-ed piece "Bloggers Double Down" refers to a political blogger turned novelist by both of her names:

But others... see him [Markos Moulitsas, who runs the political blog Daily Kos] the way Ana Marie Cox, née Wonkette, described him his week in Time.com:...

Whoa!  That "née" is reversed; the person who's blogged under the name Wonkette was born Ana Marie Cox.

Maureen Dowd is not the first to get it backwards for Cox/Wonkette.  Here's the title of a book review by Phil Kloer in the San Diego Union-Tribune of 1/29/06:

"Ana Marie Cox, nee Wonkette, does surprisingly well in her debut novel"

(Here "nee" is one degree less Frenchy than Dowd's "née".)

It's not just Wonkette.  Here's a reversal (also accentless) for Cutler/Washingtonienne, from "The Week in Wonkette" by Ryan Avent on the blog dcist on 1/4/06:

It began in this month's Capitol File, where Jessica Cutler (nee Washingtonienne, and also a published author) seems to direct a column (on sex, natch) at Cox.

It's not just women.  Here are two reversals (accentless, but still feminine in form if understood as French) for John Hinderaker, a lawyer who blogs on Power Line and uses the alias Hindrocket, from "Support The Troops, When Convenient" by "Reverend Mykeru" (Michael Cortese) on the blog mykeru.com on 4/15/05:

Just the other day, a member of the chickenhawk right, John Hinderaker (nee "Hindrocket", AKA "Assrocket") committed one of the most egregious examples of abandoning the troops when they start becoming real people...

and from "Hindrocket's Hackery" by Nico Pitney on the blog Think Progress of 4/14/05:

Over at Powerline, John Hinderaker (nee "Hindrocket") is outraged.

It's fairly easy to see how things got turned around this way.  We start with the adoption of the French feminine singular participle née 'born' to indicate a married woman's maiden name, as in "Hillary Clinton née Rodham"; the OED Online revision of 9/03 has cites from 1758 through 2000, all except one (from 1878) with the accent -- and that one indicates the Frenchness of the word by italicizing it.

Next, it gets extended ("often humorously and for effect", the OED says) to the meaning 'originally called' and is applied to things and places, as in the OED's first cite in this sense, from a 1958 issue of the International Journal of American Linguistics:

On Tagmemes, née gramemes

and, usually without an accent, to men who have adopted pseudonyms or aliases, as in the OED's 1988 cite from the Los Angeles Times:

He once had a coach, the infamous Johnny Blood (nee McNally)

(though the gender-appropriate French is also attested for the latter purpose; the OED has cites from 1937 on).

Then, it gets further extended to the meaning 'formerly called', with no claim that the earlier name was the original one.  Examples that are unambiguously of this sort are not hard to find; they involve a series of renamings, as in this posting by Doug Snow on the blog The Volokh Conspiracy, 4/4/06:

I think it's interesting that my IP address is resolving from the Milwaukee Central Office of AT&T (nee SBC, nee Ameritech, nee Wisconsin Bell).

or in this one by "Ace" on the blog Ace of Spades HQ, 8/17/05, entitled:

"P Diddy, nee Puff Daddy, Changing His Name Again"

This time the man was simplifying his name to Diddy.  But he was ORIGINALLY Sean Combs, and at this point in his life was merely FORMERLY Puff Daddy, soon to also be formerly P Diddy.  (Also note Frenchness conveyed by italics.)

Once we get to things like "nee Puff Daddy", the way is open for the reader to (mis)understand the nee as merely supplying an alternative, not necessarily former, name -- to understand it as a synonym of a.k.a. or AKA (or however you want to spell it), with which it sometimes co-occurs, as in the Mykeru quote above, or in this wonderful quote from "The Hysteries of Tacitus", by "Retardo Montalban" on the blog Sadly, No!, of 5/11/06, which has pretty much everything going on at once (with some invective thrown in for free):

Josh Trevino (nee' Tacitus, a.k.a. The Marble Douchebag), in the conclusion to one of his patented 'I'll Concentrate On The Mote In Your Eye If You'll Please Ignore The Huge Pole Up My Ass' diatribes...

Here we get a version of née used of a man, to introduce a pseudonym, in combination with a.k.a., and with a spelling that marks it as French -- though ineptly, since the accent is associated with the second e rather than the first.

In any case, such examples are not literally of reversed née; instead they illustrate the extension of the word to mean 'also known as', with no specification as to the temporal order in which the names appeared.  I'm not ready to go there yet with Maureen Dowd, but at least I see how she (and the others) got there.  Are they the wave of the future? 

[Update, later the same day: A pile of readers -- Gene Buckley got in first -- rose up to suggest that Dowd's usage, and some (but possibly not all) of the others I cited, could be just a variety of the 'originally named' reading -- but extended from the view of the person referred to to the view of the reading public, or Us, and in this case We are the blogosphere.  Or, in other words, that the meaning is no longer 'originally/first named', but 'originally/first known as'.  We first came across her as Wonkette, then eventually discovered that she was Ana Marie Cox, so FOR US Wonkette came first.  Plausible as another route to the appearance of reversal, and also fairly distant from the original meaning of née.]

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 01:05 PM

Extorting Barry

If you've been following the controversy over Barry Bonds' alleged steroid use, you may have heard of Kimberly Bell, the ballplayer's ex-girlfriend. She provided damaging testimony against Bonds in front of a federal grand jury investigating BALCO (the Bay Area Laboratory Cooperative, which supposedly supplied Bonds and other athletes with performance-enhancing drugs). Bell's leaked testimony was a crucial component in Game of Shadows, the book that blew the lid off the BALCO/Bonds story. Now, as the Feds line up new charges against Bonds, Bell is being told by federal investigators not to cooperate with Major League Baseball's own steroid investigation. Bonds' lawyer, Michael Rains, suggested to the AP that this is being done because Bell lacks credibility:

"Maybe they realize when Kim Bell starts answering questions, it's gonna become clear that she first tried to extort Barry for money, that she changed her stories about various things and has changed it since then and will change it again," Rains said.

This curious phrasing isn't new for Rains. Back in March, when excerpts from Game of Shadows were first published in Sports Illustrated, Rains released a statement to the San Francisco Chronicle, which read in part:

We know and understand that one of the most prominent sources is a woman who previously attempted to extort Barry for money.

This use of extort was new to me — "extort money from Barry" sounds much more natural than "extort Barry for money." And every dictionary I checked implies only "extort (something) from (someone)" as a possibility, not "extort (someone) for (something)." But an online search finds that Rains' usage is not at all idiosyncratic. (Rule #1 of Googlinguistics: Nothing is idiosyncratic.) For instance, here's a quote from boxer Jose Antonio Rivera talking about a potential title fight against Oscar De La Hoya:

"And believe me, if Don King gives me $2 million to fight him, I'm not going to complain and at the last minute try to extort him for $6 million more." (Worcester Telegram, May 8, 2006)

As it turns out, there's been a lot of talk about extortion lately, thanks to a new video game based on the Godfather movies. Here are some excerpts from reviews of the game:

Beyond plot-based missions, players can engage in multiple side jobs. Rob the local bank and rush to your safehouse before the cops catch you, extort businesses for money, or take out Corleone enemies. (USA Today, Mar. 20, 2006)

You can build up your own crime empire by moving from shop to shop and extorting the owners for 'protection' money and running illegal rackets for extra cash. (Kikizo Games, Mar. 24, 2006)

Oftentimes I'd wander into a Barber's Shop and he'd run off scared, thinking I was going to try to extort him for money. (GameSpot, Apr. 1, 2006)

For instance, if you choose to extort bakeries for protection money, you'll notice that every bakery has the same layout and character models inside. (Hartford Advocate, May 11, 2006)

Other examples lack the "for NP" complement but still take an object denoting the person or establishment from whom money is extracted, rather than the money itself:

You can go in to shops and extort them look about for rackets and then take them. (GameSpot, Apr. 16, 2006)

Additionally, there are tons of things to do, from extorting businesses to taking over the city, which means you'll definitely get your money's worth. (Hartford Advocate, May 11, 2006)

Independently of the Corleone missions, you have to extort businesses, take over rackets and generally cause mayhem throughout the city. (Mail & Guardian, May 12, 2006)

It's great fun to pretend to be a mobster, to wield deadly weapons, and to accumulate money, power, and respect by extorting businesses, carrying out contract hits, bribing cops, and fighting the occasional mob war. (Slate, June 1, 2006)

So it seems that the range of possible "frames" for the verb extort has been expanding. The previous standard usage put extort in a verb class that Beth Levin calls the "STEAL verbs," i.e.:

abduct cadge capture confiscate cop emancipate embezzle exorcise extort extract filch flog grab impound kidnap liberate lift nab pilfer pinch pirate plagiarize purloin reclaim recover redeem regain repossess rescue retrieve rustle seize smuggle snatch sneak sponge steal swipe take thieve wangle weasel winkle withdraw wrest

All of these verbs can appear in a frame that we can notate as "NPa V NPb from NPc," where NPa is the taker, NPb is the thing taken, and NPc is the victim of the taking:

The con artist extorted/extracted/snatched/stole/took $1,000 from me.

Note, however, that one common verb of this class, take, has developed another colloquial sense, which allows the frame "NPa V NPc for NPb." Here the object immediately after the verb denotes the victim and the object after for denotes the thing taken, such as money:

The con artist took me for $1,000.

The OED lists this as sense 8c of take ("To swindle, cheat, or deprive of money by extortion. Freq. const. for") and gives these citations that fit the frame:

1930 D. HAMMETT Dain Curse xii. 122 They landed Mrs Rodman... They took her for one of her apartment buildings.
1968 'L. MARSHALL' Blood on Blotter xxvii. 183 'How much did you take him for?' 'Slade? Plenty.'
1970 Washington Post 30 Sept. B12/4 It looks to me like yo're fixin' to git took for the dollar an' thirty cents, Shuffy.
1982 'E. LATHEN' Green grow Dollars xiv. 112 'I told Mary to take them for every penny she could get,' he said stoutly.

A few other colloquial verbs also work in this frame, such as soak. Here are more OED cites (under sense 7f):

1915 WODEHOUSE Something Fresh ii. 37 Especially after poor old Percy had just got soaked for such a pile of money.
1966 'L. LANE' ABZ of Scouse 101 Can I soak yer fer a coupler bob?
1977 Time 21 Nov. 59/2 Then add the investment in sophisticated equipment: a single stainless-steel 1,000-gal. vat can soak the vintner for some $6,000.

And if we consider phrasal verbs, then we can add "shake (someone) down for (money)" and "hit (someone) up for (money)" to the list. (Both expressions are American slang of early 20th-century vintage.) Note that soak, shake down, and hit up don't allow the traditional frame for "STEAL verbs," however:

* The con artist soaked $1,000 from me.
* The con artist shook $1,000 down from me.
* The con artist hit $1,000 up from me.

Those verbs aren't as flexible as take, which can fit both the "NPa V NPb from NPc" frame and the "NPa V NPc for NPb" frame. And now extort has joined take in that select class. Furthermore, we can see that the new sense of extort, like take, can have a verbal complement with just one object, denoting the victim (NPa V NPc):

The con artist took me.
The con artist extorted me.

This variant might be more acceptable to some speakers in the passive voice:

I got taken (by someone).
I was extorted (by someone).

Without the "for NP" complement, take and extort now resemble what Levin classifies as the "CHEAT verbs" (e.g., bilk, cheat, con, defraud, fleece, rob, strip, swindle). The CHEAT verbs, however, can appear in a fuller frame "NPa V NPc (out) of NPb," while take cannot. Extort, gregarious word that it is, seems like it could join this verb class too:

The con artist cheated/conned/fleeced/swindled me out of $1,000.
* The con artist took me out of $1,000.
? The con artist extorted me out of $1,000.

Interestingly enough, after investigating all of this, I don't think Rains' use of extort sounds so peculiar after all! My tolerance level has been raised — but only in terms of word usage. I'm still not buying Rains' effort to discredit the most damaging witness against his client.

[Nothing new under the sun... A check of the newspaper databases finds examples of the Rainsian construction back to the '70s at least:

New York Times, July 29, 1971, p. 39
He said that bribes of about 5 per cent might be paid for a $50,000 loan. Asked if that was the going rate, he replied: "Not necessarily, some would extort you for 10 per cent or more."

Washington Post, Nov. 15, 1975, p. A16
Three Northern Virgina men convicted last month of extorting a former business associate at gunpoint were sentenced yesterday in U.S. District Court in Alexandria to prison terms of varying lengths.

New York Times, Dec. 18, 1977, p. 55
Many storeowners and restaurateurs have been allegedly extorted for money or free meals over the past few years by members of the Ghost Shadows, White Eagles and Flying Dragons youth gangs.

[Mark Liberman offers another interpretation:

My first reaction to this usage of "extort" was that it was a lawyer's substitution for "blackmail", where "blackmail for " seems like a standard usage. At least, it's normal for the direct object to be the person being threatened.
But what's interesting about this use, I think, is the non-specific "for money". I'd expect say "blackmail (or take or etc.) for ten thousand dollars". It's a sort of "cognate prepositional phrase", like the (semantically) cognate objects in "cost money" or "owe money".

Posted by Benjamin Zimmer at 01:29 AM

June 09, 2006

Vocabulary Building at the Seven Lazy P

Rich and I just got back to our Montana summer fieldwork (me)-and-writing (us) base from a couple of days at the Seven Lazy P Deep Canyon Ranch, in a gorgeous never-logged and un"developed" canyon on the Rocky Mountain Front outside Choteau, Montana. From the ranch we drove up into the canyon to a trailhead from which you can wade across the North Fork of the Teton River (a bit challenging this time of year) and hike up into the Bob Marshall Wilderness. All along the 8-mile road to the trailhead we saw mule deer, mostly bucks; and, as mule deer do, they fled in hilarious pogo-stick bounds, often more or less straight up a steep mountainside. They aren't quite as charming as kangaroos hopping -- nothing could be -- but there aren't a lot of kangaroos in Montana, and watching mule deer bound is a lot of fun. At dinner that evening we reported on the unusually large number of mule deer, and on their bounding, and one of the other guests mentioned that this gait is called stodding. Never heard the word. So when I got back to my computer I googled stodding: only 18 hits, and only two of these had to do with this pogo-stick gait.

Next I tried the on-line Oxford English Dictionary (OED), looking for stod, and here I got a surprise: it didn't ask if maybe I meant something else, like maybe stud; it simply sent me directly to the stud entry. A flaw in the program, if they make assumptions about user-misspellings. I went back and tried stodding and got (appropriately, as it turned out) a "no results" report.

Back to Google, and this time I tried "mule deer" pogo stick. That worked: in various entries I found the word stotting for the bounding gait. So my fellow ranch guest had fallen into the Metal, meddle, mettle, medal, etc. trap with his spelling (I did ask him to spell it) stodding. The spelling stotting (which is of course pronounced exactly like stodding, though stot and stod are not pronounced identically) is in the OED: the verb means `rebound, bounce (from, off)...; jump, start, spring'. Numerous sources (see Google) use the verb to refer to the bounding gait of antelope, gazelles, and also mule deer. The sources describe the gait as a stiff-legged bounding with all four feet off the ground at once.

But some sources note that there's also another word for this same gait: pronk. Unlike stot, which (the OED says) is of obscure origin, the etymology of pronk is known: it's from an Afrikaans word meaning `to show off, strut, prance', and ultimately from Dutch pronken `to strut'; and it was first applied to the spectacular bounds of the little South African antelope called a springbok. I have a vague recollection of hearing pronk in an obscene sense, but either I'm remembering wrong or all the Google sources I checked were too prim to mention this one.

Anyway, now I know how to describe the mule deer's bounding with two words that are brand new to me. I think I still prefer the pogo-stick image, though.

Addendum/Correction: Ben Zimmer just told me, as we passed in a corridor in Language Log Plaza, that there is reason in the OED's rhyme, because stod is given in the stud entry as a 14th-century spelling variant of stud.

THREE MORE ADDENDA: First, a correction, thanks to Mark Etherton, a Language Log reader in the U.K.: RP and many (most? all??) speakers of British English would have a genuine phonetic [d] in stodding and a genuine phonetic [t] in stotting. So my comment about identical pronunciations of the middle consonant in the two words is valid only for most (not all) dialects of American English. Careless of me not to mention this in my post: apologies to all international readers of Language Log!

Second, Grant Hutchins reports that a science comprehension section on his ACT exam (vintage 2000 or 2001) "heavily featured" measurements of stotting by some kind of buck. He says that the word struck him as odd at the time, and that he will never forget its meaning....So clearly it's possible to build this word into one's vocabulary without ever visiting western Montana.

And third, another reader turned up obscene and scatological meanings for spronk, so that's probably why I vaguely remembered such meanings for pronk. But (as the reader discovered) googling spronking (rather than just pronk, which turns up too many proper names) shows that its current usage is mainly the same as pronking or stotting. It's certainly not as common as pronking -- only 93 Google hits, as opposed to 11,300 for pronking -- but it's still puzzling: where did that initial s- come from? Not from Dutch, as far as I can tell. (The OED has an entry for spronk, but both of its two meanings, `a shoot, sprout' and `a spark', are irrelevant, and also the word is obsolete except in certain dialects. So that's no help.)

Posted by Sally Thomason at 02:03 PM

Go and synergize no more

If, as Geoff Pullum reminds us, "people who are clueless about English grammar shouldn't be trying to humiliate others over grammar," then by the same token people who don't know how to use a dictionary shouldn't try to appeal to lexicographical authority to advance an argument. The latest such case (brought to the attention of the American Dialect Society mailing list by Bill Mullins) comes from a blog called NASA Watch by Keith Cowing, a former NASA scientist who now casts a critical eye on the workings of the agency. In a recent entry about linguistic sloppiness at NASA, Cowing zeroed in on one particular lexical item that rubbed him the wrong way:

Experts Agree: Synergizing is not a Word - Yet
Today, in the Constellation presentations at NASA headquarters, I heard Jeff Hanley use the word "synergizing" several times - as a verb. This was a new word to my ears - almost as good as hearing someone at NASA saying that they were going to "action" something. Out of curiosity, I checked Dictionary.com, Merriam-Webster OnLine, The Free Dictionary, and Cambridge Dictionaries Online. None of these resources recognized the word "synergizing".

As Cowing's links indicate, the search term he used for these online references was synergizing, not synergize. If he had thought to search on the uninflected form, he would have had better success with Dictionary.com, which consolidates entries from a number of dictionaries. A proper search finds entries for synergize in Webster's New Millennium Dictionary of English and Merriam-Webster's Medical Dictionary.

(Webster's New Millennium, by the way, has no relation to the Merriam-Webster line of dictionaries. It's an electronic resource recently developed by Lexico, the publishing group behind Dictionary.com. So far, in this new millennium, its main claim to fame is that it fell for a lexicographical copyright trap set by the New Oxford American Dictionary — see this New Yorker piece and this Chicago Tribune followup for details.)

Merriam-Webster's Medical Dictionary provides the older sense of synergize, "to act as synergists; exhibit synergism," or, used transitively, "to increase the activity of (a substance)." Webster's New Millennium gives a more generalized meaning, "to cooperate with another or others, esp. to remedy something." It's true that the other references on Cowing's list — Merriam-Webster Online, The Free Dictionary (which reproduces entries from the American Heritage Dictionary), and the Cambridge Dictionaries Online — are no help with synergize or synergizing. But both the limited sense relating to interacting pharmacological agents and the more extended sense can be found in various unabridged dictionaries. The Oxford English Dictionary gives the definition as "to act as a synergist, co-operate, as a remedy, or an organ, with another," with a handful of citations, including this lyrical passage of literary criticism:

1954 Times Lit. Suppl. 12 Nov. 721/1 The illuminating, synergizing word here, without which the rest is nothing but maundering, is..the word sighs.

Webster's Third New International Dictionary of 1961, still reliable after all these years, gives the "act as synergists" sense later published in Merriam-Webster's Medical Dictionary, also listing the synonyms cooperate and coordinate. Granted, the OED and Webster's Third aren't available online for free, but the wonderful Century Dictionary is, as it's old enough to be in the public domain. Global Language Resources even offers scanned page images, and here's what you can find in the 1909 supplement:

So at least in the medical sense, synergize and synergizing have been in the bigger dictionaries for about a hundred years. (And Google Book Search easily churns out attestations back to 1882.) But guess what? As it relates to Cowing's gripe, none of this really matters. Cowing didn't like the way that synergizing was used in NASA management-speak, and he thought he found evidence that it was "not a word" based on a desultory search of online references. So has synergize suddenly "become a word" because we've found it in several dictionaries? The idea is laughable. But such is the power of "The Dictionary" in the popular imagination as an anointer of sacred wordhood.

Lexicographers themselves are quick to dispel this notion of The Dictionary as ultimate authority. Two recent pieces by dictionary honchos make for instructive reading on this point. Jesse Sheidlower, editor-at-large of the Oxford English Dictionary, wrote last month on Slate about copywriter Ray Del Savio's quixotic campaign to get the verb concept in Merriam-Webster's Collegiate Dictionary. Such lobbying is the flip-side to Cowing's disparagement of synergizing as "not a word" because he couldn't find it in a dictionary — from Del Salvio's perspective, once concepting is in The Almighty Dictionary, well, then it will really have arrived!

In a similar vein, New Oxford American Dictionary editor Erin McKean stepped in as a guest blogger for Powell's Books not too long ago and posted an entertaining entry titled "What's a Word Gotta Do to Get in This Joint, Anyway?" She writes:

Lots of people (and by "lots" I mean roughly 99% of everyone I've ever spoken to) believe that the dictionary is a Who's Who of words. That it's like Ivy League college admissions. That only the really good words, the ones that have eaten all their spinach and who play the oboe and who get high scores on the SAT, make it into the dictionary. That the words that make it into the dictionary are somehow "realer" than the words that don't.

Well, that's not exactly true. It does take a bit of work to get a word into the dictionary, but inclusion in the dictionary is not an honor. The dictionary words are not more real than the words not in the dictionary. What they are is more USEFUL.

Think of the dictionary as less of a Social Register for words and more like a word general store. I am the manager of the word general store. Do I stock only words in my size? Only in the flavors I like? Only the words I wish people would use? No — I provide a wide selection of words for the use of all my customers. And because my customers are such a wide group (basically, all adult readers and writers) I have to make sure to include the words that will serve their needs.

Because there's limited shelf space in the Word Store, we have to make hard decisions about what to stock. Those decisions are based mainly on usefulness, not on beauty or on any kind of perceived intrinsic merit. We want you to be able to walk in and grab what you need off the shelf (and since we don't have to worry about shoplifting, there's nothing kept behind the counter).

If more people thought of the dictionary as a "word general store" (and had a better idea of how to rummage through the aisles), then maybe we could avoid these constant misperceptions about what's stocked on the shelves, lexically speaking.

(As for the management sense of synergizing that got Cowing's goat in the first place, its popularization is no doubt due to Stephen R. Covey's 1989 best-seller, The Seven Habits of Highly Effective People. Habit #6 is "Synergize" — as Covey advises, "The whole is greater than the sum of its parts.")

[Update #1: And speaking of popular perceptions of how words get included in dictionaries, here's a wonderful image from the New York Times Sunday Magazine of June 3, 1923, accompanying an article entitled, "Getting Into the English Dictionary; Every New Word Must Pass an Inquisition to be Admitted to the Select 500,000." (Thanks to George Thompson for discovering this and Jesse Sheidlower for hosting the image online.)]

[Update #2: I goofed in the original version of this post, misspelling Cowing's name as "Cowling," as Bill Mullins had rendered it on the American Dialect Society mailing list. Cowing responds on his blog:

I just love it when the language police try and make a grammatical point - and misspell the name of the person they cite for having committed the grammatical offense ...

It just goes to show that Hartman's/Skitt's/McKean's Law is alive and well. But I hope it's clear that there was no "grammatical offense" that I was "policing"... Just an appeal for improved appreciation of dictionaries, both practically in terms of actual use and conceptually in terms of understanding the limits of their authority.]

Posted by Benjamin Zimmer at 01:34 AM

June 08, 2006

Zero for three on grammar, minus three, makes -3

On May 23 a user named "Bob" posted on the group blog of the left-leaning Blue Mass Group a brief attack on the grammar of a post on a rival right-leaning blog, Hub Politics. Said Bob:

Regressive blog Hub Politics offers a revealing post today titled "English Perversion." The piece, which blames the legislature for the poor English of some schoolchildren, begins: "We can thank the overwhelmingly liberal legislature for the poor fluency amongst the non-native English speaking students, after all, it was the Democrats on Beacon Hill who gave the axe to the voter approve English immersion program." (Screen shot here). A grammatically correct sentence, of course, would use the past tense of the word "approve." One might also question the author's use of the preposition "amongst" in this context, and the awkward use of the clause "after all," but let's start with spelling and grammar. With friends like these, do Massachusetts Republicans need enemies?

Bob makes three grammar points here. And guess what his record is for getting stuff right? It's beginning to be a familiar score. He's zero for three. In fact you could say he's actually at minus 3, because while he manages to be wrong on all three of the points he makes, he misses three actual errors in the quoted passage. Did you spot them?

Here is the detailed grammar fisking.

  1. Bob can't tell the past tense (preterite) from the past participle. It is true that approve should have been approved, but that's because it should have been a past participle. (For approve, as for all regular verbs in English, the preterite and the past participle have the same form in both spelling and pronunciation. But if an irregular verb like write had been used instead of approve, the correct form here would have been the past participle written, not the preterite wrote.) So he can't tell a tensed verb from a participle, or a finite clause from a non-finite one.
  2. It is quite unclear what Bob thinks might be wrong about using amongst here. He may be half-recollecting the old story about how between is wrong for more than two items and should be replaced by among or amongst, which is not true, but even if it were, amongst should be fine here. Perhaps Bob is one of those who think that amongst is just a pretentious alternative form of among, which should be avoided as an affectation. Arnold Zwicky tells me there are some people who believe this. But the belief is false. Merriam-Webster's Dictionary of English Usage cites clear evidence for this statement from the usage of a number of authors. Amongst is slightly less common than among, but both are correct. Bob thinks he has a possible nitpick, but he doesn't.
  3. Regarding the awkward use of the clause "after all", the awkward fact is that "after all" is not a clause. It's a preposition phrase (of idiomatic meaning). Bob doesn't know the difference.
  4. The first of the three errors that could actually be charged in this paragraph but are missed by Bob is that "English speaking" might reasonably be corrected to "English-speaking". The hyphen can be seen in this image of one edition of the book by Churchill that contains this phrase in its title. I would not insist on this intransigently — hyphens in compound attributive modifiers are not consistently used even in printed sources; but if Bob wants to be pedantic, then he would have been well advised to point out that it would have been preferable to follow Churchill and have the hyphen in this modifier.
  5. The second error Bob missed is that a hyphen is even more clearly needed in the compound attributive modifier "voter-approved", which means "approved by the voters". There are only two occurrences in the 1987-1989 Wall Street Journal corpus, but both have the hyphen.
  6. And finally, the comma after "students" marks the start of a clear case of an unacceptable run-on sentence. A semicolon or a period is needed after "students". The run-on sentence is a subtle matter (some apparent run-ons seem fine, like the sequence of them that begins Dickens' A Tale of Two Cities), but this is pretty clearly a case of the bad type.

The bottom line here is that people who are clueless about English grammar shouldn't be trying to humiliate others over grammar for political advantage. Those who feel grammar correction is something they want to discuss in the media should do just a little bit of work to prepare themselves for their calling. Not knowing a preterite from a participle or a clause from a phrase is like not knowing how to add small integers.

And in any case, as the Boston weekly Dig, in the Media Farm section on May 31, points out, "Blue Mass should be pointed to Rule 1 of Media Criticism: avoid ridiculing typos." (Then they immediately break Rule 1 by pointing out two typos from the Blue Mass blog from the previous few days. Sigh.)

The Dig also notes that "If anything, bad grammar actually seems to help Republican political campaigns"; but that may just be a Dig dig at Bushisms once again. (Renewed sigh.)

Posted by Geoffrey K. Pullum at 02:45 PM

Meh-ness to society

In today's Star-Ledger (a daily newspaper from northern New Jersey), television critic Alan Sepinwall responds to readers' comments about the HBO series "The Sopranos." Like many other critics and fans, Sepinwall was bitterly disappointed by Sunday's uneventful season finale. One reader pointed out that Sepinwall's wrap-up of the episode neglected a touching scene where New Jersey mob boss Tony Soprano pays a hospital visit to his New York counterpart, Phil Leotardo, who has suffered a heart attack. Sepinwall answers the reader apologetically:

I was so busy railing against the overall meh-ness of the finale to mention arguably the best scene of the hour.

Grammatically, that sentence is a bit strange (shouldn't he have written 'I was too busy ... to mention" or "I was so busy ... I forgot to mention"?), but we can chalk that up to poor editing. More interesting is Sepinwall's use of meh-ness to describe his lukewarm reaction to the episode.

To understand meh-ness, let's first take a look at the root, meh — primarily used as an interjection like blah, evoking dullness or apathy. Like so much in contemporary American pop culture, Fox's long-running animated show "The Simpsons" is crucial to the development of meh. Here is the current entry for meh in Wikipedia's excellent "List of neologisms on The Simpsons":

"Meh" is a commonly used word in the Simpsons universe, and is a sort of grunt of disinterest.

In the episode "Hungry Hungry Homer", Homer asks Bart and Lisa if they want to go to Blockoland:

Bart and Lisa together: Meh.
Homer: But the TV gave me the impression that...
Bart: We said, "Meh!"
Lisa: M-E-H, meh.
The meaning seems to be approximately "I'm not in favour of the idea, but would go along if necessary." It could also be interpreted to mean "Oh well" or "whatever." For example:
Mother: What do you think of the new socks your aunt gave
you for Christmas?
Son: Meh.
"Meh" is also often used by The Cheat of Homestar Runner fame.

One notable use in this form was in the episode "Girly Edition." When Marge says to Homer "Oh, for Pete's sake! Why is that monkey wearing a diaper? I thought he was housebroken!", Mojo, the helper-monkey, responds by waving his hand while saying "Meh".

The Simpsonian usage of meh hasn't passed by eagle-eyed lexicographer Grant Barrett, who logged a citation for it on Double-Tongued Word Wrester, his online compendium of slang, jargon, and other fringe vocabulary. (Incidentally, Grant has spun off DTWW into a fine new book, The Official Dictionary of Unofficial English, which joins Far from the Madding Gerund on this year's list of Language Log-approved bathroom reading.) The DTWW citation for meh is from a bit of "Simpsons" dialogue in a May 1996 episode reproduced on the Usenet newsgroup alt.tv.simpsons:

Homer: [holds Lisa's suitcase] Somebody's travelling light.
Lisa: Meh. Maybe you're just getting stronger.
Homer: Well, I have been eating more.

That wasn't the first use of meh on "The Simpsons," though. A March 1995 episode had this exchange between Bart and Marge on a family outing to the Springfield Renaissance Fair (as transcribed on alt.tv.simpsons and The Simpsons Archive):

 Bart: [whining] Oh, these renaissance fairs are so boring.
Marge: Oh, really? Did you see the loom? [camera turns to it] I
took loom in high school.
[Marge hums, quickly weaves "Hi Bart, I am weaving on a
Bart: [pause] Meh.

At the time, meh was such a new addition to the vocabulary of "The Simpsons" that the closed-captioning transcribers didn't know how to deal with it. According to a post on alt.tv.simpsons, Bart's interjection was "rendered incorrectly, and humorlessly, in closed-captioning as 'Nah.'"

Nowadays meh has become firmly established in TV fan forums, very often extended to adjectival usage, as in "that episode was meh" or "that was a meh performance." So meh-ness was a logical next step in the description of television fare that leaves viewers unmoved. Meh-ness is a particularly popular neologism on the forums of Television Without Pity (whose motto is "Spare the snark, spoil the networks"). Fittingly, the usage appears most widespread in the forums for "American Idol," a show that inspires an enormous amount of meh-ness vis-à-vis its weekly cavalcade of insipid singing. Here's a selection of TWoP comments on "AI" from the first few months of 2004:

All that to say that I think your impression of the talent pool seems pretty accurate, splitchick, given the overall meh-ness of this season's so-called "talent". (Jan. 30, 2004)

Second place will be whomever is left after the meh-ness cancels each other out. (Feb. 17, 2004)

Kim was very gracious here. Especially considering the meh-ness that is AI3. (Mar. 10, 2004)

And Clay breaks the long-standing tradition/curse of meh-ness and sucking from former Idol constants who reappear on the show. (Mar. 17, 2004)

Camille - Started out well, but then faded into meh-ness. (Apr. 6, 2004)

It's the meh-ness of his singing that pisses me off. (Apr. 7, 2004)

I agree about the meh-ness of John's voice. (Apr. 7, 2004)

Jasmine: You reign as the undisputed queen of this season's meh-ness. (Apr. 15, 2004)

I had a feeling that song wouldn't sound good in his range, and it really didn't. Utter mehness. (Apr 27, 2004)

Bring on Marque, Lisa, Susie, and everyone else who might make this display of infinite mehness a little more interesting. (Apr 28, 2004)

Recently another fan site devoted to "American Idol" known as Foxes On Idol used a bland performance by Chris Daughtry as an opportunity to make an obvious pun (which I've shamelessly appropriated):

Chris is turning into a meh-ness to society. (May 10, 2006)

So the appearance of meh-ness in the Star-Ledger column is perhaps an indication of the incipient mainstreaming of a term previously limited to subcultural fan discourse on sites like Television Without Pity. It's not surprising that TWoP would be a fertile breeding ground for neologisms, since participants in the forums are constantly exploring innovative wordplay. But such innovations usually remain limited to usage within the forums, where they serve as a badge of in-group identity for the snark-oscenti. For more on creative TWoP-talk, check out the article by Mark Peters in the latest issue of Slayage, a journal devoted to scholarly appreciation of "Buffy the Vampire Slayer." The article is entitled, "Getting a Wiggins and Being a Bitca: How Two Items of Slayer Slang Survive on the Television Without Pity Message Boards" (HTML, PDF). It's part of a special issue, "Beyond Slayer Slang: Pragmatics, Discourse, and Style in Buffy the Vampire Slayer," guest-edited by Michael Adams, author of Slayer Slang and a noted wordanista.

[Update #1: Karen Kay emails to point out that I completely missed the Yiddish roots of the interjection meh, which would of course long predate "The Simpsons." There was some discussion of meh last year on Metafilter, where a Yiddish origin was suggested. One commenter supplied a link to lyrics for a Yiddish song from 1936, "Yidl Mitn Fidl," in which meh appears as the bleat of a goat and rhymes with feh. (Feh for some reason is more recognizably Yiddish to me than meh. Jonathan Lighter observes that feh was favored by the writers of Mad magazine, though he believes that it expresses "slightly greater disapproval" than meh.)

Elsewhere on the Web, a commenter on Artblog.net defines meh as "a Yiddish interjection used to express disdain that borders on apathy." Beyond that I don't find much online discussion of meh's Yiddishness. (For instance, Yiddish goes unmentioned on a silly website devoted to the word called "The Gospels of MEH," which provides only a spurious origin story from 1986.) It's very possible that the "Simpsons" writers took meh from Yiddish, though compare similar nonsense syllables used on the show such as buh, snuh, and zuh. In any case, it seems that whatever Yiddish origins the interjection might have had, they have been lost in post-"Simpsons" usage.]

[Update #2: Eleanor Wroblewski reports:

On the internet, the word "meh" has been steadily mainstreaming for quite some time; I've been using it for years, completely unaware of its use on the Simpsons, or frequenting any websites you mentioned, or indeed being involved in that section of fan culture; "I have fandom friends who do that" might apply here, but I'm pretty sure the people I've picked it up from probably pride themselves on *not* watching the Simpsons.

Just to clarify, it was meh-ness, not meh on its own, that I thought might be about to shift from fan-forum usage to the mainstream. Meh itself has indeed become quite prevalent online, as can be seen from the multitude of definitions offered up by the likes of Urbandictionary, Langmaker, Wiktionary, and Merriam Webster's Open Dictionary.]

Posted by Benjamin Zimmer at 12:28 PM

June 07, 2006

One of those who

From reader Adam Drew this morning, a posting (of 6/1/06) he found on the Defective Yeti website:


I posted this question to a discussion group and it incited a veritable brawl:

Which is grammatically correct: "I have had sex with each and every member of Avenged Sevenfold, one of the bands that [is|are] part of Ozzfest 2006."

No consensus was reached, so we can settle the matter once and for all, right here on this humble little webpage. Fight!

In the seven days since then the responses have piled up alarmingly.  Some people said only the plural was correct, period, and one of them cited Bryan Garner (who is inflexible on this point in his Dictionary of Modern American Usage) in support.  At least one reader was disbelieving that anyone could EVER use the singular here.  Others maintained that only the singular was grammatical.  Several said that both were possible; one of these helpfully and painstakingly provided sentence diagrams for each of the variants.  Pandemonium reigned.  Or maybe even rained.

This is a venerable controversy --  people have complained about the singular variant since at least 1770 -- and this variant goes back even farther, at least to Shakespeare.  (The plural variant is attested in the 10th century.)  What's interesting here is that passionate discussions on points of grammar, usage, and style break out in all sorts of places that have nothing in particular to do with language: child-care forums, techie mailing lists, lgbt newsgroups, for instance.  People CARE.  And the issues that exercise them so much are mostly famous ones in the Ling Biz, like this one, issues that are treated with some care in Merriam-Webster's Dictionary of English Usage -- a volume that scarcely anyone outside of the Ling Biz seems to know about.  Forget Strunk & White.  Go for MWDEU.

If you look in MWDEU under one of those who you will find capsule histories of the variation in usage and of the controversy about it among the various "authorities" on grammar, usage, and style.  The article observes that each variant appears with some frequency, including by respected writers in formal contexts, and that often both appear in the work of a single writer (Joseph Addison in The Spectator, for instance).  MWDEU ultimately takes the position of John S. Kenyon in a 1951 American Speech article, that (in the dictionary's words) "it is simply a matter of which is to be master" -- the singular one or the plural NP (those, the bands, whatever) that follows it.  This is in fact the position of "Umrain Zero", who produced those sentence diagrams.  Both variants are ok, and they might even be conveying slightly different things (with greater discourse salience of the one thing, with singular agreement, or of the reference class from which this thing is drawn, with plural agreement).

The advice literature on grammar, usage, and style is on the whole inimical to alternative expressions with the "same meaning", generally advising that one of the alternatives be banned entirely (as in this case) or else relegated to conversation and informal writing (as in the case of a lot of vs. much, which I've recently written about here).  There are two problems here.  First, again and again it turns out (as Dwight Bolinger used to explain so often) that the alternatives are not truly free variants, but convey different meanings or discourse statuses.  Second, free variation (without semantic or discourse-structuring concomitants) is still a great thing to have around, indicating all sorts of other things: your mood, your persona, your attitude towards the people you're talking or writing to, your social group membership, and so on.  This is true even if we confine ourselves to the standard language.  We can do more things if we have more choices.

So I say: try to steer clear of people who want to impoverish your set of choices.  And don't go around constraining other people's sets of choices, especially if they might be doing something subtle with them.  Your life will be calmer, less stressful, and less contentious.  Go out and get (as Geoff Pullum advised here) a copy of MWDEU, and apply it soothingly when you feel a fit of grammar-wrangling coming on.  (Geoff recommended the concise version, but I usually suggest that people spring for the full volume.  Note: I am in no way associated with the Merriam-Webster company.)

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 08:59 PM

Mock modesty at the NYT

Two correspondents, Douglas Davidson and Mark Swofford, noticed the following baffling paragraph in a New York Times piece on Nice and its Provençal cuisine, 6/4/06, by Mark Bittman:

But do not, under any circumstances, skip the classic niçois version of gnocchi (its name, even in French, cannot be printed here), made with Swiss chard and served with one of three sauces: gorgonzola, pistou or tomato.

A name that is so obscene it cannot be printed in the Times EVEN IN FRENCH?  Ok, you're asking (as Davidson and Swofford were asking): what name?

I sent out an appeal to the full staff at Language Log Plaza, and within a minute or so quick-draw Ben Zimmer had the answer, obtained via a Google search on <gnocchi niçois>: Provençal "merda de can" 'dog shit'.  We know from past experience (see my recent posting) that the NYT is "shit"-averse as well as "fuck"-averse, but this is ridiculous.  Surely Bittman is just toying with us, in an annoyingly obscure way.  (Maybe he's been reading Language Log and knows that our keen linguist's ears prick up when we come across allusions to things that can't be printed in the paper.  Well, he SHOULD be reading Language Log.  It's part of a complete liberal education.)

I smelled a put-on, but before I could do a Google search for French shit in the Times -- in my defense, Thomas Grano and I were busy searching the New Yorker for "a lot of" and "lots of" at the time, and we were hot on the trail of some nice numbers -- Mark Liberman did the work for me.  Mark reported:

A search on the NYT web site ("since 1981") for "merde" turns up 13 examples, e.g. a piece from 1/20/2003 by Craig Smith under the title "Villefranche-sur-Saone Journal: When Bad-Mouthing Wine is a Punishable Offense", which includes the phrase "vin de merde".

William Safire alone has used "merde" twice in his columns, according to that search.

It's true that "merda" turns up nothing, but surely that's because Provençal is massively less frequent than French. And in fact, a search for "merda" returns the helpful query "Did you intend to search for merde?" -- along with an even more helpful spot ad from eBay, asking "Looking for Merda? Find exactly what you want today."

You heard it here: every so often, the merde hits the fan at the Times

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 06:45 PM

Goram motherfrakker!

I should have realized that alluding to science fiction in any way was likely to elicit e-mail, and sure enough that's what happened when my latest posting on the ridiculous modesty of the New York Times mentioned invented swearwords in the writing of Larry Niven and on the television shows Farscape and Red Dwarf.  So now, a few items of interest from two other shows, Battlestar Galactica and Firefly.

Along the way, I stumbled across a WikiWikiWeb page on "fake cuss word", with a variety of examples from science fiction, and more.  Smegging amazing place, the web.

By the way, correspondent Nate Dorward twitted me about my use of "sci fi" in that posting, noting that "sci fi" (or "sci-fi" or "scifi") is "apparently offensive to the ears of science fiction fans, who always use 's.f.'"  (or "S.F.", "sf", or "SF", I add).  An inspection of the Terminology section of the Wikipedia article on science fiction reveals that Dorward's formulation is too absolute; a lot of science fiction fans and writers detest "sci-fi", others seem not to mind it, and some make a distinction between the genres of SF and sci-fi.  Meanwhile, we have things like the Sci-Fi Channel in the U.S. and Sci-Fi London, "the UK's only dedicated SF and Fantasy film festival".  It's a usage morass out there in space.

On to Battlestar Galactica.  Yesterday Bruce Webster and Chris Reaves both pointed out the use of "frak" (so spelled in its entry in the BG wiki) in the series as a replacement for "fuck", in all of its uses, including "motherfrakker", modifying "frakkin'", exclamatory "oh, frak!", "frak you!", and the literal "frakked" 'had sexual intercourse with'.  This one is especially nice, since it's phonologically very close to both "fuck" and "frig" (which some people view as a downtoned substitute for "fuck", and others -- John Cowan, among them -- view as significantly worse than "fuck"; "mysterious are the ways of taboo", as Cowan wrote to me last September).  Webster even caught it migrating out of the science fiction world and into this season's opening episode of Scrubs, where Dr. Elliott Reed used it to express frustration.  For all I know, little kids are using "frak" as a coded obscenity -- delicious, because it's just barely coded at all.

(Webster also reports "a long-established subculture of f-word substitutions among Latter-day Saint (Mormon) males in their teens and twenties; the two most common are 'flip' and 'fetch'.")

Soon after the "frak"-sightings came Noah Silbert and Hakim Cassimally, with some wonderful stuff from the Firefly series, which uses "goram" as an all-purpose replacement for "goddamn" (from which it is obviously derived), a series of curses in Chinese, and "rutting" as a synonym for "fucking".  Silbert wanted to see "goram" as derived from "goddamn" via a sound change imagined by Joss Whedon; I thought this was stretching things a bit, but maybe somebody will get the opportunity to ask Whedon some day.

With these reports, I leave the world of cursing and swearing in science fiction.  Here's a small sampling from that world.  If you want more, go to the "fake cuss words" page.  (And if you're a Firefly fan, add "goram" to the page.  It wasn't there the last time I looked.)

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 06:03 PM

"Redux" in flux

News flash from the Associated Press... The Senate has rejected a constitutional amendment to ban gay marriage. But the more momentous news from Language Log's perspective comes in this sentence from the AP report:

The House plans a redux next month, said Majority Leader John Boehner, R-Ohio.

That's right, redux is now a noun! (And we're not talking about appetite suppressants.) Making a verb out of a noun is one thing, but making a noun out of a postpositive adjective? Now that's innovative. Should we expect this usage to usher in a galore of lexical extraordinaires?

Actually, the AP's usage of redux to mean 'return, revival' is not particularly new, as I discovered from a quick trip to Google Book Search. Here are some predecessors:

Gerard Jones, Honey, I'm Home!: Sitcoms: Selling the American Dream (1992), p. 216
His ugly, prissy, appearance-obsessed sister-in-law, Esther, was a redux of Kingfish's wife, Sapphire.

Robert W. Love, History of the U.S. Navy, 1775-1941: Volume One (1992), p. 324
The redux of the wave then dragged her back to the coral reef at the edge of town.

Stacy Shiff, Saint-Exupery: A Biography (1994), p. 79
The room, a redux of the Bossuet desk, was an abominable mess: sheets of paper, bits of paper, balls of paper covered everything.

Linda Mizejewski, Ziegfeld Girl (1999), p. 196
The film's serious offerings of X-rated sex, especially in conjunction with its deadpan redux of All about Eve backstage jealousy, were the very terms of its campy reception.

David Foster Wallace, Brief Interviews With Hideous Men: Stories (1999), p. 245
For Sissee Nar's title role, opposite the contemporary logos-legend Vanna of the White Hands as the lunar Selene in this somewhat Sapphic redux of a well-known mini-myth, called only for catatonia.

Douglas Brode, The Films of Steven Spielberg (2000), p. 274
Dennis Quaid, as test pilot Tuck Pendelton, braces himself for a miniaturized flight into the human body in Innerspace, a comical redux of Fantastic Voyage.

Joseph A. Califano, Jr., Inside: A Public and Private Life (2004), p. 108
Negotiations between civil rights leaders and local businessmen eventually desegregated Birmingham lunch counters, but I remained fearful of a redux of Oxford as sporadic bombings and outbursts of vandalism continued.

The author of the last citation, Joseph A. Califano, Jr., is evidently an inveterate user of nominal redux. Here he is in a Washington Post column way back in 1981:

Joseph A. Califano Jr., Washington Post, May 20, 1981, p. A8
I thought it was only a matter of time before Carter and Edward Kennedy became a redux of Johnson and Robert Kennedy.

All of the above examples come in the form "(a) redux of (something)" — much rarer is the AP's usage of redux as a standalone noun not followed by "of..." It's possible to find scattered examples of this, however, as in the title of Vern Bullough's contribution to the 1996 book Desire and Discipline: Sex and Sexuality in the Premodern West (Jacqueline Murray and Konrad Eisenbichler, ed.): "Sex in History: A Redux."

Postpositive adjectives like redux constitute a peculiar word class in English. Most such adjectives are borrowings from Romance languages, especially French (à-gogo, extraordinaire, manqué, flambé). Italian, meanwhile, gives us culinary adjectives that stand for the longer expression "alla X" ('in the X style'), as in carbonara, primavera, and puttanesca.

Redux of course comes from Latin, which also supplies postpositive agonistes and redivivus. The first prominent English usage of redux was in the Latin title John Dryden bestowed on his 1660 panegyric to Charles II, Astraea Redux. By the late 19th century redux could be used as a postmodifier for English proper nouns, as in Anthony Trollope's Phineas Redux (1873). And a century later John Updike gave redux a big boost with the second book in his "Rabbit" series, Rabbit Redux (1971). More recently, Francis Ford Coppola revisited his 1979 movie Apocalypse Now with a bigger, longer, uncut version, Apocalypse Now: Redux (2001).

So now that redux is noun, the next obvious step is to verb it. Oops, too late! There are more than a 1,000 Googlehits for reduxing, including this 2002 New York Times headline: "It's the Forsytes, Reduxing Again."

[Checking the OED, I see that galore was nominalized quite a long time ago, as in this 1849 cite from George F. Ruxton's Life in the Far West: "Galore of alcohol to ratify the trade." Galore, by the way, derives from the Irish adverbial phrase go leór 'enough' (lit. 'to sufficiency').]

Posted by Benjamin Zimmer at 01:20 PM

Mung gets munged

Following up on my suggestion in a post yesterday of a term mungagnia, an addiction to recursion, possibly sexual, my learned friend Jason Kerwin pointed out to me that the internet generation has developed an extreme specialization of the term mung, a specialization which takes it in a new and disgusting direction. To these youngsters, to mung is to consume the bodily fluids of a corpse, preferably that of an old woman, and typically by direct mouth-on-orifice contact while a buddy jumps on the corpse's stomach. In the interest of good taste, and because it suggests a physically unlikely universe in which time can loop, I will *not* propose the natural recursive extension of the innovation, i.e.

Mung The munging of the corpse of a person that died as a result of mung-induced illness, etcetera, ad infinitum.

I must, however, make sense of the above non-proposal by noting that in the disgusting (but non-recursive) sense there is a corresponding noun mung refering to the piquant smoothie prepared by munging, presumably a mass term. It is the analogous (recursive) noun that you find in the compound mung-induced illness. I never came across the original recursive acronym mung (Mung Until No Good, a development of Mash Until No Good) being used as a nominal, perhaps because the product of munging was typically so much more abstract, e.g. a computer program, and correspondingly less, uh, visceral.

Posted by David Beaver at 12:35 PM


Andrew Sullivan has invented a new genre, and a new word for it: podfisking.

Or rather, I think, he re-invented both the genre and the word. A Google search for {podfisking} this morning returned no documents, but a search for {podfisk} turned up the basis for a prior claim by Jeff Jarvis.

On 3/7/2005, Jarvis posted "More podcast play" at BuzzMachine:

I've played with another podcast response: This brief (five-minute) 'cast takes quotes from Democratic FCC Commissioner Jonathan Adelstein's interview with Brooke Gladstone on On the Media . It was a good interview but I couldn't resist adding my own answers to Adelstein's answers. I'm not so sure this one works; hear what you think.

I wish this is what the opposition party would do to, say, the State of the Union speech. Rather than that cardboard response the Democrats gave to the last SOTU, how much better it would be if they gave Bush an audio fisking: Respond to his stands, point-by-point, back-and-forth in a podfisk!

Whoever invented it, I think it's a terrific idea. However, it's a bad sign that there's apparently been no follow-up over the 15 months since Jarvis made his suggestion.

And given the total lack of interest that greeted my nomination of "fisking" for ADS Word of the Year in 2004, you might be inclined to discount my enthusiasms for new forms of virtual debate, whether in text or in audio form.

In any case, as the Wikipedia entry for fisking explains, Sullivan provided the first example of the fisking process, back in December of 2001, and either invented the term or co-invented it. So perhaps his intervention will be successful this time again: {fisking} now has 858,000 Google hits (even more than {"Language Log"} :-).

Posted by Mark Liberman at 10:00 AM


Doug Wyatt at Sonosphere reports the bumper sticker I � UNICODE. He observes that depending on your OS, browser, font inventory and miscellaneous settings, your computer may be recreating the joke by an alternative route.

Anyone who tried to use Unicode a decade ago might have been tempted to decorate their bumper in a different emotional direction: F��K UNICODE. This is no longer a well motivated point of view, occasional annoying blips aside. But the idea does suggest a connection between the practice of displaying glyphs such as � or for character codes that are not available in the current font, and the practice of substituting asterisks (or hyphens or underscores) as typographical bleeps for certain letters, in order to avoid violating lexical taboos. (Does anyone know who invented typographical bleeping, and when? Perhaps it was inspired by the practice of disguising names in forms like S____, which seems to be older.)

As the conventional lexical taboos lose their force, the practice of typographical bleeping can be used ironically, as documented in email by Paul Kay:

Arnold Zwicky cites with approval the admonition of the Guardian Style Guide (via Chris Waigl) regarding the use of asterisks to evoke taboo words without actually printing them: "Finally, never use asterisks, which are just a copout." Fundamentally sound, I think, but like most useful admonitions this one admits of the occasional exception. Recently spotted in Berkeley, where I live, a home-made bumper sticker:


I'm sure this was inspired by David Lodge's 1975 novel, Changing Places, in which the governor of California was named Ronald Duck (remember who was governor of California from 1966 to 1974?) and Berkeley student protestors displayed signs saying "FUCK D**K".

Actually, the passage in Lodge's novel refers not to bumper stickers or to protest signs but to lapel buttons. For those of you who haven't read it, the book starts like this:

High, high above the North Pole, on the first day of 1969, two professors of English Literature approached each at a combined velocity of 1200 miles per hour. [..] Although they had never met, the two men were known to each other by name. They were, in fact, in process of exchanging posts for the next six months ...

Philip Swallow is on his way from the University of Rummidge to Euphoric State for his half of the exchange, and on pp. 48-49 of the 1979 Penguin edition he's still up in the air:

Meanwhile, Philip Swallow is wondering more desperately than ever when this flight is going to end. Charles Boon has been talking at him for hours, it seems, permitting few interruptions. All about the political situation in Euphoria in general and on the Euphoric State campus in particular. The factions, the issues, the confrontations: Governor Duck, Chancellor Binde, Mayor Holmes, Sheriff O'Keene; the Third World, the Hippies, the Black Panthers, the Faculty Liberals; pot, Black Studies, sexual freedom, ecology, free speech, police violence, ghettoes, fair housing, school busing, Viet Nam; strikes, arson, marches, sit-ins, teach-ins, love-ins, happenings. Philip has long since given up trying to follow the details of Boon's argument, but the general drift seems to be concisely summed up by his lapel buttons:

HAPPINESS IS (just is)

In spite of himself, Philip is amused by some of the slogans. Obviously it is a new literary medium, the lapel button, something between the classical epigram and the imagist lyric.

Back in 2006, Arnold Zwicky pointed out in response to Paul Kay's note that the "fuck D*CK" usage preserves the concept that some words are viewed as too obscene to display in public, and forwarded an email from Nate Dorward, reading in part:

I remember that when the play Shopping and Fucking was in the news a lot, prompting a lot of euphemisms & bleeping in reviews &c, one quick-witted writer at a newspaper referred to it as "Sh---ing and Fucking". Wish I had a cite for that for you.....

Paul Kay commented:

Since Shopping and Fucking was first produced in 1996, the coiner of 'Sh---ing' may also have had access to Lodge's 1975 novel.

And so do you. If you haven't read it, or haven't read it recently, by all means do so.

[Update -- Ben Zimmer points out:

You can also get "I � Unicode" on a T-shirt...

Chris Waigl was asking for one back in November...

It's embarrassing that I didn't notice/remember that! Chris was way ahead of me on this one, as she often is.]

Posted by Mark Liberman at 07:11 AM

A sudden loss of innocence

At the end of the academic year, students in Arroyo, the Stanford undergrad dorm which for the last seven years I've helped look after as a ``Resident Fellow'', are eagerly looking for ways to procrastinate, and one such way is taking a purity test. It enables you to tally your sexual, drug or otherwise off-color adventures in a notches-on-a-stick, trophy-on-the-wall or kill-sticker-on-the-side-of-the-humvee sort of a way. Many students took this at the start of the year too, so retaking the test provides them with a way to gauge the progress they've made during the year. Progress here means loss of innocence.

As for language related issues, well, I learnt several interesting words taking the test, but what I wanted to tell you about was this wonderful example of recursion.... oh, you want to know what the dirty words were first?

Cocrophilia The purity tests define this as `marked interest in excrement; esp. the use of feces or filth for sexual excitement', and all the immediate google hits are for purity test related material. The OED has never heard of the word. Hey, has someone been making stuff up? Am I allowed to do that? [No: a reader points out that the OED does list coprophilia. It's from the Greek kopros, dung. So this is a spelling error that crept into an early purity test, now to be web-propagated for evermore.]

Frotteurism Well, I guess everyone else knew what frotteurism was? It means rubbing yourself on others without consent. At least the current psychiatric diagnostic criteria take it to be non-consensual, and it's only illegal in most developed nations if non-consensual. I'm not sure if simply liking to rub yourself on someone legally and consensually has a name. Apparently the term frottage is now outdated as far as psychiatrists are concerned, though the OED still takes frottage to be the primary term. Perhaps because frottage can mean also to take a rubbing, e.g. of wood, for an artwork. This is apparently socially acceptable even without the wood's consent. Seems a slippery slope to me. Anyhow, back to the sexual sense, I particularly liked one of the OED's source quotes, from Ellis & Ararbanel's 1961 Encyclopedia of Sexual Behavior: `Frotteurs are only truly perverted when they avoid coitus and other sex acts.' So there you are: even if there was that one time in the frathouse and you were totally wasted and, well, yes, looking back, it was clearly a misdemenour, and no, he'll never talk to you again, and what harm would there have been if that girlfriend of his hadn't told him, and well, it looked like she was passed out, so how could you have known what would happen, and it's pretty damn dubious in your opinion that she watched the whole thing without saying a word until two days after... at least you aren't a pervert.

Klismaphilia Enemas for fun. Like cocrophilia, it isn't in the OED, but it seems to have a fairly established currency on the web.

Mysophilia No, don't look to the OED, but again the web is a rich source:  dependency on soiled and filthy stuff, e.g. sweaty underwear, and used menstrual pads.

Scoptophilia Here the OED does know the word, which it gives as `Sexual stimulation or satisfaction derived principally from looking; voyeurism.' The purity test's definition varies slightly but significantly: a dependency on looking at sexual organs and watching sexual activity openly, not surreptitiously, as in voyeurism.

Urophilia In context, you'd certainly have guessed what urophilia is (urine fetish), but the OED hasn't heard of that one either, although it does know the word urolagnia, which apparently has exactly the intended meaning.

Umm, why did I begin this post. Oh yes, it was question 497. So the thing is, the more Yes-answers you give on the purity test, the lower your score, but question 497 runs as follows:

497. Have you ever done something for the sole purpose of lowering your Purity Test score?
      Yes      No

Please, just for me, take the test, answer yes, and thus make it so! This way Language Log will soon be responsible for the largest drop in human purity since the discovery of klismaphilia. But, you ask, is there a nice, long, partially Greek word for someone who takes an almost sexual pleasure in such a delicious recursion as question 497 invites? Recursarecursarecursa...philia? Mungagnia? Naaah, the word is just geek, I'm afraid. But don't worry if you fit the bill. Geeks are only truly perverted when they avoid coitus and other sex acts.

[Update: see here, though, fittingly, it may be where you came from.]

Posted by David Beaver at 01:39 AM

June 06, 2006

Springtime for snowclones

Sarah Lyall recently wrote in the NYT ("It's Springtime for Soccer, and for Rowdy England Fans", June 2, 2006) about the latest World Cup worry for the British and German authorities. It's not that British fans will "rampage through the streets, destroying things and attacking people", but that they will make fun of the Nazis.

"It's not a joke," Charles Clarke, then the home secretary, warned at a pre-World Cup briefing earlier this spring. "It is not a comic thing to do. It is totally insulting and wrong."

That means, basically, no getting drunk and goose-stepping in a would-be humorous manner. No Nazi salutes. No shouting "Sieg Heil!" at the referees. No impromptu finger-under-the-nose Hitler mustaches.

"Doing mock Nazi salutes or fake impersonations of Hitler — that's actually against the law in Germany," Andrin Cooper, a spokesman for the Football Association, which administers English soccer, said in an interview.

The unnecessary redundancy police have doubtless been firing stern emails off to the Football Association, asking whether real impersonations of Hitler are legal or not, but we don't care about such things here at Language Log. Instead, we're enjoying the article's headline, which contains a classic snowcone: Springtime for Soccer.

This of course is a reference to "Springtime for Hitler", a musical-play-within-a-movie from Mel Brook's 1968 The Producers (redone in 2005, and now on Broadway as a musical). According to the Wikipedia article:

Springtime for Hitler: A Gay Romp With Adolf and Eva at Berchtesgaden is a musical about Adolf Hitler written by fictional Nazi Franz Liebkind.

The play is chosen by producers Max Bialystock and Leo Bloom in their fraud scheme to raise substantial funding by selling several multiples of a 100% stake, fail the play, and keep all the remaining money for themselves. In order to ensure the play is a total failure, Max picks the worst director he can find, Roger DeBris, a stereotypical homosexual/transvestite caricature, and gives the part of Hitler to an uncontrollable hippie named Lorenzo St. DuBois who calls himself LSD.

A quick web search {"Springtime for"} turns up not only soccer, but also quite a few other substitutions that allude more or less directly to Brooks' fictional title: killing in Afghanistan / Ahmadinejad / realism / Blair / Trotsky / Dictators / Summers / slanderers / Mahmoud and the 12th Imam / Jacko / The BNP / venture capital.

Speaking for myself, I think it's wonderful that Hitler has ended up 60 years later as a figure of fun; but it's more problematic that the British find so difficult to refrain from making Hitler jokes around the Germans. I noticed this cultural tic in Stewart Lee's odd meditation on the alleged role of the German language in alleged oddities of German humor, where he complained

There is less room for doubt in German because of the language's infinitely extendable compound words. In English we surround a noun with adjectives to try to clarify it. In German, they merely bolt more words on to an existing word. Thus a federal constitutional court, which in English exists as three weak fragments, becomes Bundesverfassungsgericht, a vast impregnable structure that is difficult to penetrate linguistically, like that Nazi castle in Where Eagles Dare.

Would an American comedian would have thought that joke was worth trying to make? I don't think so.

[Or maybe I'm wrong. Eric Bakovic points out:

Don't forget Dr. Strangelove's Nazi-salute tic, which is just as funny to Americans as I imagine it is to the British, even today. (And FWIW, I've observed Americans making Nazi jokes around Germans, and convict jokes around Australians, and other such things. It often has a faint whiff of that check-out-what-I-know-about-world-history to it, which I imagine is more absent or less obvious with Brits, but of course I may be wrong about this.)

And surely you also read this other article in the Times? "Surge in Racist Mood Raises Concerns on Eve of World Cup".

It starts out:

As he left the soccer field after a club match in the eastern German city of Halle on March 25, the Nigerian forward Adebowale Ogungbure was spit upon, jeered with racial remarks and mocked with monkey noises. In rebuke, he placed two fingers under his nose to simulate a Hitler mustache and thrust his arm in a Nazi salute.


Posted by Mark Liberman at 03:41 PM

June 05, 2006

Words that can't be printed in the NYT

The folks at the New York Times continue to come up with inventive ways to avoid printing seriously taboo words in its pages.  Here's John Hodgman, in the "Comics Chronicle" of the Book Review, 6/4/06:

For all the admirable effort to allow comics to tell different types of stories, there is also a creeping sameness to many of these comics: black-and-white, semi- or wholly autobiographical sketches of drifting daily life and its quiet epiphanies. [Cartoonist Gabrielle] Bell herself evinces the form sweetly in her contribution to Mome -- only two of the three words in its title, "Happy" and "Birthday," can be printed here -- in which she contrasts two bizarre birthday parties and ends on a beguiling and luminous panel of the author dancing as if possessed by spirits.

In case Hodgman's wink-wink nudge-nudge allusion went past you, Bell's piece is "Happy Fuckin' Birthday", from Mome #2 (Fall 2005).

In recent weeks, we here at Language Log Plaza have returned to talking about taboo vocabulary and taboo avoidance, in Mark Liberman's "Delete expletives" of 3/29, and Ben Zimmer's "Twonk!" of 3/30 and "Thinking specifically about the F-word..." of 4/2, reviving the topic from our last go-round, in August through October of last year.  The Times has been a reliable source of fun for us, beginning with its struggles with the book titles On Bullshit and Bullshit Nights in Suck City and continuing with a bizarre piece by Michael Brick (mocked by me in an August posting) in which he announced that the paper prefers to avoid "winking" allusions to taboo vocabulary -- "f***", "f-word", or even "f-bomb" for "fuck" -- and recommended "word-bomb" as a substitute in this case, a proposal that, blessedly, seems not to have caught on.

Eventually, we came to two September postings describing some families of avoidance strategies: avoidance characters (global things like "#$*!", sometimes pronounced, or even reported in writing, as "bleep", and local avoidance characters, like asterisks, elliptical periods, hyphens, and underscores, replacing letters, as in "all f***ed up") and the effing avoidance strategy, of saying (or writing) things like "F-ing" or "effing", plus (noted in the same posting) the full ellipsis strategy, as in "Shopping and...", which concludes the title, in speech as well as writing, with the word "and".

In October I noted that the Guardian seemed to have no qualms about printing "fuck", at least in quotations where the word is memorably deployed.  Posting on the topic then languished.

But, backstage, much was going on.  Chris Waigl observed in her blog that the Guardian Style Guide is actually explicit on the matter:

do not describe this as "a good, honest old-fashioned Anglo-Saxon word" because, first, here is no such thing as an Anglo-Saxon word (they spoke Old English) and, more important, it did not appear until the late 13th century
see swearwords

We are more liberal than any other newspaper, using words such as cunt and fuck that most of our competitors would not use.
The editor's guidelines are straightforward:
First, remember the reader, and respect demands that we should not casually use words that are likely to offend.
Second, use such words only when absolutely necessary to the facts of a piece, or to portray a character in an article; there is almost never a case in which we need to use a swearword outside direct quotes.
Third, the stronger the swearword, the harder we ought to think about using it.
Finally, never use asterisks, which are just a copout.

Good on them.

Meanwhile, a discussion was going on on the American Dialect Society mailing list (in a thread titled "Obscenity on British TV"), instigated by a BBC show called "40 Years of F***", celebrating the 40th anniversary of Kenneth Tynan's saying "fuck"on the telly, in this revolting piece of trash talk: "I doubt if there are any rational people to whom the word 'fuck' would be particularly diabolical, revolting or totally forbidden." (Larry Horn pointed out, the way a good semanticist should, that Tynan didn't actually USE the word, he just MENTIONED it.)

Towards the end of this discussion (on 9/24), Chris Waigl noted still another avoidance strategy, the coyly euphemistic paraphrase:

During the first program of the BBC Radio 4 Word4Word mini-series, the presenter, Dermot Murnaghan ran into a bit of a problem when he had to introduce Mark Ravenhill, the author of the play "Shopping and Fucking".  No, he didn't pronounce "fucking" on the air, and I think he could have avoided talking about "retail therapy and horizontal refreshment".

In November, Chris blogged on a skit entitled "[The usage/history of] the word 'fuck'", which has sometimes been (incorrectly) attributed to Monty Python.  The story's still available on her site.

By February, she reported to me in e-mail that she'd come across an occurrence of "fuck" in the Guardian that wasn't in something reported with quotation marks around it, in a review of Ariel Levy's Female Chauvinist Pigs:

As Erica Jong, erstwhile celebrator of the zipless fuck, tells Levy: "Sexual freedom can be a smokescreen for how far we haven't come."

This is, of course, one of those quotes that don't need quotation marks any more, because they've become common currency.

All was not Chris Waigl, however.  Regular correspondent John Cowan turned up in March to note that science fiction writer Larry Niven's Known Space series has a number of vocabulary innovations, including "a set of handy new swearwords" -- some invented ("tanj", complete with a dubious acronymic etymology, from "there ain't no justice"), others shifted from their uses in English as We Know It ("censor", with the full force of "fuck").  I replied, in my quaint lower-case way:

sci fi is a rich source of innovations in swearing: so many stories, movies, and tv shows involve some sort of space rangers, characters who would be expected to swear like, well, troopers, but you can't get away with the appropriate taboo words of actual english, so you have to invent.  niven takes two different routes here -- total invention and "dirtying" existing non-taboo words.  inventions that suggest existing words are popular too, as in Farscape's "frell" (suggesting "frig"/"frick"/"fuck" and "hell") and Red Dwarf's "smeg" ("smegma").

And Mark Liberman noticed, here on the Language Log, that the Economist was following in the footsteps of the Guardian, using "fuck" in quotes.  That was February, and here they go again in June (review of Peter Carey's Theft: A Love Story, 6/3/06):

... he describes colour with zest and joy, greens he was into like a snouty pig--huge, luscious jars, greens so fucking dark, satanic, black holes that could suck your heart out of your chest".

Here at Language Log Plaza we stand with the Guardian and the Economist (scarcely sleazy rags), and we occasionally point our fingers at the Times and snicker.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 06:57 PM

Not doubting that the door could not be opened wider

In response to my post quoting from Stephen Levinson's letter to Language, Steve of Language Hat emailed:

Thanks very much for putting up that exchange of letters for all to read; it was gutsy of Levinson to send such a letter, and I'm glad it was received so well. (I don't suppose there's any chance e-Language will be readable by all comers...)

The "request for proposals" on the LSA's web site doesn't address this issue one way or the other. I'd certainly argue in the strongest possible terms for Open Access.

The fun part of Hat's note comes next:

But what caught my attention was the following sentence, which (if it was not distorted in the transition to HTML) provides a piquant illustration of the problem of accumulated negatives that the Log has discussed in the past:

"But having been shown by colleagues quite a few excellent papers that have been rejected by Language, I have not the slightest doubt that the door could not be opened a lot wider while still maintaining the very highest quality."

That's exactly how it reads in the journal, and I agree, this seems like a case of overnegation to me. As far as I can tell, the sentence says the opposite of what Levinson meant to convey. (Of course, the problem may have been introduced in editing or typesetting...)

Posted by Mark Liberman at 05:24 PM

The Old Navy style sheet

Among the "message tees" that Old Navy offers for newborns is one announcing:

SPINACH AND <drawing of a carrot>

(This is a white cotton tee, with the lettering in a color Old Navy labels "guacamole", though the carrot is, fortunately, orange.)

The accompanying ad copy moves towards standard spelling and punctuation, though the Old Navy style sheet continues to be resolutely comma-averse:

Sweet envelope-neck tees come with playful messages and appliques. Choose from "Cranky But Precious," "Shhhh Baby Sleeping," I Don't Like Potatoes Spinach and Carrots," I Want Lots of Hugs Kisses and a Teddy Bear," or "I Love Candies Chocolate and Cupcakes."

(Interestingly, though clothing for newborns is highly gendered, via choices of color, styling, and topics of drawings and slogans -- you really can't shop for clothing for a generic newborn, only for a girl or for a boy -- in this case the message tees are offered for girls and boys.)

Where to start?  Well, the subtlest point is the use of and rather than or under the scope of a negative.  There's nothing wrong with that (in this sentence); this variant conveys something like 'Here's a list of things I don't like: potatoes, spinach, and carrots' (where or would be odd), with its suggestion that the list is complete.

Then there's the spelling "potatos".  In the world in general, "potatos" is astonishingly common (I just got 913,000 raw Google webhits on it -- vs. 45,200,000 for the correct "potatoes", but still, nearly a million ain't small potatoes), often right in the neighborhood of a correct spelling, as here:

Idaho potatos from Sun Valley Potatoes, supplying high quality Idaho potatos all the way from their field to your table.

Spelling the plurals of nouns ending in o is a well-known minefield, with all sorts of nasty details to trap the unwary speller.

On the Windsor Peak child-care site, where this tee-shirt has recently been under discussion (my thanks to Elizabeth Daingerfield Zwicky, for pointing me to the photo on this site and then to the Old Navy original), the general opinion seems to be that "potatos" and the missing apostrophe in "dont" were intentional, a move on the part of the designer to present the wearer of the shirt as babyishly cute, making learner's errors -- though someone who goes around in clothes size 5T or smaller isn't going to be spelling ANYTHING, or for that matter drawing a recognizable carrot.  I'm not so sure;  "potatos" might just be an error, as with those "Idaho potatos", and the missing apostrophe might be a style sheet thing.

Tee-shirts serve as signs of a kind, like billboards or highway signs.  For some time now, highway signs have been moving towards a clean, modern, punctuation-free (and diacritic-free) style.  Colons and commas mostly disappeared long ago.  That's surely the source of the comma-free style at Old Navy.

Apostrophes in possessives were the next to go.  And now the apostrophe of negated inflection (a.k.a. negative contractions) may be threatened.  My impression is that "don't" and the like are not at all common on road signs -- instead, you get uncontracted DO NOT or other kinds of negatives, as in NO TURNS -- but now I'll be on the lookout for them.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 03:55 PM

Psychology ≅ 10-100 x Linguistics?

Why is the field of psychology (in the United States) roughly 10 to 100 times bigger than the field of linguistics, depending on how you quantify things? Is this a logical consequence of the two fields' relative amounts of intellectual interest and social importance? Or is it largely a historical accident? Back in January of 2005, I gave a talk at Stanford ("A Series of Unfortunate Events: The Past 150 Years of Linguistics", Stanford University, 1/28/2005; blog post, pdf of slides) in which I argued for the "historical accident" theory.

I was reminded of this issue by an exchange of letters in the March 2006 issue of Language, the journal of the Linguistic Society of America.

Stephen Levinson's letter ("Language in the 21st Century", Language, vol. 82, no. 1, 2006) and Brian Josephs' reply ("Language in the 21st century: An assessment and a reply") are both well worth reading in their entirety, and so I've put up an unauthorized .html copy for those of you who are not LSA members. But what caught my attention was a particular numerical comparison in Dr. Levinson's letter.

Current trends are for journals to reflect the vibrancy of their fields, the increasing bodies of accessible data, the growing diversity of professional associations, and the rapidity of scientific developments. Compare, for example, the American Psychology Association [sic] (with 49 journals), or the American Anthropological Association (with 24 journals), publishing thousands of articles a year with online supporting data, and serving their (admittedly larger) memberships with highly ranked outlets for a large portion of their work. In contrast, Language publishes only about 20 articles a year, restricts concurrent multiple submissions by the same authors, has no online supporting data, and spends many of its precious pages on book notices and reviews. [emphasis added]

In my Stanford talk, I offered several alternative methods of quantification, which gave ratios between 1:14 and 1:38. Google hits for {"linguistics department"} were then 60,900, compared to 1,010,000 for {"psychology department"}, yielding a ratio of 1:14. (The numbers are now 307,000 and 5,470,000, for a ratio of 1:18.) Based on (unverified) figures from a publisher's rep, I estimated about 50,000 enrollments per year in undergraduate introductory linguistics courses in the U.S., compared to about 1,500,000 enrollments in undergraduate psychology courses, for a ratio of 1:30. I estimated the Linguistic Society of America's membership at 4,000, and the American Psychological Association membership at 150,000, for a ratio of 1:38.

Dr. Levinson offers a ratio of LSA journals vs. APA journals at 1:49, and a ratio of articles published annually in those journals of 20 to "thousands", which is presumably < 20:2000 = 1:100.

These methods of quantification are all highly imperfect, but I think we can conclude that the collection of disciplines calling themselves "psychology" is one to two orders of magnitude larger than the collection of disciplines calling themselves "linguistics".

Of course, one critical difference is that psychology developed as a "big tent" within which many different and often fractious subdisciplines co-exist uneasily: clinical psychology, social psychology, experimental psychology, physiological psychology, and so on. To achieve a comparable level of integration, you'd need to combine the LSA (with its 4,000 members), the ASHA (with its 123,000 members), the speech communication section of the ASA, the ACL, much of the MLA and even the ATA -- and for that matter, some fraction of the APA and AAA, since most psycholinguists are in psychology departments and most anthropological linguists are in anthropology departments. (There are also relevant fragments of several other large disciplines.)

Such disciplinary integration is by no means an unmixed blessing -- tensions within the APA led to the creation of the 15,000-member Association for Psychological Science (originally the American Psychological Society) in 1988. This history of the APA offers an interesting perspective on its alternation between inclusion and exclusion over earlier decades, and describes the 1988 schism this way:

Dissatisfaction from academic and experimental psychologists over what appeared to be the takeover of APA by the applied fields and public advocacy motivated the creation of the Assembly of Scientific and Applied Psychologists. However, the reorganization proposal that had taken years to prepare failed to receive a majority vote of the membership in 1988. These stresses and others within APA led to the establishment of the American Psychological Society as an alternative organization to APA.

There are several smaller overlapping groups as well, including the Cognitive Science Society, but despite these centrifugal tendencies, the fact remains that during the 20th century, psychology as a discipline evolved and maintained a remarkable degree of integration. The history of anthropology in America is similar, but the contrast with the discipline of linguistics is a stark one. Although the systematic -- and even scientific -- study of language began much earlier, linguistics as a discipline crystallized much later, and grew more slowly, and adopted a more exclusionary stance from the beginning.

In my opinion, this is not simply a terminological issue. While it's true that many speech and language professionals call themselves something other than "linguists", that needn't prevent them from being as well trained and analytically sharp as anyone could wish. Nor is this mainly an issue of how the academic pie is divided, though I think that question is worth discussing. Rather, I'm concerned about the effect on the general education of the population at large. Many if not most American professionals -- doctors, lawyers, teachers and so on -- have taken at least one course in psychology. I don't know how much they learn and retain, but at least the discipline of psychology usually gets a shot at teaching them something. In contrast, as this blog has often had reason to note, most Americans -- intellectuals and ordinary citizens alike -- are never taught how to analyze or even describe the sound of a word, the structure of a sentence, the logic of an argument, or the flow of a discourse. When they need to address personal, professional or public-policy issues that deal with language, they're on their own, and it shows.

And increasingly, the same can be said of professionals whose job descriptions involve practical linguistic analysis as a core activity. The fraction of language teachers, reading teachers, writing teachers and the like who ever acquire any significant skills in elementary linguistic analysis is lower than it has been at any time in the history of our civilization. While there are many reasons for this, the disciplinary weakness of the field of linguistics is surely one of them.

Posted by Mark Liberman at 01:41 PM

June 04, 2006

Thou shalt not miss Jan Freeman

A splendid column by Jan Freeman in the Boston Sunday Globe today, about whether Newsweek could muster any excuse at all for their silly headline "Thou Shalt Not Like It" on their negative review of The Da Vinci Code. (Thou shalt is of course a prohibition. They wanted to express a prediction about your reaction, not forbid a positive one. Surely they knew the difference.) Read her lively discussion of this and a range of similar pseudo-Olde Englishe nonsense examples here.

Posted by Geoffrey K. Pullum at 02:45 PM

Googling without spelling

I'll add one to this morning's character-encoding theme. Yesterday, Patrick Hall at Blogamundo posted an interesting reaction to our recent Name That Tune's Language series, "How do you Google something you can't spell?". He observes that a web search for the song's lyrics based on the correct Romanian orthography fails, while one based on an English-inspired asciified re-spelling succeeds. He points out that similar problems are common in searching for Hindi on the web, and cites a paper that suggests an approach based on n-gram matches.

There's a cluster of loosely related problems here: dealing with variant (and wrong) spellings; dealing with "eye dialect"; dealing with alternative encodings (e.g. GB vs. Big 5 vs. Unicode for Chinese characters, or the multiple proprietary encodings of Hindi newspapers); dealing with languages whose orthographic system is not standardized (e.g. Somali or Elizabethan English); dealing with transliterations that are semi-systematic and/or in multiple systems (e.g. Arabic into English or English into Chinese); and so forth.

In addition to the ideas that Patrick references, let me cite the following without further comment (for now): Soundex, agrep and (especially) TRE; and Google Scholar queries like {fuzzy matching transliteration} and {levenshtein transliteration}.

Posted by Mark Liberman at 10:22 AM

ASCII, Mac OS X, and the 128 names of DormAid

A former Harvard dean I met recently at a party told the story of how couple of years ago Harvard agreed to let a private company (started by a Harvard student) under the name DorMaid contract to provide chambermaids come in to clean and tidy student dorm rooms for those students well-heeled enough to afford the fees. There was some discussion in March 2005 of whether this was political correct, given the class implications (see the critical editorial in the Harvard Crimson, and the contemptuous dissenting view on this blog by a Harvard alum). Harvard's negotiated terms before the company was permitted to operate, I was told, included a change to the company's name: from DorMaid to DormAid. "Maid" carried connotations of low-status unmarried females doing menial work for wealthy sons of the elite. "Aid" did not. (The Harvard Crimson didn't get the new name right: their editorial spelled it Dormaid, which does not match the company web site and, as I understand it, was never the correct name.) As I listened to this story I realized it reminded me of something rather ghastly about the Macintosh OS X operating system. Let me explain it to you. And if you use Mac OS X, you should listen, because not knowing this could be hazardous to your health.

As you can easily verify, in OS X the distinction between the name Harvard didn't like and the name that it did is, to say the least, subtle. Try a few experiments (with great caution). Start the Terminal program, and create a file called DorMaid. Don't put anything valuable in it. (The command touch DorMaid will create an empty file for you to experiment with.) Now try removing the nonexistent file DormAid (you don't have one, but imagine that you thought you did and you wanted to get rid of it). (The command to do this is rm DormAid.) If you know Unix, you should expect to see an error message referring to a nonexistent file. You'd expect the screen to look like this:

% rm DormAid
rm: DormAid: No such file or directory

But you won't see that error message. You'll see nothing but the command you typed and the new prompt. That's the expected behavior when a Unix command worked perfectly. It's what you'd expect if rm really had found a file of that name to remove.

Now use the ls -l DorMaid command to list the details of the file you created, DorMaid, which you did not attempt to remove. You'll find it isn't there:

% ls -l DorMaid
ls: DorMaid: No such file or directory

The file you created has gone AWOL. Where did it go?

Here's what's going on. OS X appears to be working with no distinction between A-Z and a-z, but pretending to recognize the distinctions. It registers your file names with the capitalization you gave them, but then treats then without regard to the case distinction.

So if you create a file called DorMaid, OS X will carefully repeat it back for you as if it were really called that (the ls program, for example, will show it as DorMaid. But it isn't really called that, and it is very dangerous to think otherwise. The truth is that all of the following are the same filename under OS X:

DORMAID DORMAId DORMAiD DORMAid DORMaID DORMaId DORMaiD DORMaid DORmAID DORmAId DORmAiD DORmAid DORmaID DORmaId DORmaiD DORmaid DOrMAID DOrMAId DOrMAiD DOrMAid DOrMaID DOrMaId DOrMaiD DOrMaid DOrmAID DOrmAId DOrmAiD DOrmAid DOrmaID DOrmaId DOrmaiD DOrmaid DoRMAID DoRMAId DoRMAiD DoRMAid DoRMaID DoRMaId DoRMaiD DoRMaid DoRmAID DoRmAId DoRmAiD DoRmAid DoRmaID DoRmaId DoRmaiD DoRmaid DorMAID DorMAId DorMAiD DorMAid DorMaID DorMaId DorMaiD DorMaid DormAID DormAId DormAiD DormAid DormaID DormaId DormaiD Dormaid dORMAID dORMAId dORMAiD dORMAid dORMaID dORMaId dORMaiD dORMaid dORmAID dORmAId dORmAiD dORmAid dORmaID dORmaId dORmaiD dORmaid dOrMAID dOrMAId dOrMAiD dOrMAid dOrMaID dOrMaId dOrMaiD dOrMaid dOrmAID dOrmAId dOrmAiD dOrmAid dOrmaID dOrmaId dOrmaiD dOrmaid doRMAID doRMAId doRMAiD doRMAid doRMaID doRMaId doRMaiD doRMaid doRmAID doRmAId doRmAiD doRmAid doRmaID doRmaId doRmaiD doRmaid dorMAID dorMAId dorMAiD dorMAid dorMaID dorMaId dorMaiD dorMaid dormAID dormAId dormAiD dormAid dormaID dormaId dormaiD dormaid

This is a whole slew of accidents waiting to happen. It is not just a silly triviality. I nearly lost some important source files because of this appalling misfeature. I am now having some difficulty reconstructing the exact scenario; what I posted here earlier was not correct. It may have been that I typed ‘cp ch1.tex Ch1.tex’ (trying to creat a new file with a capital initial) and failed to notice that there was an error message (copying a file to itself is not a legal operation), and then later decided to remove ch1.tex on the grounds that I didn't need it, and thus almost lost the original completely. All I recall is I had two near disasters several weeks ago in which I was only saved by lucky accidents of backup copies with different names, and after the second I realized what OS X was doing to me.)

The mv command, which alters the names or locations of files, allows name changes like mv DorMaid DormAid without complaint. The ls command then shows the name change as having taken place. But the rm command then ignores the change. This is very dangerous behavior. I had wondered why Apple was naming successive releases of OS X after dangerous beasts (Jaguar, Panther, Tiger, and so on). But in fact it's rather appropriate. Be warned. It's a jungle out there.

[Update: David Pesetsky writes from MIT with some helpful technical info that modifies what I say above — he claims it's the file system, not the OS itself. I quote:

Actually, from 10.3 on, you can specify that you want your hard drive to have "case-sensitive HFS+" when you set it up or if you repartition it. (There is such an option in "Disk Utility".) It's apparently not a property of OS X per se, but of the HFS+ filesystem that is the default for OS X installations.

I think it all has to do with compatibility with pre-OS X programs for the Mac, since the previous operating system was case-insensitive for real. If you Google for "case-sensitive HFS" or "Case-sensitivity Macintosh" and the like, you'll also see various warnings about problems with some OS X programs if you have turned "case sensitivity" on. But the option of case-sensitivity exists, at least at the time your disk is first formatted -- and apparently OS X can deal with it.

I followed David's advice and looked at this site and this one. On the latter I saw a comment asking why anyone would want case-sensitivity. I think I've answered that!

Don Porges tells me you get the same behavior doing DOS commands on a Windows system. I'm not really surprised. This is behavior that, unusually for Mac OS X, really sucks. I would have expected Windows to faithfully replicate every feature of other operating systems that really sucks, and then add stupid features of its own, plus bugs. And apparently that's right. Perhaps I should add this note before I close: Mac OS X is beautiful. I use it all the time. It's the best corporate operating system I've ever encountered, and I do all my serious work on it (I keep a Windows system purely to be able to work on files with coauthors who like WordPerfect. No other reason. WordPerfect is the best of the WYSWYG editors, but isn't available in a Mac version. Don't read me as saying Mac OS X is no good. It's wonderful. The only better systems are carefully selective and well-configured Linux systems (Debian Linux is my pick). And they are occasionally a little heavy on the system maintenance time that you have to put in. No, I'm grateful for what David Pesetsky has told me, because I love Mac OS X, and only hate the failure to be case-sensitive.]

Posted by Geoffrey K. Pullum at 07:10 AM

ASCII the diacritic assassin

[Guest post by Jarek Weckwerth. Jarek was one of those who answered Sven Godtvisken's "Name that tune ('s language)" query. This morning, he sent along an interesting follow-up, presented here after the jump.

There are a few terms of art in Jarek's post that may not be familiar to everyone, so here's a quick glossary: ASCII is the American Standard Code for Information Interchange, a 40-year-old scheme for assigning 7-bit codes to characters used in representing text. ASCII lacks codes for characters with accents and other diacritics, such as háčeks and ogoneks.  ISO Latin-1 is a slightly less obsolete character encoding, which uses 8 bits rather than seven, and includes many though not all of the characters used in "latin" orthographies, i.e. those based on the roman alphabet. Unicode is a modern, non-obsolete character encoding standard, increasingly (but still incompletely and often inadequately) supported by computer software that deals with text. UTF stands for Unicode Transformation Format; there are several different UTFs, of which the commonest (I believe) is UTF-8, designed by Ken Thompson and Rob Pike on a placemat in a New Jersey diner, one evening in 1992. Rob's description of the process will help you to understand why Unicode needs "transformation formats" and why there is more than one of them. ]

[Jarek's note follows.]

Just (?) three things more:

(1) Voltaj’s web page seems to have the official lyrics. “In chefuri si in fite.” So I think that should solve the Mondegreen problem :) (Notice the lack of diacritics on their page, which is otherwise quite nice in fact.)

(2) Stephen Smith’s comments re gashca etc.: In fact you can see the same in Poland. Those who cannot/do not use Polish diacritics e.g. in email will sometimes use English-like spellings instead, e.g. tesh for też /tɛʃ/ ‘too’. I would say this is done mainly for fun, and in an extremely irregular/idiosyncractic fashion. There are relatively few cases where you might see some real justification in avoiding “graphemic neutralisation” with another word. So, for example, the informal term for a mobile/cell phone, komóra /kɔˈmura/ may end up as komoora, since without the diacritic it’s in competition with komora /kɔˈmɔra/ ‘chamber’.

Another approach is to use “phonetic” spellings that would normally count as “spelling errors”, thus komura.

Interestingly, some owners of mobiles that do not offer Polish diacritics (still a majority) will use a slightly different technique when texting: they’ll use whatever diacritics are available, usually Western European characters with diacritics within ASCII Latin1, thus e.g. mojä for moją ‘my sg.fem.acc./loc.’ to preserve the contrast with moja ‘my sg.fem.nom.’.

Of course, this is all redundant in a vast majority of cases; in context, Polish (like Romanian) is almost completely readable without the diacritics, and many people will happily write Polish without them. However, my impression is that these are mainly the “old ASCII guard” (like myself – I’ve switched to UTF especially for this message) rather than “internet-savvy modern” people, as Stephen says. Or perhaps those who realise what conversion problems there are. I never use the Polish characters in my phone because they invariably come out garbled at the other end unless the recipient happens to also use a relatively up-to-date SonyEricsson. Nokia or Samsung won’t do ;)

One interesting case is the demise of ę in some endings, as in piszę ‘I’m writing/I write’ vs. pisze ‘(s)he/it is writing/writes’. The ending lost nasality quite some time ago, with the two forms becoming homophonous for many verbs. Standard spelling still preserves the distinction but I think it won’t last long in the Internet age; people who otherwise do use diacritics omit the ogonek more and more often. ASCII reinforces a natural sound change.

Has there been any research into this?

(3) When re-reading my response to the name-that-song riddle (BTW, thanks for posting!), I noticed I had done some unexpected stereotyping: I seem to think that Slavic/Balkan rock is an identifiable musical style. Where did that come from?! My experience with Russian/Romanian/Serbian etc. rock/pop is rather limited (=close to nonexistent). Two options:

(i) There is indeed an identifiable musical “substratum” in at least some pop/rock produced in those countries; it’s rooted in folk music; this folk music has some shared characteristics; it’s part of the stereotypical images of those countries; and these stereotypical images are sufficiently familiar to me. Possible. My feeling was that e.g. Russia, Poland, Serbia, Romania, Bulgaria would qualify, but the Czech Rep. wouldn’t, and for some reason neither would Greece. (No intuitions about Albania whatsoever.) So, after all, this may be related to experience in some way, however subconscious and superficial it might be.

(ii) It was Balkansprachbund at work. My first impression was based – I think – on the ubiquity of hard-ish sounding [tʃ, ʃ], and [ts], coupled with a rather simple vowel set. This was consistent with the initial Slavic hypothesis, and was amplified by [tatuˈaʒɛ] (whatever the final vowel), because the word sounded so instantly recognisable. (And it finally turned out to actually mean the same as in Polish!) Of course there was the schwa in one of the first words, but I seem to remember that it’s the only common phonetic feature of Romanian, Bulgarian and Albanian that is widely cited, most descriptions of the Bund focussing on morphosyntax, while my intuition was based on Slavic-like consonantal content. Could it be that the poor Romanians, having been surrounded by the Slavic element for so long, have not only borrowed some vocabulary but also some phonotactics? Not that unlikely. I’ll have to read up on this next week; a quick Google search doesn’t help this time.

It was probably both (i) and (ii). Anyway, the intuition was largely correct. Sometimes the apparent usefulness of stereotyping in everyday life scares me.

[Guest post by Jarek Weckwerth]

Posted by Mark Liberman at 05:44 AM

French in Maine: Louis XIV lives?

After decades of stigmatization, French language use is experiencing a revival in the state of Maine, according to the New York Times. For the sizable French-American population in the state, there are "reacquisition classes" for adults wanting to brush up on their linguistic skills and immersion programs to put the young ones in touch with their roots. French also receives a modicum of official recognition from the state legislature, which observes French-American Day every year with "legislative business and the Pledge of Allegiance done in French and 'The Star-Spangled Banner' sung with French and English verses." Uh-oh! A non-English version of the anthem? Don't let Dubya know, or Mainers might find themselves disenfranchised! (He'd probably spare Kennebunkport.)

One passage in the Times article should set off warning bells for regular Language Log readers:

French-American French, derived from people who left France for Canada centuries ago, resembles the French of Louis XIV more than the modern Parisian variety, said Yvon Labbé, director of the French-American Center at the University of Southern Maine.
French-Americans may say "chassis" instead of "fenêtre" for window, "char" instead of "voiture" for car. Mr. Labbé said many French-Americans pronounced "moi" as Molière did: "moé."

Sounds like a Gallicized version of the old "Elizabethan English in Appalachia" canard.

I believe that what Sally Thomason wrote last year about the Appalachian English myth holds true here, mutatis mutandis:

There are said to be features of Shakespeare's English that are preserved in Appalachian English but not in Standard English; but they would be noticeable only because they have vanished from Standard English. The many features of Shakespeare's English that remain in Standard English are not noticeable: they're just ordinary — though they are of course what makes it possible for American high-schoolers to read Shakespeare today. I bet Appalachian English has lost some Shakespearean linguistic traits that Standard English has retained, too. Differential retention of inherited linguistic features is one thing that characterizes divergent dialects of the same language. It's not a surprise, and it's not evidence of super-archaicness in any dialect.

Similarly, just because French as it is spoken in Maine has retained some older features that are no longer present in Standard French, this does not imply that the Maine dialect "resembles the French of Louis XIV more than the modern Parisian variety." Note the slight ambiguity of that sentence: does it mean "...more than the modern Parisian variety does" or "...more than it does the modern Parisian variety"? Either way, it suggests a kinship between American dialects of French and the language of a bygone classical era, as if those Franco-Mainers are living linguistic fossils who could chat amiably with Molière. But as with the Appalachian case, retentions from le français classique specific to the Maine dialect are only noticeable because they have dropped out of Standard French usage, while retentions in the standard dialect pass by unnoticed precisely because they are still standard.

A final point about the Times article: I was surprised to learn that "census figures show Maine has a greater proportion of people speaking French at home than any other state — about 5.3 percent." I would have guessed Louisiana holds that distinction. But a check of 2000 census data, both via the MLA Language Data Center and the U.S. Census Bureau's own website, verifies that Maine is significantly ahead of Louisiana in terms of percentage of the population, though Louisiana holds the lead in terms of the raw number of French speakers. (The Census Bureau specifically includes the Cajun dialect in its definition of French.) Unfortunately, when I check the MLA site against the official numbers from the Census Bureau site, I find a bit of a discrepancy:


Population (5+)
Population (5+) 1,203,442
French speakers
French speakers 63,610
Percentage 5.28%
Percentage 5.29%


Population (5+) 4,153,367
Population (5+) 4,152,122
French speakers 194,314
French speakers 179,750
Percentage 4.68%
Percentage 4.33%

The figures for Maine are close enough to suggest that the MLA is simply using slightly outdated numbers (the Census Bureau posts updated and corrected statistics on its site). But the Louisiana figures are a bit disturbing. How could the MLA site be missing only 1,245 Louisianans over the age of 5 while omitting a whopping 14,564 French speakers in the state? I may need to temper my praise for the MLA's presentation of census data if the site turns out to contain significant errors.

[Update #1: Cory Lubliner writes:

There is another ambiguity in the New Yotk Times article you cited: does "the French of Louis XIV" mean French as spoken in the 17th-18th century, or French as spoken by the Sun King?  This is a significant difference: there has always been a discrepancy between French as spoken by the upper class and le français populaire.  I am not familiar with Maine French, but I have a little familiarity with Cajun French, and I find that it has numerous features in common with older forms of colloquial French, features that have been eradicated by schooling in France.  Examples: je vas in place of je vais; à c'te heure (usually spelled asteur in Louisiana) in place of maintenant; and être aprés (note the acute accent!) + infinitive to denote the progressive form of a verb. Other features are still present in the vernacular of France, such as representing the first person plural of verbs with on rather than nous, but I doubt that Louis XIV ever spoke that way.

And Jim Gordon adds some insight about Maine French:

The bushies will probably be even more alarmed than you suggest: the French spreading into Maine is not only a foreign tongue, it's Canadian (or Quebecois) imperial colonialism. And it's not just French, it's /Joual/, the dialect named for the antique pronunciation of /cheval/, for horse.
Those of us who have had occasion to live in or visit Quebec found that (a) our metropolitan-accented and -constructed Français differs from the Quebecois /joual/ dialect, and (b) the Quebecois whom we meet automatically peg us as foreigners (or Anglo snobs) and switch to English. I commend to you a useful resumé of the life and times of /joual/ at <http://en.wikipedia.org/wiki/Joual>, and the wiki-lexicon at <http://en.wikipedia.org/wiki/Quebec_French_lexicon>.

The Wikipedia entry for "Joual" notes the "moé" pronunciation of moi mentioned in the Times article and provides the following historical explanation:

Although moé and toé are today considered substandard slang pronunciations, these were the pronunciations of Old French used by the kings of France, the aristocracy and the common people in many provinces of France. After the 1789 French Revolution, the standard pronunciation in France changed to that of the bourgeois class in Paris, but Quebec retained many old pronunciations and expressions, having been isolated from the Revolution by the 1760 British Conquest of New France.

And Bill Poser emails to say:

Canadian French (of which Maine French is an offshoot) is derived to a considerable extent from the dialect of Normandy. That accounts for many of its differences with standard French.

See also the more detailed history given in The Canadian Encyclopedia:

During the 17th and 18th centuries, the different varieties of French and dialects spoken by the settlers gradually fused together into a common Canadian French tongue, which retained features that were common to all the varieties of French spoken during the colonial period as well as features that were typical of the varieties of French or dialects spoken in the provinces which exported many immigrants to New France, eg, Normandy/Perche, Poitou, Aunis and Saintonge. ]

[Update #2: Over on Languagehat a commenter takes issue with Jim Gordon's comment about "Joual," claiming:

The french spoken in Maine has nothing to do with either Quebecois or Joual. It's Acadian.

From what I've read, "joual" is not terribly well-defined, but some do identify it with various Franco-American dialects including those spoken in Maine. (The Encarta definition of "joual" specifically mentions Maine.) This article delves deeper into the problematic usage of "joual" as applied to Franco-American French. The writer argues that "French-Canadians who immigrated to New England in the 19th century surely brought along some joual with their patois."

As for the Acadian influence on Maine French, I recommend this page from the University of Maine at Fort Kent. A 1962 study by Genevieve Massignon is cited:

Massignon concluded that the French of the Upper St. John Valley (Maine and New Brunswick) was a mixed, relatively Canadianized speech in comparison to that of surrounding Acadian settlement areas. Massignon found most Maritimes Acadian communities to be 'purely Acadian' in their origins and their present-day speechways. The St. John Valley, by contrast, she found to be a mixed zone (half Acadian, half French-Canadian), where speechways reflected a blend of Acadian and French-Canadian vocabulary and phonetics, and a predominantly French-Canadian morphology."

So the situation with Maine French seems a lot more complicated than simple labels like "Joual" or "Acadian." ]

Posted by Benjamin Zimmer at 01:19 AM

June 03, 2006

Vocalize, don't vocalize, whatever

Standing around the Language Log water cooler yesterday, Ben Zimmer told us about these two reviews of Far From the Madding Gerund. The first is by Linda Seebach, who knows a little something about linguistics, having done some graduate work in the subject; her review of FFtMG is uniformly positive, and even includes a summary of the recent feud between Geoff Pullum and Mark Steyn (as documented by Mark Liberman).

The second is by Michael Quinion, who "writes about International English from a British viewpoint". Quinion concludes that the book is not for everyone; basically, if your eyes glaze over at the sight of grammatical terminology, or if you don't mind reading Language Log for free on the Web, don't bother:

This is a book to dip into, not to be read straight through. Nobody will find every item interesting [...]. If you're flummoxed by such grammatical terms as hierarchical ontology, predicate, nominalisation, count noun, or prepositional phrase, you perhaps ought to give the book a miss. If you're not sure, the miracle of the Web means you can test-read articles by popping over to the Language Log site.

Should you trust Quinion's opinion? Dear Language Log readers, I think not.

Compare Quinion's summary of one FFtMG piece with the relevant portion of the original post:

They discuss why wedding vowels often appears when wedding vows is meant (many North Americans don't fully vocalise final l, so that they say vows and vowels alike).

There are many dialects of English that fully vocalize syllable-final /l/, turning it into a high back off-glide, and for speakers of these dialects, vows and vowels have merged phonologically. They've become homophones.

I don't know how this could be more clear -- even if you have no idea at all what 'vocalize' or 'high back off-glide' mean -- but Quinion gets it exactly backwards.

Note also that, earlier in the review, Quinion states about Language Log: "The theme of the site is grammar and correctness in English." Whether this claim is true depends on the structure of this phrase:

If Quinion means the first of these, he's basically right, though rather trivially: the theme of Language Log is grammar, and some posts happen to be about correctness in English. But this can't be what Quinion means, because he would then be talking about the themes of Language Log, not the theme. If Quinion meant the second, which I suspect, then he's just wrong: Language Log is decidedly not only about English.

[ Comments? ]

Posted by Eric Bakovic at 03:05 PM

Further thoughts on the riddle of GAN

[Our story thus far... Back in March, in a post titled "Engrish Explained," I reproduced an item from a monstrously mistranslated Chinese menu posted by John Rahoi on Rahoi.com: "Benumbed hot vegetables fries fuck silk." I quoted a commenter on Rahoi's blog identified as "an anonymous professor of China studies" who explained how fuck has been used as the translation-equivalent for GAN 'dry' (干 in simplified script) due to "the recent proliferation of Colloquial English dictionaries in China." More recently, Victor Mair wrote about the GAN confusion in two guest posts for Language Log: "A Less Grand Chinglish" and "GAN: Whodunnit, and how, and why?" Then Languagehat cross-posted Prof. Mair's query wondering "Who's telling the menu-makers and sign-painters to write 'fuck' for GAN1/4?" In the comments to the Languagehat post, the "anonymous professor of China studies" from Rahoi.com stepped forward in the pseudonymous guise of "xiaolongnu," elaborating on the earlier speculation about GAN and Chinese dictionaries of colloquial English. I received permission from xiaolongnu (who wishes to remain pseudonymous) to reprint the informative Languagehat commentary as a guest post here on Language Log.]

I'm not sure my comment on rahoi.com was clear enough on this point, but I do actually have a theory about how this happened. First, you have to know that the official English teaching materials the PRC has been using since 1949 were typically based on material produced in the early 20th century (I'm guessing inter-war) and reflecting a Commonwealth usage of English. So there are a lot of oddities of usage in PRC English that can be traced back to this fact. One of the most frequently encountered in my field of archaeology and art history is "the cream of" used to mean "the best of," as in "The cream of ancient Zhou bronzes from Houma" and other memorable titles. In American English this usage is generally limited to the set phrase "the cream of the crop" and I've explained this to a number of publishers who've all said, basically, "Really? We were told it was proper idiomatic usage."

So until the mid- to late-nineties in the PRC, students of English were being taught out-of-date British(ish) idioms as correct English, which was doing nothing for their communication with a business world that increasingly speaks an Americanized and quite colloquial version of English, and in which the formal language of early 20th-century Commonwealth English is not particularly respected or desired.

The upshot is that there's now considerable awareness of (a) the difference between American English and British English, distinguished as 美語 and 英語 respectively; and (b) the need to speak and understand not just correct, but idiomatic AND colloquial English. There's a tremendous emphasis in educational programs on speaking colloquial English -- think of the guy, I forget his name, who teaches people to overcome their inhibitions about speaking English by YELLING EVERYTHING THEY SAY. And there are scads of dictionaries and vocabulary books and phrasebooks out there that purport to offer the reader the "real idiomatic American English."

Say what you will about "fuck" as a translation for 干, you can't deny that it is deeply embedded in American idiom.

So I think that the compilers of such reference books, reviewing the possible translations of the word 干, must have settled on "fuck" as the most idiomatic of the possible usages. Any China-based readers able to go out to the bookstore and find an example? The same may be true for some machine translation software out there, I have no idea. The translators of the menu were probably concerned with rendering it in idiomatic American English, because if there's one thing they know about the history of English as it has been spoken in the PRC since 1949, it's that it has tended to be too British (sorry Brits, I mean this from the PRC point of view and not my own) and too formal. Too bad about the actual result, eh?

[guest post by xiaolongnu]

Posted by Benjamin Zimmer at 01:08 AM

Naming That Tune: "We keep it incessantly in pleated skirts and scowls"

This morning, I asked about the language of a song sent in by Sven Godtvisken. JS Bangs gave the right answer right away:

The song is in Romanian. Am I disqualified because my wife is Romanian and I'm fluent in the language?

I would never disqualify anyone for knowing something!

JS continued:

Here's a transcription in standard orthography:

Gaşca-i adunată din mii
Nu s-a schimbat
În pliu fu şi în fiţe o ţine ne-ncetat
Tatuaje noi, inele şi cercei
Stând doar pe MTV
Şi nu ne pasă ce zic ei

E o lume nouă
Una nouă nouă nouă
E o lume nouă

Once more in English:

The gang's made up of thousands
It hasn't changed
It was in the crease* and always wears a scowl
New tatoos, rings and earrings
Only watching MTV
And we don't care what they say

It's a new world
A new new new one

* The first part of this line was very hard for me to make out. As I transcribed it ("in pliu fu") it literally means "he/she/it was in the crease". I'm probably either hearing it wrong or am ignorant of some colloquialism using "pliu". The Academia Română DEX doesn't have any
entries for "pliu" that are helpful, alas.

Wouldn't that be "in the groove"? This is based on my English-side language model only, of course: I was able to guess that it was Romanian only by hearing some obviously Romance morphemes (mostly in other verses) and working through the possibilities by a process of elimination.

JS wrote back later with more information from his wife, the native speaker:

I wrote earlier today with a sample translation and transcription of the mystery song. My wife, the native, has since gotten home and we
came up with a more satisfying (but still not great) transcription of
the first three lines:

Gaşca-i adunată
Nimic nu s-a schimbat
În pliuri şi în fiţe o ţinem ne-ncetat

The gang is gathered
Nothing has changed
We keep it incessantly in pleats (i.e. pleated skirts) and scowls

I don't think we're going to do better without an actual lyrics sheet, given the famous difficulty of understanding song lyrics in any language :).

I don't know -- somehow the a priori probability of "we always keep it in the groove", as a rock lyric, seems higher to me that "we keep it incessantly in pleated skirts", which strikes me as likely to be my first encounter with a Romanian Mondegreen. But then, what do I know about Romanian rock sensibilities?

JS wrote again:

This is the last time I'm writing about this, I promise. A little extra googling revealed that the band is "Voltaj", who evidently got some sort of award from MTV in 2005.

Still couldn't find an online lyrics sheet, though, and now I really am giving up :) .

Jarek Weckwerth, unbiased by any actual knowledge of Romanian, relied on the web:

It's Romanian.

Method: Listen + google. Brute force, trial and error, simple tools, 20
minutes :)

My first impression was that it was Slavic/Balkan rock, a useful cue :)))

After first listening through the built-in speakers of my laptop, I thought I heard [dva] and [tatuaZe], which indeed seemed to point to a Slavic language (I'm Polish). But then I wasn't able to make out any other words, which would be quite unlikely if it really was a Slavic language.

I then listened through headphones, and decided to start from [tatuaZe] as [ta] was very clearly a word boundary and the word ("tatoos" e.g. in Polish) seemed to make good sense in a rock song. It started sounding familiar; last year we had a big (ugly disco) hit from Moldova (of all places), pounded ad nauseam all over Europe.

So I had a look at the Wikipedia description of Romanian orthography (simple tools, as I said) to see how [tatuaZe] could be spelled, added the next syllable, and googling "tatuaje noi" produced Romanian pages exclusively. Bingo. Then:

[tatuaZenoi | ineleSitSetSej ... emtiviSinunepas@tSezikjej]

which is compatible with the following (no Romanian diacritics... trial and error, adding one syllable at a time):

tatuaje noi inele si ce cei ... MTV si nu ne pasa ce zic iei ...

where all the substings google to Romanian sites. (Mind, I don't know any Romanian and had no idea at this point whether this made any sense at all...) But then googling "MTV si nu ne pasa" found just two pages, both containing the lyrics.

So, it's Romanian, the band is called Voltaj, song title 1@999, lyrics available here:


(So you can see that my transcription was far from perfect.)

(And of course the presence of [@] and (especially) [ts] also helped, as well as the relative absence of nasalisation.)

What do you say?

I say bravo; hooray for Jarek, and hooray for Google, and hooray for Wikipedia. Jarek wrote back to say that

Hello again,

In fact I've just noticed that already the third hit for "tatuaje noi" points to the lyrics... Got to get those glasses at last.

The lyric sheet gives the first verse and refrain as:

Gashca-i adunata, nimic nu s-a schimbat
In chefuri si in fite o tinem ne-ncetat
Cu tatuaje noi, inele si cercei
Stam doar pe MTV si nu ne pasa ce zic ei

E o lume 9, 1@999
E o lume 9, yeah-yeah-yeah-yeah
E o lume 9, 1@999
E o lume 9, hei-hei-hei-hei.

A Romanian-English dictionary glosses chefuri as "revelry", so it was neither "in the groove" (at least not literally) nor "in pleated skirts". And it (apparently) isn't a new ("nouă") world, a new new new one; instead, it is (or rather it was) a nine (also "nouă" ) world, one nine nine nine ("unu @ noua noua noua"), i.e. 1999. (Or maybe 9 for "nouă" and 1@ for "una" is Romanian 733t-speak?). Of course, probably it is (or was) both a nine world and a new world, over there is Bucharest seven years ago. Anyhow, there's nothing like a rock lyric for engendering Mondegreens. Of course there's no guarantee that the lyrics posted on the site "Versuri de melodii romanesti" are correct, either -- but between pleated skirts and revelry, somebody's hearing something other than what was sung.

Other submitted guesses were further afield, e.g.

  • Based solely on general phonology, I guess Saigon Vietnamese.
  • So, my guess is that it's Albanian. I managed to hear and roughly transcribe a word which was later found on many Albanian pages --namely, "tatuazhe". Though I still don't know what it means :)
  • My guess is Euskara (Basque). I think I could identify the words "etxe" and "loa", and that was enough.
    Don't really have any idea who the artist is, nor the song name, but I'm sure there's some Basque out there who'll be sending you that.
    This was fun.

Yes, it was. Thanks, Sven!

[Update -- Stephen Smith wrote:

Regarding your follow-up to the name-that-toon post on Language Log, in the bottom, with the Romanian transcription, you find something odd that appears throughout online Romanian text (I read Romanian with middling amounts of fluency): people replace Romanian letters (such as the s with the comma, ş or ș depending on what encoding you're using) with their equivalent English sound. And it isn't only nonnative Romanian speakers, or, say, second-generation Romanian-Americans who only know how to speak and not how to write the language. Native Romanians, living in Romania, post online using this shorthand. However, they do not usually find any way to represent the vowels (â, ă, î). Romanian can be completely comprehended without the accents, and in fact many things are published without them. The anglicization is totally unnecessary and reveals a strong affinity among Internet-savvy modern Romanians for English, which all Romanian school-aged children are taught.

The "lyric sheet" you used gives the first line as: "Gashca-i adunata, nimic nu s-a schimbat," where "Gashca" is meant to be "Gaşca," where the ş makes the /ʃ/ sound. Ţ makes the /ʦ/ sound and is sometimes, represented as "ts" or "tz," although this is not as prevalent as the ş/sh depiction.


[JS Bangs writes:

Looking at http://www.voltaj.ro/, it would appear that the song 1@999 did in fact appear on the "Risk Maxim 2" album, released in 1999. I'm sure that the play on words is deliberate.


Posted by Mark Liberman at 12:06 AM

June 02, 2006

Coordination gone awry

Susie Fork's page about "California Oakmoths at Elkhorn Slough", on the Elkhorn Slough National Estuarine Research Reserve site, seems to be telling us that some oak trees are insects:

So, although the oak moths can periodically wreak havoc on certain trees, oaks and oak moths have been coevolving for a long time and can be viewed as one of the many conspicuous insects of the Reserve.

[Elkhorn Slough is located near Moss Landing, on the Monterey Bay between Santa Cruz and Monterey.  I've been there, on a tour by boat -- with Geoff Pullum, in fact, back in the years before he became CRO (Chief Rant Officer) in the Language Log organization.  It's a fascinating place.  ("Slough" is pronounced "slew", by the way.)  As for the oak moths, we've been exceptionally afflicted by them this spring at Stanford -- a rain of caterpillars, then masses of cocoons, and now clouds of moths.  Ick.  Susie Fork, however, sounds pretty pro-moth.  Well, the Elkhorn Slough staff seem to value all the organisms they study.  But then they don't have to live with clumps of cocoons disfiguring the pieces in the New Guinea Sculpture Garden, the way we do.]

You can see how she got into this: the sentence is mostly about the oak moths, and "oak moths" is the most recent NP that could serve as the subject of "can be viewed as..."  Nevertheless, we end up with oaks (along with oak moths) as one of the many conspicuous insects of the slough (as well as partners in coevolution with the moths).  Not a good consequence.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 08:20 PM

Reflexive alert

G.M. Filisko writes in the ABA Journal eReport for 5/19/06, about speculations as to why J. Michael Luttig is resigning as a U.S. Circuit Court judge.  The fifth paragraph begins:

Others have speculated Luttig resigned in anger at the Bush administration over its handling of the Jose Padilla case, in disappointment caused by dimming hopes for a Supreme Court seat, or in the expectation that a private sector job will better position himself for the next Supreme Court vacancy.

I've bolded a startling reflexive pronoun.  The advice books don't warn you against such reflexives; they spend a lot of time inveighing against myself (and sometimes yourself) without an overt antecedent (as in "You should give it to Sandy and myself"), but the problem here is not that an antecedent is lacking for "himself" -- "Luttig" is the subject of a clause that begins "Luttig resigned" and takes up the rest of the sentence -- but that "Luttig" is structurally too distant from the reflexive to count as an antecedent for it.  English speakers and writers rarely produce such reflexives, so examples like the Luttig sentence might easily escape the notice of the grammar and usage mavens.  Even if they did collect such examples, they'd probably be at a loss to say why reflexives (rather than plain personal pronouns) are odd in them.

In English.  Similar examples in Japanese, with the all-purpose Japanese reflexive zibun, are fine.  What's going on?  And why is (2) so much worse than (1), even granting that (1) isn't stellar?

(1)  Luttig expects that a private sector job will better position himself for a Supreme Court vacancy.

(2)  I'm speculating about Luttig that a private sector job will better position himself for a Supreme Court vacancy.

[Original example from Victor Steinbok, who got it from Ann Althouse.]

First, a little background about English reflexives.  (Seasoned syntacticians can skip this part.)  The big generalization (subject to some provisos) here is that in English reflexives and their antecedents must belong to the same clause, in a special sense of "belong":

Definition: An occurrence X of some expression within a sentence Y belongs to the smallest clause within Y that contains X.

Clause-Mate Condition: Reflexive pronouns and their antecedents must belong to the same clause.

Look at (1), a pared-down version of the original Luttig sentence.  The clause that the reflexive pronoun "himself" belongs to is "(that) a private sector job will better position himself for a Supreme Court vacancy", and this clause does not contain the (intended) antecedent for "himself", namely "Luttig"; instead, "Luttig" belongs to the (larger) clause that comprises all of sentence (1).  Similarly for (2).  Neither (1) nor (2) satisfies the Clause-Mate Condition, so both should be ungrammatical, and equally so.

When I first looked at the Luttig sentence, I thought that Filisko chose a reflexive pronoun for its emphatic or intensifying effect (one of the motives for antecedentless myself and yourself).  Reflexives are good for this because they are weightier than accusative personal pronouns: they have two syllables rather than one, and always bear some accent, while accusative pronouns are usually unaccented.  And maybe that's all there is to say about the matter.

But then I noticed that the clause that the reflexive belongs to is the complement of an deverbal noun denoting a thought, "expectation", and specifically LUTTIG'S thought; Filisko could have written "in his expectation that...", but the identity of the expecter is perfectly clear for the version "in the expectation that..."  Then the light came on and the word logophoric popped up in my head.

Logophoric pronouns are used in complements of verbs of saying or thinking to refer to the person responsible for the words or thoughts. (For some general discussion, see Peter Sells's influential "Aspects of Logophoricity" in Linguistic Inquiry, 1987.)  Reflexive pronouns in some languages have logophoric uses, probably on the grounds that the speaker or writer is taking the viewpoint of the person whose speech or thought is being represented.  The Japanese all-purpose reflexive zibun has logophoric uses; the literature on zibun is now enormous, but for a recent survey of some relevant facts, see Yukio Hiroshe's article "Viewpoint and the nature of the Japanese reflexive zibun" (Cognitive Linguistics, 2002).

In any case, "himself" in (1), and in the original Luttig sentence, is in the right place for a logophorically used reflexive, while the reflexive in (2) is not -- which might explain why (2) is so much worse than (1).

It turns out that I am not the first person to suggest that English, or at least some varieties of English, might have logophoric uses of its reflexive pronouns.  The most useful discussion I've been able to unearth (in an admittedly quick search) is in a handout for a talk ("The interpretation of logophoric self-forms, and some consequences for a model of reference and denotation") given by Volker Gast at the 5th Discourse Anaphora and Anaphora Resolution Colloquium in 2004.  (There is also a fair amount of literature on logophoric and other non-clause-mate reflexives in earlier stages of English, where they were apparently more frequent than they are now.)  It's a handout, but a very detailed one, with a fair number of attested examples and some bibliography.

In fact, I am not the first person to discuss, right here in the halls of Language Log Plaza, what might be logophoric uses of English reflexive pronouns.  Back in January, a guest posting by Chris Culy (no slouch on logophoricity), "Getting ourselves in trouble", started from the Darrel Waltrip quote

He told me I talked and talked and talked, and eventually I'd say something that would get myself in trouble.

(which has the right context for a logophoric reflexive)  and went on to explore some other non-clause-mate reflexives in English that are pretty clearly not logophoric -- all this in response to Geoff Pullum's straightforward deploring of George W. Bush's "ourselves" in

And so long as the war on terror goes on, and so long as there's a threat, we will inevitably need to hold people that would do ourselves harm.

(which is certainly not logophoric, but is 1st person, so might fall into another category of non-clause-mate reflexives).

So it might be that the original Luttig sentence comes from a variety of English -- one that is not mine, or Steinbok's, or, I would imagine, Pullum's -- where logophoric reflexives are possible.  There's a lot of variation in language, after all, and so far I know, no one has ever claimed that logophoric reflexives are impossible in English on some principled grounds, which means that finding varieties that have them should be cause for a small celebration: here's something that languages can do, and, by golly, here are varieties of English (who knew?) that do it.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 05:58 PM

How do you spell "xenophobia"?

Blogospheric reactions to last night's results from the Scripps National Spelling Bee, in which Catharine Close of Spring Lakes, NJ vanquished Finola Hackett of Tofield, Alberta, have tended toward the facetiously jingoistic. "Jersey Girl Triumphs Over Canadian Menace," crows Wonkette. And the blogger on My All Time Top 5 adds: "Honestly, it's hard to come up with a result more satisfying than New Jersey sticking it to Canada in direct competition. (Well, I guess the other girl could have been French.)" If you were wondering about the Canadian participation, it turns out the "national" spelling bee actually accepts contestants from the entire Anglosphere (or at least wherever there's an official sponsor) — this year there was a European speller (Amanda Suarez, winner of the European Spelling Bee), as well as representatives from American Samoa, the Bahamas, Guam, Jamaica, New Zealand, Puerto Rico and the U.S. Virgin Islands.

But a lot of the bloggers' orthographic nationalism was directed toward the words used in the final rounds of the competition. Ursprache beat out Weltschmerz? Kundalini? Izzat? What izzat? And there might have been whispers of conspiracy when Hackett, whose home schooling in Canada naturally includes extensive study of French, kept getting one Gallicism after another: guilloche, douane, machicotage, esquisse, tutoyer. (C'mon! Tutoyer? You're telling me this thing wasn't rigged for bilingual Canadians?)

So how come our spelling bee has been taken over by foreignisms? The obvious answer is that these are the hardest words for most Anglophones to spell. Once you get past the usual Anglo-Saxon roots and Latin and Greek combining forms, things start to get tricky. And more recent borrowings from foreign languages are more likely to have a phonemic/graphemic mismatch from an English speaker's perspective. (Older loanwords very often develop either spelling pronunciations or pronunciation spellings to bring the written representation more into line with what is pronounced.) The kids who train for the spelling bee must therefore be savvy enough to know something of the sound systems of major world languages and how words from those languages get adapted in spoken and written English. That's why it was a bit of shocker when Hackett started spelling Weltschmerz with a v, neglecting a basic rule of German orthography.

Perhaps the best take on the spelling bee's foreign tilt is this finely wrought piece of journalistic satire from The Swift Report:

Group Objects to Words of Foreign Origin in National Spelling Bee

During this year's national spell-off, contestants were forced to puzzle out words of Spanish, Greek, Latin American, homosexual, even French origin. Now some native-born bee watchers say they've had enough. If they get their way, spelling bees from elementary schools to the nation's capital will soon be conducted in English only.

The article goes on to quote a representative from the (fictitious) lobbying group ProEnglishFirst: "Our position is that if you're going to spell in this country you ought to be spelling words that are native to our language." As Wonkette ruefully observes, in these days of rampant linguistic isolationism, the Swift Report piece "doesn't clearly read as satire — which is kinda scary."

Posted by Benjamin Zimmer at 02:34 PM

Northeastern's school song: Get me rewrite!

Last night was the glittering spectacle of the 11th annual Northeastern Alumni Night at the Pops, at Symphony Hall in Boston, featuring a warm congratulation to outgoing President Richard Freeland and the introduction to the university community of its new president, Joseph Aoun. And since this is the first time an active researcher in the linguistics profession has taken over the presidency of a major research university, naturally Language Log's Boston-area linguistics reporting staff was there to cover the event.

For Barbara and me, the musical highlight was the Gershwin selection: "Summertime" never sounded so lush and emotional, and the performance of "Rhapsody in Blue" made you want to go home and throw away your recordings of it lest they sully the memory. That orchestra is extraordinary. Symphony Hall was packed to the rafters, alumni from classes as far back as 1957 and even earlier were there in great mood, champagne flowed downstairs for donors. Welcoming applause for President Aoun was loud and enthusiastic. Everything was great. Except for one thing. They had to do the school song, of course. And they have a problem. The primary school song, bearing the not exactly original title "Alma Mater", is syntactically a real stinker. President Aoun, you are a syntactician. You have got to have the lyrics of that song reworked.

For a performance of the song, click here to listen to it done a cappella by The DownBeats. Now let's look at the lyrics:

Oh, Alma Mater here we throng
And sing your praises strong.
Your children gather far and near
And seek your blessings dear.
Fair memories we cherish now
And will forever more.
Come, let us raise our voices strong.
Northeastern we adore.

To start with, the opening line has a ludicrous verb choice. One does sometimes talk about crowds thronging a sports stadium, but the intransitive use of the verb is almost unknown. Nobody says "Let's make sure we throng at the Alumni Night on June 1." The only excuse for this absurd verb is (as in so much bad verse) that you need something that a bunch of other words will rhyme with. But the only thing rhymed with this is strong — which is used twice in eight lines! That's 25 percent!

I'm also not sure about gather far and near. People came from far and near for the evening at the Pops, but does gather far and near make any sense at all? There are essentially no Google hits at all for this phrase. A close look at the handful of candidates (other than the NEU song web page) reveals that there is a verse in a Scottish Christmas song with the lines "Clansmen and kith / Will gather, far and near", but the comma is crucial: it allows far and near to be a loose adjunct to clansmen and kith rather than a complement or modifier of gather. The few other distinct hits all seem to be from songs or verse and they're a sorry bunch. Basically this phrase is gibberish despite its two or three past citations.

But worse is to come, when we look at the syntax.

Attributive adjectives are syntactically required to be positioned before the head nominal in English (the postpositive adjective construction as in anyone competent is highly restricted, and not relevant here). I know that in archaic poetry, and bad recent poetry that imitates it, we get lots of exceptions; but it is not what the syntax of ordinary contemporary English permits, and you need a lot of pressure to allow poetic license to justify it. In this miserable verse three out of the four attributive adjectives are placed postnominally in defiance of the ordinary syntax of the language: *praises strong for strong praises; *blessings dear for dear blessings; *voices strong for strong voices. Only fair memories satisfies the syntactic conditions. (All the adjective choices are of course pathetic. Blessings are dear? Memories that are fair? Do you talk like this?)

[Update: it is a very interesting fact that people have been emailing me to say that they thought strong was being used as an adverb: that sing your praises strong was supposed to mean "sing your praises strongly (in a loud clear voice)". The thing is, this would be non-standard. Hit it hard is Standard English. [!]Treat me nice is not. And certainly [!]We should oppose it strong is not (I'm not sure strong works well as an adverb even in non-standard dialects). The writers of these execrable lyrics surely intended to write in the language that is Northeastern's medium of instruction. And that would be Standard English, where the use of zero-derived adverbs (the ones like hard that look exactly like their adjective cousins) is extremely restricted. So no, you cannot read strong as an adverb. But it's very interesting that the post-nominal adjective reading is so unnatural that some of you tried to parse it as an adverb.]

But the lamest thing is (and I know you will see what I mean, President Aoun) the verb-final last line. This uses a construction that linguists often call "topicalization" (not a good name: it doesn't make anything topical, or turn anything into a topic). The Cambridge Grammar uses the better term complement preposing. At this point I have to get slightly technical, but not unbearable.

The Cambridge Grammar (chapter 16, section 3, pp. 1372-1376) distinguishes focus complement preposing from non-focus complement preposing, and discusses the different discourse conditions of the two, but nothing that could make the construction appropriate seems to hold here. In many contexts the construction can't really be used at all in most contexts where its meaning might suggest it was usable. If you saw your friend's mother yesterday and you want to tell about it the next day, you say:

I saw your mother!

It would be bizarre beyond belief to say this instead:

???Your mother I saw!

This is how Yoda tends to talk. But he has an excuse: notice, he is of an alien species. The key element of Yodic syntax is use of complement preposing where the discourse conditions cannot possibly motivate it, leading to simple clauses with pointless Complement - Subject - Verb order. And that's what you've got in this song.

The complement preposing construction works best when you have two noun phrases, one with property A and the other with property B, and you want to constrast them by introducing each up front and then saying what the two contrasting properties are:

This color, I like; that one, I absolutely hate.

So this would be a powerful pair of lines (albeit subject to the charge of negative campaigning, since it sort of downplays the adorability of another university):

Harvard, we admire
But Northeastern, we adore!

I'm not recommending that the song should hurl this mild insult across the Charles river, you understand, I'm just pointing out that it would be a stylistically good use of complement preposing.

But to have Northeastern we adore just sitting there as the last line of a song that is all about Northeastern, where there is no contrast at all, implied or explicit, is bathos at its worst. It's utterly and profoundly lame. It has to go.

President Aoun, this won't be the most popular thing you do as president, but the tough decisions are best made right at the start. You have to ditch that song, and ditch it now. You don't need a song that reads as if it was written by Yoda. I'll come back for the next Northeastern Alumni Night at the Boston Pops next June, and I want to hear different lyrics to that song. (The young woman who sang it was great, by the way. Keep her.)

Posted by Geoffrey K. Pullum at 10:23 AM

Name that Tune ('s language)

Reader Sven Godtvisken writes:

I would like to solicit Language Log to help with a problem originally posed to me by a friend. This friend sent me a song, but she lacked the track information, as it came from a copied CD (the creator obviously did not put this information on the CD.) Now, the song is in a language unknown to both of us...

I think this is probably too easy if I give you the whole song (and I don't want to bring down some transnational version of the RIAA on Sven and his friend and her CD supplier -- not to speak of me) so here's the first verse and the refrain.. Extra points if you figure it out by searching on-line dictionaries for transcribed words, but if you just recognize the language, that's great too, and it's even better if you know the band. Send me your answers and I'll post a summary.

[The answer to Sven's question is obviously a matter of contingent fact, but its wording reminds me of a certain genre of logic puzzle. So as a bonus, I'll add a link to Neil Sloane's notes from G4G7, which include a link to Peter Winkler's paper "Seven Puzzles You Think You Must Not Have Heard Correctly". They're all great, but I especially liked "The Dot-Town Suicides":

Each resident of Dot-town carries a red or blue dot on his (or her) forehead, but if he ever figures out what color it is he kills himself. Each day the residents gather; one day a stranger comes and tells them something -- anything -- non-trivial about the number of blue dots. Prove that eventually every resident kills himself.

Comment: "Non-trivial" means here that there is some number of blue dots for which the statement would not have been true. Thus we have a frighteningly general version of classical problems involving knowledge about knowledge.

"Love in Kleptopia" is also special fun, since even quite young children can understand it.]


Posted by Mark Liberman at 05:07 AM

"Big" in Japan (and Bali)

Matt of No-sword, an excellent blog on all matters Japanological, recently brought up an interesting case of lexical borrowing across multiple languages:

I always assumed that the company name Bikku Kamera meant "big camera", and was just an example of "bad man/Batman" double consonant devoicing. But a co-worker noted yesterday that according to Wikipedia, I was wrong:

創業者の新井会長が、バリ島を訪れた際に現地の子供たちが使っていた 「ビック、ビッ ク」という言葉に、「偉大な」という意味があ ると聞いて社名に使った。
On a trip to Bali, company founder Arai heard local children using the phrase bic, bic, and, told that it meant idai (great, grand), used it as the company's name.

On the other hand, it seems that this Balinese bic itself derives from English "big". This would mean that my "devoiced /g/" theory was accurate as far as it went, but the assumption that this took place within Japanese was mistaken.

Lesson: Loanwords in Japanese are a psychedelic fever swamp.

Bad news, Matt. This cross-linguistic telephone game looks to be even more feverishly complicated than that.

Matt's working theory is that Mr. Arai heard Balinese children saying "bic, bic" and that the reduplicated word bic (presumably pronounced as [bik] or [bɪk]) is a nativized version of English big with final devoicing. I find this theory improbable on several counts. First, in my research in Indonesia I've never heard of children in Bali or elsewhere using the English word big in this way. To be sure, big has been borrowed into Indonesia's national language and its related regional languages like Balinese — mostly in set English phrases like big boss (often spelled as big bos) or Big Bang. I suppose it's possible that some Balinese kids picked up the word big and began using it repetitively to mean 'great, grand,' but if they did, we wouldn't expect final devoicing as in [bik] or [bɪk].

Devoicing is unlikely because Balinese, like other nearby Austronesian languages, has no problem with final /g/. The Balinese wordlist on Robert Blust's Austronesian Basic Vocabulary Database includes such words as beseg [bəsəg] 'wet' and kerug [kərug] 'thunder'. Another Balinese word for thunder also has final /g/: grudug, which can be intensified through echo reduplication as gradag-grudug. (Even more evocatively, the sound of a sudden thunderclap can be onomatopoeticized as jeg gradag-grudug!) Other common Balinese words ending in /g/ are gedeg 'angry', jegeg 'beautiful', belog 'stupid', kreteg 'bridge', togog 'statue', and awig-awig 'village regulations'.

What makes it even less likely that the children overheard by Arai were saying [bik] or [bɪk] as some sort of nativized version of big is the fact that the voiceless velar stop [k] generally doesn't occur word-finally in Balinese. The letter k, when it appears orthographically at the end of a Balinese word (like pianak 'child' or barak 'red'), is actually pronounced as a glottal stop [ʔ].

So what did Arai actually hear? Could the Balinese children have been saying big, big in mimicry of English, which was perceived by Arai himself as devoiced, thus yielding the company name BicCamera? That's a possibility, but I think there's a more likely source for Arai's bic, and it has nothing to do with English. In the national language of Indonesian, an extremely common word is baik [baiʔ], meaning 'good, fine'. In conversation, baik is frequently repeated as baik, baik — much as one might might say "okay, okay" in English. (There's also a reduplicated word baik-baik, meaning 'well, carefully, in good condition'). Most children in Bali's urban centers code-switch between Balinese and Indonesian (or, increasingly these days, just stick to Indonesian). I'd wager that the children Arai heard were actually saying baik, baik, which doesn't quite mean 'great, grand' but is close enough. (Perhaps Arai wanted to aggrandize the source of the company name by giving it that gloss, or perhaps the meaning was simply embellished in translation.) I'm not clear on exactly how baik might have ended up as Japanese bic, though I suppose in rapid speech [baiʔ] might sound something like that to the untrained ear.

So to recap, it appears that a Japanese businessman heard Balinese children using a common Indonesian expression, shifted the semantics and phonology a little bit, and ended up with a company name that can be interpreted as a devoiced version of a common English word. Ain't globalization fun?

Posted by Benjamin Zimmer at 01:33 AM

June 01, 2006

Tutoyer, koine, tmesis, Ursprache

Sure, most Language Log readers know those words (or should), but what about seventh and eighth graders? Those linguistic terms were among the words spelled by the two finalists in the closing rounds of the 2006 Scripps National Spelling Bee. In fact, Ursprache was the winning word, spelled correctly by Katharine Close of Spring Lakes, NJ. (Perhaps her championship word portends a career in historical linguistics?) Congratulations to Katharine, the first winner of the spelling bee to hail from New Jersey — and the first to win on live prime-time TV. Katharine beat out Finola Hackett of Tofield, Alberta, who was stumped by another Germanism, Weltschmerz.

[On the American Dialect Society mailing list, Larry Horn commented:

I was waiting for the young girl who had been given "tmesis" to spell to ask the Associate Pronouncer (yes, that's an actual officer) "Do you mean as in fanfuckintastic?" But she forbore. ]

Posted by Benjamin Zimmer at 10:15 PM