October 31, 2007

I am neither America nor a snowclone

It's been over a month since I last issued this edict.  Here's the Halloween version:

Not every reworking of an idiom, cliché, proverb, catchphrase, memorable quotation, or title represents a snowclone.  In fact, most such reworkings are just playful allusions to the original.

I say this because Mark Liberman reported a reworking of the Colbert title I Am America (And So Can You!) as She's Famous (And So Can You) and labeled it a new snowclone, and then Geoff Pullum followed up with yet another variation on the Colbert theme, Colbert for President, and you can too.  There's no snowclone here, and there probably never will be one -- just people riffing on a notable syntactic peculiarity of the original, Verb Phrase Ellipsis (VPE) in which the base form be is omitted (for details, look here).

As I've said several times in the past, clear snowclones are themselves formulas, and the snowclone template itself contributes meaning to instances of the snowclone.  Here's a summary from my website (based on a Language Log posting from 2005):

[Snowclones] have two-part histories, a first phase in which a fixed model gains currency, a second in which variations are played on the model, sometimes leading to a second fixing, a crystallization of these playful allusions into a snowclone.

... Sometimes, every part of the model that can be varied for effect is:

Eye Guy: Queer Eyes for the Spanish Guys, Straight guys for gay eyes, Homosapien eye for the Neanderthal guy

Brokeback: Backdoor Mounting, Breakdance Mountain, Brokeback Mounties

In snowcloning, these variants become fixed as formulas with open slots in them, and with a mostly calculable meaning: [e.g.] One man's X is another man's Y.  It's still possible to play creatively with the expression, but most variants will fit the template.

So far we have two variants on the Colbert title (surely there will be more), and they share almost nothing but the conjunctive form and the odd VPE.  It's hard to see how this thing could settle into a formula with (a few) open slots in it, and even harder to imagine what the semantic contribution of the formula would be.  This looks like another Eye Guy or Brokeback case, not like that warhorse of Snowclonia, The New Y (X is the new Y).

As I've also said several times, playful allusions to formulaic expressions are incredibly common.  Here are a few more from various sources:

Teachers Sold Separately (Harper's, November 2006)
The Kids Are Far Right (ditto)
A Farewell to Alms: A Brief Economic History of the World (recent book)

(Some involve imperfect puns, some do not.)  There's no point in trying to collect these things, because new ones crop up every day; you'd soon end up with tens or hundreds of thousands of them -- none of them formulas.

Now, there ARE hard-to-decide cases, and I'll return to some of them in a later posting.  But there are also plenty of clear cases, and at least for the moment it looks like variation on the Colbert title is one of them: clearly not a snowclone.

Posted by Arnold Zwicky at 08:05 PM

"You don't have to say, 'Use English!'"

Greetings from the US Airways terminal at LaGuardia, where I'm waiting for a flight to Ithaca. A propos of the cartoon that Mark posted earlier today, I wish to acknowledge a not-crazy piece of language commentary in a public venue, namely the US Airways inflight magazine, US Airways Magazine. A subsection of a story about how to interlocute with your teenager is devoted to coping with the strange, barbaric language being developed by Young People Today:

That way, when your daughter speaks to you in a language she got straight from a hip-hop CD, you don't have to say, "Use English!" She is using English, but it's in a special sociolect that isn't yours. (In fact, a teenager could argue that the illogical rules and punctilios of correct grammar constitute a code language that's not any different from her own hip-hop lingo.)
The surrounding text even helpfully provides some kind of pop interpretation of identity construction through language choice. So props to you, US Airways Magazine, and contributing author Jay Heinrichs, for exhibiting some evidence of having come into contact with a Ling 101 course, or relevant equivalent content, sometime in the past.

(See how I used 'props' there? Hope I got it right -- it's one of them words I've picked up from Young People Today, who I am sadly apparently no longer one of, what with being hyperconscious of 'props' as an innovation and all (unless I'm having a recency illusion). Consulting the Urban Dictionary, I think I'm pretty close. One of the definitions there says it's short for 'propers', which just sounds crazy to me, but checking in with Wikipedia, I see that that itself might be short for 'proper recognition', and that there's a gesture for communicating this very same notion, the 'fist pound'. I knew about the gesture but not its connection to the word 'props'.)

Update: Ok, so 'propers' is something I've been singing happily to myself without understanding for years, darnit, in 'Respect', a song recorded before I was born; it's short (apparently) for 'proper respect'. What a dweeb!1 Thanks to grixit for that info.

And since I'm logged in here, and one of Mark's other Halloween posts is about science reporting, I thought I'd pass on the link to one of the most spectacularly surreal BBC "science" reporting tours de farce I've run across lately, forwarded by my colleage Andrew Carnie. Apparently H.G. Wells' imaginary future of Eloi and Morlocks is a realistic possibility: today's modern humans will divide into two subspecies, 'gracile' and 'robust', "according to a report commissioned for men's satellite TV station Bravo."

a) Isn't Dr. Oliver Curry a bit young to be bitter and unsuccessful already, and answering calls from the Dogberts at Bravo? Update: So of course a little elementary Wikipedia work would have alerted me to the fact that Dr. Curry feels his essay was taken out of context and misrepresented by Bravo and subsequent articles, surprise surprise...
b) Is this the same TV network known as Bravo here in the US? If so, it's not really accurate to describe them as a "men's satellite network".Update cont: ...and to the fact that 'Bravo' is indeed a men's network in the UK. Thanks to Lance and Barbara Zimmer for these clarifications!
c) 'Science reporting' is supposed to be about 'science', which is often these days made up of a series of claims together with supporting evidence which has appeared or will appear in a peer-reviewed academic journal of some kind. Science does not often orignate in special reports commissioned by television networks (though sometimes it shows up there later).

Shouldn't somebody get the BBC Science team a subscription to Science or Nature or something? Their reporters seem to have been reduced to trolling for stories while surfing the late-night offerings. Next thing you know we'll see BBC Science articles about the Bowflex machine or laser-sharpened kitchen knives.Update cont. cont. ...though of course, as Mark points out from the IcelandAir gate at BWI, the late-night trolling hypothesis would explain how they came up with the idea to cover the burning scientific issue of breast enlargement.

1 "Propers" is an interesting formation, assuming it is from "proper respect", though. Clippings usually respect the categorization properties of their head nouns, so I wouldn't expect a reduction from "proper respect" to "proper" to pluralize, since "respect" is a mass noun (*respects). What I guess it must be is a clipping of just the 'pect' part, leaving "proper.res" with subsequent reduction of that "res" syllable, so you end up with a superficially plural-looking form, which is probably then treated like "proper" + "pl", interpreted, probably, like 'scissors' or 'thanks', as a pluralia tantum... and *that* is shortened thence to "props". But this is all just armchair etymologizing; there's nothing in any of the sources I can access from my hotel room here in Ithaca about it (e.g. it's not in the AHD, the OED, or the Cambridge Dictionary of American English). Anyone out there either a) have intuitions or b) know of other sources on it? (It must be in DARE, no?)


Posted by Heidi Harley at 07:00 PM

False logic and linguistic blindness: you could look it up

According to James Hrynyshyn ("a freelance science journalist based in western North Carolina"), "There is no such thing as a 'woman president'":

You know the English language is in trouble when both NPR and the BBC World Service decide that "woman" is an adjective, as in "Argentina has just elected its first woman president." As a copy editor, I had to fix that one numerous times, usually in the copy of young reporters whose excuse was that the proper adjective, "female," was too clinical, and they didn't want their story to read as if it concerned a science project. Oh really?

First, that's no excuse. "Woman" is noun. Look it up.

Well, the English language systematically and generally allows nouns to be used as modifiers, so I don't really need to. But if you insist...

The Oxford English Dictionary's entry for woman has:

II. attrib. and Comb.

6. a. Simple attrib. = 'of or characteristic of a woman or women, feminine, womanly'

Citations are given back to the 16th century:

1542 UDALL Erasm. Apoph. 29 The woman sexe is no lesse apte to learne al maner thynges then menne are.
1621 LADY M. WROTH Urania 104 Woman modestie kept her silent.

And continuing to the present day:

1971 V. CANNING Firecrest vi. 83 He put his arm round her shoulder..and felt through silk the warmth and firmness of woman flesh.

Even more relevant is the specific subentry devoted to the "woman president" type of construction:

b. appos. (a) = 'female', esp. with designations of occupation or profession: woman doctor, driver, -help, journalist, officer, p.c., police officer, -savage, teacher, etc.

Examples of this one are cited back to the early 14th century:

a1300 Cursor M. 29420 If þou wit þi woman frend Find clerk be doand dede vn-hende.

The construction is in common use in the 15th and 16th:

1530 PALSGR. 289/2 Woman coke, cuisiniere.
1617 MORYSON Itin. I. 258 The famous woman poet Sapho.
1632 BROME Court Beggar V. ii. (1653) S3b, What Woman Monster's this?
1659 D. PELL Improv. Sea Ep. Ded. dj, Wee are so wise now, that wee have our woman Politicians.

And it's cited through to the present day as well:

1968 R. L. FISH Bridge that went Nowhere iv. 44, I might have known it would be a woman driver!
1972 L. LAMB Picture Frame xviii. 154 A woman p.c. was clearing an outside drain.
1973 ĎB. MATHERí Snowline x. 121 I'll send a couple of woman officers along.
1976 R. LEWIS Witness my Death i. 36 You've shown all the worst traits that can be expected in a woman doctor. 1976 Southern Even. Echo (Southampton) 11 Nov. 32/5 A chase through rush-hour crowds ended with a suspected shoplifter escaping into the darkness..as he was pursued by a woman police officer.
1982 A. BROOKNER Providence ix. 108, I wonder why they didn't send a woman teacher.

The plural women is sometimes used in a similar way:

1935 D. L. SAYERS Gaudy Night vii. 147 There are much better ways of enjoying Oxford than fooling round..with the women students.
1971 Guardian 15 Apr. 11/1 The diocese of Hong Kong, the only diocese out of 300 to have stated openly its support for the ordination of women priests.
1981 'A. CROSS' Death in Faculty ix. 106 Most women students..don't really believe women professors actually exist.

I think that the quotes from Dorothy Sayers and Carolyn Heilbrun ("Amanda Cross") pretty much settle the matter; but just for fun, I'll point out that John Dryden's translation of the Aeneid joins the chorus:

1013 "Vain hunter! didst thou think through woods to chase
1014 The savage herd, a vile and trembling race?
1015 Here cease thy vaunts, and own my victory:
1016 A woman warrior was too strong for thee.
1017 Yet, if the ghosts demand the conqueror's name,
1018 Confessing great Camilla, save thy shame."

As for man, it has its attributive uses as well. Returning to the OED, we find (among many others):

1530 J. PALSGRAVE Lesclarcissement 242/2 *Man nourse, novrricier.
1786 C. POWYS Diary May (1899) 225 To the play I went... The men actors at this period do not shine in London.
1893 Ladies' Home Jrnl. Apr. 39/1 There is no impropriety in a man friend writing to you without having asked your permission.
1922 J. JOYCE Ulysses II. 517 What's our studfee?... You fee men dancers on the Riviera, I read.

It's surprising how often this happens. Someone invents a theory about the nature of the English language, entirely unsupported by any evidence or argument beyond his own whim about what is "logical" (or "consistent" or "traditional" or just plain "correct"), and starts hectoring everybody for writing or talking in a fashion that's been normal among educated speakers for centuries -- in this case, 700 years.

You also can generally expect to find that the complainer fails to heed his own advice. In this case, a quick scan of the first third of the front page of Mr. Hrynyshyn's blog turns up at least the following examples of nouns used as modifiers:

sports reporting, sleeping habits, science journalist, reincarnation nut-case Shirley MacLaine, computer models, greenhouse gases, climate change, chemistry professor, Nobel Committee, peace prizes, Ecology Center, bottom line, climate science, "climate porn", media coverage, the climate change front ...

(Geoff Pullum discussed the general question of nouns as modifiers in March of 2005, and it has come up in other contexts as well, for example here.)

[Hat tip: Mark V. Paris]

[Update -- Ben Zimmer points out that William Safire dealt with this question at some length a few months ago.]

Posted by Mark Liberman at 10:51 AM

And so can you (be)

The quick eye of Mark Liberman recently spotted what may be the fastest ever emergence of a new phrase into snowclonehood when Steven Colbert's book title I Am America (And So Can You!) was picked up by Guy Trebay of the New York Times after just three weeks: Trebay's pastiched article title She's Famous (and So Can You) has just the same syntactic property — an ungrammatical (or at least strikingly and off-puttingly unusual) deletion of a repeat occurrence of be. [I'm assuming here that I am America (and so can you be!) is fully grammatical and acceptable, and so is She's famous (and so can you be). At least one reader has written to say he disagrees with this. The near-prohibition of deleting non-finite forms of be under identity of sense was studied in a nice doctoral dissertation by Nancy Levin at The Ohio State University some years ago. And Arnold Zwicky gave a careful and serious discussion of the syntax of Colbert's title back in May, in this post.]

Emergent snowclone, thought Mark when he noticed the pastiche. Well guess what: the Current Strawpoll at the Doonesbury site as of today is "Colbert for President, and you can too." Mark really is one heck of an emergent snowclone spotter. And you can too.

Posted by Geoffrey K. Pullum at 09:26 AM

The psychodynamics of science in the media

This morning's Dilbert advances a theory about why the coverage of science in the mass media is so bad.

But I think it's wrong to pin the blame on personality defects and moral flaws in individual scientists and journalists. Nor should we over-emphasize the malign influence of the Dogberts who run the show, and the Rent-a-Weasels who support them. (And while we're cataloguing defects and flaws, let's not forget the crucial role of the folks in the public relations departments, who select topics and write the press releases that seed most science stories. But the flacks are not really the villains here either.)

It's true that things would be better if individual scientists were less willing to over-interpret or mis-interpret in order to make a splash; and if PR people were less eager to encourage and help them; and if individual journalists had the time and the ability to do some critical reading in the primary literature, instead of just decorating press releases with a few quick quotes from experts; and if media executives were not too focused on bean-counting to care one way or another about any of this. But focusing on individual failings ignores the fact that all of the people involved -- scientists and journalists and executives and rent-a-weasels -- are responding to the normal economic and psychological forces within their diverse subcultures, which interact badly in their areas of overlap.

On the whole, the whole system of science and engineering does a pretty good job of creating knowledge and technology. On the whole, the media do a pretty good job making information available to the public. Put them together, and the whole is noticeably less responsible than the parts.

In fact, the subcultures of science and journalism are so ill-matched that it's amazing we don't get a steady diet of stories about imaginary studies, non-existent animals with superpowers, alarmist fantasies, misleading or fabricated numbers, and so on.

Oh, wait...

Anyhow, individual virtue is always to be prized. But if you want better science journalism, you need to change the cultural and economic forces that gives us the system we have.

Better education, especially in statistics  -- for the audience as well the creators of the stories -- is probably the one thing that would make the biggest difference. And commentary in the "new media" -- especially weblogs -- adds another cultural force whose influence is mostly positive.

Posted by Mark Liberman at 06:51 AM

Linguistic change in progress

The usual idea that Young People Today are developing their own strange, barbaric language (see "LOL?", 4/30/2007). So it's nice to see this morning's Zits turning the trope upside down:

Posted by Mark Liberman at 06:48 AM

October 30, 2007

Taboo avoidance in translation: kros words

Deborah Cameron's new book The Myth of Mars and Venus continues to draw press attention in Britain, the latest coming from a column by Damian Whitworth in the Life & Style section of the Times. And just like Susannah Herbert's Times review three weeks ago (discussed by Mark Liberman in an earlier post), Whitworth opens with an ethnographic vignette from the work of linguistic anthropologist Don Kulick (though Kulick goes unnamed here):

In Gapun, a remote village on the Sepik River in Papua New Guinea, the women take a robust approach to arguing. In her pithy new book The Myth of Mars and Venus, Deborah Cameron reports an anthropologist’s account of a dispute between a husband and wife that ensued after the woman fell through a hole in the rotten floor of their home and she blamed him for shoddy workmanship. He hit her with a piece of sugar cane, an unwise move that led her to threaten to slice him up with a machete and burn the home to the ground.
At this point he deemed it prudent to leave and she launched into a kros — a traditional angry tirade directed at a husband with the intention of it being heard by everyone in the village. The fury can last for up to 45 minutes, during which time the husband is expected to keep quiet. This particular kros went along these lines: “You’re a f****** rubbish man. You hear? Your f****** prick is full of maggots. Stone balls! F****** black prick! F****** grandfather prick! You have built me a good house that I just fall down in, you get up and hit me on the arm with a piece of sugar cane! You f****** mother’s ****!”

In her review of Cameron's book, Herbert presents the obscenities of the quoted kros in a similar expurgated style. A casual reader of the Times might be led to believe that Gapun villagers speak a mixture of English and asterisks.

For the relevant passage Cameron cites Kulick's 1993 article in Cultural Anthropology, "Speaking as a Woman: Structure and Gender in Domestic Arguments in a New Guinea Village" (available via JSTOR), which elaborates on the findings Kulick presented in his book Language Shift and Cultural Reproduction. The tiny village of Gapun where Kulick conducted his fieldwork has a population of about a hundred, and villagers speak both the English-based creole language Tok Pisin and a Papuan vernacular called Taiap. As is typical in Papua New Guinea, Taiap is spoken in Gapun and nowhere else, and the language is increasingly endangered as young villagers abandon the vernacular for Tok Pisin. Older speakers still code-switch between Taiap and Tok Pisin (not English and asterisks).

The kros quoted by Cameron (and bowdlerized by the Times) was spoken by a woman named Sake, who was in her early thirties at the time of Kulick's fieldwork. Kulick describes her as the most dominant woman in Gapun, as she "commands respect and vague feelings of fear from all villagers because of the speed and intensity with which she can hurl long monologues of abuse at anyone she feels has imposed on her in some way." The kros (lit. cross; fit of anger) is the verbal genre in which Sake hurls her vituperation, towards her husband Allan and others in the village.

Here is the unexpurgated translation of the kros in question as given by Kulick in the 1993 article, with regular text representing (translated) Tok Pisin and underlined text representing (translated) Taiap:

Though Kulick doesn't provide a transcript of the original kros in Tok Pisin and Taiap, he does give an indication of how he has chosen to translate Sake's invective in a footnote to the article:

In order to give readers a sense of the tone and emotive force of the words used in a kros, I have avoided literal translations and have instead translated vernacular and Tok Pisin speech into a colloquial form of American English. As for the translation of obscene speech, the word I have rendered as "fucking" in Taiap is the word "bad" (aprɔ) plus an emphatic lexeme (sakar) used only in context of abuse. The anatomical references in the obscenity are fairly literal translations of the originals; thus maya pindukunga aprɔ sakar, which I have glossed as "fucking mother fucker," is literally "mother fuck+NOMINALIZER bad EMPHATIC."

Regardless of the expurgation (and the glossing over of the translation issue), it's unclear why Herbert and Whitworth found the kros so noteworthy as to lead their respective pieces on Cameron's book. For Herbert, it presents an opportunity to muse that Cameron "would give a first-class kros, and enjoy it, too." For Whitworth, it's an opening for some marital humor (and some tweaking of journalistic rules for censoring obscenities):

Such a domestic scene may be familiar to some readers, but for most of us arguing with our partners is not quite such an explosive business; except, perhaps, when discussing who is most responsible for a navigational hiccup on the way to lunch at the home of an old flame of our partner’s, or getting to the bottom of who left the ****** ******* cap off the **** ******* toothpaste for the third ****** ******* time this ****** ******* week.

The six-letter + seven-letter combination of asterisks is easy enough to figure out (though motherfucking is more commonly spelled as a single eleven-letter word without a space or hyphen). But what's the four-letter + seven-letter combination preceding "toothpaste"? I'm guessing Whitworth and his editors missed a couple of asterisks in there, unless it represents a vulgarity so vulgar that it transcends conventional strings of avoidance characters. Sake would be proud.

[Update: Two readers have already suggested that the four + seven asterisk combination could stand for cock-sucking. Kind of a strange way to describe toothpaste...]

Posted by Benjamin Zimmer at 12:27 AM

October 29, 2007

Statured pitchers, statured scientists

Continuing the baseball theme... Super-agent Scott Boras announced last night that Alex Rodriguez would opt out of his contract with the New York Yankees and declare free agency. Boras (who conveniently made the bombshell announcement during the final game of the World Series) told the Associated Press that A-Rod opted out because he was uncertain whether Mariano Rivera, Jorge Posada and Andy Pettitte would return to the Yankees:

"Alex's decision was one based on not knowing what his closer, his catcher and one of his statured pitchers was going to do," Boras said. "He really didn't want to make any decisions until he knew what they were doing."

This isn't the first time Boras has referred to a baseball player as statured:

Anytime we assume that an amateur player will be a statured major league player, that expectation is to the detriment of the player. ... A statured player like Pujols would be at the top of the market with his historic performance. (Baseball Prospectus online chat, Aug. 9, 2005)

I think the New York fan is a special fan. They expect their team to win, and when the team doesn't win, the statured players are the ones they point to. (New York Post, Feb. 7, 2007)

I haven't spoke to John in well over a year, which is contrary to what general managers do when they have statured players on their team. (Atlanta Journal-Constitution, Oct. 2, 2007)

Statured almost always appears as the second half of a hyphenated compound, with the preceding modifier denoting an extreme of height or build (small/large, short/tall, high/low) or else something in between (mid, middle, medium, average, normal, full). Boras' use of bare statured may be his own particular verbal mannerism, or perhaps he's a fan of E.O. Wilson.

The relevant sense of statured pops up in some older dictionaries — the Century Dictionary (1889-91) gives one definition as "of or arrived at full stature," while Webster's Revised Unabridged Dictionary (1913) has "arrived at full stature." Webster included this sense in his 1828 dictionary, but said it was "little used" even then. Over the years the word has shown up primarily in poetic usage:

Lo, as I gaze, the statured man,
Built up from yon large hand, appears
A type that Nature wills to plan
But once in all a people's years.
"The Hand of Lincoln," Edmund Clarence Steadman (1897)

When Edward O. Wilson published Consilience: The Unity of Knowledge in 1998, consilience (meaning the "jumping together" of scientific knowledge) wasn't the only rare old word that he revived for his argument. In his discussion of the climate crisis, he wrote:

Some will, of course, call this synopsis "environment alarmism." I earnestly wish that accusation were true. Unfortunately, it is the reality-grounded opinion of the overwhelming majority of statured scientists who study the environment. By statured scientists I mean those who collect and analyze the data, build the theoretical models, interpret the results, and publish articles vetted for professional journals by other experts, often including their rivals. I do not mean by statured scientists the many journalists, talk-show hosts, and think-tank polemicists who also address the environment, even though their opinions reach a vastly larger audience. (p. 307)

The turn of phrase "statured scientists" is not original to Wilson, however, since he was apparently echoing the sentiment of the Chairman of the Advisory Committee on the Environment of the International Committee of Scientific Unions, James McCarthy, as quoted in Ross Gelbspan's 1997 book The Heat Is On: The High Stakes Battle over Earth's Threatened Climate:

"There is no debate among any statured scientists of what is happening," says McCarthy. By "statured" scientists he means those who are currently engaged in relevant research and whose work has been published in the referred scientific journals. (p. 22)

The McCarthy/Wilson/Boras usage of bare statured moves the word away from what the Oxford English Dictionary terms "parasynthetic derivatives," i.e., "compounds one of whose elements includes an affix which relates in meaning to the whole compound; e.g. black-eyed ‘having black eyes’ where the suffix of the second element, -ed (denoting ‘having’), applies to the whole, not merely to the second element." Linguablogger Brett Reynolds recently explored compounds of this type in two posts on English, Jack, citing the Cambridge Grammar of the English Language (p. 1709) for the treatment of the -ed suffix in such formations.

Statured, when used on its own, joins other examples of X-ed meaning 'possessing or characterized by X' such as moneyed or cultured. (See also Mark Liberman's commentary on faithed.) But statured strikes me as a bit peculiar, since it's typically associated with adjectives of size, similar to other parasynthetic derivatives like (large/small)-sized or (large/small)-framed. We wouldn't expect sized or framed to appear on its own to mean 'having a full size/frame.' But perhaps statured can take on the extended meaning because the base form stature already sounds somewhat big and important, as opposed to the more neutral-sounding size or frame.

Then again, though we don't find people described as sized without a preceding modifier, we do find the faux-PC-ism person of size, a riff on person of color. And what do you know, in an interview with the New York Post about A-Rod, Scott Boras used "pitcher of stature" rather than "statured pitcher":

We just weren't prepared to make an economic decision like this until we knew the philosophy of the club and what was happening with key players. We are talking about a pitcher of stature, a catcher and a closer.

I like that construction better, if only because it lends itself to a bit of doggerel:

Rub-a-dub-dub, three men on a club.
And who do you think they be?
A closer, a catcher, a pitcher of stature,
But who will remain a Yankee?

(Need to work on the scansion of that last line.)

Posted by Benjamin Zimmer at 11:56 AM

Incoming links

Some recent articles referencing Language Log: Caryl Rivers and Rosalind C. Barnett, "The difference myth", Boston Globe 10/28/2007; Faye Flamm, "Sex with a feminist is great, survey finds", Philadelphia Inquirer 10/29/2007; Carol Lloyd, "Linguists: 'Moist' makes women cringe", Salon 10/29/2007.

Posted by Mark Liberman at 09:45 AM


In honor of the Boston Red Sox' World Series sweep, here's a small linguistic gem from Curt Schilling's blog, for Arnold Zwicky's Verb Phrase Ellipsis collection. This comes from a darker time, back in mid-September when it looked like maybe the Sox would collapse and lose the pennant race to the New York Yankees ("Character test incoming", 9/20/2007):

Ouch. I certainly envisioned the start on Sunday ending in a much different manner than it did, but it didn’t.

That would have been the game on Sept. 16, when Schilling lost 3-4 to Joba Chamberlain and the New York Yankees, after giving up three runs in the eighth inning. But it takes some thought for a Red Sox fan (which I was as a boy, growing up in eastern Connecticut) to decide how to feel about a game that didn't end in a much different manner than it did.

Curt is a total VPE monster -- some less striking examples from later in the same post:

For those second guessing anything, from me being in the game to my throwing a ‘hanging’ fastball, don’t.

I had Jason in a spot to do a bunch of different things to end the AB, and didn’t.

Tip your hat to the Yanks for playing as well as they have too, regardless of how well anyone has done they’ve put themselves into this position by answering the bell when they needed to.

Posted by Mark Liberman at 09:14 AM

Improving moral standards

From his travels in China, Victor Mair sent a photo of a sign that shamelessly flouts the maxim of quantity.

As Victor says, "It's hard to be a civilized citizen at the Wenshu Temple in Chengdu, Sichuan. There are so many rules that must be remembered."

But did the sign-maker intend any Gricean implicature? In a U.S. tourist attraction, a similarly-detailed sign would suggest some strange local history, or at least an eccentric caretaker.

Put this sign in The Onion -- perhaps with few small tweaks -- and it could be the basis for an amusing feature on the hard life of a temple guard ("The worst? Checking all the elderly worshippers for clean underwear") or the difficulties of a bureaucrat with Asperger's ("Refer to Appendix A for a full list of the 112,367 activities that are specifically prohibited in the temple: disposal of nuclear waste, archery practice, marshmallow roasts, filming pornographic movies, quail hunting, ...").

In China, though, it seems that signs with long lists of diversely forbidden (or recommended) activities are common. So maybe the only semiotic implicature is that "this is a place with rules, for example these".

[John Burke writes:

At Orchard Beach in the Bronx, 50 or so years ago, there was a large sign--I'd guess maybe four feet wide by three feet high--listing the rules. Almost the entire left half of the sign was filled by the word "NO;" the right-hand half, in considerably smaller letters, specified things like "Running, Ballplaying," etc. etc.--quite a lot of fairly inoffensive (as it seemed to my teenage self) beach activities. Then at the bottom, printed across the full width of the sign, the deathless bureaucratic admonition: "This is your beach. Enjoy it!"

A bit of Google Image search turns up plenty of signs like this one:

Perhaps it's just that when the language is more idiomatic, we don't notice the length of the list so much. But still, the Chengdu list does include some things that would seem a bit out of place on a tourist-site sign the U.S., I think, no matter how they were translated: "Don't waste food. ... Do not be out for small advantages. ... Resist superstition. ..."]

[OK, I give up. Matt Bishop send this:

West Wickham has got Chengdu beat by a mile -- or at least by a couple of furlongs -- in the explicit-lists-of-bad-acts derby.]

Posted by Mark Liberman at 08:08 AM

October 28, 2007

Linguistics in the funny papers

In today's Sally Forth, we learn that America's 8-year-olds have been taught to Avoid Passive. (We'll see on my Linguistics 001 midterm tomorrow how many 18-year-olds have learned to identify passive-voice clauses accurately, so as to avoid them if they choose to so do.)

That single panel may be a little puzzling, so here's the context:

In other linguistics news from the Sunday paper, Doonesbury illustrates the extent to which George Lakoff has succeeded in making the concept of framing so much a part of everyday vocabulary that a popular comic strip can make it the basis of a pun, without having to set it up or explain it.

Posted by Mark Liberman at 08:21 AM

October 27, 2007

That didn't take long

21 days after the October 9th publication of Stephen Colbert's "I am America (and so can you!"), the Fashion & Style section of the New York Times ran a story by Guy Trebay about Tila Tequila under the headline "She's Famous (and So Can You)".

I'm sure that this is not the quickest snowcloning in history, but I can't think of a quicker one at the moment.

Posted by Mark Liberman at 05:40 PM

Vowels on vacation

The ad (caught in the 11/6/07 Advocate, p. 5) offers:

The most nights in the most ports with the most vowels.

Specifically Hawaiian vowels:

NCL America's 10- and 11-day inter-island Hawai'i cruises aboard Pride of Aloha give you the option of longer cruises with overnights in Honolulu, Kaua'i and Kahului.

(NCL is Norwegian Cruise Lines, though it's not so identified in the ad.)

I wonder if anyone runs pony treks in the consonant-heavy Caucasus, for those who like to tough it up on their vacations.

[Added 10/28/07: Several readers have reminded me of the 1996 Onion piece "CLINTON DEPLOYS VOWELS TO BOSNIA -- Cities of Sjlbvdnzv, Grzny to Be First Recipients".]

Posted by Arnold Zwicky at 11:05 AM

Political semiotics: new significance for old signifiers

It's hardly news that people judge politicians, to some extent, on the basis of superficial characteristics. But what with Larry Craig, Mark Foley, Robert Bauman, Ted Haggard, David Dreier and the rest, the contextual meaning of certain features may be changing, as Ruben Bolling's most recent comic strip points out:

Posted by Mark Liberman at 10:55 AM

Interpreting magical language

Yesterday I complained, again, about someone interpreting a slight statistical tendency in the complex behavior patterns of diverse groups (snap judgments of relative competence based on brief glimpses of politician's faces account for 5-10% of the variance in vote share) as a categorical fact about the behavior of all individuals at all times (election outcomes are entirely determined by the candidates' appearance). In response, Garrett Wollman writes:

I don't hold out much hope for the public coming to a better understanding of statistics and probability. It's been 223 years since the original text of this quotation was written:

The art of concluding from experience and observation consists in evaluating probabilities, in estimating if they are high or numerous enough to constitute proof. This type of calculation is more complicated and more difficult than one might think. It demands a great sagacity generally above the power of common people. The success of charlatans, sorcerors, and alchemists -- and all those who abuse public credulity -- is founded on errors in this type of calculation.

- Antoine Lavoisier and Benjamin Franklin, Rapport des commissaires chargés par le roi de l'examen du magnétisme animal (Imprimerie royale, 1784), trans. Stephen Jay Gould, "The Chain of Reason versus the Chain of Thumbs", Bully for Brontosaurus (W.W. Norton, 1991), p. 195

(This quotation is taken from <http://en.wikiquote.org/wiki/Antoine_Lavoisier>, but I'm the editor who put it there.)

In the context of Gould's article, he of course takes issue with the bit about "generally above the power of common people", but does not disagree with the sentiment in general. I'm feeling a bit too lazy to grab my copy of /Bully for Brontosaurus/ and reread that essay at the moment, but I suspect whatever Gould was deploring in 1991 can only have gotten worse over the ensuing sixteen years.

Well, one step at a time. I don't think it's unreasonable to expect that journalists and political commentators should come to understand things like how to interpret a scatter plot, what r2 is and what it means, what it means to talk about two roughly normal distributions whose average values differ by about ten percent of a standard deviation, etc. More important -- and even easier -- they should learn to demystify general claims that come wrapped in the magical language of statistical sorcery, and ask simple questions like "how many of what?"

In fact, I suspect that things are getting better in most ways, not worse. There is a systematic effort in incorporate ideas about probability and statistics into school curricula in the U.S. -- here are some of the class projects assigned by a widely-used middle school mathematics text:

Is Anyone Typical? Students apply what they have learned in the unit to gather, organize, analyze, interpret, and display information about the "typical" middle school student.

The Carnival Game Students design carnival games and analyze the probabilities of winning and the expected values. They then write a report explaining why their games should be included in the school carnival.

Dealing Down Students apply what they have learned to a game. They then write a report explaining their strategies and their use of mathematics.

Estimating a Deer Population Students simulate a capture-recapture method for estimating deer populations, conduct some research, and write a report.

I don't know how well the nation's middle-school teachers understand these concepts -- probably there is a wide range of variation -- but perhaps those who don't know them will be learning along with their students.

My youngest son, who is in the sixth grade, has been given homework assignments based on the collection and interpretation of simple statistics about (for example) evaluations of a school trip, in which participant responses were divided into various subgroups (students vs. teachers, students in different grades, and so on. The techniques involved were fairly simple -- percentages and a few different sorts of graphs. But it asked them to calculate, represent and reason about group differences in (for example) the proportions of sixth graders vs. seventh graders who enjoyed the trip, at a level of sophistication that I would love to see regularly reproduced in the New York Times or the BBC News.

In my opinion, the biggest part of the problem is shock and awe in the face of unfamiliar ideas presented in an intimidating way. Seeing a new piece of jargon or an unfamiliar equation -- or even suspecting that they might see one if they looked further -- many people seem to freeze up and surrender their intellects, falling back on crude reasoning about group archetypes ("women are unhappy"), exaggerated and simplistic causal connections ("teachers' gesturing makes students learn 3 times better"), and naive reductionism ("the gene for X").

There are some cases where you need to understand an equation in order to evaluate a claim. For example, in order to evaluate the claims of Groseglose and Milyo in their widely-reported study "A Measure of Media Bias", you need to see that their mathematical model (apparently by mistake) embodies the assumption that conservatives care only about authoritativeness in deciding which sources to quote, whereas liberals weigh authoritativeness and ideology equally (see here and here for discussion). This is not very hard to understand. The right-hand side of the relevant equation is the sum of three terms, two of which are single variables while one is the product of two variables, of the form y = a + bx + e. You just have to reason a bit about what b and x are, and what it means to multiply them. This is not exactly the topos of presheaves on the poset of commutative subalgebras, you know?

But even that level of sophistication is not required, most of the time. Usually it's enough to say something like "wait a minute, never mind the beta coefficients and odds ratios for a minute, what are the numbers here? How many people diagnosed with the syndrome did you test, and how many of them had the genomic variation that you identified? And what were the same numbers for the control subjects? OK, so 77% of the people without the disease had this genomic variation, versus 84% of those with the disease? And what is the frequency of the diagnosis in the general population? About 2.7%? So if we used this as a screening test, assuming these proportions apply to the general population, let's see, the contingency table would look like this, right? And so, um, wait a minute, the false positive rate would be 97.1%? OK, thanks, now I see what's going on."

This is just middle-school math, plus a few simple but useful concepts like contingency table and false positive rate. And while there may be some intellectuals who are a little fuzzy on percentages, I think most educated Americans can (learn to) handle this stuff just fine. But when was the last time you saw this kind of discourse from a science journalist or a columnist?

I'm not trying to suggest that statistical analysis can or should be replaced by inspection of tables and graphs based on counts and simple derived quantities like proportions and percentiles. But we'd be a lot better off if (for example) journalists and other public intellectuals understood basic concepts built out of these simple parts -- histograms, contingency tables, and so on -- and insisted on understanding research at this basic level before getting to the more sophisticated methods.

The next step would be to understand and apply the general concept of "confidence interval". In fact, I think that most educated people have at least a blurry sort of idea about what this means, even if they don't apply the idea consistently

And beyond that, we might hope that someday a few percent of the population might understand factor analysis well enough to understand and evaluate the various debates about IQ.

And OK, sure, it would be nice if most the journalists assigned to the biomedical beat fully understood (say) how stratification is used to control for confounding variables in logistic regression, and why it can fail. But one step at a time.

[Update -- Rob Malouf sends in a link to post on Tritech.us from 7/13/2007, "Model Fitting, WSJ Style", which suggests that letting journalists loose with numbers and graphs might not always be the wisest policy:

This is what happens when conservative journalists take a crack at statistics. An editorial in today's Wall Street Journal attempts to fit a relationship between a nation's corporate tax rate and the resulting tax revenue, as a fraction of gross domestic product (GDP). The amazing result is shown above, taken directly from the newspaper's website.

Now, few would argue that, taken over the entire unit interval, there should be some sort of unimodal relationship here -- clearly at both 0% and 100% taxation, there will be no tax revenue. This hypothesized relationship is known as the Laffer curve (I'll spare you the obvious pun). However, over the realized range of the data, there is a conspicuous increasing linear trend, albeit with much residual noise. This suggests that (1) this relationship is pretty messy, and not a very informative univariate model, and (2) the optimal rate may in fact lie beyond the upper range of the data! What is clear is that the article's author has no inkling of how to fit regression lines to data. Notice that if you extend the curve on the right hand side, it would intersect zero at about 32%. Apparently a complete tax revolt takes place (don't tell France and the US).

Now, I'm no Wall Street Journal basher (I'm a subscriber, in fact), but this is sophomoric journalism that only Fox News would be proud of. I expect better from one of the country's best national newspapers.

If I were the sort of person who sketched qualitative functional relationships on napkins, I might suggest that a plot of idiocy relative to education starts fairly high (at zero education), rises for a while with increasing education (because a little knowledge is a dangerous thing, especially in the case of arrogant people who think they know what the answer is before examining the facts), and eventually falls (because those of us who devote our lives to education need this faith). But I'm not, so I won't.]

Posted by Mark Liberman at 10:14 AM

October 26, 2007

Snap judgments of competence and the rhetoric of statistics

This morning, Andrew Sullivan linked to a news report about a study said to show that "[a] split-second glance at two candidates' faces is often enough to determine which one will win an election".

Sullivan's comment: "Maybe everything else is just make-work".

Actually, no.

The paper is Charles C. Ballew II and Alexander Todorov, "Predicting political elections from rapid and unreflective face judgments", PNAS published online 10/24/2007.

In their experiments, snap judgments of competence from facial appearance accounted for between 2% and 14% of the variance of vote share, wherefore "everything else" accounted for between 86% and 98%. These different outcomes represent the results of various different experimental techniques -- different lengths of picture presentation, presence or absence of response deadlines, etc. -- applied to different collections of gubernatorial and senatorial races.

To get a graphical sense of what this amount of prediction is like, here's a scatter plot, charting the proportion of subjects who judged a candidate as more competent-looking than his or her opponent, against that candidate's share of the two-party vote, for their largest experiment (89 gubernatorial races) and the best-performing experimental condition (250 msec exposure to the pictures, r2=0.053, i.e. 5.3% of variance accounted for):

SI Fig. 5. Scatter plot of the two-party vote share for the candidates and their perceived competence (Experiment 1). Each point represents a gubernatorial race. The line represents the best fitting linear curve.

Now, these were two-party races, so you should expect to predict the outcome from the flip of a fair coin 50% of the time, leaving 50% to be predicted from facts about the candidates and the voters. In their experiment 1, predicting 89 gubernatorial races, subjects' snap judgments of competence in their best-performing condition were able to predict the winner 68.5% of the time (the other conditions were right 59.6% and 62.9%). That decreased the chance error rate by (68.5-50)/50 = 37%, which is certainly a considerable help in prediction, almost half as good as the (huge) effect of incumbency for House races. The results in their experiment 3 (35 2006 governor's races) were similar: 68.6% prediction of the winner from subjects' aggregate snap judgments of competence.

So there's definitely something going on there. But it's a small effect on vote share (accounting for less than 10% of the variance), whose leverage as a predictor of election winners is perhaps made larger by the fact that most races are fairly close, so that swinging a small number of votes can sometimes determine the outcome. This is a long way from justifying the conclusion that people's votes are in general determined (or even very reliably predicted) by their first impression of a candidate's appearance, or the conclusion that everything in politics except what a candidate looks like is "just make-work".

I like Andrew Sullivan's writing a lot -- and his reaction in this case was surely a wry joke -- but it would be a better world if influential people would read and understand the primary sources in cases like this one, rather than relying on the (reliably unreliable) press releases and mass media to tell them what such "studies" and "research" mean. I guess it would be almost as good if the journalists who act as front-line interpreters did a better job.

Of course, this would require everyone involved to have basic statistical literacy, which in this case means understanding what "percent of variance accounted for" means. And people would have to get used to evaluating (reports of) scientific results with the same skeptical care as policy recommendations.

This story is on the wires now, and will be hitting the papers over the next few days, so you can judge for yourself how helpful and accurate the media's interpretations are, and what the uptake from political commentators is like. My own guess is that we'll see plenty more evidence of the kind of pop platonism that characterizes most people's way of thinking about quantitative properties of groups.

[You might wonder how incumbency and snap judgments of appearance interact. Funny you should ask: Ballew and Todorov devote a section of their paper to showing that in these experiments, the effect of snap competence judgments is independent of incumbency. But I was very interested in this note from their "supplementary material":

Incumbency Status and Competence Judgments. Although we showed that the effect of competence judgments was independent of incumbency status for Senate races in our prior work (3), this was not the case for the House races. For these races, competence judgments predicted the winner only in races in which the incumbents won. There are a number of differences between House and Senate races and it is not clear how to interpret the latter finding. There is less media exposure to House candidates than to Senate candidates, and it is likely that many voters are unfamiliar with the faces of their House candidates, a possibility that suggests different accounts of voting decisions in House and Senate races. It was also impossible to obtain pictures of both candidates for all House races and this may have introduced unknown biases in the sample of these races.

Or maybe better campaign organizations hire better photographers and hairdressers, and pick better head shots from the available possibilities?]

[Update -- Andrew Gelman at Statistical Modeling, Causal Inference, and Social Science blogged about an earlier Todorov paper back in April ("Baby-faced politicians lose", 4/27/2007). He counsels care in interpreting such results:

You have to be careful in interpreting the results, however. Todorov et al. seem to be saying that individual voters' visual "inferences of competence" are affecting votes. Another story, perhaps more plausible, is that the more competent-looking people are the ones who rose to political success.

My main point here is that no matter which way(s) the causal arrows point, the overall effect is a fairly small one, which is a very long way from licensing the belief that that vote share is entirely determined by appearance.]

Posted by Mark Liberman at 02:03 PM

Monkeys will check your grammar

Ash Asudeh sends this quotation (found through the graces of DaringFireball) from Jason Snell talking about new features in Leopard, the new release of Apple's Mac OS X:

"What I'd never pick: Grammar Check—at last, the most useless feature ever added to Microsoft Word has been added to Mac OS X! With this feature, an infinite number of monkeys will analyze your writing and present you with useless grammar complaints while not alerting you to actual grammatical errors because computers don't understand grammar. Sure, it sounds great on a box—or a promotional Web site—but anyone who knows, knows that grammar checking is a sham. Just say no."

Spot on, Jason. Nice to see some intelligent ranting about this. Computer grammar checking really is terrible. The programs in question devote most of their effort to trying to catch the most easily diagnosed prescriptive shibboleths; for example, it is not too difficult to spot a sequence consisting of "to" followed by an adverb and a verb, so that a split infinitive can be complained about. They also do basically brainless things like looking for masculine third-person singular pronouns (he, him, his, himself) so they can warn you that perhaps you should say "him or her". Since these tasks are so easy, they score some successes there, without being of much use. But it's extremely rare for them to catch anything subtle, tricky, or genuinely helpful.

They can't really even help with standard nonsense like discouraging the use of passives (see this page for a listing of Language Log posts on the passive), because they are basically hopeless at identifying passive clauses — even more hopeless than college-educated American adults, which is not setting the bar very high. They mostly can't help with subject-verb agreement errors because they are unable to spot which noun the verb should be agreeing with. And they cannot warn you off singular antecedents for they, because they can't figure out which antecedent a given pronoun has. The things they are good at, like spotting the occasional the the typing error, are very easy there are very few of them. For the most part, accepting the advice of a computer grammar checker on your prose will make it much worse, sometimes hilariously incoherent. If you want an amusing way to whiling away a rainy afternoon, take a piece of literary prose you consider sublimely masterful and run the Microsoft Word™ grammar checker on it, accepting all the suggested changes.

Posted by Geoffrey K. Pullum at 12:56 PM

Noun phrases

Or rather, phrases as nouns. We've recently seen phrases as adjectives, adverbs and verbs; this morning's Cathy drops the nominal shoe:

Posted by Mark Liberman at 06:50 AM

Cold comfort for whomever

Ben Zimmer forwarded to me this question from Jay Livingston, about Ben's post on the episode of The Office about whomever ( "It's a made-up word used to trick students", 10/25/2007):

I think you're wrong about whomever.  Yes, whom is disappearing, but I hear whomever all the time.  My secretary, for example, uses it as a stand-alone (I'm not a linguist, and I'm sure there's a technical term for this.  "Who's going to take these?"  "Whomever.")  And between you and I, it's used in a similar way as "between you and I," and probably for the same reason -- it sounds more sophisticated.

How can we find out if whomever is indeed on its last legs or whether its stock is still rising?  Do linguists have some way of counting the number of appearances a term makes in everyday speech?

Yes, we do. Imperfect, but good enough to show that Jay is somewhat right and somewhat wrong.

I searched the LDC Online corpus of conversational English, which I've used on some previous occasions to provide data for Language Log Breakfast Experiments™. It involves 14,137 transcribed telephone conversations, comprising a total of 26,151,602 words. Most of it was recorded in 2003 and 2004, though a small portion was recorded in 1991. The participants span a wide range of ages, regions, occupations and backgrounds.

In that collection, we find the following counts and ratios:

who/whom ratio
whoever/whomever ratio

(Note: to get the counts given above, I combined 1,008 instances of "whoever" with 28 of "who ever", and 23 instances of "whomever" with 1 of "whom ever".)

So whomever is not exactly common -- a frequency of about 1 in 1.09 million words. Not exactly "all the time", though of course Jay's acquaintances may be dealing from a different deck. And in these recordings, whom is still about seven times commoner than whomever, in absolute terms.

But in fact, these counts do support Jay's impression that whomever is holding out better, in proportion, than whom is.

How can I say this? Well, whom is 218 times rarer than who, while whomever is only 43 times rarer than whoever. Cold comfort, perhaps, but comfort nevertheless.

What about the contexts in which whomever is used? Are these mainly like "between you and I", i.e. contexts in which the objective case is not historically motivated?

Not really -- 27 of the 34 examples are like these:

yeah okay and i'm sure that you know as he gets to know them more he'll
you know consider their husbands or whomever good friends too

all you need is a letter from maybe your pastor or whomever and a letter f- b-
from you

now what what what do you usually watch like the local news or the or like world news with

There are only 7 examples out of 34 where whomever is not sanctioned by case-marking, at least by my construal:

i i mean i i see i'm not too familiar with like the laws that have been passed on something like that or if jesse jackson's trying to or whomever is is trying to
incorporate some law to congress federal law to congress about you know hiring you know about having your quotas and so forth you know

uh and certain things it's it it well it whoev- whomever is uh uh is holding the the highest positions of controlling

situations involved which would you know kind of allow for for people to think um you know like america's the bad guys or even the way you know they're you know whomever is r- ruling the country can spin it off
um to make to make america look like the bad people [laughter]

you know just uh how could the government or the Pentagon uh whom- whomever's in charge there been so careless

but if these terrorists or whomever knows that it's done randomly

and you know not lose yourself in someone else and whatever the words are it's just perfectly said
so whomever's listening from fisher they should go play that song and that's my ideal man [laughter] whomever she's singing about

right and i was like what happened what why didn't your um your investor analyst or your or whomever say 'here it comes'

It's possible that Jay's experience is quantitatively very different from this, but I'm skeptical. I think that it's more likely that he's the victim of a form of the Frequency Illusion, where the relative frequency of striking or characteristic patterns of usage is overestimated.

[Note: in the LDC Online transcripts, the sequence that I've rendered "Jesse Jackson's" was transcribed as "Jesse's actions". That's an example of what can happen when you hire transcriptionists from New Zealand...)

Posted by Mark Liberman at 06:37 AM

Shoe-leather reporting and Gresham's Law of Headlines

Fev at headsup: the blog muses on current standards of evidence among (certain classes of) journalists ("Journalism, Science, Grammar", 10/23/2007), and redefines a term:

That's why we call it "shoe-leather reporting," kids. When you read a column that appears to result from one possibly apocryphal encounter with an elderly acquaintance, one hypothetical party and one afternoon on Facebook, you take off a shoe and hit yourself on the head until shards of leather form a pile at your desk.

He has a modest suggestion:

Maybe when we write about stuff in the observable world, we should assume that people are going to ask us how we know it. And we could assign different values to different sorts of claims about truth. And people who had more evidence to support what they said would get better play, and people who were just blowing smoke would be consigned to the far outer circles of hell. Or something.

Think it'll work?

Not for columnists -- the international taxi-drivers association would never allow it. And as for the rest of the journalistic enterprise, there seems to be some sort of Gresham's Law of Headlines that penalizes those who operate by Fev's guidelines. Overvalued stories drive undervalued ones out of circulation; or more succinctly, sensationalist headlines drive out honest ones. I'd like to say that science works the way he recommends, but in my experience, I'm afraid that this is true only in the aggregate and on longish time scales.

[I also can't help noting Fev's implication that the "far outer circles of Hell" are worse than the inner ones. This seems exactly right, in fact, since the few exceptionally evil denizens of the innermost circles get (and deserve) lots of coverage, whereas the many less extreme sinners in the outer circles are more numerous but mostly (and justly) ignored. Except perhaps by certain novelists.]

Posted by Mark Liberman at 06:29 AM

October 25, 2007

From cringe to offense

On the American Dialect Society mailing list this morning, Charlie Doyle observed that the aversion to the word moist -- which Mark Liberman has reported on here, here, and here -- has ratcheted up, at least at the University of Georgia:

A student in my Shakespeare class announced that the word "moist" (which I had uttered to describe Egypt in Antony & Cleopatra) is offensive to women. Some of the other women in the class concurred (not hostilely--just as a matter of information for a clueless male professor). I was somewhat flabbergasted, and nobody would articulate a reason for the offensiveness--except for one male student's eventual suggestion that the word reminds women of sexual arousal. That association is not at all beside-the-point of my description of Egypt in the play--but why would such a connotation make the word offensive per se? As far as I could ascertain, "damp" and "wet" don't carry whatever stigma attaches to "moist." What am I missing here?!

What started as a cringe by individual people (mostly women) at the word has now been elevated to a perception (at least by some women) that the word is offensive to women in general.

Posted by Arnold Zwicky at 01:22 PM

The great Montana speech police affair

In the past some of the mucky-muck higherups at Language Log Plaza have ragged on me because I chose to forsake the joys of Washington DC urban life and to retire, instead, out here in the wilds of western Montana, where nothing linguistically interesting seems to happen. I'll admit that things do seem to move slowly here, which is actually kind of refreshing. This year even our forest fire season has been overshadowed by California. But interesting language issues do pop up here once in a while. At least Washington Post columnist George Will thinks so. He finds the speech police alive and well in this state. His view is that the University of Montana has severely restricted a conservative student government candidate who was defeated because of the alleged shenanigans brought about by liberal Democrats not subject to the same expenditure limits.

The University of Montana permits candidates running for student government offices to spend no more than $100 in their campus campaigns. Will calls this "merely another manifestation of the regnant liberalism common on most campuses -- the itch to boss people around." (Note: Recognizing that most of his readers won't know the meaning of 'regnant,' Will defines it for us, at the same time giving himself a nice opportunity to display his marvelous vocabulary.) You can read the rest of the story in the link above. It's not clear to me who is actually right in this case but the defeated conservative candidate took the matter to court, lost, then took it to the 9th Circuit Court of Appeals, and lost again. Now it's in the hands of the U.S. Supreme Court Justices, who will have to decide whether or not to take on the great Montana Speech Police Affair.

See, all you doubters, exciting language stuff does happen out here -- once in a while anyway.

Posted by Roger Shuy at 12:29 PM

Dickens, Browning and Follett

The topic is extended modifiers again -- things like "not-getting-stuck-with-a-groom's-man-shorter-than-you good" or "those gee-my-feet-are-killing-me-since-I'm-in-the-infantry blues". Yesterday I quoted a passage from a 1969 article by John E. Crean, who defended these constructions as a "functionally striking and attractive" option that "in recent years ... has gradually been working its way into English". Crean was reacting to the charges leveled by Wilson Follett in Modern American Usage (pp. 137-138 in the 1966 edition indexed by Google Books):

Under the influence of advertisers, American English has slipped into a construction deeply at odds with the genius of the language and more akin to German, in which compounding is a normal practice. Early examples included easy-to-read books for the children and ready-to-bake food for the whole family. This agglutination of ideas into complex phrases requiring hyphens to make them into adjectives goes against the normal articulation of thought, which is food ready to bake and books easy to read or easy books.

This is amusingly reminiscent of Antoine de Rivarol's 1783 argument that (18th-century) French exactly mirrors the inner language of logical thought. Today, email from Ran Ari-Gur points to some evidence that both Follett and Crean were victims of the Recency Illusion.

Ran sent a link to The Cambridge History of English and American Literature, Vol. XIV, The Victorian Age, Part Two, chapter XV ("Changes in the Language since Shakespeare's Time"), § 4. "Changes in grammar". (The series was published between 1907 and 1921. Volume XIV was written by William Murison.)

The story of English grammar is a story of simplification, of dispensing with grammatical forms. Though a few inflections have survived, yet, compared with Old English, the present-day language has been justly designated one of lost inflections. It is analytic, and not synthetic. This stage had virtually been reached by the beginning of the seventeenth century [...]

Further condensation is seen in the wide use in modern English of the attributive noun instead of a phrase more or less lengthy. The usage began in Middle English, and has been vigorously extended in present-day language. It is regularly employed in all kinds of new phrases, as when we speak of birthday congratulations, Canada balsam, a motor garage. Compound expressions are similarly applied, as loose leaf book manufacturers, The Prevention of Cruelty to Animals Act, a dog-in-the-manger policy.

The attributive noun is not an isolated phenomenon in English. It belongs to the widespread tendency whereby a part of speech jumps its category. The dropping of distinctive endings made many nouns, for example, identical with the corresponding verbs; and, consequently, form presented no obstacle to the use of the one for the other. The interchange was also facilitated by the habit of indicating a word’s function or construction by its position in the sentence. This liberty became licence in the Elizabethan age. [...]

An extreme instance of this freedom appears in sentences transformed, for the nonce, into attributes, as when Dickens writes, “a little man with a puffy ‘Say-nothing-to-me-or-I’ll-contradict-you’ sort of countenance”; or into verbs, as in Browning’s lines,

While, treading down rose and ranunculus,
You “Tommy-make-room-for-your-uncle” us.

Murison argues (plausibly, I think) that the freedom to deploy phrases in a variety of syntactic roles arose naturally as English shed its inflectional morphology. In contrast, Follett saw the lack of morphological marking as a problem, not an opportunity:

... the insult to reason consists in the failure to articulate. The reader must unscramble the ideas for himself.

It's certainly true that long complex nominals in English can be hard to parse. But Follett is not talking about things like "Volume Feeding Management Success Formula Award", much less "a puffy 'Say-nothing-to-me-or-I'll-contradict-you' sort of countenance". He's taking aim at some much more straightforward constructions:

All these locutions can be uttered, but sound does not give them meaning. When in advertisments for a directory we are promised 4,000 hard-to-find biographies, and in a dictionary we are asked to note its concern for hard-to-say words, we have reached a point where agglutination sounds like baby talk.

Yes, he really wants us to believe that phrases like hard-to-say words are hard to understand:

The language has no need of such fallacious compressions. They save no time; they corrupt both style and thought, and they leave the user unable to imagine how his meaning is read.

Those Yoplait ads may corrupt style and thought (Patricia Witkins, who sent me the link, certainly thinks so), but it's not because there's any uncertainty at all about the meaning of the phrases used as adverbs. Follett takes this bizarre line of reasoning to the point of recommending that we shun harmless phrases like accident-prone:

If we wish to protect ourselves from this assault on our wits, we must begin by avoiding every form of easy compounding -- e.g. flight-conscious, career- and action-oriented, accident-prone, ... , and all other lumpings of words in which the relation is not either established by usage or controlled by rule. Parking lights deceives no one into thinking of lights that park, because the phrase is formed on a regular pattern; the same is true of hairdresser, lantern-jawed, housebroken, bowling alley, and secretary bird--in all these we how how the elements affect each other to denote a fact or idea.

This is a remarkably incoherent passage. What, I wonder, is the regular pattern that housebroken is an instance of? Does Follett really mean to recommend coinages like carbroken (= "trained not to defecate in the car") and tentbroken (= "trained not to defecate in the tent"), while forbidding accident-prone?

There's nothing obviously irregular about X-prone = "prone to X"-- the OED gives examples with X=accident, suicide and violence, with citations from 1926 onward; and the general pattern of <Noun Adjective>, where the noun is interpreted as a complement of the adjective, was common in English for hundreds of years before that. The OED's first citation for "penny wise and pound foolish" is from 1598.

The same can be said for X-conscious = "conscious of X". The OED gives examples with X=class, colour, clothes, dress, woman, money, history, and weight), with citations starting in 1903.

And again, it's true that lantern-jawed is formed on a regular pattern (club-footed, ham-handed, jug-eared); but so is action-oriented. The OED gives examples of X-oriented with X = family, disease, person, expansion, goal, with citations starting in 1949.

Did Follett really mean to ban all constructions of the form <Noun PastParticiple> or <Noun Adjective> in which the noun is to be interpreted as a complement of the participle or adjective? I'd like to hear him explain this to Elizabeth Barrett Browning ("With a spirit-laden weight did he lean down..."), and to Thomas Hardy ("Oh epic-famed, god-haunted Central Sea"), and to H.G. Wells ("I became woman-conscious from those days onward"), and to pretty much every other English-language writer since Chaucer.

I think that the explanation for this curious passage is clear. Follett, who was born in 1887, was offended by newly-common uses of -prone and -conscious and -oriented as the second element of compound forms. When he wrote "established by usage", he really meant "familiar and comfortable to me". And when he wrote "controlled by rule", he apparently meant nothing at all -- this was just empty bluster to get the reader to accept his prejudices.

Really, it's amazing that a book like this has been repeatedly published in new editions, without correcting its obvious errors of logic and fact. Anyone who buys it should sue the publisher for fraud. This man was never grammarbroken.

[Note: I realize that phrases used as prenominal modifiers are syntactically different from compound modifiers headed by adjectives or participles, and that both are different from several other species of complex nominals, compound nouns, adjective phrases, and so on. They've gotten conflated here because Follett lumps a heterogeneous collection of them together as "Germanisms".]

Posted by Mark Liberman at 09:06 AM

October 24, 2007

It's a made-up word used to trick students

Geoff Pullum warned us a few years back about "the coming death of whom," and last week's episode of The Office provides ample evidence that whomever is similarly on its last legs.

I'd take this exquisitely constructed scene line by line, but there have already been two fine analyses on other blogs: from Neal Whitman at Literal-Minded and from Ed Cormany at Descriptively Adequate. The linguablogosphere doesn't miss a trick.

Related Language Log posts:

"I really don't care whom" (Apr. 17, 2004)
"Whom humor" (Apr. 18, 2004)
"The coming death of whom: photo evidence" (Sep. 10, 2004)
"Talking about whom you are and who you're seeking" (Nov. 9, 2004)
"Whomever controls language controls politics" (Oct. 22, 2005)
"Class consciousness" (Dec. 2, 2006)
"Dog whistles for linguists" (Dec. 21, 2006)
"Whom?" (Jan. 5, 2007)
"Marxist quotation" (Jan. 8, 2007)
"Whom shall I say [ ___ is calling ]?" (Jan. 23, 2007)
"Relevance of a different kind" (Jan. 28, 2007)
"A note of dignity or austerity" (May 3, 2007)
"ISOC, ESOC" (June 18, 2007)
"It's whom" (Aug. 29, 2007)
"Whom was that masked man?" (Sep. 8, 2007)

[Update: My offhand claim that whomever may be "on its last legs" has generated a wide range of reactions. Jay Livingston feels that whom is moribund but whomever is going strong:

Yes, whom is disappearing, but I hear whomever all the time. My secretary, for example, uses it as a stand-alone (I'm not a linguist, and I'm sure there's a technical term for this. "Who's going to take these?" "Whomever.") And between you and I, it's used in a similar way as "between you and I," and probably for the same reason -- it sounds more sophisticated.

Adrian Morgan, meanwhile, takes precisely the opposite position:

In my assessment, there is no "similarity" between the prevalence of "whom" vs "whomever" in present-day English. "Whom" may be on its last legs, but it's still out there, used by at least a significant minority of speakers, and everyone is aware of its existence. "Whomever" had all its legs amputated so long ago that the scars have healed, and, I suggest, has been completely eradicated from most dialects of standard English (even in the formal register).

I think we need some quantitative studies of spoken corpora over time before we can adequately take stock of whom and whomever. But regardless of actual frequency of use, there's no question that uncertainty surrounding usage of a word like whomever can evoke anxiety and confusion, which is what makes the humor of "The Office" work so well.]

Posted by Benjamin Zimmer at 12:49 PM

"Insult to reason" or "functionally striking and attractive"?

This is the third in a series of posts on phrases as modifiers in English, following up on "Phrasally grateful" (10/18/2007) and "Extended adjectives" (10/23/2007). I don't have time this morning to do justice to all the email that I've gotten on the subject, but I very much enjoyed one scholarly exchange from four decades ago, and thought I would share with you the links and a couple of quotes.

A note from Stalina Villarreal connected me to John E. Crean, Jr., "The Extended Modifier: German or English?", American Speech 44(4): 272-278, 1969:

The extended modifier saves the verbiage of a relative clause and avoids the trailing effect of modifiers that follow the noun by tucking all the punch up front between the article and the noun, as in this example: ein anfangs für jeden Studenten ziemlich schwieriges Problem. Word-for-word the phrase translates as 'an initially for every student rather difficult problem', or, restructured into more idiomatic English syntax, 'a problem that is intitially rather difficult for every student."

The word-for-word translation sounds strange in English. Follett's Modern American Usage censures any such English structures, classifying them indiscriminately as "Germanisms" and rejecting them as "deeply at variance with the genius of the [English] language," an "agglutination of ideas into complex phrases . . . against the normal articulation of thought" and an "insult to reason." For him, "the language has no need of such fallacious compressions," which "corrupt both style and thought."

Anyone at home in both Germanic languages is made doubly uncomfortable by Follett's abrupt dismissal. Not only has the construction been accepted and utilized for decades in one Germanic language, German, but in recent years it has been gradually working its way into another, English. No dosage of grammatical prescription will be sufficient to cure the spread of the American extended modifier, because the reasons for its popularity and growth are much the same as those for its original use in German: economy and impact. Journalists, sponsors, commentators, advertisers, entertainers, gossip columnists, and editors alike are utilizing it to deliver the most message with the fewest words. What once might have been esthetically odd or ugly is now functionally striking and attractive.

And to O.C. Dean, Jr., "The Extended Modifier: German, Not English", American Speech 46(3/4): 223-230, 1971.

Born in the officaliese of the German Chancelleries of the sixteenth century, the extended modifier construction was quick to find a permanent place in German expository prose and, hence, in the list of constructions that must be mastered by foreigners who want to read German. In a recent article in these pages, John E. Crean, Jr., suggested that English has a similar construction in the increasingly popular use of such ad hoc adjectival phrases as those in "the five pounds thinner girdle" and "a get-tough-with-students guy." There are, however, some important differences between the English and German constructions that lead one to question whether the two are comparable on more than a superficial level. First, the German extended modifier is normally limited to the written German of scholars, bureaucrats, novelists, and newsmen and would be too formal for spoken German or advertising copy. The English construction illustrated above, on the other hand, is prevalent in advertising and in informal writing and conversation, but it is strictly avoided in scholarly writing and would quickly receive the red pencil treatment from teachers of English composition. Second, t with the exception of a few expressions imported as calques from English, such as mach es selber 'do-it-yourself (kit)' the type of ad hoc adjectival phrase noted by Crean is almost nonexistent in German. Last, and most important, the German extended modifier and its putative English counterpart are basically different constructions and result from different transformational rules.

"Different transformational rules"? Ah, those were the days, my friend...

In a more modern idiom, Jesse Tseng directed my attention to L. Sadler and D.J. Arnold, "Prenominal adjectives and the phrasal/lexical distinction", Journal of Linguistics 30(1): 187-226, 1994, saying that "My copy is somewhere in one of 40 boxes of books and papers that I haven't unpacked yet". It's not available on line, at least not to me, and I'm not about to interrupt my breakfast blogging hour for a trip to the library. Remember, online articles have more impact! But with or without the citational support, Jesse agrees with O.C. Dean, Jr. that English and German are different:

I don't think that the same term should be used for the English and German constructions. They both involve "heavy premodifiers", but otherwise they are completely different syntactically and semantically (and prosodically, and stylistically...)

I think for English we are dealing with a generalization of the "Army strong" phenomenon that came up a few months ago on LL (but I don't know of an existing term for that, either).

And we mustn't forget that the examples that started all this off were phrases used as adverbs ("ice cream cone on the way home grateful", or "not getting stuck with a groom's man shorter than you good"), not as adjectives. This is an area where English seems to be forging ahead of its Germanic sister.

And right on cue, this morning's Tank McNamara joins the chorus:

More from the mailbag on this later.

Posted by Mark Liberman at 07:12 AM

October 23, 2007

Voilà!  Ear spellings

Those of us at Eggcorn Central get a lot of mail about things that are dubious as eggcorns: simple misspellings, word confusions, morphological reshapings, "demi-eggcorns", etc.  Often it's hard to know quite what to say about particular examples -- "In 1776, America through off its monarchy", from a posting to soc.motss on 9/26/06; you might not know what to say about this one, until you learn that the poster is a notoriously unsteady speller -- but every so often a pretty clear not-an-eggcorn-or-anything-close-to-it case comes along.  I offer you an assortment of spellings for voilà in English: walla, wallah, wala, wa-la, wella, wha-la, vwala, etc.

These spellings have been noted several times in the eggcorn database; Brians's Common Errors mentions "vwala" (with distaste) in its entry for viola/voila; and two correspondents have written me at moderate length on the subject (one last August and one today).  Here at EC, we are dubious that anything more than "ear spelling" is going on here; I don't see any evidence of reanalysis at any level, or of any semantic content introduced in the spellings.  As Pat Schwieterman wrote in the Eggcorn Forum on 10/25/05:

Personally, I think it's just a phonetic spelling rather than an eggcorn. There is a word "wallah" in English, but it's hard to see that people who use "walla(h)" in place of "voila" are thinking of the usual meaning of "wallah."

I'd add that the existing English word wallah (borrowed from Anglo-Indian) has main stress on the first syllable, voilà on the second (with, in English pronunciations of the word, a first syllable either unstressed or bearing a secondary stress).  (In fact, the hyphenated spellings might be an attempt to suggest primary stress on the second syllable, as in ta-da!)

Today's correspondent, Earl Davis, supplied some hits for the "wh" variant (and suggested that this spelling might be an attempt to represent the complex onset /vw/ in French):

"Wha-La! Class Schedule!" [blog title]
Finally done and ready for your perusement...MOSAIC CLASS SCHEDULE Winter 2007...  (link)

"Unfortuneately this gave the younger gneration the liscence to start feathering the sides of their hair, which eventually lead to chopping the sides off--leaving only the long hair in the back. And Wha-la: you have the classic mullet."  (link)

"Splenda is actually just the brand name for sucralose, a sugar derivative, which is made through a patented multi-step process that converts natural sugar cane to a no-calorie, non-carbohydrate sweetener that your body doesn't recognize as sugar or carbohydrate -- so it doesn't get metabolized. Wha-la! It's calorie-free!"  (link)

My August correspondent, Paul Wolman, suggested that the "wella" spelling (which seems not to be nearly as common as some of the others, though the existence of the Wella company makes it hard to tell) might convey some connection to the English adverb well -- something like "Well, there!" -- but I'm inclined to think the "e" is a spelling for a neutral vowel in an unstressed first syllable.  (Still, there might be a few people who've made a connection to well.  People are ingenious.)

A reflection on why ear spellings should be so likely for this word.  If you've heard the word, you probably know how to use it in sentences, but if you haven't seen it in print (or don't remember having seen it in print, or didn't realize that the spelling "voilà" represented this particular word), you're in trouble.  People tell you to look up words if you don't know their spellings, but where do you look in this case?  If you don't know French, or don't recognize the French origin of the word, what would possess you to look under VOI in a dictionary, especially if your pronunciation of the word begins with /w/ (I think this is the most common current pronunciation, at least for people who aren't "putting on", or at least approximating, French)?  So you spell it "the way it sounds".

Posted by Arnold Zwicky at 03:16 PM

Extended Adjectives

A few days ago, I commented on the frequent use of phrases as modifiers in informal English ("Phrasally grateful", 10/18/2007), and asked readers for terminological suggestions. What I was looking for was a way to search the scholarly literature for existing research on the subject, not a catchy name to use going forward. But I wasn't very clear about this, and so many people sent me suggested coinages.

I've only gotten one email with information about an existing term, and that one is in (and about) a foreign language, so it looks like a neologism competition is in order. However, before we get to that, I'll share with you what Seth Knox told me about German.

Here's an example from Der Spiegel, taken from the first sentence of the article "Schlappe für Premier Kaczynski - Bürgerplattform vorn", 10/21/2007:

„Die am späten Abend nach Schließung der Wahllokale im polnischen Nachrichtensender " TVN 24" veröffentlichten Prognosen...“


("The forecast that was announced on the polish news station "TVN 24" late in the evening after the closing of the polling stations...")

In German I am most familiar with these constructions being called erweiterte Adjektive (“extended adjectives”); in English I’ve heard them most commonly called “extended adjectival constructions,” but I’ve also heard such terms as “extended modifiers” and “pre-noun inserts”.

Also, while poking around on Google to try to find some alternate names for extended adjectival constructions, I came across the blog Experimental Linguistics, run by Yale grad students, wherein "W1ll13 30% Hacker" posted on 12/04/2004 about "Extended Adjective/Adverb Constructions", and referred to the same Yoplait advertisement mentioned by Patricia Witkin in your blog post. Go figure.

As an aside, I’ve discussed the teaching of extended adjectival constructions with other German instructors, and they all (myself included) teach it as a written alternative to relative clauses that would sound utterly ridiculous in English. Now I know of some English equivalents that I can present to my students.

W1ll13 30% Hacker observes that blues lyrics are a good source of examples, citing "I've got the 'gone completely crazy 'cuz my woman done left me' blues". I tried the search pattern "got those" and turned up a bunch of things like Louis Jordan's "I got those 'gee my feet are killin' me, since I'm in the infantry' blues"; or Dolly Parton's "I got those can't stop crying, dishes flying, PMS blues". (A nice phonetic discovery: in that last song, Ms. Parton makes an internal rhyme of of almighty and somebody in the line "I got those God almighty, slap somebody PMS blues".)

It looks like Experimental Linguistics (the blog) is moribund, alas -- no posts since 4/30/2006.

In this connection, it's worth revisiting Mark Twain's discussion of "The Awful German Language" (see here), which depends heavily on the effect of extended adjectives to achieve the effect that he described as follows:

"You observe how far that verb is from the reader's base of operations; well, in a German newspaper they put their verb away over on the next page; and I have heard that sometimes after stringing along the exciting preliminaries and parentheses for a column or two, they get in a hurry and have to go to press without getting to the verb at all. Of course, then, the reader is left in a very exhausted and ignorant state."

A quick scan of Google Scholar did not turn up any uses of "extended adjective(s)" to describe English, so I think the field is still open for new coinages.

[Update -- James Crossley writes:

This reminds me of my youthful (re-)reading of the Guinness Book of World Records. Listed then as the longest song title in history was, if I remember aright, something called "I'm a Cranky Old Yank in a Clanky Old Tank on the Streets of Yokohama with My Honolulu Mama Doin' Those Beat-O, Beat-O, Flat on My Seat-O, Hirohito Blues."

Relevant? Useful? I can't see how, but I had to get it off my chest. I've been carrying that song title around in my head for more than thirty years.

And after a bit more research, James reports that the lyric in question was written by none other than Hoagy Carmichael.]

Posted by Mark Liberman at 12:55 PM

October 22, 2007

Slogan gap?

It's seven to two, in favor of the Republicans.

After the discussion of an obscure British politician's slogan, I wondered about the current crop of American presidential slogans. I've been following the campaign fairly closely, but somehow none of the current candidates' catchphrases have registered with me. I couldn't bring a single one to mind.

So I checked.

The Republicans:

John Cox: [apparently no slogan]
Rudy Giuliani: "Strong Leadership. Proven Results."
Mike Huckabee: "Faith. Family. Freedom."
Duncan Hunter: [apparently no slogan]
Alan Keyes: [apparently no slogan]
John McCain: "Courageous Service  Experienced Leadership  Bold Solutions"
Ron Paul: "Hope for America"
Mitt Romney: "True Strength for America's Future"
Tom Tancredo: "For A Secure America!"
Fred Thompson: "Security. Unity. Prosperity."

The Democrats:

Joe Biden: [apparently no slogan]
Hillary Clinton: [apparently no slogan]
Chris Dodd: [apparently no slogan]
John Edwards: [apparently no slogan]
Mike Gravel: [apparently no slogan]
Dennis Kucinich: "Strength through Peace"
Barack Obama: [apparently no slogan]
Bill Richardson: "Change and Experience"

So 7 of 10 Republican candidates -- and all the serious ones -- have slogans; but only 2 of 8 Democratic candidates -- and none of the front-runners -- do.

Among the 9 slogans in both parties, there is not a single verb (leaving out the quasi-adjectival participle forms proven and experienced).

Three slogans include periods ("Strong Leadership. Proven Results." "Faith. Family. Freedom." "Security. Unity. Prosperity.")  One has an exclamation point ("For A Secure America!"). Six have no punctuation ("Courageous Service  Experienced Leadership  Bold Solutions" "Hope for America" "True Strength for America's Future"  "Strength through Peace"  "Change and Experience").

Bill Richardson has an underlined word ("Change and Experience"), but there are no italics.

This is a pretty feeble collection (of slogans, I mean). Back in the day, we used to have verbs in our presidential campaign slogans: "Give 'Em Hell, Harry!"; "He kept us out of war"; "I like Ike"; "In your heart, you know he's right"; "Turn the Rascals Out"; "Win with Wilkie".

Wikipedia has a list. A striking slogan that I didn't know: "Hoo but Hoover?" It carried the day in 1928.

[Update -- Lane Greene submitted a quotation from Dave Barry Slept Here:

"EXTRA CREDIT: Try to think up a campaign slogan even more inane than 'I Like Ike'. Hint: This is not possible."

I hate to disagree with Dave Barry, whose linguistic judgments are generally impeccable, but I always thought that "I like Ike" had a certain poetic concision.]

[Update #2 -- Charles Neveu writes:

Regarding the Slogan Gap, Kinky Friedman has a number of slogans, so maybe they will bring the average up.

When he was running for Governor of Texas, one of his slogans was "How Hard Can It Be?"
Another, courtesy of Molly Ivins was "Kinky Friedman: Why the hell not?"
Another, having to do with the idiosyncrasies of Texas election law: "Save Yourself For Kinky"
Suggested by a fan: "He never broke his word to the Indians"

I'd give citations, but you can google Kinky Friedman Slogans as well as I can.

They reminded me of the Birthday Party's Nobody For President campaign, whose 2008 slogan is "Nobody Speaks For Me".

I remember some of their other slogans:

Because Nobody Should Have That Much Power
Nobody Knows the Trouble I've Seen
Nobody Loves You When You're Down and Out ...

And remember, Nobody's a bigger fan of LanguageLog than me.


Posted by Mark Liberman at 06:07 PM

Ambiguous focus of the day

Chris Huhne is running for leader of the Liberal Democratic party in the U.K., and this is his slogan:

Adrian Bailey, who sent in a note about it, thinks that

Surely it should be "The people in charge." However, I'm sure most people will understand it, given that the political concept is so cliched.

Cliched? It seems to be completely incomprehensible.  Adrian thinks that it means "The people [should be] in charge (as opposed to the politicians or the investment bankers)". But Guy Fawkes' blog suggests that "Presumably he wants 'people in charge' as opposed to Alien Lizards?"

I hesitate to offer an opinion about British politics, but my initial reaction was that he meant "People in charge, as opposed to wandering around talking".

Other interpretations from Guy Fawkes' commenters:

There's not much you can fit onto a banner these days. "A fairer society would be one with people like me in charge" is the full message.

A verbless society. Nouns in charge.


A feral society. People on charges.

Posted by Mark Liberman at 05:49 PM

Adding insult to injury: the power of "a"

Sunday's New York Daily News sports section reveals a bizarre case of lawyers making mincemeat of conversational implicature. Victor Washington, who played for a number of different National Football League teams in the 1970s, suffered a number of different on-the-field injuries — including to his back, his elbow, and his right kneecap. The NFL, however, has managed to avoid paying Washington a higher rate of monthly benefits under its retirement and disability plan thanks to a particularly uncooperative reading of the phrase "a football injury":

In 1986, an arbitrator — using the kind of warped logic that would make Kafka green with envy — decided that Washington should get $750 a month for non-football related injuries, instead of the $4,000 he would receive if his problems were football-related. The plan's language, the arbitrator said, specified the higher payment for "a football injury." Since Washington suffered multiple injuries, he was out of luck.
"Who would ever know the letter 'A' had so much power?" Washington asks, half laughing, half moaning.

Washington has been stuck in litigation ever since. He agreed to a lump-sum payment in 1998 without realizing that the arbitrator's perverse construal of "a football injury" had been tossed out by a federal court considering the case of another retired NFL player, Donald Brumm. Washington sued to overturn the settlement, on the grounds that the NFL did not inform him of the Brumm ruling. A federal judge decided in Washington's favor in 2005, but after the NFL appealed, the Ninth U.S. Circuit Court of Appeals ruled last month that Washington is stuck with his current settlement. And all because of a one-letter indefinite article?

Until the early 1990s, the NFL's disability plan stated that a retired player would be eligible for "Level 1" benefits if he has been "totally and permanently disabled" due to "a football injury incurred while an Active Player." He would only receive "Level 2" benefits if his "total and permanent disability results from other than a football injury." In 1987, when seven players sought Level 1 benefits, their claims were submitted to an arbitrator, Sam Kagel. Kagel concluded that a player could only get Level 1 benefits if he was disabled from "one identifiable football injury." Donald Brumm was subsequently denied his Level 1 benefits under the standards set by the Kagel arbitration. A district court then upheld the retirement board's decision because Brumm's disability was not the result of one single injury but rather the cumulative effect of several injuries.

In 1993, however, the Eighth U.S. Circuit Court of Appeals heard Brumm v. Bert Bell NFL Retirement Plan and ruled that Brumm should get his Level 1 benefits after determining that the retirement board had acted "arbitrarily and capriciously." The decision read:

We conclude that the interpretation applied in Brumm's case, if not flatly contrary to the language of the Plan, represents at the least a startling construction. To require that disability result from a single, identifiable football injury when the relevant Plan language speaks of "a football injury incurred while an Active player" is to place undue and inappropriate emphasis on the word "a". "Injury" can mean either an "act or a result involving an impairment or destruction of . . . health" Webster's Third New International Dictionary (1986). Therefore, the key phrase from Section 5.1, "a football injury", could refer to either a single injury (act) or a cumulative one (result). The apparent dichotomy set up by Section 5.1 — between "results from a football injury" and "results from other than a football injury" — is consistent with the latter meaning. In sum, we believe that the Board's proposed construction of the relevant language impermissibly crossed the line between interpretation and amendment.

There seem to be two separate questions of implicature here. First, does the singular indefinite article a in "a football injury" imply that a player can only be injured once to receive the higher level of benefits? That's an uncooperative reading for anyone not trained as a lawyer, since despite its singularity a(n) is not typically considered to have the logical force of "one and only one." Second, even if a(n) is read in the limited, lawyerly way, does the word it modifies, injury, need to refer to a single harming act? The appeals court consulted a dictionary to decide that "a football injury" could in fact be the cumulative result of a number of individual acts on the playing field.

Victor Washington's benefits claim had also been held hostage to the perverse reading of "a football injury." According to last month's appeals court ruling, the arbitrator Kagel had awarded Washington the lower level of benefits in 1987 because his medical experts had not identified "'a injury' that resulted in his having to leave football." That's not a typo: Kagel really referred to "a injury" and not "an injury." That, presumably, was intended to drive home the uncooperative "one and only one" reading, by treating the article a legalistically rather than conversationally. The legal emphasis on the word a would apparently be lost if it underwent the regular addition of the epenthetic consonant /n/ to create an before a word beginning with a vowel like injury.

As I understand it, Washington didn't lose out this time around because of the pesky article a but because of yet another legal technicality. The appeals court ruled that knowledge of the 1993 Brumm decision could only have helped Washington out under the NFL's old disability plan. Since Brumm, the NFL has created new benefits categories for retired players: Football Degenerative and Inactive. Washington's old Level 2 benefits (for a disability resulting from "other than a football injury") translated into Inactive benefits under the new system, for disability arising from "other than League football activities." Washington wanted his benefits to be reclassified as Football Degenerative, for disability from "League football activities." Since the Brumm case only pertains to the old categories, the NFL's failure to disclose that ruling doesn't apply to Washington's consideration under the new categories. I still don't see how Washington's on-the-field injuries could be seen as anything other than "League football activities," but since the troublesome phrase "a football injury" is no longer in the picture, the appeals court felt that Brumm was irrelevant to Washington's current benefits status.

Still, it was a willfully obtuse interpretation of the article a that set Washington down this road in the first place. Sometimes, to quote Mr. Bumble in Oliver Twist, the law is a ass.

[Update, 10/24/07: Rob Pérez writes in:

I read your posting on "a football injury" with interest. In patent law, the opposite is generally true. Where "a widget" is recited in a patent claim, a device having at least one widget would be infringing (assuming all other recited elements were present). If we want to mean one and only one, we would normally say something like "exactly one".

Also, Michael Covarrubias at Wishydig takes exception with my description of the /n/ in an as "epenthetic."]

Posted by Benjamin Zimmer at 11:04 AM


Isabella Bannerman frames the problem:

Posted by Mark Liberman at 08:20 AM

We eventually wound up walking into this complete other study

After trying for a couple of weeks to ignore the news stories about that workplace-swearing study, I've given up -- the flood of "check it out" emails from readers finally led me to look into the research behind the media coverage. This confirmed, once again, that things that seem "too good to check" are probably exactly that.

In this case, you can check it out yourself by reading Yehuda Baruch and Stuart Jenkins, "Swearing at work and permissive leadership culture: When anti-social becomes social and incivility is acceptable", Leadership & Organization Development Journal 28(6): 492-507, 11/6/2007. The authors review (some of) the literature on swearing, and propose an "An emergent model for the consequences of swearing at workplace":

This is not a model in the statistical sense, which might be tested in relation to counts or measurements of various sorts of relevant behaviors or characteristics of individuals and groups. Rather, it's a "boxology" that expresses graphically a number of qualitative hypotheses. In particular, the authors adopt an established distinction between two types of swearing, "social swearing" and "annoyance swearing", and hypothesize that social swearing has positive effects on stress release and social cohesion (and therefore on individual and group well being), whereas annoyance swearing has negative effects.

This could all be true, though it's not obvious (for example) that annoyance swearing has negative effects on stress release. And someone might suggest that it's also important to distinguish aggressive swearing, which expresses hostility towards other people, and also to distinguish whether the object of hostility is within the group or outside it. Having had those thoughts, you could change the boxology to symbolize them.

But whatever the boxology, how do the authors of this study test it? They decided that measuring and counting things would be too hard:

While the model presented above fits well with a positivist approach of listing and testing an explicit set of hypotheses, the nature of the subject would not allow for using a conventional quantitative method (e.g. questionnaires). Collecting first-hand data on swearing is a challenging task, and needs to follow guidelines for conducting research on sensitive topics.. Firms tend to be reluctant to admit that swearing characterizes part of their operation. A direct, "front door" approach is troublesome for a number of reasons:

* The study will potentially be intrusive, time-consuming, and disruptive to the individuals or organizations involved. The firm may expect some consideration in return for their investment of staff time.
* Obtaining accurate data requires a long time frame. Participants may initially be shy about their conversations being recorded, and will tend to modify their behaviour in order to create a good impression.
* Research of this type is an ethical minefield. The firm will be concerned about its reputation and image, whereas the employees will be concerned about control of the data in order to maintain good relationships with their colleagues and employer.

A more relevant and feasible way to collect sensitive data is action research, where the researcher is engaged in the organization as a participant. This helps to overcome the above mentioned issues, and enables discussion of real life in organizations. The data collection for this study was conducted while the second author was employed in a temporary position in a mail-order warehouse.

In other words, Jenkins had a part-time job, and made some observations about cussing in that context.

That's not quite all they did: they also gathered a set of "cases, vignettes, and examples" in "six focus group discussions (four in a southern USA state, two in England)".

Full time and part time employees, mostly students in class discussions in groups of 10-20, were asked to reflect on both positive and negative use and "application" of swearing in the workplace. They were working in a variety of sectors, and as expected, the responses reflected certain variance amongst sectors and occupational groups.

It's hard to be sure, but this is consistent with the view that the second author attended a university in the U.S., and discussed cussing in some recitation sections where he was a teaching assistant (or perhaps a student); and then repeated the experience at UEA in Britain.

Now, I've got no problems with ethnography; and discussions with students are a great source of ideas and anecdotes. But let's be clear: this "study" consisted of one person's observations about cussing at a part-time job he once had, along with notes from six classroom discussions with students about their own experiences with cussing on the job.

The ironic thing is that pretty much all the reporters writing stories about this research, and most of the people reading those stories, have had at least this wide a range of experience of cussing in the workplace, and probably almost as much experience of swapping stories about cussing among different groups.

Thus they have roughly as much empirical basis as Baruch and Jenkins for evaluating hypotheses about how workplace swearing works and what its positive and negative impacts might be. But because this is a "study" presenting the results of "research", the journalists and their readers are all solemnly considering it as testimony of a qualitatively different character from their own life experience.

Something strikes me as really funny about this. In particular, it reminds me of one of my favorite pieces of science-news satire, "New Study Finds College Binge Drinking To Be A Blast", The Onion, 3/24/1999:

AMHERST, MA—Researchers at the University of Massachusetts released a surprising new study Monday indicating that, contrary to long-held beliefs about its destructive effects, collegiate binge drinking is a fucking blast.

"Data collected at bars and fraternity parties on the UMass campus has yielded unexpected conclusions with regard to the practice of binge drinking," study head Dr. Albert Greaves said. "Over the course of our research, a consistent pattern emerged demonstrating that binge drinking seriously kicks ass."

What I like best about this article is the blending of rhetorical styles -- for example:

According to Greaves, much of the UMass team's research was conducted at a party at this one guy Matt's place. "My colleagues and I were doing beer bongs, keg-stands, Jell-O shots, Jager shots—you name it," Greaves said. "We were totally binge drinking and just having a great fucking time. The best part was the crowd—the study was packed, and there was this amazing random sampling of hot chicks. I was so drunk, I couldn't figure out what the source of the unusually large hot-chick sample was, but by that point, I really didn't care."

When the keg was tapped, Greaves and his team went looking for a place to gather more data. "We heard there was this awesome study on Church Street, but we didn't have the address, so we just went wandering around," Greaves said. "We eventually wound up walking into this complete other study where we didn't know anyone. Unfortunately, it turned out to be totally lame—most of the people there were in the non-drinking control group. We had fun for a little while busting on them, but pretty soon we split."

Another example:

"Dr. Schmid is what we scientists term a fucking booze monster," team member Dr. James Podriewski said. "This one time, we needed a whole bunch of Wild Turkey and tonic water for a study that was just getting going at midnight, so we sent him out to this store that's open until 2 a.m., and we're waiting for, like, hours until he finally comes back, and he doesn't have any of the stuff, but he's carrying this big fucking railroad-crossing sign, and he's all like, 'Guys, check out the sign I found.' It was funny as shit. I swear, I was laughing so hard, I almost left a urine sample all over my pants."

The use of this is especially nice.

For more on cussing from a cross-cultural and comic-strip perspective, check out this Language Log post and the links at the bottom of it, especially this one.

Posted by Mark Liberman at 07:03 AM

October 21, 2007

The speech patterns of Dingburg

The town of Dingburg, Maryland, home of the Pinheads, has (of course) its own peculiar variety of English, heavily influenced by popular culture.  Scholars have long wondered about its source, speculating that it might be some variant of Foreign Accent Syndrome, caused by widespread brain trauma.  But now the truth is out:

(By the way, if you search for "Dingburg" on Google, you're asked if you meant "Edinburgh".  The Scottish Connection!)

Some investigators take a dim view of Dingburg talk:

(Thanks to Laura Whitton, who sent me this e-card.)

Meanwhile, the Pinheads of Dingburg are not only afflicted by the consequences of that misguided government experiment, but are also suffering from an addiction to memes:

Posted by Arnold Zwicky at 11:20 AM

Turn-taking etiquettes

In this morning's Doonesbury, Joanie refers to call waiting as "the last-shall-be-first" function:

I have the same opinion about call waiting, and in fact about cell phones in general, and for that matter about telephones of any kind. It's always seemed weird to me that people who would never consider barging into the middle of someone else's life in other ways, at least not without serious motivation and elaborate apologies, think nothing of making a randomly disruptive phone call. And people who would be surprised and offended by a random interruption in the flesh are completely unperturbed by random phone calls.

Let me add immediately that I use the phone a lot, and don't mind getting calls. For that matter, I generally like having people visit me in real life, even randomly and unannounced. I generally draw the line at call waiting, just because I hate being on the other side of "wait a minute, I've got another call" activity; and I do sometimes ignore incoming phone calls more than some people consider polite.

But the etiquette of phone calls and real-life interventions is very different. And email is another thing entirely; I've recently had to apologize to several correspondents whose emails I had apparently ignored, because I tend to run my email as what computer scientists call a "last in, first out (LIFO) queue", which is the same as what Joanie calls "last shall be first". This is not because I prefer or choose this algorithm, but because it's what email programs tend to promote, just as the call-waiting function does.

Since I get more email than I can comfortably answer, the  result is that once a message gets buried under new arrivals, it may never resurface. I regret this, but (so far) not enough to do anything about it. In the case of call waiting, I can choose not to use it; in the case of email order, I'd have to take some positive action, for instance to assign priorities and decide how to act on them.

[Update -- Nicholas Waller writes:

In my very first proper summer job after school in 1975 (ie long before mobiles) I was assistant to someone in an electronics company, and I remember accompanying him to talk to the head of the Quality Control department. On arriving at his office, which had a glass door, we could see QC Chap was in a meeting. So my boss said "I can't interrupt him in person; we'll go back to my office and phone him". Which we did, successfully as I recall.

Later, as a publishers' rep in bookstores, I noted how often staff would break off from dealing with a customer who had made the effort to come in to the shop to take a phone call from someone sitting in an armchair at home.

But, oddly, as a customer myself I get edgy if a staff-member ignores an insistent ringing phone to continue dealing with me or another customer. It's stressful, and generally makes the shop sound not well run.

Imagine the difference if shop phones could not ring but could only allow a distant customer to say "excuse me, could you help me?", but walk-in customers carried a large bell that they could ring loudly at will to summon assistance, and keep on ringing until it arrived.

Personally, I have never been that keen on phones, either for receiving or making calls. The do seem a bit rude and intrusive.

The question of cell phone interruption etiquette recently made the news, of course, in the case of Rudy Giuliani's habit of taking phone calls from his wife Judith during speeches and fund-raising meetings. Among the many comments on this, I didn't see any that pointed out how complex and strange the etiquette of phone interruption has now become.]

Posted by Mark Liberman at 10:20 AM

October 20, 2007

More fun with VPE

From an Andy Gill article on D.A. Pennebaker's rockumentary Don't Look Back (on Bob Dylan), in The Independent of 4/27/07 (the Joan in question is Joan Baez):

[quote from Pennebaker:] "I guess I tried to make that film as true to my vision of him as I could make it. But as a storyteller, I wanted there to be stories in it."

Pennebaker was aided in this regard by Bob Neuwirth, a singer and painter who served as Dylan's tour manager. Neuwirth had proved himself Dylan's equal in droll acerbity - he's the one who jokes, "Joan's wearing one of those see-through blouses you don't even want to!" - and he clearly saw part of his job as providing entertaining moments for the camera.

Yes, one of those see-through blouses you don't even want to, with a Verb Phrase Ellipsis (VPE) on the edge.  It takes some work to interpret -- you're likely to think of it as some kind of play on words -- but you can do it, and figuring it out contributes to the humor (a good thing, since the content of the quip is an unpleasant put-down).  The other ingredient in this example is the "anaphoric island" phenomenon, which we seem not to have discussed before on Language Log.

(Hat Tip to Empty Pockets.)

I've talked about playful uses of VPE twice in the last year: John McWhorter's

One could write a whole paper on it (and, as it happens, one is ___!).

and Eric Bakovic's

All completely unnecessary, if you ask me (though, of course, nobody did ___ or is ___).

The first of these postings has an introduction to VPE, which I won't repeat here; what's important here is that in this construction a VP is omitted when it can be supplied from the immediate linguistic context.  (There's a huge literature on the details of VPE.)  In the Neuwirth example, there's a relative clause in which the complement of infinitival to is omitted:

you don't even want to ___

What is this complement?  Something like the VP

see through ___

which itself has a gap in it -- the gap of the relativized NP.

Now, VPE is possible when the ellipted material has a gap of relativization in it, as in this example (from a posting to the OutIL mailing list on 5/17/04):

Doesn't DOMA say the Feds don't have to pay attention to any state marriage laws they don't want to ___?

The omitted complement VP here is

pay attention to ___

with a gap in it, just as in the Neuwirth sentence. 

A bit of a digression now: what fills the gap in pay attention to ___? If you take a relative-clause gap to be filled by its head, then this relative clause is to be interpreted as (that) they don't want to pay attention to any state marriage laws.  But that's not right.  The object of pay attention to should be understood as something like any state marriage laws they don't want to pay attention to ___.  Whoops.  There's a gap in THAT, and we're in an infinite regress. 

What we have here is a instance of "antecedent-contained deletion", usually illustrated by somewhat simpler examples like John read every book that Mary did ___.  The phenomenon has been studied since 1970; there's even a Wikipedia page.   Finding a satisfactory analysis involves abandoning simple antecedent-substitution analyses for the gaps in VPE and relativization, and/or re-considering the analysis of quantification in NPs (as in any state marriage laws and every book).

These are gripping issues for syntacticians and semanticists.  What's important for us here, though, is the question of whether gap-containing VPE is somewhat harder to process than gapless VPE, like the one in

I would like the Feds to pay attention to all state marriage laws, but they don't want to ___.

where the omitted complement VP is the gapless pay attention to all state marriage laws.  My impression is that the relative-clause gap contributes some processing difficulty, but I don't know if there's research bearing on the question.
But in any case, the DOMA sentence doesn't present the kind of processing puzzle that the Neuwirth sentence does.  What's the difference?  The nature of the ANTECEDENT for the omitted VP. 

The DOMA sentence has a VP antecedent, pay attention to ___, in the linguistic context, just as VPE requires.  But the Neuwirth sentence has instead the adjectival see-through, modifying the noun blouse.  Admittedly, see-through is derived morphologically from the verb see (plus an accompanying preposition, through), but it is not a VP, or even a V.  There is no VP see through ___ in the linguistic context to satisfy the requirements of VPE.  To understand the sentence, you have to get the interpretation 'see through' by "going inside" the lexical item see-through.

Another topic from roughly forty years ago, when it was first suggested that lexical items are "islands" for anaphora, that parts of lexical items or referents merely evoked by lexical items cannot serve as antecedents for anaphoric elements (of several different kinds).  Here are typical violations of the Anaphoric Island Constraint (AIC):

I'm a pianist, but I don't own one.  '... don't own a piano'

Flautists can easily take them on planes.  '... can easily take (their) flutes on planes'

I speak Norwegian, but I've never been there.  '... never been to Norway'

There's a huge literature on the AIC.  Early on, it was observed that more morphologically transparent lexical items are less problematic than more opaque ones.  Compare the examples above with:

I'm a piano-player, but I don't own one.  '... don't own a piano'

Flute-players can easily take them on planes.  '... can easily take (their) flutes on planes'

I speak Hawaiian, but I've never been there.  '... never been to Hawaii'

And many examples improve considerably in context.  That is, various factors contribute to easing the task of finding antecedents within islands.  Eventually, some linguists began to argue that the AIC was not a syntactic phenomenon at all, but a pragmatic one, having to do ease of antecedent retrieval (as related to contextual cues and morphological transparency, in particular); for a summary, see Gregory Ward's 1997 "The battle over anaphoric 'islands': syntax vs. pragmatics" (in Directions in Functional Linguistics, ed. by Akio Kamio).  Some examples present no problem, others are extremely hard to interpret, even in context, and many lie in between.

It seems to me that the anaphoric-island violation in the Neuwirth sentence is in this middle territory.  It takes some work to figure out the speaker's intention, but the puzzle isn't insoluble.  The gap within the omitted VP might contribute to the listener's work, though most of the work is, I think, a consequence of the fact that the antecedent for VPE is not a VP in the linguistic context, but is instead evoked by the lexical item see-through.

So you can play with VPE for humorous effect.  You can also stretch VPE without playful intent, as some of the awkward examples in my previous postings on VPE illustrate.  My current favorite of the latter type is from the final episode of the TV series Charmed, in the following (presumably scripted) exchange:

A: I just want Christy back.
B: You might be able to.

The omitted VP in B's response would seem to be get Christy back, which is not actually in A's statement, but is conveyed by it, if it's understood as meaning 'I just want to get Christy back'.  If I hadn't been a VPE junky and had just been listening like a normal person, I might not even have noticed the off-flavor of B's response.

Posted by Arnold Zwicky at 03:58 PM

To what after shampooing?

A recent xkcd offered a new twist on an old theme:

Some might consider that the mouseover title "Hit Turing right in the test-ees" is just a bit off, given this biograpical detail (as presented in Alan Turing's wikipedia bio):

In 1952, Turing was convicted of "acts of gross indecency" after admitting to a sexual relationship with a man in Manchester. He was placed on probation and required to undergo estrogen therapy to achieve temporary chemical castration. Turing died after eating an apple laced with cyanide in 1954. His death was ruled a suicide.

In a different variety of extended Turing Test, we've put a lot of effort over the past few years into trying to determine the nature of the "translators" who create the English-language labels and documentation for Chinese products and services. Bad computer programs? Incompetent human beings armed with large dictionaries and strange theories about how to choose among lexicographical alternatives? A team of infiltrators from The Onion?

Here's the most recent clue, sent in by Victor Mair (who apologizes for failing to provide his usual analysis of the corresponding Chinese):

The sentence fragment at the end, following "cool off to wash the shape of water after 5 minutes", is puzzling. It suggests that there's more going on here than just injudicious lexical choice.

Posted by Mark Liberman at 11:13 AM

Puzzle of the day: The constitution in B flat?

When David P. Currie read the U.S. Constitution out loud, did he perform it in the key of B flat, using melodies from a strange scale made up of minor thirds and tritones? Probably not, but read on.

Yesterday, I was at four interesting gatherings. There were memorials for Henry Hiz and Bob Lucid, where I learned interesting things about each of them, and even more interesting things about the people who gathered to talk about them. There was a meeting at the library where Laura Brown, author of the Ithaka report "University Publishing in a Digital Age", spoke about the future of scholarly communications.

And Greg Kochanski was here at Penn to give a talk on the topic "Maintaining information contours in the brain", which presented data from Bettina Braun, Greg Kochanski, Esther Gabe and Burt Rosner, "Evidence for attractors in English intonation" (preprint here).

The basic idea of Greg's talk was that people are able to imitate gradient values of pitch contours with reasonable accuracy, but when they copy their own productions recursively, their performances drift gradually towards patterns that you could think of as "attractors" in a sort of iterated map of imitation -- or perhaps as emergent psychological categories.

In conversation with him later in the afternoon, I mentioned a Language Log post from about a year ago, "Poem in the key of what" (10/9/2006), where I evaluated the idea that small-integer pitch ratios (that is, musical intervals) might play a role in the intonational contours of normal speech. I was skeptical of that idea, and still am, but I found it surprisingly hard to disprove it purely on the basis of comparing histograms of fundamental-frequency values in speaking as opposed to singing. We expect to see multiple modes in the histogram of singing pitches, correponding to the different pitch classes in the abstract musical scale implicit in the performance. But talking is not singing, and so it's suprising that (at least on phrase-by-phrase basis) we see similar modes in the histogram of speaking pitches.

That same post shows some data from an old paper, Mark Liberman and Janet Pierrehumbert, "Intonational invariance under changes in pitch range and length", pp. 157-233 in M. Aronoff and R. Oehrle, Eds., Language Sound Structure, MIT Press, 1984. Although that paper was basically about how the various parts of a pitch contour scale relative to one another as pitch range changes, the research in question actually began as an attempt to debunk an earlier claim of characteristic musical intervals in speech.

The idea was that if there really are favored pitch classes or pitch intervals in speech, then if you put speakers in a situation that encourages them to show you a wide range of pitch ranges, the resulting distributions of pitch values and pitch relationships should show some clumping around the favored values and/or intervals. Such probability enhancement in certain regions of the distribution might provide evidence about the structure of the underlying process, just as peaks in the graph of interaction cross-section vs. energy do in particle physics.

We didn't find any (multiple) peaks to explain -- which was exactly what I expected at the time -- but we did find some nifty patterns of other sorts. And since positive results are much more fun than negative ones, we didn't even discuss the failure to find what we were originally looking for, focusing instead on the scaling issues.

Therefore, I was somewhat surprised by the mutliple modes in the F0 histograms that I made last year. However, I persuaded myself that they mostly arose because of some not-especially-musical characteristics of the fairly short (sentence-sized) pitch contours that I got data from. In the longest passage I tested -- a Sylvia Plath poetry reading about three minutes long -- the histogram was considerable smoother. There were still some lumps, but, I told myself, it's poetry, after all.

I followed up a few days later ("More on pitch and time intervals in speech", 10/15/2006), with a post that looked at plots of dipole statistics as a way to examine pitch intervals in speech. That produced some pretty pictures and a certain amount of puzzlement. And what with one thing and another, I never really got back to it.

But it seems possible that there's a connection between the "attractors" that Braun, Kochanski et al. found, and those puzzling modes and ridges. So I promised Greg that I'd try looking at a longer sample, staying away from poetry. And so late last night, after I got back from the memorial for Bob Lucid, I downloaded David Currie's reading of the U.S. Constitution, available here from the University of Chicago Law School. It's about 50.14 minutes of audio in total. The pitch tracker that I used found 196,128 f0 values in voiced frames, out of 300,845 total analysis frames (at the conventional 100 frames per second), which corresponds to about 32 minutes and 41 seconds of voiced speech.

After converting the f0 values in Hz to semitones (relative to A 110), and dividing them into quarter-tone bins, here's the result:

As you can see, there are clearly at least two modes about a tritone apart (and I think there's pretty clearly a third component about half-way in between them). Three-quarters of a diminished chord? or some strange mode including those pitch-classes? Or just an artefact of the structure of the document and the speaker's style?

The lower peak corresponds to a pitch of B flat, which is where I got the jocular title of this post. But I can think of a dozen obvious questions to ask about what's really going on here, and three or four plausible sources of artefact, so let's hold off before concluding that Ken Pike was right about English having four phonemic pitch levels. Or adding the speculation that these might correspond to the four pitch-classes of a scale dividing the octave into minor thirds, as I did in a moment of fatigue-induced unscientific weakness while writing my dissertation, many years ago.

Meanwhile, I've got to give a shout-out to computer technology and the internet. This little experiment took me a total of 20 minutes of elapsed time to perform. About half that time was waiting for the 70-MB mp3 file to download over a marginal wireless connection, and rebooting the elderly laptop on which I converted it from mp3 to wav, when the size of the resulting array caused some simultaneously-running applications to freeze up. For someone like me, who started looking at pitch tracks by tracing overtones on narrow-band (paper) spectrograms 40 years ago, that's amazing.

Posted by Mark Liberman at 07:55 AM

October 18, 2007

You say potato, I say bologna

In the October 22 New Yorker, Michael Schulman reports on his conversations with Majella Hurley, an English dialect coach who is coaching Claire Danes as Eliza Doolittle in a revival of Pygmalion ("You say potato"). Halfway through the piece, Schulman turns his attention to the speech patterns of the current crop of American presidential candidates, and brings into the conversation another dialect coach, "Hurley’s colleague Beth McGuire, who also attended the matinée—she is the dialect coach for another of the Roundabout’s fall shows, 'The Overwhelming,' a drama, set in Rwanda, that involves seven different dialects".

McGuire, who had listened to some of the previous night’s Republican debate, had noticed a striking disparity—aside from the one in economic policy—between Giuliani and his most formidable Republican opponent. “I’m used to listening to Giuliani. He was my mayor,” she said. “So here I am listening to Giuliani going dadadadada”—she made a machine-gun sound—“and then here’s Mitt Romney with this whole other pattern: dah . . . dah . . . dah . . . But Giuliani is who Giuliani is, and how we speak is who we are. Giuliani is a dadadadada guy.”

I wonder what in the world McGuire might have meant by this.

Given that she's getting paid for coaching seven separate Rwandan dialects, I'm sure that she meant *something*. And my general rule in such cases is to assume that the expert had something sensible in mind, and to blame the journalist for screwing it up. But no matter who's responsible for this claimed contrast between dadadadada and dah . . . dah . . . dah . . ., a reasonable reader, I think, would take it to mean that Rudy Giuliani speaks a lot faster than Mitt Romney does, and in particular, that this difference could be heard in the Oct. 9 debate in Dearborn, MI (the debate that McGuire must have been referring to). However, as I reported a couple of days ago ("The tale of the tape: Those fast-talking southerners", 10/17/2007), Rudy Giuliani's average speech rate was 207 words per minute in that debate, while Mitt Romney's was 221 words per minute.

Is there some other quality of their delivery that deserves description as "dadadadada" vs. " dah . . . dah . . . dah . . ."? Here's a comparison of clips from each man's first answer in the Oct. 9 debate, so you can decide for yourself:

Frankly, I wonder whether McGuire (or Schulman) might have gotten Mitt Romney mixed up with one of the other candidates.

But maybe not, because Schulman quotes McGuire saying something just about equally puzzling about the speech patterns of Hillary Clinton.

[McGuire] suggested that Senator Hillary Clinton work on her melodiousness.“Any woman in a position of authority tends to lower her pitch,” she said. “But Hillary doesn’t vary her range a lot. She ends all her sentences on a down glide, which can make her sound masculine and hard.”

OK, maybe Hillary should work on her melodiousness, that's a matter of opinion. But is McGuire really suggesting that a candidate for president of the United States of America should systematically engage in uptalk? That would do wonders for her image, I'm sure.

In fact, Senator Clinton has a reasonable proportion of appropriately deployed non-terminal rises and levels, as in this passage from the Sept. 26 Democratic debate in Hanover NH:

In the transcript below, I've marked the three final-rising phrases in red, and the three final mid-level phrases in purple:

Well, what I have said is that
I will do everything I can to prevent Iran from becoming a nuclear power, including
the use of diplomacy,
the use of economic sanctions,
opening up direct talks.
We haven't even tried.
That's what is so discouraging about this. So then you have the Republican candidates on the other side
jumping to the kind of statements that you just read to us.
We need a concerted, comprehensive strategy to deal with Iran.
We haven't had it.
We need it.
And I will provide it.

I didn't have to look hard for this passage. I haven't counted, but I'd guess that Senator Clinton uses phrase-final rises and levels at least as often as any of the other candidates, and probably more than most. (You'll notice that the clips from Romney and Giuliani, above, involve no final rises or mid-levels -- though I'm sure that they sometimes use such contours.) It also sounds to me as if Clinton varies her pitch range, proportionally, at least as much as the other candidates -- and again, this is something that can easily be quantified.

I suppose that vision researchers get just as discouraged by what journalists write about colors, but let me tell you, it's hard out there for a phonetician.

“This may be a naïve thing to say, but I would hope that people actually listen to content,” said Hurley, voicing a sentiment that is not bloody likely.

Me, I would hope that dialect coaches and New Yorker writers, when commenting on the way people talk, would actually listen to their speech and describe it somewhat accurately, rather than peddling inaccurate stereotypes and generic advice. Perhaps they might even count and measure things, when they want to make authoritative-sounding generalizations and recommendations. But that sentiment is even less bloody likely, apparently.

[Update, 7:00 Friday morning: I thought I should take some of my own medicine, and do a small quantitative comparison of pitch ranges as this morning's Breakfast Experiment™.

For the half-minute audio clips that you can listen to above, here is a plot comparing Hillary Clinton and Rudy Giuliani in terms of the percentiles of their fundamental-frequency measurements. ("Fundamental frequency", or F0 -- commonly pronounced "eff zero" -- is the objective measurement of local periodicity in speech that corresponds to the subjective dimension of "pitch".)

And since the comparison should probably be made in proportional terms, here's the same plot expressed in semitones relative to middle C:

And for those of you who prefer numbers to graphics, here are the tables of data, with ratios and differences (calculated before rounding). In hertz (i.e. cycles per second):

  Hillary Rudy ratio

(In other words, Senator Clinton's median F0 was 180 Hz, while Mr. Giuliani's median F0 was 117 Hz; and similarly for the other percentiles.)

Alternatively, in semitones relative to middle C:

  Hillary Rudy difference

(In other words, Senator Clinton's median F0 was 6.5 semitones below middle C, or between E and F; Mr. Giuliani's median F0 was about 7.4 semitones lower; and so on for their other percentiles.)

Does this support the view that Senator Clinton "doesn't vary her range a lot", at least compared to Mr. Giuliani? I'd say no. On the contrary, in fact.

Perhaps McGuire meant something different, say that Hillary Clinton's pitch range, though reasonably large in proportional terms, is too consistent from phrase to phrase relative to other politicians. But if you listen to the clips above, I think you'll find reason to doubt this idea as well. A quantitative test will have to wait for another morning, since my breakfast hour is over and it's time to get to work.

The most parsimonious hypothesis about all this, I think, is that McGuire and Schulman's evaluations of the candidates' speech were simply re-packaged stereotypes about groups (New Yorkers talk fast) and individuals (Hillary Clinton is masculine and hard), without any empirical basis whatsoever in the facts of any actual talk at any actual time.

I freely admit that this theory is an expression of my own stereotypes of what certain groups have to say about speech and language. This is what political commentary is like, all too often, and what popular discussions of speech and language are like, nearly always. At least, so it seems to me. I would have hoped for better from the New Yorker, but I'm afraid that in this case, the stereotypes (about journalists' quotations from experts, not about Hillary Clinton and people from New York) turned out to be true.]

Posted by Mark Liberman at 10:50 PM

Go go go!

A couple days ago Mark Liberman followed up on this question from Thomas Mills Hinkle:

Just now, my wife asked the following "Would you mind go checking on the laundry?" The (to me) error "go checking" made me think: what is the deal with go X constructions in English?

I have a speculation about this example, which relates it to the "go V" construction but also treats the verb mind as crucial in the story.

But first a pointer to an essential piece of literature on the subject, our own Geoff Pullum's 1990 article "Constraints on intransitive serial-verb constructions in modern colloquial English" (Ohio State University WPL 39.218-39).

Geoff's article does four things.

One thing it does is distinguish the construction -- which I'll refer to here as QSV (for "quasi-serial verb": I'll go see who's at the door) -- from a number of other constructions that can have the verb GO in them and share some syntactic or semantic properties with QSV, in particular the following (the names are mine, not Geoff's):

Hendiadys, with and: I'll go and see who's at the door;

GoPurp, a purposive construction with a marked infinitive: I'll go to see who's at the door;

AdvIng, "adverbial -ing":  I'll go fishing with you tomorrow.

(There are more constructions with GO in them, but these are the most similar to QSV.)

Another thing Geoff's article does is survey the peculiarities of QSV.  Most speakers allow some motion verbs other than GO as the first verb in QSV: COME and maybe RUN or HURRY.  And for most speakers, QSV is subject to the two conditions in the passage below (from a 2003 abstract of mine that adds some observations to Geoff's):

... (1) the Inflection Condition, which requires that both verbs be in a form identical to the base form (either the base form itself, or the present non-3rd-sg form, as in I go get some wine whenever I can, vs. *She goes get(s) some wine...); (2) the Intervention Condition, which disallows a dependent of  the first verb between the two verbs (*Go out get some wine); (3) a strong preference for face-to-face conversation (so that searching the standard corpora nets very few examples); and (4) a newly discovered statistical asymmetry (observed by searching a database of film scripts), involving a very strong preference (when QSV examples are compared to all relevant occurrences of go and come) for base forms over unmarked presents and, within the base forms, for the imperative over all other uses (with modals, with infinitival to, and in all other contexts).

(Hendiadys, GoPurp, and AdvIng are subject to neither the Inflection Condition nor the Intervention Condition.)

The Hinkle example violates the Inflection Condition: go looks like a base form, but checking certainly is not one; it's a "gerund participle" form ("form N", as I'll call it here), with suffix -ing.  The Hinkle example also violates a condition on complements of the verb mind, which have to be nominal (as in Would you mind a helpful suggestion?); VP complements are possible, but they are in fact nominal gerunds, in form N: Would you mind my suggesting an alternative?  Would you mind seeing who's at the door?  Fixing the complement of mind in the Hinkle example (where it's in the base form) only makes things worse: *Would you mind going checking on the laundry? violates the Inflection Condition twice.  (*Would you mind going check on the laundry? still violates it once, and *Would you mind go check on the laundry? satisfies the Inflection Condition but violates the condition on complements of mind.)

So the Hinkle example looks like a blend -- of QSV (Would you go check on the laundry?) and mind with a form-N complement (Would you mind checking on the laundry?), yielding a complement of the form: go + N-form VP (with semantics distinct from the AdvIng construction).  It could have occurred as an inadvertent speech error.  But, as Mark pointed out in his posting, it looks like some people have combined these features into an actual construction, not (so far as I know) previously reported.  (This sort of thing can happen.  The GoToGo construction -- I'm going home and take a nap -- that I've posted about here several times presumably originated in a telescoping or blending, but it's now just part of some speakers' linguistic systems.)  The question is what the details of the construction are like.  I'll save that discussion for an appendix to this posting, and return now to Geoff's article.

A third thing that Geoff does in this article is go through the previous history of linguists' treatments of QSV and the conditions on it, going back to the 1960s.  The sad lesson here is that the phenomenon was discovered again and again, and people wrote about it almost entirely without citing any of the earlier literature.

Finally, Geoff reports on some judgments he and I collected (way back when) about QSV, judgments indicating that there is a core pattern, with some variation.  But the full envelope of variation appears to include almost every logically possible system.  Mark has now searched for some of the minority variants and found several of them attested.

This general configuration is of considerable interest: people base their systems on what they hear, and in the case of QSV, they are mostly -- but not entirely -- reluctant to go past the evidence they have, in which the Inflection Condition holds.  What  they do here is UNLIKE what people do in many other situations: ok, you haven't heard a passive with a particular verb, but you press on and produce it, and no one bats an eye.  You go beyond the evidence you have.

But for some constructions, people are conservative.  They (mostly) don't generalize to new cases.  You hear I go help them whenever I can, but you (well, most of you) don't go on to say She goes help(s) them whenever she can.  This is an instance of Baker's Paradox, which I discussed here about a year ago in connection with a constraint (for many speakers) on independent possessives that allows Let's meet at Sandy's but not Let's meet at mine (with mine not anaphoric).

Characteristically, in cases where most speakers resist generalization, having apparently learned an arbitrary constraint on some construction (like the Inflection Condition), there are some speakers who strike out anyway.  If you look hard enough, you can find a few people who don't adhere to the constraint; you find things like i called Melanie and went meet her in Moreauville (in Mark's posting), and more.  Sometimes the constraint applies in some dialects but not others; things like (non-anaphoric) Let's meet at mine are widely attested in British English.

A few words about the history of QSV.  Most people reflecting on the matter assume that QSV (go V) is just the "short form" of the "full form" Hendiadys (go and V); as Mark reported, this was Hinkley's own idea.  In general, when faced with a longer and a shorter variant with similar meaning, many people, linguists included, assume that the shorter variant is derived, historically and maybe synchronically as well, by truncation of the longer form.  This is often not true historically, and is very rarely, if ever, a satisfactory synchronic analysis.  In any case, Mark casts doubt on Hendiadys as the historical source of QSV (both have been in the language for a very long time).  The fact that Hendiadys is subject to neither the Inflection Constraint nor the Intervention Constraint casts further doubt on it as the source of QSV.

In my alternative story (in my QSV abstract),

... (sentence-initial) hortatory go and come with imperatives (Go, get some wine! Come, see how it's grown!) were reanalyzed as forming prosodic, syntactic, and semantic units with them; the resulting construction was then extended from the imperative to other uses of the base form, and then to homophonous finite forms (thus yielding the Inflection Condition). 

This is just a suggestion, but it does offer a possible account of the statistical asymmetries in modern QSV, which reflect the historical trajectory of the construction.

Appendix.  We started with examples of mind with a complement composed of go and an N-form VP, as in:

mind + [ go + [ checking on the laundry ] ]

You can also find examples with try:

If you guys like what you read, try go getting his books

Try go telling a fanatic Muslim something about Jews.

You should try go asking some Koreans.

and probably with some other head verbs as well (I've just started looking at such data; this is a very preliminary report).  What's striking about the examples Mark and I have collected so far is that the head verbs are all in their base form: in imperatives (try go getting his books) or complements of modals (would you mind go getting your sister, should try go asking some Koreans).  Could this be a version of the Inflection Condition, applying here to the head verb?  If so we could hope to find some present non-3rd-sg examples, that is, with the verbs not in the base form but in a finite form identical to the base form -- something like

Whenever I need to know more about hangul, I try go asking some Koreans.

To complicate things still further, there are cases of go plus N-form VP "on their own" (not as a complement to a main verb like mind or try), cases that are not examples of AdvIng:

Don't go getting my hopes up.  [cf. Elton John's 1976 hit "Don't Go Breaking My Heart"]

How does one go getting it back?

Only a few examples so far, but they have base-form go.  Again, we could hope for some present non-3rd-sg examples, pointing to the Inflection Condition.  (It's possible that the two examples above illustrate two different constructions.)

To sum up:  there are apparently two constructions here, call them (arbitrarily) A and B. 

Construction A involves two constituents: the verb GO and an N-form VP.  On the evidence so far, the verb is constrained to its base form (or possibly it's subject to the Inflection Condition).

Construction B involves two constituents: a head V and a base-form complement in Construction A.  On the evidence so far, the verb is constrained to be in its base form (or possibly it's subject to the Inflection Condition.)  Also, on the evidence so far, the verbs that can head Construction B are verbs (like mind and try) that can occur with complements that are plain N-form VPs (I don't mind checking on the laundry, I tried asking some Koreans).  But which such verbs?

Lots of work still to be done here.

One final note: in searching for constructions with GO plus N-form VP, I turned up yet another one I hadn't noticed before, illustrated by

There you go getting obtuse again.

There you go, asking more questions.

This one obviously isn't restricted to the base form.  In fact, it isn't subject to the Inflection Constraint:

There he goes getting obtuse again.

It seems not to be relevant to the other constructions we've been looking at.

Posted by Arnold Zwicky at 03:08 PM

Neanderthals may have had headline writing gene

Much of the blame for the public's poor understanding of science must go to a little studied but culturally pivotal genre: news report headlines. Short snappy headlines provide the lazy reader with just enough information to totally misconstrue a story.

Take the evidence reported in today's NYT that Neanderthals may have possessed a FoxP2 gene like that of modern humans. It's a striking result, if it holds up, though it might not. But, as is clear from the article, presence of a modern human FoxP2 gene doesn't establish one way or another whether Neandertals had modern human speech abilities. The thing is, the nature of the FoxP2 gene establishes neither sufficient nor necessary conditions for speech: see the Language Log coverage of FoxP2 in bats and chimps.

The NYT article, reporting on a complex set of controversial hypotheses and results, doesn't do too bad a job. The headline, however, "Neanderthals May Have Had Gene for Speech", is potentially misleading. Even Nature's coverage, titled "Modern speech gene found in Neanderthals" would tend to confuse those who didn't already know something about genetics.

The very fact that the headline writers use the bare singulars "gene for speech" and "speech gene" suggests, through omission, that this is not just one of many genes linked to speech, but possibly *the* speech gene. But the more important fact (as the LL pieces make abundantly clear) is that FoxP2 doesn't deserve to be called a speech gene at all. At best, it seems to be a vocal articulation gene, although it may well have other as yet unidentified roles. And vocal articulation, even very precisely managed vocal articulation, is not speech.

To speak you need to have language. Otherwise all your twitterings, squawks, grunts and growls are just twitterings, squawks, grunts and growls. FoxP2 has nothing to do with language per se. For example, it is presumably totally irrelevant to speakers of human sign languages. At best, FoxP2 may have been an important factor in the evolution of language... but I feel safe in stating that there must have been many other such factors, both social and genetic.

I understand that "Neanderthal's may have had vocal articulation gene" is not such a sexy headline. But please, please can we all stop labeling FoxP2 a/the "speech gene"?

Posted by David Beaver at 03:06 PM

You're not the boss of me

Geoff Pullum recently noted an odd turn of phrase used by British prime minister Gordon Brown in acknowledging his decision not to call an early election: "Anything that happens in Downing Street is the direct responsibility of me." Geoff writes:

What makes "of me" so unusual is that there is a monosyllabic genitive form of the first person singular pronoun, namely my, so normally people will say "my responsibility" or "my spouse" rather than "the responsibility of me" or "the spouse of me". Perhaps it was a planning screw-up: he first embarked on "Anything that happens in Downing Street is the direct responsibility of the prime minister" and then decided on a mid-course correction from "the prime minister" — the third-person reference to himself might have sounded pompous — and changed the last noun phrase to "me". What he ended up with sounded strangely inept.

Inept or infantile? Several helpful readers pointed out the popularity of the phrase "the boss of me," as in the song by They Might Be Giants, "Boss of Me," used as the theme song for the FOX TV show "Malcolm in the Middle" from 2000 to 2006. "You're not the boss of me now, and you're not so big," goes the chorus, seemingly from the perspective of a petulant child addressing a parent or older sibling. (Songwriter John Flansburgh has said it's about his older brother.) Though the phrase "you're not the boss of me" may owe some of its current popularity to the TMBG song, this bit of rebellious kid-speak has been kicking around since the late 19th century.

When I checked up on this a few years ago for the American Dialect Society mailing list, I was able to trace "you're not the boss of me" back to 1953 using then-available digitized newspaper databases. Now, thanks to the wonders of Google Book Search, it's easy to take it back another 70 years:

His sister was going to put her arms around him, but he whirled, and facing her with a very angry face, snapped — "Let me alone; you are not the boss of me now, I tell you, and I'm going to do as I please."
—"As by Fire," The Church, New Series Vol. III, 1883, p. 70

Though this is from a long-ago children's story, the context of a boy bristling at the control of his older sister is strikingly familiar. ("I've been babied till I'm tired of it," complains the brother. "I can take care of myself now, and I don't need any of your bossin'.") The next example I see on Google Book Search is from 1949, in Millie Tool's novel Resurrection Road. The book is unfortunately only in snippet view, but here too the expression appears to be used in the context of inter-sibling squabbling (between brother Astor and sister Star):

"You're too cheeky," said Astor, sticking out his tongue. "You're not the boss of me."
—Millie Tool, Resurrection Road, 1949, p. 63.

The 1953 quote I had found earlier is notable in that it appears in article about childhood behavior, suggesting that experts in the field already considered "you're not the boss of me" to be a well-known reaction of recalcitrant children:

Put off that visit to grandma, or hers to you, till the peak of this "Try and make me — you're not the boss of me" stage is past.
—"Child Behavior: Better to Ward Off That Crisis. The Gesell Institute." Washington Post, Jul 1, 1953, p. 30

From the 1960s onwards, "you're not the boss of me" seems to have an increasing pop-cultural presence, as in the comic strip I've reproduced in the top right ("The Ryatts," Appleton [Wisc.] Post Crescent, May 4, 1966). But it didn't really take off as a catchphrase until the late '90s. Predating They Might Be Giants by two years, a band called The Meat Joy released the song "Free Kitten" in 1998, with the chorus, "You're not the boss of me." And in a 1999 interview with Barbara Walters, Monica Lewinsky explained that she's always been stubborn: "From the time I was 2 years old, one of my first phrases was, with my hands on my hips, 'You're not the boss of me!'"

Though "you're not..." is the most common frame for "(the) boss of me," it has also shown up in other formulations. For instance, the Galveston Daily News of Sep. 30, 1910 quotes Senator Joseph W. Bailey as saying, "I am not the boss of any man, but no man is boss of me." There the rhetorical use of "boss of me" is easily understood as contrastive, highlighting a chiastic reversal. (I'm reminded of the Qur'anic invocation, lam yalid wa lam yulad, "He begets not, nor is He begotten.")

A more childlike example comes from "Tiger" by the Canadian poet Isabel Ecclestone Mackay (1875-1928), in a posthumous collection of verse published in 1930 (via Literature Online):

There is a TIGER in our hall—
He lies so flat and still
He never seems to move at all,
But, some time, p'r'aps he will!
Some day, when I am grown up tall,
I'll step on him!—you'll see,
I'll teach that Tiger in our hall
He's not the boss of me!

Finally, there's the formulation "I'm the boss of me," attested since the 1950s at least. In these two citations it is discussed as an emblematic assertion of a child's independence:

As a child once reported, "I am the boss of me." Obviously this is the culmination of the development of the ego trait of autonomy.
— Joseph Salomon, A Synthesis of Human Behavior, 1954, p. 42.

"I'm the boss of me." Wherever they pick it up, youngsters from tots to teens make the statement importantly and cling to it. Upon its earliest utterance, watchful, loving non-permissive parents will reply that that's the way it should be — as long as the child is a good boss of him. If he is not, then someone else has to take over.
—"Why Behave?" Lima (Ohio) News, Nov. 8, 1967, p. 35

More recently, both "you're not the boss of me" and "I'm the boss of me" were featured in the 1997 film Boogie Nights. Mark Wahlberg plays the porn star Dirk Diggler, who gives himself a memorable pep talk in the mirror, imagining that he's telling off his svengali Jack Horner (played by Burt Reynolds):

You're not the boss of me, Jack. You're not the king of Dirk. I'm the boss of me. I'm the king of me. I'm Dirk Diggler. I'm the star.

In all of the above cases, "the boss of me" provides much greater emphasis (albeit in a puerile manner) than the unmarked possessive construction, "my boss." Perhaps Gordon Brown was looking for a similar way to emphasize his personal role by referring to "the direct responsibility of me." But it probably does no good for a prime minister to sound like a five-year-old having a tantrum.

(For more discussion of recent "boss of me" usage, from Dirk Diggler to Monica Lewinsky to "Malcolm in the Middle," see the two posts by linguist Neal Whitman guest-blogging in 2004 on The Volokh Conspiracy.)

Posted by Benjamin Zimmer at 02:09 PM

Phrasally grateful

If you run out of conventional adjectives and adverbs, the English language stands ready to help. Just package an evocative phrase or two with an appropriate prosodic inflection, and you're on your way, as this morning's Zits illustrates:

Richard Sproat and I discussed this kind of modification in our paper "The stress and structure of modified noun phrases in English" (pp.131-182 in Anna Szabolcsi and Ivan Sag, eds., Lexical Matters, 1992). I don't have the volume at hand, but if memory serves, one of our examples (taken from a 1980s-era business magazine) was "old-fashioned white-shoe do-it-on-the-golf-course bankers" (though I'm not sure I have the punctuation as it was in the original).

I don't know any good, generally-used term for constructions like this. They are sometimes called "phrasal modifiers" (i.e. "modifiers that are phrases"), but that term is more commonly used to mean "things that modify phrases". Without a convenient indexing term, it's hard to search for discussions on line. If you have a suggestion, please let me know.

[Update -- Patricia Witkin writes:

Your post reminded me of the relatively recent series of Yoplait ads wherein two young women chat over cups of yogurt, taking turns to describe just how deliciously good the stuff tastes.

Man, I can't stand these ads (and, apparently, short men didn't like the bridesmaid one either. In looking for a link to send I found a letter to General Mills taking issue with the "not-being-put-with-an-usher-who's-shorter-than-you good" line)


Posted by Mark Liberman at 09:21 AM

Ask Language Log: Rotely

Jasprizza Will writes:

A quick question -- is "rotely" a real word? I used it in a report the other day and was corrected and told it wasn't in the dictionary. True enough it isn't -- neither Websters or Oxford. It does get approximately 12,000 Googits and it sounds unexceptionable to me, but is it simply a widespread error or has it reached acceptance?

Rotely is used several times a week in U.S. newspapers. Thus Michele Parente in a restaurant review in the San Diego Union-Tribune of 10/11/2007:

No matter that these dishes must have been made thousands of times before; instead of rotely turning them out, Sala Thai has nailed how to do them right every time.

And Jim Farber in a concert review in the New York Daily News of 10/11/2007:

There were some duds: Howard Jones rotely tinkled out "Tiny Dancer." and the Pernice Brothers had no point of view on "Country Comforts."

Rotely also makes it past the copy editors at the New York Times, at least from time to time; thus Adam Nagourney, "Iowans Check for Dirt Under Giuliani's Nails", 8/15/2007:

Mr. Giuliani signed autographs like a machine, a reminder of just how famous he is. He traveled with what appeared to be an endless inventory of Sharpies, rotely affixing his name to campaign leaflets, books, T-shirts, signs and menus, even when no one was asking.

We can also see rotely used from time to time in scholarly writing. Thus H. Douglas Brown, "Cognitive Pruning and Second Language Acquisition", The Modern Language Journal 56(4): 218-222, 1972.

... retention of material rotely learned is extremely inefficient since forgetting is easily induced by interference ...

Or Helen Thompson, "Review of A Serious Proposal to the Ladies and An Essay on the Art of Ingeniously Tormenting", Eighteenth Century Fiction 18(3): 398-402, 2006:

But a woman's lively understanding of her "own duty" does significantly encroach upon the "Authority" that would enforce rotely feminized "conformity" ...

And in general, the fact that a rare but regularly-derived word isn't in the dictionaries is not a very strong argument. The English suffix -ly is routinely used to make adverbs out of adjectives, and not every creation of that type is listed, or needs to be listed. A minute of searching finds a list of -ly forms that have been used in print by good writers, but don't make the OED: relaxingly, bossily, diabetically, maturationally, etc.

So is it time to start passing out copies of Arnold Zwicky's declaration of lexical independence, "Not a word!" (11/27/2004)?

Perhaps not yet.

The problem is that the regular and sanctioned used of -ly is to make adverbs out of adjectives -- but rote is a noun, at least traditionally. The OED gives us a set of obsolete nominal senses:

1. a. Custom, habit, practice. Obs.
    b. Mechanical practice or performance; regular procedure; mere routine. Obs. (Cf. sense 2.)
    c. A rigmarole. Obs. rare.

with citations from 1315 to 1678. The only modern uses that the OED gives are in the phrase by rote:

2. by rote, in a mechanical manner, by routine, esp. by the mere exercise of memory without proper understanding of, or reflection upon, the matter in question; also, with precision, by heart. a. With say, sing, play, etc.
    b. With know, get, learn, etc.

and a set of attributive uses, either of rote itself or of the phrase by rote:

3. attrib., as rote knowledge, -learning, -lesson, -work; rote-learned, -like, adjs.; by-rote babble, lesson; rote learning, also spec. in Psychol., the learning by rote of meaningless material designed to be free of associations, as a technique in the study of learning.

Some people have (plausibly) interpreted the common attributive uses (such as rote knowledge, rote learning) to mean that rote is an adjective. And if it is, then rotely as an adverb meaning "by rote" is indeed unexceptionable.

But if rote is still a noun (as it seems to be in my personal and subjective grammar), then rotely would have to be the other -ly suffix, which is added to nouns to form adjectives: kingly, masterly, scholarly, manly, womanly. I should really should say "was added to nouns", because this engine of derivation runs fitfully at best in modern English. Forms like bookly and floorly can be coined, but half as jokes, as in these examples from the web:

That word is Delight; and this is what I wish for you as readers--Delight, in all of its bookly incarnations.

I'm sure some of you may be grossed out by my lack of floorly cleanliness, but if it doesn't look bad, it doesn't bother me.

So if Jasprizza used rotely as an adverb, she was making a correct deduction from a historically uncertain premise. If she used rotely as an adjective, then she was making a whimsical deduction from an established (though obscure) premise.

Either way, there are arguments on both sides. In such cases, absent strong motivation for using the uncertain form, a conservative choice might be wiser, at least in formal contexts. The rotely quotes above could all have been rephrased with by rote, perhaps with some other re-ordering:

...instead of rotely turning them out, Sala Thai has nailed how to do them right every time.
...instead of turning them out by rote, Sala Thai has nailed how to do them right every time

Or some other word like mechanically might have been used.

... an endless inventory of Sharpies, rotely affixing his name to campaign leaflets, books, T-shirts, signs and menus...
... an endless inventory of Sharpies, mechanically affixing his name to campaign leaflets, books, T-shirts, signs and menus...

For those who want to avoid giving offense, such alternatives might be better choices. Thus does conscience make cowards of us all.

[Update -- Joe Ruby observes:

The correct word would seem to be rotelily. "He played the piece by rote." "It was a rotely performance." "He rotelily played the piece." :-)

I may have written you about this before -- if so, forgive me -- but lawyers use "timely" in the adverbial sense. "The complaint was timely filed with the court but untimely served on the defendant." So there is perhaps a precedent for a formation like "rotely."


Posted by Mark Liberman at 08:15 AM

October 17, 2007

Programming Language

Ever since my high-school English teacher ran a course on "Electronic Grammar," I've been intrigued by the idea of writing programs to analyze language. Thirty years later the data is more accessible and the programming languages are much easier to use; LanguageLog contains many posts that demonstrate the value of simple programs and plots. The Natural Language Toolkit (NLTK) is designed to make it easy for anyone to write Python programs to access language data and generate tables and plots, and a new version has just been released. NLTK includes a free online book with over 200 graded exercises, including some inspired by LanguageLog such as the above vocabulary growth curve for presidential addresses. It also contains a large software library, 480Mb of data in dozens of languages, interactive graphical demonstrations, and distributions for Windows, Mac OSX and Linux, all free. Although NLTK is now used in over 50 universities, I hope NLTK will go full circle, so that high-schoolers will teach themselves to write programs to analyze language and to test the dubious claims that are often made about language.
Posted by Steven Bird at 08:21 PM

The tale of the tape: Those fast-talking southerners

Well, John Edwards, anyhow. He talks about 10% faster than Hillary Clinton, and about 25% faster than Bill Richardson. On the other hand, Mitt Romney, the overall fast-talking champ among the serious presidential hopefuls, has him beat by 12%. At least, that's the tale of the tape in the most recent Democratic and Republican debates.

The NYT has been trying hard to improve its web presence, with blogs springing up like mushrooms in all the nooks and crannies of its site. They've also begun deploying interactive flash apps for one kind of analysis or another, including this one for the Sept. 26 Democratic debate, and this one for the Oct. 9 Republican debate, which are meant to help you navigate the transcripts and the video recordings.

I like this approach to multimedia navigation and annotation -- if you're interested in exploring such things for yourself, you might take a look at Project Pad, among other free software efforts to create tools for this sort of thing. Flash produces a nice, crisply responsive user experience -- though it's unfortunate from my point of view that users can't cut and paste content easily. (On the other hand, I suppose that many "content providers" think that this is a good thing).

Anyhow, for this morning's Breakfast Experiment™ I thought I'd take advantage of the rather modest annotation offered by these specific NYT apps, which present a summary by speaker of word counts, speaking times and speech rates (in words per minute). (I obviously don't vouch for the details behind these numbers. Different ways of quantifying word counts and speaking time can produce quite different speech-rate numbers, as discussed here -- and I'm not even sure that the NYT used the same methods to calculate word counts and speaking times in the two debates. So caveat lector...)

From the Oct. 9 Republican debate:

  Words Time Rate
Moderator 3018 15:56 189
Sam Brownback 1558 7:58 196
Rudolph Giuliani 2964 14:25 206
Mike Huckabee 1669 8:03 207
Duncan Hunter 1095 5:48 189
John McCain 2099 11:59 175
Ron Paul 1067 5:43 187
Mitt Romney 3084 13:57 221
Tom Tancredo 1377 7:01 196
Fred Thompson 2910 15:57 182
All candidates 17823 90:51 196.2

The Sept. 26 Democratic debate:

  Words Time Rate
Moderator 3670 21:02 174
Allison King
(asked questions)
426 2:31 169
Joe Biden 1402 7:23 190
Hillary Clinton 3153 17:14 183
Chris Dodd 2202 9:56 222
John Edwards 2478 12:20 201
Mike Gravel 776 4:27 174
Dennis Kucinich 1423 7:31 189
Barack Obama 2582 13:47 187
Bill Richardson 1948 12:04 161
All candidates 15964 84:42 188.5

Overall, the Republican candidates are about 4% faster on average; but an unpaired t-test on the candidates' rates suggests that this difference is not statistically significant. It's perhaps worth noting, though, that stereotypes are not in general confirmed: John Edwards, despite his southern drawl, is around the 75% percentile of this bunch. Hillary Clinton, who as a woman should by stereotype be a faster talker than her male competitors, is below the median in this sample.

[Update -- Ben Zimmer observes:

Clearly, it's the Southwest where people talk slowly, given the low rates for New Mexico's Richardson and Arizona's McCain. They've obviously learned to conserve their energy to adapt to the desert climate...


Posted by Mark Liberman at 09:53 AM

The Islamic language family

One small point went unnoticed in Sally Thomason's very sharp two-part critique (here and here) of Tecumseh Fitch's recent short article in Nature. It was spotted by my sharp-eyed Edinburgh colleague Bob Ladd. The artwork accompanying Fitch's article depicts the tree of Indo-European language relatednesses. And the branch leading toward such languages as Russian is labeled ISLAMIC.

It is possible that Fitch never saw this at the proof stage, and will learn about it right here on Language Log. In my experience of writing for Nature, proofs of commissioned commentaries and the like are sent by fax or PDF at a stage where the accompanying decorative artwork isn't necessarily in final form when the author checks the text. If I recall correctly, when Barbara Scholz and I read the proofs of our short piece ‘Language: More than words’ for publication (in Nature 413, issue no. 6854, 27 September 2001, page 367), we hadn't seen the picture that was to appear with it.

Bob and I both felt that the most likely explanation for the slip here (there is of course no such language family as "Islamic", anywhere in the world) does not lie in sheer ignorance. (We don't think anybody believes Russian is in a family of Islamic languages, except for those who have come to believe it in the last few days because they read it in a diagram in Nature, and those who have made a totally unjustifiable leap of inference from President Putin's current visit to the Islamic Republic of Iran.) Our guess would be that (a) somebody's handwriting was misread by an artist who was just copying lettering and needed a second cup of coffee and was not thinking about plausible historical relationships between languages, and crucially, (b) the direction of the slip was reinforced by the very high frequency of the word "Islamic" relative to what was actually intended, namely "Slavic".

Frequency commonly affects the direction that error takes; you may recall my speculation about the case I discussed in "Hammer, jammer, slammer, stammer, grammar". To check the guess about frequency in this case, I looked at the number of pages found by Google News (UK) for searches on the two relevant words. Slavic: 247. Islamic: 52,561. The defence rests. The slip was probably a frequency-reinforced error in handwriting recognition.

Perhaps, though, it would still be a good idea for Nature to print a correction.

Posted by Geoffrey K. Pullum at 07:11 AM

October 16, 2007

Eggcorns on ABC News

For anyone who's near a television this evening, you might want to turn on the 6:30 broadcast of ABC World News with Charles Gibson. Or, if you're not near a television, just click over to the webcast. At the end of the show is a short piece in honor of National Dictionary Day, and I was interviewed for it. I talk about how idioms get their spellings (and meanings) reshaped into new variants, sometimes to the point of meriting dictionary inclusion. Of course, we call those eggcorns around these parts, but unfortunately my discussion of the exciting field of eggcornology was too much for a two-minute segment. If you want a fuller exploration, check out my column on OUPblog that inspired ABC News correspondent Robert Krulwich in the first place.

[I'm not sure how long ABC's link to the video segment will last, so I've archived it here (15MB) and here (4MB).]

Posted by Benjamin Zimmer at 06:04 PM

New frontiers in WTF grammar

In response to my post about "Would you mind go checking on the laundry?", other people are sending me examples of grammatical variation. Thus Dick Margulis:

Coincidentally, there's a current thread on COPYEDITING-L about constructions like "You need your eyes testing."

UK correspondents report that it sounds perfectly normal to them.

Whether or not it's a regional thing, that construction is certainly out there. To me, on the other hand, it sounds like the output of bad machine translation. But right now I need a lecture preparing, so I'll leave this one for the experts. Arnold?

[Update -- under the Subject line "We don't need our grammar correcting", Sarah McEvoy writes:

Just spotted your LL post; it's quite true that the construction "you need your eyes testing" is perfectly normal in the UK, but, to quote Inigo Montoya, I do not think it means what you think it means. You concluded with "I need a lecture preparing", meaning (I think) that you needed to prepare a lecture. However, the construction is never used in that way here; the meaning is passive. If you had said that out of context, I and other UK readers would automatically assume that you needed to have a lecture prepared on your behalf.

It may be worth noting that at school I was once brought up short by a teacher who said (jocularly!) to another pupil, "You want shot!" This sounded really odd to me. I would have automatically expected "shooting".

What, out of interest, would be said in the States? Would it be "you need your eyes tested"? That is heard over here too, but I suspect "testing" would be perceived by the majority of Brits as more correct.

Yes, we (or at least I) would say "you need your eyes tested". But the issue here is not cases like "you want/need shooting", which (ethical issues to the side) are fine on both sides of the Atlantic, as far as I know. ("Your eyes need testing" is thus a normal and unremarkable phrase for Americans, or at least for me.) The problem is phrases of the form "X needs Y V-ing", where Y is construed as the object of V.

As for the active/passive distinction, I guess I thought it would be more of a middle, along the lines of "my eyes need testing" = "I need my eyes testing". Thus, I thought, "my lecture needs preparing" would correspond to "I need my lecture preparing". I was reluctant to believe that the gerund-participle form, which is normally active in meaning, has become effectively passive in this case.

However, perhaps there is some variation on this point among British speakers. Under the Subject line "You need your grammar checking", Vicky Larmour writes:

the construct that you say sounds like "bad machine translation" does indeed sound perfectly normal to my English ears! I'm wondering what the US-English equivalent would be - "You need your eyes tested"? "You need your eyes to be tested"?

(I didn't even notice your use of "I need a lecture preparing" on the first couple of readings, it sounds so natural to me)

Posted by Mark Liberman at 09:39 AM

Ask Language Log: "Would you mind go checking on the laundry?"

Thomas Mills Hinkle asks

Just now, my wife asked the following "Would you mind go checking on the laundry?" The (to me) error "go checking" made me think: what is the deal with go X constructions in English?

The context makes it clear that Thomas means X=Verb (as opposed to things like "crazy" or "fishing" or "missing"). Consulting his intuitions, he observes that:

Go X works in:

* Imperatives: "Go check the laundry".
* Combination with auxiliary verbs: "I'll go check", "You should go check", "You might go check..."
* Infinitive phrases: "I like to go check the laundry".

Go X sounds slightly dodgy with:

* "do": "Did you go check the laundry"... "I did go check the laundry"

Go X is unacceptable with:

* Present: *"He go checks", *"He goes check"
* Progressives: *"He's go checking", *"He's going check"
* Gerund phrases: *"I wouldn't mind go checking the laundry", *"I wouldn't mind going check the laundry"

My intuitions are pretty much the same, except that "Did you go check the laundry?" seems fine to me.

Go V-ing works when go is an infinitive or imperative and V-ing is the complement of go, as in the common phrase "don't go looking for trouble". But as Thomas observes, go V as a whole doesn't have any gerund-participle form, and attempts to use go V-ing (or going V, or going V-ing) for this purpose don't work.

Except that for some people, this is apparently not true.

Looking on the web, I find a certain amount of stuff like this:

"He never came out of his room but breakfast should be ready soon.Would you mind go getting him?"Kurmam asked politly.
“That’s my hope,” Barbara replied. “Dinah, would you mind go getting the wheel chair for me. I’m getting tired.”
"Would you mind go getting the water? I need to show Mom this, not just tell her."
"And you said that you wanted a bookbag with the Gryffindor logo? I'm pretty sure we have it over there--" she pointed to a corner in the shop "--would you mind go getting it please?"
"Oh, well good job, Nate! Hey, would you mind go getting your sister?"

Sean: Hey, would you mind go checking the microphones one more time?
because im such a lame loser no one seems to be commenting my icons i posted, and you know, i just lvoe comments, so would you mind go checking them out?

I really don't mind go looking, if you want to.

At least some of these seem to be genuine instances of constructions that some native English speakers find OK, not typos or other mistakes. (There are a smaller number of examples like "I wouldn't mind going get coffee beforehand", but these seem most plausibly to come from leaving out "to" by mistake.)

And it's also apparently not true for everyone that go V is impossible with inflected forms of V.  At least, there are lots of examples of "went V" out there. Some of them may be typos for "went to V", but this set of selections from one sequence of Xanga entries shows that there are apparently some people for whom went V is routine:

wE tAlKeD tO jOe bLoWs aNd dAlE(hE wAs bEiN aN AsS)aFtEr tHaT wE wEnT eAt PiZzA aNd tAlKeD tiL liKe 2:15
wE wOkE uP aRoUnD 9:30 aNd aTe bReAkfAsT tHeN wEnT sWiM.
I saw Dale had called so i called him abck and talked to him for a while.then we ate n watched t.v. then went swim until 5:00.
I woke up at 7:15.Left my house at 8:00.Me and my sister Aimee left and went get Devon V.
N i called Melanie and went meet her in Moreauville.
I woke up at 9 n waited for his call.At aroung 11 o'clock I went get my cell phone n saw that i had a missed call from a # so i called it back....

It looks to me like there's a certain amount of microvariation in the grammar of go V -- this is the kind of stuff that emerges when we can search the informal writing of millions of people.

The basic go V construction has been around in English for a long time. The OED explains in its entry for go that

32. Instead of, or in addition to, the place of destination, the purpose or motive of going is often indicated. This may be expressed in various ways: a. by the simple inf. to go get : to go and get; to reach; to work hard or ambitiously (cf. GO-GETTER). Now colloq. and U.S.

go look! used to convey a contemptuous refusal to answer a question (obs. exc. dial.; common in Derbyshire).

and gives citations all the way back to Beowulf:

Beowulf (Z) 1232 Eode þa to setle. 1375 BARBOUR Bruce I. 433 Ga purches land quhar euir he may. c1386 CHAUCER Shipman's T. 223 Lat vs heere a messe and go we dyne. c1475 Rauf Coilȝear 157 Ga tak him be the hand. 1542-5 BRINKLOW Lament. (1874) 111 That I shulde go pour out my vyces in the eare of an vnlearned buzarde. 1591 SPENSER Teares Muses 398 Now thou maist go pack. 1602 Narcissus (1893) 87 Come, daunce vs a morrice, or els goe sell fishe. a1625 FLETCHER Mad Lover II. i, There's the old signe of Memnon: where the soule is You may go look. 1668 HOWE Bless. Righteous (1825) 199 We mighte as well go preach to devils. 1724 DE FOE Mem. Cavalier (1840) 71, I bid him go take care of his..things. 1795 Ann. Agric. XXIII. 315 Nor does the drilled corn..go lie (as the farmer calls it) so readily as the broad-cast. 1813 JANE AUSTEN Lett. (1884) II. 216 Your Streatham and my Bookham may go hang

Thomas (the guy who wrote in with the question) suggested a theory about the origins of the construction:

"Go X" seems to be a "short form" for "Go and X", which is what I would say for other forms, such as "He's going and checking the laundry" and "I went and checked laundry."

The "full" form also strikes me as correct where the shortened form would work, as in "I would go and check the laundry, but this blog is really interesting."

So, my question is, what is the deal with these "go" forms? What's the role of "go"? Are there other verbs that get clumped together with other modals/main verbs in bare forms (in infinitives, with modals) but then get re-parsed as coordinate phrases (X and Y) in conjugated forms? (or is that what's going on?)

Since go V has been used in English for such a long time, if it derived historically from go and V, it happened a long time ago.  The OED gives go and V a separate sub-entry, which also draws attention to its connotation, which CGEL (p. 1303) calls an "emotive ... overlay of disapproval, annoyance, surprise, or the like":

c. by and with a co-ordinated verb. In the modern colloquial use of this combination the force of go is very much weakened or disappears altogether. In the positive imperative go is often nearly redundant (cf. L. i nunc, et...); otherwise, to go and (do something) = ‘to be so foolish, unreasonable, or unlucky as to ----’. So in the vulgar phrase (I have, he has, etc.) been and gone and (done so and so).

The citations go back a thousand years, but I'll select one that is s fairly accessible to modern readers, and also clearly tinged with disapproval:

1558 SIR T. GRESHAM in H. H. Gibbs Colloquy on Currency App. 6 Againste all wisdome the seyd bishoppe went and vallewid the French crowne at vjs. ivd. 1

CGEL groups go and V with with try and V, be sure and V, etc. Here's the discussion of try and V:

This idiomatic construction is syntactically restricted so that and must immediately follow the lexical base try; this means that there can be no inflectional suffix and no adjuncts. She always tries and does her best and We try hard and do our best can only be ordinary coordinations. There are two forms that consist simply of the lexical base: the plain form ... and the plain present tense ... But the verb following and is always a plain form, as is evident when we test with be: We always try and be/*are helpful. In spite of the and, therefore, this construction is subordinative, not coordinative; and introduces a non-finite complement of try. Be sure works in the same way as try, except that the lexical base of be is only the plain form, so this time there is no plain present tense ...: We are always sure and do our best is not possible as an example of this construction (and unlikely as an ordinary coordination).

Other instances of the same general type include sit and V, stand there and V, be an angel and V -- but each case has its own peculiarities, as you can discover for yourself. And the existence of a parallel form lacking the and is peculiar to go; so I don't think that the hypothesis of and-deletion gets us very far in explaining the distribution of contemporary forms.

Still, there are some significant affinities. Checking off the characteristics mentioned by CGEL in connection with the V and V pattern: For most of us, the go of go V must be the lexical base go (whether as the plain form, as in Let's go get it,  or the plain present tense, as in I always go check the water level at noon), and not an inflected form, as in *We went get it, or *I'm going get it immediately. And for most of us, interpolated adjuncts are no good: *Go right now do your homework.

Meanwhile, I hope that Thomas went and checked the laundry.

[Update -- Geoff Pullum writes:

With some reluctance, I changed "can only be an ordinatary coordination" to "can only be an ordinary coordination" because it was in a quote from CGEL. Clearly the words "subordinative" and "coordinative" a couple of lines earlier had apparentatly inductated in you a kind of maniacatal predisposatition to insert "-at-" in lexatemes that etymatologicatally providated no reasonatable excuse for such insertation. But in making my correctation I have destroyed a nice piece of genuinely attestated evidence for a sort of speech error in writating.

I agree that the morphophonemics of typing errors is a unjustly neglected research area. And I've occasionally considered registering "Word's Worst Proofreader" as a service mark with the USPTO.]

Posted by Mark Liberman at 07:50 AM

October 15, 2007

Never talk metric to decent folk

After reading Arnold's post on "Cow-towing to Celsius", I can't resist pointing out that one of the Oklawaha County Doctrines of Citizenship is "Never talk metric to decent folk". And even though there's no metric talk in Gamble Roger's Bovine Midwifery story, it does put cows in the middle of the the culture clash between the good citizens of Oklawaha County and the forces of metrication represented by community organizer Narcissa Nonesuch.

Posted by Mark Liberman at 07:17 PM

Cow-towing to Celsius

The Scientific Activist of 10/13/07 reported on

responses to an announcement by Chief Meteorologist Tim Heller on Houston ABC-13's Weather Blog when he announced that the TV station's weather report would now include temperatures in Celsius in addition to Fahrenheit.

Among the ranting responses was this one, with the wonderful cow-tow in it:

This is just another example of giving in to people who come here from other countries and are too lazy to learn our ways (English, non-metric temps, etc.). Why should they have to learn our ways, we feed them their ways so they don't have to bother. This is a TERRIBLE idea. If I need to know how to convert something from metric to American temps, I will get a book and figure it out or find it on the internet. They should do the same. With the internet, anything can be learned without it having to be fed to us. I don't expect other countries to cow-tow to my English, we should NOT cow-tow to their language and desires to not bother to learn our language and ways.

(Hat tip to Paul Armstrong.)

Before I go on to cow-tow, here are a couple more responses about the Celsius threat to the American Way of Life in which the metric system and language are tied:

NO, on celsius. This is the United States of America. We speak English and use Fahrenheit. Well, I guess you could show wind speed in kilometers, too. Where does it stop? I guess when we become a Spanish speaking nation.

Just another concession to political correctness and liberalism. By compromising our language, our culture, our standards, and the like we only enable those who refuse to assimilate and only wish to be leeches upon our largesse.

There are more.  To be fair, there are also critical replies to the ravings.

Now for cow-tow (also spelled solid, as cowtow, and separated, as cow tow).  This one is in Brians, under cowtow/kowtow, and was noted in a discussion on the Eggcorn Forum back in March.  You can google up a pile of hits; it's all over the place.  The question then is whether this is a simple misspelling, with initial /k/ spelled by the more common C rather than K; or a spelling like pail for pale in beyond the pail (more on this below); or an eggcorn in which cows are somehow involved (a possibility that the posters on the forum found unlikely).  It is, of course, possible that different people have hit on the spelling by different routes.

As background for further discussion, I note that eggcorns come in three types:

Type 1, involving semantic reanalysis of some part of an expression that is not reflected in spelling.  These are HIDDEN EGGCORNS, like the die is cast taken to refer to casting things in molds, rather than throwing dice (in the ecdb here).

Type 2, involving semantic reanalysis of some part of an expression that's reflected in spelling but not in pronunciation, as in the dye is cast, with the expression taken to refer to coloring things (in the ecdb here).

Type 3, involving semantic reanalysis of some part of an expression that's reflected in both spelling and pronunciation, as in mindgrain for migraine (in the ecdb here).

In these classic eggcorns, there is a reanalysis of one or more parts of an expression as representing lexical material not in the original and contributing to the (perceived) semantics of the result.  In types 2 and 3, the reanalysis is reflected in the spelling.

But there are other errors in which one or more parts of an expression are re-spelled so as to replace opaque parts by recognizable lexical material, but without any noticeable improvement in the semantics; what gives rise to them is a drive to find familiar elements as much as possible.  I'll call these DEMI-EGGCORNS.  The errors that I called PAILS in an earlier posting -- named for the pail of beyond the pail -- are demi-eggcorns: they provide familiar parts that nevertheless don't contribute meaning to the resulting expression.

Of course it's possible that once the reanalysis has been made by some people, others will find some way to rationalize the result.  Maybe there are people who think pails are involved when something is beyond the pail.

And maybe there are people who think that cows are involved in cow-tow.  But I'd guess that many people who use this spelling are just pleased to see a familiar element, cow, in the expression, and treat the whole expression as yet another puzzling idiom of English.  That is, I'm suggesting that many occurrences of cow-tow are demi-eggcorns (some probably are simple misspellings) -- of a type corresponding to the type 2 eggcorns above, with spelling altered but pronunciation preserved.

So you're asking if there are demi-eggcorns corresponding to the type 3 eggcorns above.  Here's a candidate, from a discussion on the American Dialect Society mailing list back in September: southmore, as in "freshman southmore junior senior":

Clubs: Track, Basketball, Freshman/Southmore Choir, Powderpuff Football, Intramural Football, & Senior Yearbook Committee, ...  (link)

You were strong as a freshman on this board, but then you suffered from the southmore slump. I was starting to lose hope, but damn son, you are right back ...  (link)

1125 FR - Freshman
1125SO - Southmore
1125JR - Junior
1125SN - Senior  (link)

hey, i am abrahan garza, class of 1997, and was in band from 1994 to 1997 minus my southmore year when i was the mascot.  (link)

This variant (which seems to be widely distributed in the U.S. and, from testimony on ADS-L, goes back to the 19th century) is clearly based on the disyllabic pronunciation of sophomore, with both the vowel and the offset consonant of the first syllable reshaped so as yield a familiar English word, south, in place of the unfamiliar first syllable of the original.  Maybe some people think the compass point has something to do with the second year of college, but I suspect that the motive for the reshaping is primarily the search for familiar elements, for some people quite possibly encouraged by the south of the equally opaque southpaw.  (Larry Horn, who made this suggestion on ADS-L, noted that historically southpaw is compositional, but with an etymology that hardly anyone appreciates; for most people, it's just a idiomatic compound.  For the etymology, see southpaw in AHD4.)

To sum up, I'm suggesting that there are two drives behind reshapings: to find familiar elements as much as possible, and to find meaning as much as possible.  Classic eggcorns show both effects, demi-eggcorns only the first.  Cow-tow looks like a type 2 demi-eggcorn, southmore like a type 3 demi-eggcorn.

An entertaining final note: back in November 2005 on the Eggcorn Forum, Ken Lakritz noted cow toe as a variant spelling of kowtow.  He suggested that showing deference by kissing someone's toe might be involved in this version -- but it could be based on a mispronunciation of written {kowtow} (or {cow-tow}), based on the fact that {tow} can be pronounced like toe.

Posted by Arnold Zwicky at 02:42 PM

On being manifestly wrong

Since I've written about the legal term of art, "manifest injustice," twice now (here) and (here), my headache has taken a new direction, all of my own doing. It seems that the two law professors I quoted in my second post were actually agreeing with each other rather than, as I had assumed, giving me contradictory advice. It would be easy for me simply to say mea culpa, but part of my error and confusion has a slight linguistic flavor, leading to what I hope will be my final post on this topic (but for a neat piece on apologies, take a look at this New York Times article).

For one thing, I miscopied part of Professor Weinberg's message to me, which is a no-no in anyone's book. For reasons unknown even to me, I added a "probably" where it wasn't present and left out the word, "manifest," before the  word, "justice." That was bad enough, but it was his next sentence that I really messed up. It was difficult for me to understand and I  guess I just didn't parse it properly. What he wrote was:

That is, if requiring the defendant to stick to his plea would probably lead to injustice, but things aren't really clear and there's room for  argument either way, then defendant is stuck with the plea.

My comprehension problem began when I incorrectly assumed that Professor Weinberg intended "then the defendant is stuck with his plea" to be a new sentence and that he had somehow forgotten to start with a capital letter. So, in a misguided effort to help, I changed it (note to self: don't ever do this again). I'm now sure that Professor Weinberg intended his compound clause, "but things aren't really clear and there's room for argument either way," to be a parenthetical statement coming between his initial dependent clause and his final main clause, which would make a lot of sense. But I didn't read it that way so I foolishly tried to be helpful and edit it as a new sentence. Good intentions; bad results. As a result, my post misrepresented Professor Weinberg and made it look as though he was saying the very opposite of what he meant. For this, I apologize to him and my readers. Mea Culpa, indeed.

But the really nifty part of this whole episode is Professor Weinberg's final message to me, correcting my misreading. It began: "Not to prolong any of this or make your headache worse...it [my post] was fun to read...." And, after clarifying what his original message really meant, he concluded this one saying: "Anyway, I know it wasn't intentional, and it's hard to publish a whole lot of blog posts over an extended period of time without coming up with some bloopers."

How many letters of correction ever compliment the writer who misstated his words? And how many such letters ever show any  understanding that the errors weren't intentional? Now that's real class!

Posted by Roger Shuy at 11:25 AM

Myanmar is Mama

And Burma is Bama, apparently. And Mama is the literary pronunciation of the more colloquial Bama. John Wells explains:

In Burmese, this name Myanmar is essentially just a variant of the name Burma. It is transliterated as Myan-ma or Mran-ma, and in the local language pronounced something like [ma(n) ma], as against [ba ma] for the traditional name.

According to Wikipedia,

within the Burmese language, Myanma is the written, literary name of the country, while Bama ... (from which “Burma” derives) is the oral, colloquial name. In spoken Burmese, the distinction is less clear than the English transliteration suggests.

So where did those R's come from? They're a British spelling convention to indicate long vowels:

What interests me now, however, is the question of how Americans and other rhotic speakers are supposed to pronounce this name. In both Myanmar and Burma the English spellings assume a non-rhotic variety of English, in which the letter r before a consonant or finally serves merely to indicate a long vowel: [ˈmjænmɑː, ˈbɜːmə]..

So any American who says the last syllable of Myanmar as [mɑːr] or pronounces Burma as [bɝːmə] is using a spelling pronunciation based on British, non-rhotic, spelling conventions.

I don't know anything about Burmese, and haven't checked the details involved here -- Bill Poser, who does know something about Burmese, may have more information to offer.

This reminds me that I'm still curious to know the truth about the "Burmese episode" at Yale that "was as funny to outsiders as it was painful for those involved". But there's certainly nothing funny about what's going on in Burma now (Seth Mydans, "A Few Voices From the Deepening Silence", NYT 10/14/2007):

A young man described how the junta has clamped down on social exchange, destroying trust among people:

There is no more connection between people. It’s been broken. In our own neighborhood, the security groups will arrest anyone who is heard talking about these events. Even at tea shops we can’t talk about these things. These thugs will remember who you are and come to arrest you later. We can only talk to people we know on the street and never to strangers now. No one says anything at the market and everything has to be in secret. The bars have emptied out both because no one has any more money and what fun is it to get drunk when you can’t talk?

Even now we don’t dare take our transistor radios to listen to foreign broadcasts outside. Just in the last few days, we have been threatened with arrest by local authorities for doing this in our ward. Anyone with a cellphone or camera will have it confiscated.

Posted by Mark Liberman at 07:17 AM

October 14, 2007

Fitchifying Language Change

This is the second of two posts today on the fitchification of aspects of historical linguistics. The first focused on a misunderstanding of the history of (modernish) linguistics in Tecumseh Fitch's recent Nature article. This one is on his odd views about the nature of language change. Some are just simple mistakes, or at best uncritical acceptance of highly controversial views on the topics he's addressing. Examples are his claim that Proto-Indo-European was spoken `some 10,000 years ago' (the prevalent view among Indo-Europeanists, who are the people who have the linguistic evidence, is more like 6,000 years); his belief that the family tree of related languages was `the crowning achievement' of 19th-century historical linguistics (see comments in my earlier fitchification post today); and his belief that people `don't generally invent words or grammatical forms'. On this last one, anyone who was ever a teenager, or who ever even met one, should know better -- in addition to the world of teenage slang and all those corporate inventors of terms for new products, consider novel constructions like the -f***in'- infix in words like abso-f***in'-lutely. (No, of course those asterisks aren't necessary on an enlightened forum like Language Log. I just think they're cute.) And community-wide examples of deliberate linguistic changes, at all levels of linguistic structure, are turning up quite frequently these days -- changes, that is, that alter an entire language, typically in a small speech community. But Fitch also reveals a more interesting misunderstanding in his conception of what language change, especially lexical change, is.

The problem is that Fitch seems to be conflating three very different change processes. First, there's lexical replacement, as when early English hound was replaced by dog as the generic term for the animal, or when Old English deer was replaced by the borrowed word animal as a generic term for a beast. Second, there's analogic regularization, as when the verb fly, without giving up its inherited past tense flew, acquired an alternative past tense form flied (with a slightly different meaning) in baseball expressions like he flied out to left field. There are also occasional analogic changes in the opposite direction, away from global regularization, as when wear changed from a weak past tense, which would have become weared if it had survived into Modern English, to a strong past tense wore on the analogy of rhyming verbs like tear, swear, and bear; it is not true that analogic change always favors the majority pattern, so Fitch is wrong when he says that irregular verbs `remain only as irregular residues'. And third, there is sound change, which (according to the regularity hypothesis, which dates back to the late-19th-century Neogrammarians) is blind to considerations of morphology, semantics, and -- crucially in Fitch's context -- frequency.

Fitch emphasizes frequency of occurrence of particular words because that is the focus of the two articles he is surveying in his essay (see Mark's post of a few days ago for the links). One set of authors argues that the most frequent words resist analogic regularization; the other set argues that the most frequent words resist change, at least in Indo-European languages. Fitch acknowledges that the `realization that frequency of use has a significant role in language change is nothing new', but claims that `the use of sophisticated methods...to quantify these relationships is an important step forward'. He does not say why it's an important step forward, but never mind. It's nice to have quantitative evidence to support what historical linguists have known for a hundred years or so, and it's not necessarily a trivial result.

Some of Fitch's examples are unfortunate choices as illustrations of his thesis. He says that `high-frequency English verbs retained their ancestral irregular state ("go/went" or "be/was")'. It depends on what he means by `ancestral', of course, but he might be interested to know that some of the irregularities in these two verbs are not all that old. Old English had a regular past tense for go; the earliest example of went given in the Oxford English Dictionary is from 1484, five or six hundred years after the earliest documented Old English. So this is another case like wear, where a regular verb has become irregular over time. The history of be is more complicated. According to the OED, the modern paradigm of this verb is an amalgam of three different Proto-Indo-European verbs -- PIE *es- `be' in the present tense (am, are, is, are, with analogic changes merging several forms); PIE *bhew- `become' in non-finite forms (be, being); and PIE *wes- `remain, stay, continue to be' in the past tense (was, were, again with several analogic mergers). The inherited verb for `become' was still a distinct verb in Old English, and only later merged with the other two combined verbs as the infinitive. So this most irregular of Modern English verbs, like go and wear, has in fact become more irregular in recorded history. Of course these quasi-counterexamples don't affect the statistical patterns that show that frequency affects the tendency for regular formations to spread analogically at the expense of irregular formations. They do suggest that it's a good idea to check one's facts.

Fitch's most serious mistake, though, is not his choice of examples that don't show what he thinks they show. His most hair-raising error is the flat assertion that `frequently used words are resistant to change'. This is clearly true if the type of change is analogic regularization. It may or may not be true of lexical replacement (and I admit that I haven't yet read the articles he refers to, to see if they make any distinctions according to the type of lexical change); it won't be surprising if it holds there too. But it most certainly is not true of lexical changes due to sound change. The Neogrammarians' hypothesis that sound change is inevitably regular runs into problems, primarily (but not entirely) having to do with the differential spread of sound changes through a speech community. However, the fact that the Comparative Method, which rests heavily on the regularity hypothesis, has proved to be effective in the establishment of language families and the reconstruction of proto-languages for all but a very small subset of the thousands of human languages is in itself evidence that regular sound change is the norm, not the exception. And regular sound change is indeed blind to frequency and all other nonphonetic contextual factors. So it is nonsense to say that frequent words resist change unless one qualifies the statement to exclude regular sound change. When Fitch speculates that `new phonological forms might arise less often for high-frequency words because errors of perception, recall or production are less common for frequently used words', his reference to perception and production clearly includes lexical changes due to sound change. He himself does not mention pronouns. A member of one of the two author sets for the quantitative Nature articles (I didn't catch the interviewee's name), interviewed the other day on a BBC radio program, did cite pronouns as examples of unlikely-to-change words, and certainly they are among the most frequent words in Indo-European languages. But pronouns aren't all that unlikely to change; see this earlier Language Log discussion of the `super-stable pronoun' hypothesis (it's a recurring theme).

I've been critical of Fitch's article, so I should emphasize one point on which we agree: he is quite correct (if unoriginal) to say that an `adequate explanation for [what he quaintly calls] glossogenetic phenomena must incorporate individual and collective levels of description, and show why they are necessarily related'. I'm skeptical about the likelihood that statistical analyses imported from other disciplines will solve this difficult problem. It's not that statistical explorations of relationships between frequency and lexical replacement and/or regularization, and in other areas of language change as well, should be ignored; they have produced very interesting results in a variety of domains. But errors of the sort that Fitch makes show clearly that new approaches to historical linguistic analysis will be successful only to the extent that they take into account the results of historical linguistic investigations over the past hundred and fifty years or so. Failing to learn something about a field one wishes to contribute to is all too likely to lead to reinvention of the wheel at best, and to a garbage in/garbage out problem at worst.

Posted by Sally Thomason at 10:00 PM

More gapless relatives

Mark Liberman has just posted about this instance of a gapless relative in (non-standard) English:

How can we provide a service that the consumer goes, "Wow, you really made this easier for me"?

As it turns out, non-standard English has (at least) three types of gapless relatives, two with pronouns instead of gaps, and the type above, with neither a gap nor a pronoun. 

The example above has a NP of the form

a service   that the consumer goes X

(where the head is italicized and the relative clause bolded).  There's no gap in the relative clause corresponding to the head, nor is there any pronoun performing that function.  The head picks out a type of thing (in this case, a service), and the relative clause gives us characterizing details about the particular instance of this type; the example above is roughly paraphrasable as "the sort of service such that the consumer goes X".

This is not an especially convincing example of a gapless relative, since it might be analyzed as merely missing a preposition, that is as a truncated version of

a service   that the consumer goes X about

 (Recall our discussion of missing prepositions in a series of postings that began in 2005 and picked up again this year.)

But it's not hard to find other examples that don't submit so easily to this analysis.  Here, for example, is Peggy Noonan in the Wall Street Journal, as reported to the newsgroup sci.lang by Ron Hardin on 6/29/06 (italicization and bolding as above):

Frank Rich is running around with his antiwar screeds as if it's 1968 and he's an idealist with a beard, as opposed to what he is, a guy who if he pierced his ears gravy would come out.

A version with explicit anaphora to his ears is just as problematic as the original:

a guy who if he pierced his ears gravy would come out of them

The pronouns he and his in the subordinate clause if he pierced his ears aren't relevant here, as you can see from recastings with such that:

a guy such that if he pierced his ears gravy would come out (of them)

(which is clunky but standard English) and with a gapped relative:

a guy who if he pierced his ears ___ would have gravy coming out of them

(which is also standard).

One more, from radio station KFJC's Robert Emmett, on the Norman Bates Memorial Soundtrack Show of 3/12/05:

There are films that you are lucky that you don't have to sit through the whole thing.

I'll call these NoPro gapless relatives, to distinguish them from gapless relatives with "resumptive pronouns" in them.  Resumptives are pronouns that function in relative clauses much like gaps do in English.  They are incredibly common in the languages of the world; often they are just the ordinary personal pronouns put to this special purpose.  This is the case in non-standard English usage, as in the resumptive variant of the last ear-piercing relative above:

a guy who if he pierced his ears he would have gravy coming out of them

I'll call this sort of example, where the resumptive is in alternation with a gap, a ResPrince gapless relative -- Res for resumptive, Prince for Ellen Prince, who's studied them (see her 1990 article "Syntax and discourse: a look at resumptive pronouns", in BLS 16.482-97).  Another example, a paraphrase of Michael Moore speaking in his movie Fahrenheit 9/11, as discussed on the OutIL mailing list in October 2004:

Iraq, a country that it has never attacked us, that it has never threatened us...

The gapped variant is fine (and standard):

Iraq, a country that ___ has never attacked us, that ___ has never threatened us...

Still another example, from a NYT op-ed piece by Peter Guralnick on 8/11/07:

Or, as Jake Hess, the incomparable head singer for the Statesmen Quartet and one of Elvis's lifelong influences, pointed out: "Elvis was one of those artists, when he sang a song, he just seemed to live every word of it..."

This one has the extra complication that the relative clause is a "zero relative", with no relativizer, and also a "subject relative", in which the relativized element within the relative clause functions as the subject there.  Subject zero relatives like "There was a farmer had a dog" are non-standard (though they do occur).  Eliminating that non-standard feature still leaves us with a non-standard ResPrince case:

one of those artists who/that, when he sang a song, he just seemed to live every word of it

The gapped version is possible, and standard:

one of those artists who/that, when he sang a song, ___ just seemed to live every word of it

Prince wrote in e-mail at the time that in her 1990 article she

argued that [these resumptives] occur in either nonrestrictives or else (this kind of) restrictives with an indefinite head -- the two kinds of relatives where the relevant entity is evoked by the head alone. (I.e. where the hearer can retrieve or create the discourse entity as soon as the head is uttered, without having to wait for the relative clause, as one has to with definite head restrictives.)

She noted that such resumptives are perfectly standard in Yiddish, though they're non-standard in English.

The ResPrince examples differ from another use of resumptives in English -- to serve in place of gaps in positions from which "extraction" is barred, as in this example from an interview on NPR's Morning Edition on 3/19/07 (the interviewee is talking about Wal-Mart):

They have a billion dollars of inventory that they don't know where it is.

Extraction from inside adverbial subordinate clauses is generally barred (in the technical terminology of syntax, adverbial subordinate clauses are islands for extraction):

*They have a billion dollars of inventory that they don't know where ___ is.

Resumptives are non-standard, but in such cases they're much better than their gapped counterparts, which people usually find incomprehensible, or at least very hard to comprehend.  I'll call this sort of example a ResIsland gapless relative.

A problem in analyzing the data: there are some examples that have a pronoun in the relative clause which is anaphoric to the head of the clause but where there can be some doubt that this pronoun is actually resumptive; these might really be (still more) NoPro cases rather than ResIsland cases.  Here's Holden Caulfield in The Catcher in the Rye, quoted in the New Yorker of 3/14/05, p. 132:

What really knocks me out is a book that, when you're all done reading it, you wish the author that wrote it was a terrific friend of yours and you could call him up on the phone whenever you felt like it.

There's a fair amount of irrelevant detail here.  In particular, the second conjunct in the coordinate object of wish is beside the point.  Also, the it in the subordinate when-clause is not the issue here.  Eliminating this stuff leaves:

a book that you wish the author that wrote it was a terrific friend of yours

The question is now what the status of the remaining it (referring to the book) is.  Is it a resumptive pronoun, or just an ordinary anaphoric pronoun?  Certainly the pronoun is not omissible:

*a book that you wish the author that wrote ___ was a terrific friend of yours

But is that because something has been extracted from inside a relative clause (another kind of island), or because pronoun arguments are not normally omissible in English, as in the example below?

Here's a recent book.  *I wrote.  [meaning 'I wrote it']

Maybe the it is incidental to the matter, as would be suggested by the fact that the relative clause (with head the author) that it's in can be removed, leaving us with a pretty clear NoPro case:

a book that you wish the author was a terrific friend of yours

So maybe the original Caulfield example is a NoPro case too.

I have a few more of these.

A final note: once again, non-standard usage in English reflects syntactic patterns that are standard in other languages.  NoPro relatives are like relative types in (among other languages) Japanese and Korean; ResPrince relatives are like a relative type in (at least) Yiddish; and ResIsland relatives involve resumptive pronouns of a very ordinary sort -- deployed in English to allow expression of meanings that can't be easily expressed by gapped relatives, which are subject to constraints on extraction, while pronouns are not.

Non-standard (and regional and social) varieties are languages, period, and can be expected to exhibit phenomena found in varieties (standard or non-standard) of other languages.  As a result of this fact, studying a range of non-standard (etc.) varieties of one language can provide a rich set of data -- not unlike those derived from typological studies of the ordinary sort and fieldwork on little-known languages -- about what languages can be like.  This is a commonplace among linguists, but it's not always appreciated by other people.

Posted by Arnold Zwicky at 04:02 PM

Fitchifying the History of Linguistics

A few days ago Mark applauded the publication in Nature of several articles on linguistics, including one by W. Tecumseh Fitch, `An invisible hand' (a News and Views essay surveying the other two articles, which are quantitative studies of lexical change). Mark's right, of course, that it's good to see linguistics featured so prominently in such a prominent journal. But as far as Fitch's article is concerned, linguistics might have been better served by non-publication.

Fitch's view of the history of linguistics is distorted, and so is his understanding of language change. There are other oddities in his article -- his Just-So-Story explanation of the slide into pejoration of some English terms for women (hussy, wench, etc.), for instance -- but the main problems are historical. The misunderstandings are worth discussing because they aren't confined to Fitch, and because the real stories are instructive. Addressing the two topics will take some space, so I'll divide the discussions into two posts, starting here with the history of linguistics.

Taking August Schleicher and Jacob Grimm as his prime examples of 19th-century historical linguists, Fitch presents the intellectual transition from historical to synchronic linguistics this way:

Unfortunately, many historical linguists entertained quasi-mystical ideas: August Schleicher...believed that languages are living things, and Jacob Grimm posited a Sprachgeist -- an internal spirit of a language driving it to change along certain lines. Twentieth-century linguists rejected such fanciful notions, and emphasized the capacity of individuals to produce and understand utterances. Noam Chomsky famously characterized this as a conceptual shift from a historical preoccupation with `E-language' (a set of externalized utterances) to an emphasis on `I-language' (principles internalized by the language learner).

That is, according to Fitch, historical linguistics failed because its main practitioners had nutty ideas, so that it had to be replaced by the truly scientific linguistics of Noam Chomsky and his followers. What actually happened is more like this: modern synchronic linguistics got its start early in the 20th century and developed steadily for the next 50+ years, becoming increasingly more widely practiced (and more fashionable), until the publication of Chomsky's Syntactic Structures in 1957 launched the meteoric rise to fame and glory of generative grammar. Historical linguistics certainly stopped being the main game in town, even before 1957; but as a field it's still going strong, building on a very solid late-19th-century foundation. It is one of the most successful historical sciences you'll find anywhere.

Both Schleicher and Grimm were major contributors to the development of 19th-century historical linguistics, and their occasional flights of fancy are far outweighed by their substantive contributions. But -- and here's where Fitch's account begins to go off the rails -- both of them made their contributions before the great breakthrough that still ranks as one of the major achievements in the entire history of linguistics, and for that matter in historical science, period: the Neogrammarians' formulation of the regularity hypothesis of sound change, which led directly to the development of the Comparative Method, through which "genetic" relationships among languages (via descent with modification from a single ancestral language) are established and sizable chunks of long-vanished parent languages are reconstructed. Schleicher and Grimm were not Neogrammarians; the achievements of the Neogrammarians were not fanciful; and those achievements were neither replaced nor superseded by synchronic linguistics. Fitch might have realized this if he had not imagined that `[t]he crowning achievement of these early linguists [read: historical linguists] was a family tree of languages'. Schleicher is famous for his family tree, but it was the Neogrammarians, not Schleicher, who turned historical linguistics into a historical science by developing a rigorous and spectacularly successful set of methodological principles. And in any case, it was not intellectual flaws in late-19th-century historical linguistics that led to the rise of synchronic linguistics.

The actual story is more interesting. Ferdinand de Saussure is often, and with justice, called the father of structural linguistics: it was Saussure's early 20th-century work that inspired the emergence of a new synchronic science of language. His central idea (quoting from Wikipedia) was that `language may be analyzed as a formal system...apart from the messy dialects of real-time production and comprehension'. In other words, Saussure emphasized the importance of system -- structure -- in the study of synchronic language states. But long before he founded structural linguistics (and foreshadowed the distinction between E-language and I-language), Saussure was an Indo-Europeanist. He began his studies in Leipzig in 1876, the very place and year of the Neogrammarian manifesto, as expressed by the great Slavist August Leskien:

Die Lautgesetze kennen keine Ausnahmen!

(in English: Sound laws know no exceptions!)

In 1879, still a student, Saussure published his famous Mémoire sur le système primitif des voyelles dans les langues indo-européennes (Thesis on the primitive system of vowels in the Indo-European languages). This is the initial proposal of the theory that later came to be known as Laryngeal Theory. The significance of Saussure's proposal (formulated when he was all of about 20 years old!) is hardly confined to Indo-European (IE) linguistics. It was in fact the first major structural analysis of a language in Western linguistics -- the language, in this case, being Proto-Indo-European, the reconstructed ancestor of the many IE languages. Saussure took the extraordinarily messy and numerous patterns of IE vowel alternations, the so-called ablaut alternations, and reconstructed a much neater and simpler system for PIE by hypothesizing the existence of three consonants that had not survived into any of the then-known IE languages, ancient or modern. He could not have done this if he had not started with a profoundly structural notion of the language system; and the structural notion itself, though it was developed in a dramatic way by Saussure, was prefigured by the Neogrammarian breakthrough, the regularity hypothesis of sound change -- which also makes sense only if language is viewed as inherently systematic.

Saussure's Laryngeal Theory at first met with considerable skepticism from historical linguists. His approach was novel, to put it mildly, and the idea of reconstructing unknown, unattested consonants did not appeal to traditionalists. Acceptance came slowly over the next fifty years or so, aided especially by the decipherment of Hittite, which turned out to have retained some of the consonants that Saussure had reconstructed solely on structural grounds. The triumph of Laryngeal Theory also owed much to the fact that it turned out to be fruitful, permitting the explanation of alternations in morphemes and paradigms that had been wholly mysterious under pre-Laryngeal Theory reconstructions of PIE vowels.

The moral: Far from representing a sharp break with a misguided and inferior past, as Fitch would have it, the rise of synchronic linguistics was an outgrowth of the achievements of the late-19th-century Neogrammarians. Saussure learned Neogrammarian principles as a student at Leipzig; he applied the Neogrammarian structural concept to an essentially synchronic analysis of Proto-Indo-European structure; and thereafter he continued to develop his structural ideas until, in the last decade of his life, he earned his status as the father of structural linguistics.

Posted by Sally Thomason at 01:15 PM

He knows you won't call him on it

John Wells writes from London to point out that The Guardian (October 13, page 32; link here) publishes these words of advice on ending business letters (in a piece by John Harris, but apparently quoted from Ben Harris, the editor of The Oxford Guide to Effective Speaking and Writing):

Finishing: useful tips: If you are expecting a response, use the present continuous form of the verb in the last paragraph, ie: "I look forward to hearing your response". If, however, you want to bring the correspondence to a close, let your grammar reflect this: "We regret we cannot assist you further with your query".

Why he believes the exact form chosen from the verb paradigm would influence the likelihood of a letter getting a response I have no idea, but one thing is clear: the person giving this grammar-based advice cannot tell which verbs are in the present tense (continuous or otherwise) and which are not. It is quite astonishing to see that someone whose job involves knowing about grammar and usage and style and telling others about it should be incapable of distinguishing tensed verbs from non-tensed or telling one construction from another, but the evidence that he doesn't know what he's talking about grammatically is clear.

What is meant by "the present continuous form of the verb" here, I think, must be the progressive aspect construction, as in I am waiting. But the example given does not contain an instance of that construction. I look forward to hearing your response has the verb look in the present tense followed by forward and a preposition phrase with the head to. The complement of the preposition to is a subjectless gerund-participial clause, hearing your response. The verb hear is not in the present tense (it is a participle, and thus shows no tense), and it does not form part of a progressive aspect ("present continuous") construction.

My claim that there is no progressive here is not just some kind of intuition about meaning; it can be supported by independent syntactic evidence. Notice that *I am knowing the answer is ungrammatical, or at least extremely hard to imagine being naturally used, because a purely stative verb like know does not normally occur in the progressive; but I look forward to knowing the answer by tomorrow night is fine, because it is not an instance of the progressive.

I don't think there is anything else Harris could have meant to refer to other than progressive aspect.

In A Comprehensive Grammar of the English Language by Lord Quirk and his colleagues (Longman, 1985) the index entry for "continuous aspect" just contains an arrow pointing to "progressive".

The term continuative (not "continuous") is used in The Cambridge Grammar of the English Language (CUP, 2002) for an aspect involving continuation of a situation through a time period and up to the point of orientation, as in It must have been here for a thousand years as opposed to It must have been awful to witness that; but this has nothing to do with the example under discussion.

Henry Sweet, in his New English Grammar, Part I (Clarendon Press, Oxford, 1891, p. 102) does use the term "continuous", but only to refer to a sense of the present tense involving habituality (He lives in the country) rather than recurrency (He goes to Germany twice a year), and again this has nothing to do with the case at hand.

Harris cannot possibly mean that look is the recommended form for the purpose of getting people to write back, because the example he uses for contrast (We regret we cannot assist you further with your query) uses exactly the same form in the main clause, the plain present regret and cannot. There is no potentially relevant contrast between the grammar of the two examples except that the first contains a gerund-participle.

What is going on when the editor of a book on how to write offers grammar advice and opinion in a leading English-language newspaper and gets the relevant grammar utterly wrong?

The first thing to note is of course that the state of grammar knowledge among today's intellectuals is at a low ebb, and linguists must bear some of the blame for not having had enough influence to improve things over the past century.

But the other factor, it seems to me, is that Harris knew he didn't have to haul down The Cambridge Grammar of the English Language or any other reference book, because he knew you wouldn't call him on it. He thought no one would call him on it.

These days, if you make a confident assertion about language and include some technical term like "present continuous form of the verb", people just assume you're an expert and get out of the way. He didn't have to look anything up, because he thought you wouldn't, and no reader of The Guardian would, and he could count on that.

He underestimated us members of the public just a little: he didn't get it past John Wells, and thanks to John's alert, he didn't slip it past Language Log.

Posted by Geoffrey K. Pullum at 11:30 AM

Ask Language Log: gapless relatives

According to Louise Story, "The New Advertising Outlet: Your Life", NYT 10/14/2007:

“We want to find a way to enhance the experience and services, rather than looking for a way to interrupt people from getting to where they want to go,” said Stefan Olander, global director for brand connections at Nike. “How can we provide a service that the consumer goes, ‘Wow, you really made this easier for me’?” [emphasis added]

Lou Hevly asks:

For me this "that" should be "so that", "for which" (a bit stilted) or perhaps "where". My question is whether you believe that the slip was inadvertent or whether this kind of construction has become common.

I'm afraid that I find myself in the position of the pediatrician who amused me, one December day long ago, by diagnosing an infant's rash as "winter eczema", and providing a small sample of an over-the-counter ointment.  In fact, I can't do as well, because at least the doctor translated the symptom into Greek, whereas all I can offer is a transparently all-English diagnosis: "gapless relative clause". And I don't have any ointment to offer.

Even worse, the pediatrician was able to assure me confidently that winter eczema has become very common, due to the dryness caused by central heating, while in contrast, I don't have any idea whether gapless relative clauses have gotten commoner (or less common) in English over the past few decades or centuries.

On the other hand, the doc got paid.

To clarify the terminology: a standard English relative clause includes a "gap" that is semantically connected to its head. Here are two examples from the same NYT article, with the gaps made explicit:

In the 1980s, Nike began the large television campaigns that __ propelled the brand to global fame.
That’s the world we’re all afraid of __.

But there's no reason in principle that a clause without a gap shouldn't be used in a semantically similar way (although English discourages this):

... the smell that Kim was making __ by frying onions ...
... the smell that Kim was frying onions ...

I can add a few other relevant nuggets of information:

Standard English allows a somewhat similar function in clauses introduced by such that -- thus from Columbia University's Guidelines for Review of Misconduct:

For these reasons it is essential that the Vice President with the assistance of the department chairperson or institute or center director and any ad hoc committee created to conduct an inquiry or an investigation foster an attitude such that the accuser is treated fairly and reasonably.

Gapless relatives are found in standard versions of some other languages, notably Korean and Japanese (see e.g. Jong-Yul Cha, "Relative Clause or Noun Complement Clause: Some Diagnoses").

The gapless relatives with that are conceptually similar to what is sometimes called "linking which" (see e.g. "Linking 'which' in Patrick O'Brian", 11/14/2003).

And some people find that Introduction to Syntax (available without a prescription) is as effective as cortisone ointment in reducing the irritation.

Posted by Mark Liberman at 08:02 AM

October 13, 2007

Regional speech rates

In a recent post, Arnold Zwicky mentioned that

A NPR Morning Edition story this morning by Melanie Peebles about gubernatorial candidate Bobby Jindal speaking in the little town of Gramercy, Louisiana, refers to the "locals, who tend to draw out vowels in a speech pattern born of front-porch sitting".

Arnold quoted Charlie Doyle's ironic retort that "northerners talk fast because they sit uncomfortably on those little stoops", and noted that "non-linguists tend to hold to a folk belief that differences between varieties, including geographical and social dialects, have a deep explanation"

There's something else going on here, too, which is that people hear what they expect to hear -- patterns that correspond to their stereotypes, but may not exist as mere behavioral facts.

There are two different stereotypes at work (or play) in Arnold's example -- rural people talk slow compared to city people, and southerners talk slow compared to northerners. Put it all together, and you've got some seriously lazy-mouthed southern hicks.

Or maybe not.

For this morning's Breakfast Experiment™, I did a quick scan of time-aligned transcripts from a published speech corpus, which I had previously used to compare male and female talkativeness and speech rates.

In one section of this corpus, there were 2,329 speakers recorded in Alabama, Arkansas, Florida, Georgia, Kentucky, Louisiana, Maryland, Mississippi, North Carolina, South Carolina, Tennessee, Texas and Virginia -- I took these as "southern" speakers. There were 2,397 from New York and Pennsylvania -- I took these as "northern" speakers.

The mean speaking rate of the "southern" speakers was 173.9 words per minute.

The mean speaking rate of the "northern" speakers was 173.5 words per minute.

In case you think that there might be some crucial information hiding in the rest of distribution, here's a comparison of the percentiles:

If we look at individual states, we find that the 185 people recorded in my home state of Connecticut averaged 174.9 wpm, beating the 107 slow-talking folk from Alabama, who averaged a mere 170.0. However, the 88 citizens of Louisiana who were represented in this collection crossed the wire at a blistering average of 178.1 wpm -- and this was in 2003, way before Katrina washed away their front porches.

Seriously, stereotypes aside, there's no indication of any meaningful group differences in any of this stuff. I haven't seen any credible measurements from other sources suggesting a different answer, either. Now maybe the differences come out in different sorts of interactions, or with different ways of calculating speech rate, or at a different phase of the moon -- I don't know. (If you have some evidence about this, please let me know.)

Meanwhile, pending any evidence to the contrary, I'm putting the slow-talking southern hicks into the same category of mythology as the gabby women.

But just to show that I'm a regular guy, I'll join in the ethnic-stereotyping fun with an anecdote and a joke.

These don't exactly deal with speech rate, but they do address the idea that some individuals or groups might be in a bigger hurry than others.

First the (true) story. Once I was in a grocery store in Austin, TX, waiting in the checkout line. At the front of the line, an elderly woman who had bought a quart of milk was engaged in a long, long conversation with the checkout clerk. I missed the beginning of it, and couldn't hear it very clearly anyhow, but it was something about her nephew, and some neighbor's dog, and someone taking a trip, and her latest medical procedures. It went on for what seemed like hours -- I suppose it must have been ten or fifteen minutes -- and I was tired, and annoyed at having to wait in line for this.

After the old lady finally toddled out, the next woman in line asked a question that perfectly expressed my exasperation: "What was *that* all about?" But she was innocent of irony. The clerk responded "well, you won't believe this, but ..." and they went through the whole thing again, with enthusiastic footnotes and commentary, for another fifteen minutes.

Now the joke. A guy from the city is taking a Sunday drive in the country. As he passes an orchard, he sees a farmer standing under an apple tree near the road, holding up a small pig who is eating apples off the tree. Amazed, he pulls over, gets out, and asks the farmer what in the world he's doing.

The farmer says "well, Petunia here can't reach the apples by herself, so I'm giving her a little help."

The city slicker can't believe what he's hearing. "Isn't that an amazing waste of time?"

The farmer responds, puzzled: "What's time to a pig?"

Posted by Mark Liberman at 07:54 AM

October 12, 2007

Not pregnant under-age southern girl seeks lawyer with view to marriage

The following AP story is getting lots of press:

Ark. Judge Upholds Marriage Law Error
By ANDREW DeMILLO – 1 day ago

LITTLE ROCK (AP) — An error in a new law that allows Arkansans of any age — even toddlers — to marry with parental consent must be fixed by lawmakers, not an independent commission authorized to correct typos, a judge ruled Wednesday.

The law, which took effect July 31, was intended to establish 18 as the minimum age to marry while also allowing pregnant minors to marry with parental consent. An extraneous "not" in the bill, however, allows anyone who is not pregnant to marry at any age with if the parents allow it.

You can tell how passionately the AP cares about language issues by the fact that the negation bit is right up there in the second paragraph. And CNN, who care even more about sloppy language, run this same story (without further editing, at a first glance) as Misplaced 'not' in Arkansas law allows babies to marry.

Well, golly gee, I care about sloppy language too. But I care about sloppy thinking more. And here I think the "misplaced" negation is not the problem. The "misplaced" negation is merely symptomatic of a deeper problem, namely the fact that a really idiotic piece of legislation was obviously drafted by a pre-law undergrad in the midst of a heavy night's drinking, and passed by the legislature before sunrise so everyone could get some shut-eye. This AP/CNN story should not be about sloppy language, but about sloppy thinking, though the CNN byline itself suffers from seriously sloppy language.

The reason I say that the negation is the symptom rather than the problem is that the law seems equally odd with or without it. The  law in question goes by the proud name of Act 441 of 2007, Section 1, and was passed in March.  Here are the relevant parts:

(b)(1) In order for a person who is younger than eighteen (18) years of age and who is not pregnant to obtain a marriage license, the person must provide the county clerk with evidence of parental consent to the marriage.

(2) The county clerk may issue a marriage license to a person who is younger than eighteen (18) years of age and who is not pregnant after the county clerk receives satisfactory evidence of parental consent to the marriage under subsection (c) of this section.

So yeah, as you see, the law says that the County Clerk may issue a license to a minor who is not pregnant. But to say that this venerable act suffers from a "misplaced not" doesn't quite hit the mark. First CNN's use of the term "misplaced" is itself misplaced. It suggests that the negation should have been somewhere else, as in e.g.

 ``The county clerk may not issue a marriage license to a person who is younger than eighteen (18) years of age and who is pregnant after the county clerk receives satisfactory evidence of parental consent''?

Or ``The county clerk may issue a marriage license to a person who is not younger than eighteen (18) years of age and who is pregnant after the county clerk receives satisfactory evidence of parental consent''?

Or ``The county clerk may issue a marriage license to a person who is younger than eighteen (18) years of age and who is pregnant after the county clerk does not receive satisfactory evidence of parental consent''?

It's hardly credible that moving the negation somewhere else in the statute would succeed in producing a piece of law that could conceivably have represented the will of the people. Then again, the people elected the bozos who passed the law, so who knows.

Next, shouldn't CNN say "misplaced negations", plural, since the phrase not pregnant appears in both clauses? That is, to the extent that someone misplaced a negation, they did so on at least two occasions. Well, ok, another very minor point. Let us allow that "misplaced" isn't quite right, and "negation" (singular) also isn't quite right, and that what CNN really meant was a pluralized version of what AP put in the article: "extraneous negations" rather than "misplaced not." We shouldn't look at moving the negations, but at removing them. However, removing them, while it would limit the application of the law so that e.g. 8 year-old boys could not marry, would scarcely produce good law. The statute would then imply that, given parental approval, a 12 year old girl could be married provided she was first inseminated.

Or would it? That would appear to be a matter of legal interpretation, and goes beyond what a linguist can pronounce on. But you don't need to take my word for it, since in this Arkansas Government website report, the Arkansas Attorney General provides a relevant opinion. First, he gives some context for the new law. In part, it appears to replace Arkansas Constitutional Amendment section 9-11-102, which the Attorney General reports as saying:

(a) Every male who has arrived at the full age of seventeen (17) years and every female who has arrived at the full age of sixteen (16) years shall be capable in law of contracting marriage.

(b)(1) However, males and females under the age of eighteen (18) years shall furnish the clerk, before the marriage license can be issued, satisfactory evidence of the consent of the parent or parents or guardian to the marriage.

In fairness to the drafters of the legislation, and I use the plural here on the assumption that a large group was partying that night, I doubt that they intended their act to wipe out 9-11-102. Rather, I guess they must have wanted to strengthen it, perhaps by limiting its application to non-pregnant minors. Their intention was then to leave (a), immediately above, as it stands.  However, the Attorney General concludes:

Act 441 of 2007 indeed appears to amend A.C.A. § 9-11-102 to place no limit on the age at which parties may obtain a marriage license with parental consent. In my opinion, the clerk must issue the license if the statutory requirements, including parental consent, are met.

But in that case, the problem is not just one of misplaced negation. The issue of whether there is a negation before the each occurence of the word pregnant in Act 441 cannot bear in any obvious way on whether the entire new statute is intended to strengthen the earlier 9-11-102, or whether it is intended to replace parts of it.

As it happens, the Arkansas Code Revision Commission did in fact attempted to ammend Act 441 by removing the negations. This was the attempt that was rebuffed this week in the court decision that brought this whole legislative potboiler to the AP reporter's attention. But even if the Commission had succeeded, based on the Attorney General's interpretation of Act 441, it must be the case that the variant non-negated Act 441 would still have allowed a 12 year old to get married. Provided she was ovulating and slept around enough.

Can that possibly have been the intent of the Arkansas legislature? Don't look at me... I dont know the answer to that question. But an apparently thorough and well-informed article in Arkansas Online suggests that what the legislature intended was to ban marriage of 16 and 17 year olds who are not pregant, and allow marriage of 16 and 17 year olds only if they are pregnant. And if the Arkansas Attorney General's opinion is to be trusted (who better?), then no amount of adding or taking away negations in Act 441 would have yielded a law that satisfied this intent.

Why then, did the Arkansas Code Revision Commission attempt to fix things by removing the negations? Were they drinking from the same still as the original drafters of the bill? Not necessarily: the Commission is hampered by the fact that its powers extend only to correcting typos. So they had no choice but to treat what was wrong with Act 441 as involving a minor typographical error. Taking out the negations, a farcical and half-assed solution at best, was all they could hope to do given their limited jurisdiction, and even this turned out to be beyond them. However, a side-effect of the Commission's attempt at patching up an awful piece of legislation was that the press came away thinking that the ugly ramifications of Act 441 really did result from a minor typographical error. A great story, perhaps, but I just don't buy it.

Both Act 441 as passed, and the variant Act 441 with the negations removed, are strange pieces of legislation. I'm reminded of an earlier Language Log theme which began with Mark Liberman's discussion of whether Derrida can be "even wrong".

Mark told us about a parlor game in which you have to guess which of two alternatives, one containing a bogus negation, is a real Derrida quote. The simple point was that if you can't tell which is right out of the negated and non-negated variant, then perhaps the original doesn't make sense. Same thing goes here. It's scary when those putting a statute on the books don't notice the difference between a version with a negation and a version without. You gotta ask: if extra negations pass unnoticed, can Act 441 even be wrong?

Act 441 is not wrong, it's idiotic.

Posted by David Beaver at 07:31 PM

Manifestly as a term of art

Two esteemed law professors have written to enlighten me about my recent post discussing the use of "manifestly" by the lawyer for Senator Larry Craig (Billy Martin) and the judge who refused to withdraw Craig's plea of guilty in that now famous incident at the men's room of the  Minneapolis airport. I suspect I'll be hearing still more from the law profession and my headache will get even worse.

A friend, Professor Janet Ainsworth, writes:

The reason both lawyers were using the term, "manifest injustice," is that it is indeed the legal term of art appropriate to the resolution of the motion to withdraw the guilty plea. If someone enters a guilty plea but seeks to withdraw it before sentencing, the appropriate legal standard needed to justify the withdrawal is "good cause," which can be satisfied if the plea bargain was not lived up to by the prosecutor or if the defendant overlooked a valid defense to the charge that his lawyer couldn't have reasonably discovered at the time of the plea entry, newly discovered evidence casting doubt on the guilt of the defendant, for example.

After sentencing, however, the burden on the defendant to justify a plea withdrawal is higher than before sentencing. That higher standard is described as the necessity to show that it would be a "manifest injustice" to fail to allow him to withdraw the plea. Now it isn't good enough to merely show an overlooked defense; you have to show that the plea was constitutionally defective in some way, the product of coercion, or without appropriate recitation of the rights being waived by entering the plea. I suppose the reason for the higher standard post-sentencing is that we think that mere "buyer's remorse" is more likely an explanation of a change of heart after sentencing than before and that finality in judgment is a more significant factor once the entire matter is disposed of than earlier in the process.

I appreciate that weasel words are often used in the course of legal argument, but this isn't an example of one of them. Terms like "manifest injustice," "good cause," and "plain error," etc. are labels used to articulate and distinguish the particular burden needed to sustain a party's motion. In that sense, they are no more objectionable than any other term of art used by experts in a domain, like "set" used by mathematicians or "command" used by syntactitians.

So here I learn that "manifest injustice" is an acceptable and necessary term of art in law used by both the defense and the judge, and that mathematicians and linguists use terms of art too. I think I understand this. But another law professor, Jonathan Weinberg, read my post an wrote:

For what it's worth, Martin and the judge were using "manifestly" in different senses. The judge was applying a legal rule under which a defendant can recant his plea only in situations where requiring the defendant to stick to his plea would probably lead to injustice, but things aren't really clear and there's room for argument either way. Then the defendant is stuck with the plea. That's because the importance of finality in criminal adjudication trumps any (arguable, unclear) injustice. "Manifest" has important content here.

Martin, on the other hand, is just using the word in a Garner-ian sense, as an all-purpose "maybe if I throw this word in a sentence, it will sound more convincing" as an intensifier.

Okay. If I read these responses accurately (I'm sure they'll tell me if I haven't), we seem to have two somewhat different views of the use of this term of art. Professor Ainsworth says both the judge and Craig's lawyer were using it more or less properly. Professor Weinberg votes for the judge's proper use but is a bit skeptical about Martin's.

My headache isn't getting a whole lot better.

Posted by Roger Shuy at 06:24 PM

Louisiana vowels

A NPR Morning Edition story this morning by Melanie Peebles about gubernatorial candidate Bobby Jindal speaking in the little town of Gramercy, Louisiana, refers to the

... locals, who tend to draw out vowels in a speech pattern born of front-porch sitting

Ah, the slow pace of rural life made audible.

When I posted about this to the American Dialect Society mailing list, Charlie Doyle quipped:

And northerners talk fast because they sit uncomfortably on those little stoops.

The larger point here is that non-linguists tend to hold to a folk belief that differences between varieties, including geographical and social dialects, have a deep explanation.

So ordinary people are inclined to ask linguists why questions, like:

Why do Southerners speak with a "drawl"?

Why do some Middle Westerners say needs washed rather than needs washing?

Why do many Western Pennsylvania speakers say gum band instead of rubber band?

Why do kids introduce quotations with be like rather than say?

Why do working-class New Yorkers say dese and dose instead of these and those?

The answers that linguists give are rarely fully satisfying to the questioners.  Mostly, we explain the history of a variant, if we know it or can find it out, and we appeal to general mechanisms of change -- of sound change, syntactic change, semantic change, borrowing, lexical innovation, and so on.  So we say that the construction in needs washed is just a continuation of a pattern in the speech of Scots-Irish settlers in the U.S.  When pressed further, we explain that the construction makes syntactic sense: the subject of needs washed is understood as the object of the verb WASH, so the semantics here is a lot like the semantics of the passive, and we use the past participle form (washed) in the passive, so why not use it here?

At this point, our questioner is likely to say that that's all fine and good, but why did the Scots-Irish, and not other people, innovate this variant?   Why did only certain groups "simplify" the pronunciation of these to dese?  Why did Western Pennsylvania speakers of German descent carry over the German Gummiband as gum band, while abandoning most other items of German?  Why did people start using be like to introduce quotations only in the last century, if the construction is so natural and useful?

Any decent answer to these more detailed questions -- all of them questions about particular events, involving particular bits of language and particular people, at particular times and in particular places -- is going to involve some appeal to randomness or chance along the way, and most non-linguists just hate the idea that so many things might happen in language for no good reason.  (Actually, a great many people just hate the idea that ANYTHING happens for no good reason; there's a reason for everything, they say.)  So they take an an appeal to randomness as a confession of failure on the part of linguists.  And they cast about for other reasons: anatomy, the weather, geography, social customs (like front-porch sitting), group character (like the toughness of working-class New Yorkers), whatever.

People can be inventive.  Several times when I've noted, in the phonetics and sociolinguistics sections of introductory linguistics courses, the unrounded (indeed, often spread-lipped) variant of the vowel in good for many California speakers, especially young speakers along the SoCal coast -- a nice way of making unrounded high back vowels real for students who might not hear the actual vowel sounds of Japanese very well but who are probably familiar with the SoCal good vowel through its association with the "surfer dude" stereotype -- my students have offered their explanation for the phenomenon: the SoCal speakers smile a whole lot when they talk, and the SoCal good vowel is what you get when smile while producing the vowel.  My students largely remain unmoved by my objection that the vowels of dude and load are not similarly unrounded for these speakers, who mostly shift those vowels towards the front while preserving rounding in one way or another.  Or by my objection that there are plenty of other groups whose members are stereotyped as smiling an awful lot -- women, for example -- but who don't in general show the unrounding.  There just has to be an explanation; it couldn't be by chance.

Posted by Arnold Zwicky at 03:23 PM

Flexible lawyerspeak: manifestly

Recently I noticed a phrase used in Senator Larry Craig's failed attempt to have his guilty plea reversed by the court. The CNN report points out that Billy Martin, Craig's lawyer, argued that it was "manifestly unjust" to deny Sentator Craig's request to quash his guilty plea.

Now most of us think we know what "manifestly" means in its adjectival and adverbial forms. It's probably along the lines of the definition in  Webster's New Collegiate Dictionary:

Manifest adj.  1. readily perceived by the senses and esp. by the sight  2. easily understood or recognized by the mind: obvious  syn see evident--manifestly adv

The judge countered that there was "no manifest injustice" here, adding that Craig was a career politician with a college education and was of at least above-average intelligence. He ruled that the plea would not be withdrawn since it was accurate, voluntary and uninfluenced by the police interview. In other words, to him and most who read it, Craig's plea could be said to be manifestly clear. It appears that attorney Martin meant that a judgment against Craig would be manifestly unjust based on his manifestly clear admission of guilt. Or something like that.

I bring up "manifest" here because some 45 years ago David Mellinkoff, in his historic The Language of the Law, listed over a hundred words and expressions used by lawyers because they are so usefully "flexible." Other critics call the items on Mellikoff's list "equivoval expressions" or "weasel words." His list includes "manifest" on page 22.

We probably can't expect lawyers to use words the way the rest of us do, but when terms like "manifestly" pop up, I sometimes search for enlightenment about them by going to Brian Garner's Dictionary of Modern Legal Usage. Here's what he has to say about "manifest."

Manifest, adj., often functions in suspect ways in legal writing: "Someone has observed that whenever a lawyer says something or other was the manifest intention of a man, 'manifest' means that the man never really had such an intention." Jerome Frank, Law and the Modern Mind 30 (1930); repr. 1963. This word is one of those vague terms by which lawyers create an appearance of continuity, uniformity, and definiteness [that does] not in fact exist. Id.

If we are to follow Garner's definition, we might ask if both Martin and the judge in this case were just toying with one of those suspicious and "flexible lawyer words" that create the appearance of something that does not in fact exist. If Garner is right here, both Craig's lawyer and the judge meant exactly the opposite of what we are led to believe they were trying to mean. This would seem to tell us that Craig's lawyer was actually admitting that the plea was fair and just while the judge was saying that it wasn't.

This is giving me a headache. I think I'll go lie down a bit.

Posted by Roger Shuy at 01:02 PM

The direct responsibility of me

What British prime minister Gordon Brown said when asked if he was personally responsible for calling off the election he had been widely rumored to be planning for, he told the press: "Anything that happens in Downing Street is the direct responsibility of me and I will always take that full responsibility myself." Very unusual wording there: "the direct responsibility of me".

It would have sounded so much more natural for him to assign the responsibility to the office, and say: "Anything that happens in Downing Street is the direct responsibility of the prime minister." What makes "of me" so unusual is that there is a monosyllabic genitive form of the first person singular pronoun, namely my, so normally people will say "my responsibility" or "my spouse" rather than "the responsibility of me" or "the spouse of me". Perhaps it was a planning screw-up: he first embarked on "Anything that happens in Downing Street is the direct responsibility of the prime minister" and then decided on a mid-course correction from "the prime minister" — the third-person reference to himself might have sounded pompous — and changed the last noun phrase to "me". What he ended up with sounded strangely inept. An ill-planned sentence to end an ill-planned week.

I say it was an ill-planned week because the British press and the opposition in parliament have absolutely jumped all over Brown about that election he never said he would call. They've been accusing him of "bottling" it. The verb, which is unfamiliar to me in this sense (though I know the phrase "a lot of bottle", meaning "a great deal of courage and aggressive attitude", which is quite the opposite of what they're saying Gordon Brown has), seems to be a terrible accusation of cowardice. Conservative Party activists have been dressing up as bottles with Brown-like faces to taunt him. A terrible week, and a sentence-planning fiasco to cap it all, poor man.

[Update: Several people have pointed out to me that the phrase the boss of me is becoming familiar. Among other things, You're not the boss of me is the title of a song by They Might Be Giants, and the phrase is starting spread via blogs like this one and this one.]

Posted by Geoffrey K. Pullum at 11:49 AM

Berzerkistani phonology

In case you missed it earlier this week: when Duke is told in this Doonesbury strip that the name of the President-for-Life of Greater Berzerkistan is Trff Bmzklfrpz, he repeats the surname back in puzzlement: "Bmzklfrpz?" The Berzerkistani official who is escorting him to see the President-for-Life explains: "It's pronounced "Ptklm". Scholars of the Berzerkistanian language family of Central Asia will not be surprised (colloquial Berzerkistani is known to have a rather different phonology from the older Classical Berzerkistani that is still in ceremonial use), but I thought some Language Log readers might find the interchange interesting.

Later, when Duke actually gets beyond the Corridor of Broken Crystal into the presidential presence, he addresses the President as "Excellency", and is rebuked in this strip: "You are to address me as President-for-Life Trff Bmzklfrpz!"

Duke replies: "I wish I could, sir."

"No problem," says the President-for-Life; "We'll have your tongue surgically curled."

Further evidence for the specialists there regarding the exact pronunciation of the contemporary reflexes of the complex series of Classical Berzerkistani apico-coronals.

Posted by Geoffrey K. Pullum at 04:41 AM

Statistics on BBC radio

I thought Mark's discussion of the almost wilful ignorance of basic statistical concepts in our culture was not just fascinating but very important. And here is a footnote to it. A few minutes ago I heard on BBC's Radio 4 morning news program "Today" a reporter explaining that blue tongue disease in sheep could cause "up to 70 percent mortality in some animals."

I suppose there have been days when I felt 70 percent dead, but that's not what she meant. She just wandered from talking about a statistical fact concerning populations to talking about effects on individual sheep. She didn't mean "in some species of animal", if that's what you're thinking, because she had already limited the claim to how devastatingly serious blue tongue disease is in sheep.

Doubtless just an on-the-fly slip; but in a national news broadcast, it seemed to me a significant indication of quantitative carelessness nonetheless.

As if to emphasize the point, within about a few minutes on the same program another story, about a UN report on childbirth mortality, mentioned that half a million women die each year in childbirth and that the worst thing was that this figure "has remained unchanged" for a number of years. If that claim were true, it would of course be great news: since the population of the world has been increasing and the number of births is linearly related to the total population, this would mean that deaths in childbirth, per thousand births, were constantly declining from year to year. But again, that's not what was meant, since the hook for the story was a new report concerning the failure of the situation to improve. I think it was just one more case of pulling out a constant number when only a percentage would tell us what we needed to know. And that, along with its converse (giving us a percentage where only an absolute quantity would make sense) must surely be the most prevalent of all the conceptual slips in talking about quantities that Mark addressed.

Update: Jonathan Weinberg points out to me that in fact "the (very careful) UN report goes on to place the 529,000 figure between a lower uncertainty bound of 277,000, and an upper uncertainty bound of 814,000, maternal deaths." In other words, they do not really know whether the figure is even roughly constant — not to within one to three hundred thousand maternal deaths. That was not mentioned in the radio stories about the UN report, and makes the error of asserting that things are getting worse (or not getting better) much worse.

Posted by Geoffrey K. Pullum at 02:00 AM

October 11, 2007

Fantasies of illicit peeve-ranting behind the bikesheds

On Monday, I linked to a sort of semi-interview of Deborah Cameron in the London Sunday Times, and I commented that the writer seemed "patronizing (though perhaps he's only being British, it's sometimes hard for me to tell the difference)". A friend who was born, raised and educated in the U.K. wrote to say that to him, the article gave off "a whiff of personal animus bordering on misogyny", and I used that reaction as the basis for a post yesterday about lexical clues to sentiment classification.

This afternoon, a strongly contrary reaction from a British reader reached me via Arnold Zwicky, to whom the author had sent it out of frustration at being unable to reach me by inventing possible addresses at Stanford University, which knows me not. (A note to others who may encounter the same problem: there's this new-fangled American invention called "google", and if you ask it about Mark Liberman, the first item is my home page, which has my email address on it.)

I was going to add his critique as an update to one of the original posts, but it's both long enough and entertaining enough to merit a post of its own. However, you might want to (re-)read the original discussion before plunging in ("Are pop gender studies from Uranus?", 10/8/2007; "Sentiment classification at the Sunday Times", 10/11/2007). The Sunday Times article that started it all off was Ed Caesar, "Talking tosh on Mars and Venus", 10/7/2007.

The reader's comments:

To me, as a Brit, the Mars/Venus article you quote comes across as rather strongly upbeat, and without any clear "negative markers" against the author or the work. I shall translate, as best I can, into American for you.

"So it turns out" → "As anyone with sense suspected all along," (with extra irony. This is not a phrase ever used to express surprise, or at least, verging on never in heavy-irony passages like this one).

"that after all the rows about the washing up, the shopping and the school run,"→ "despite much petty silliness to the contrary"

"men are not from Mars nor women from Venus." → "the obvious is true."

"Both sexes are, rather prosaically," → "Both sexes are, as expected by any normal rational person not living in an ivory tower or cloud cuckoo land" (prosaic does not mean "boringly" but something closer to "common-sense" or "rational")

"from Earth."→"from the bloody obvious place that we all knew they were from, at least all of us who weren't dropped on our heads as babies."

"And, despite anecdotal evidence"→"Despite unreliable, unfounded andunrespectable evidence" ("anecdotal evidence" is a hugely strongnegative marker, considered synonymous with "filthy lies of the deluded", and carries the strong suggestion that the people reportingthe anecdotes are crystal-waving charlatans or six-fingered country hicks taking a break from reporting alien anal probes and encounters with Nessie).

"to the contrary, men and women do speak the same language." → "another blatantly obvious thing is also true"

[In summary, this first paragraph then, is pouring heavy sarcastic scorn onto those who ever believed the Mars/Venus thing. There /are/ negative markers in the heavy emphasis on the blatant obviousness of the points, possibly the strongest being "anecdotal evidence", but "the school run" is also heavily derogatory, sumonning imagery of the Wrong Type of middleclass Better-than-the-Joneses Helicopter parenting.]

"At least we do according to Deborah Cameron, Britain's pre-eminent" → "Deb C says this, and she's ours, Britain's, and we are proud - she is pre-eminent, which means awesome, and it is of this pure British awesome that she is made." ("Britain's" and "pre-eminent" are two huge positive markers here: the reviewer is being almost American in his slavish adulation).

"feminist philologist (not often that you meet one of them)" → "She is that rare breed, the feminist philologist. Not just a linguist, because that sounds like something you can get a GCSE in just for turning up, but a philologist, which sounds like you need to be smart for it. And feminist ones don't grow on trees, no siree, you don't get many of them for a pound".

"and the current Rupert Murdoch professor of language and communication at Oxford University."→"She has some more claims to awesome, too. She knows her stuff, and is better qualified than those pop-science 'tards on the M/V theory.

[This second paragraph could hardly have been more glowing. "Please excuse me while I flaunt her qualifications some more. I am clearly a fan and possibly in love with her. Odds are the editor had to forcibly prevent me from including her whole CV."]

"Cameron, 48, is a firebrand" → "Cameron, not a n00b, is a vehement fighter against establishment nonsense" ("firebrand" is another strong positive marker, about the most positive way to describe an anti-establishment thinker: compare the negative markers of which there is no shortage: "malcontent", "rabble-rouser", "rebel", "revolutionary", "troublemaker", "provocateur", "heretic"...)

"with an impressive list of" → "with a wonderful list of" ("impressive" is a clearly positive marker, contrast "tediously long", "lengthy", "weighty", etc.)

"pet peeves," → "things about which she is outspoken" (definitely not a negative marker, more a trying-to-be-neutral marker. There is no suggestion that these are things about which that one should not become irate, but rather the suggestion that she understands not everyone would become as riled about them as she would.)

"including Tories, Darwinists, GNER's passenger service announcements, Big Brother's language "so-called" experts, man-hating "pseudo-feminists" and societies for the protection of the semicolon. Don't get her started on Lynne Truss."→ "everything that every rational person loathes, and any sensible person will love reading a good rant about." (Here he is careful to temper any possible negative connotation of the earlier use of "feminist" by distancing her from the "man-hating" ones; the remainder of the list is an itemisation of things that his readers would be expected to share; note the scornful use of "Tories" rather than "Conservatives": it is clear that he shares these points of view and agrees with their selection as things to have a "pet peeve" about.)

[This paragraph is even more glowing than the last. The reviewer doubtless has pinned a picture of the author up by his desk and gazes at it for hours at a time, indulging in fantasies of illicit peeve-ranting behind the bikesheds.]

"But the subject that has irked her most recently -- enough for Cameron to dedicate an entire book to bludgeoning its brains out" → "Cameron served up a sound and well-deserved thrashing to an irksome subject"

"-- is what she calls The Myth of Mars and Venus, published last week by Oxford University Press." → "I like making confusing sentences, so you can't tell if this is a book, or a thing against which she is arguing, or what"

[Incoherent though it is, this final paragraph shows Cameron as a powerful and dynamic author who can deliver a categorical smackdown to long-held but incorrect beliefs.]

If you had read the original article in the tone of Jeremy Clarkson, you might have more readily understood where he was coming from and understood where those negative markers were aimed. He was not aiming them at the author: so much is clear from all the positive markers. He is aiming them at the hypothesis against which she is arguing.

OK, I cry uncle. If the "sound and well-deserved thrashing" hadn't won the day, the "fantasies of illicit peeve-ranting behing the bikesheds" certainly would have done the trick.

Clearly, I failed to take into account a crucial characteristic of British culture that I had no excuse to forget, since I've blogged about it. In the memorable words of AA Gill:

The English aren't people who strive for greatness, they're driven to it by a flaming irritation. It was anger that built the Industrial Age, which forged expeditions of discovery. It was the need for self-control that found an outlet in cataloguing, litigating and ordering the natural world. It was the blind fury with imprecise and stubborn inanimate objects that created generations of engineers and inventors. The anger at sin and unfairness that forged their particular earth-bound, pedantic spirituality and their puce-faced, finger-jabbing, spittle-flecked politics. ...

Anger has driven the English to achievement and greatness in a bewildering pantheon of disciplines. At the core of that anger is the knowledge that they could go absolutely berserk with an axe if they didn't bind themselves with all sorts of restraints, of manners, embarrassment and awkwardness and garden sheds.

[Underlining the difficulty of interpreting sentiment by ethnographic methods in the British isles, this note arrived recently from a third anonymous source:

The tone of the original article is definitely patronising at the start, but not towards Deborah Cameron and/or her ideas in particular, nor towards those of her opponents (as your anonymous correspondent somewhat bizarrely claimed). Rather it is the ingrained flippancy of the British journalistic classes towards academics - the main purpose of science is merely to provide funny stories to entertain normal, decent humanities-educated folks at the end of a hard-working day.

I think your anonymous academic friend read it wrong too. If I had a pound for every expat-Brit I've met who blames their expat status on endemic "tall-poppy syndrome" back home, I'd be a man with quite a lot of pounds. There was no particular misogyny - I think Caesar would have been equally flippant if he'd been interviewing a male academic. And seeing as Mr Caesar is an employee of Rupert Murdoch, it'd be surprising if he truly believed "thou surely shalt not hold a chair named after the Dirty Digger".

I'm still trying to figure out which planet the second anonymous commenter was from :-)


[And yet another British cultural consultant has yet another reaction, this time fairly close to my original impression:

I guess that column reads differently to every Brit who reads it.

To me -- as, I would guess, to you and to your academic friend -- "Britain's pre-eminent feminist philologist" sounded like damning with very faint praise by virtue of the (presumed) smallness of the set. If praise had really been meant, he could have written, say, "one of Britainís pre-eminent feminist academics, philologist Deborah Cameron"; that would have been a deal less mealy-mouthed, while still including the same information. Instead, he even hammers home the point of the smallness of the set in which she has such status ("donít meet one of them every day").

The list of "pet peeves", too, seems calculated to make the reader sneer in its eclecticism (she's obviously a real winger) and irrelevance (she even winges about wingers!), and to my eye the inclusion of Darwinism made things especially bad (what, she's anti-evolution too? What kind of intellectual credentials are these?). He adds to this belittling by introducing her book as a "bludgeoning to death" (strong word!) of a phenomenon which he had previously framed quite clearly as a load of very self-evident tosh -- thus neatly suggesting that she is breaking a very insignificant butterfly upon a far too large a wheel.

This, again, is hammered home by the ironic description of 'MafM, WafV' as 'seminal': here, he is saying, is a woman who is throwing the entire weight of her big-fish-in-a-little-pond authority into tearing a new asshole for a book which any half-sentient life-form is going to know from before even opening the cover is a lightweight bit of trash. Tremble, oh ye mortals!

What's really odd is that aside from a snark about her "rehearsed" Uranus quip the tone is not sustained through the remainder of the interview/article. Maybe he was just trying too hard to write amusingly; or maybe he took against Dr Cameron personally (consistent with your academic friend's sense of personal animus). Either way, I can't concede that this was a gushing eulogy. As to reading it like Clarkson? The man's a pillock-- it just makes the article sound insufferably arrogant and unthinking as well as rude.

No, no -- I have to agree with you and not with your anonymous British correspondents. Ghastly introduction.

For those of you who are (as I was) unfamiliar with the evocative but obscure word pillock, the OED explains it:

1. orig. Sc. The penis. Now Eng. regional (north.) and rare.

2. Chiefly Brit. colloq. (mildly derogatory). A stupid person; a fool, an idiot.

With that gem of antique gender stereotyping, perhaps we can declare the subject closed, if not settled.

Posted by Mark Liberman at 03:37 PM

Lexical retrieval woes

Sophie Harrison, reviewing Peter Nadas's Fire and Knowledge, NYT Book Review, 9/7/07, p. 19:

No one writes a palindromic phrase like Nadas.  On writing: "The ideal literary sentence may be born of imagination or experience, but it must gauge its imagination within its experience and its experience within its imagination." Melancholy is "the sensation of a void of knowledge or an awareness of a void of sensation."

Harrison then adds one of her own:

The discovery that there is an essay titled "Hamlet is Free" brings a feeling of sinking or a sinking of feeling, depending on how one looks at it.

But these are instances of chiastic phrases, not palindromic ones. 

Chiasmus and palindromes both involve reversals, but in very different ways.  In chiasmus, X ... Y is paired with Y ... X, while a palindrome reads the same forwards or backwards (either character-by-character or word-by-word).  So chiasmus vaguely resembles word palindromes, like these examples supplied to the American Dialect Society mailing list by Ben Zimmer in response to my posting there about Harrison's "palindromic phrase"):

So patient a doctor to doctor a patient so.

Girl, bathing on Bikini, eyeing boy, finds boy eyeing bikini on bathing girl.

You can cage a swallow, can't you, but you can't swallow a cage, can you?

Bores are people that say that people are bores.

Women understand men; few men understand women.

More examples here

Note that word palindromes (for sentences with at least four words) will necessarily have chiastic parts.  But the examples from Nadas and Harrison are not word palindromes.  And what most speakers of English think of when they hear "palindrome" is the character palindrome, as in the palindromic words "level" and "civic" and the character-palindromic sentences "Able was I ere I saw Elba" and "A man, a plan, a canal: Panama" -- but these are even more distant from chiasmus than word palindromes are.

It looks like Harrison reached into her stock of technical terms and pulled out a wrong (but semantically related) one.  This is a surprising error from someone who makes a living as a free-lance writer, including reviewing books (mostly fiction and criticism, apparently) for a variety of reputable publications: The New York Times, The Guardian, Granta, The London Review of Books, The New Statesman, and so on.  But then we've complained here many times about people whose professional lives concern language in a significant way but who misuse the technical terminology of grammar, rhetoric, poetics, historical linguistics, etc.

Posted by Arnold Zwicky at 02:28 PM

Bad Darwinism

Blake Stacey at Science After Sunclipse ("Darwi-friggin-ists", 10/10/2007) was bothered by one word in Ed Caesar's characterization of Deborah Cameron as "a firebrand with an impressive list of pet peeves, including Tories, Darwinists, GNER's passenger service announcements, Big Brother's language 'so-called' experts, man-hating 'pseudo-feminists' and societies for the protection of the semicolon". The word that struck a nerve: "Darwinists".

Americans have grown sensitive to the word "Darwinism" and its variants, since in the United States, scientists are more apt to say "evolution by natural selection," without attaching Charles Darwin's name to the idea. It is the creationists who refer to modern biology as "Darwinism," perhaps projecting a religious view that truth derives from prophecy onto the science they despise.

Of course, "Darwinists" was Caesar's word in that case, not Cameron's. But Blake looked on the web and found a 1997 essay "Language: Sociolinguistics and Sociobiology", in which Cameron uses phrases like "[e]xplanations ... of a vulgar Darwinist kind" and "today's Darwinians". She concluded that

... paradoxically, its bleak certainty is what makes Darwinism so compelling: that may be why it is rapidly emerging as the most powerful secular grand narrative available to fin de siecle westerners. It's not just that the evolutionary narrative speaks to us about who we are and why; more importantly, it does so with a confidence and clarity other narratives lack. Darwinism affirms, contra Marx, that the point is not to change it, for we cannot change our nature. It thumbs its nose at most variants of feminism, suggesting that sexual difference in its most stereotypical forms is irreducible and essential. It cocks a snook at postmodernism, a movement dedicated to destabilising all master narratives, by robustly declaring that there is, indeed, such a thing as human nature (evolutionary psychology is the study of how natural selection has shaped it).

Language is an important test case for Darwinist ideas. Like sexual behaviour, language-using has both a clear biological basis and an obvious social or cultural element; unlike sexual behaviour, however, language- using is exclusively a human trait. Any story that purports to tell humans about our 'nature' is bound to be partly a story about language. When today's Darwinists attempt to annex for nature aspects of linguistic behaviour that were previously taken to the province of culture, they are continuing what is actually a very old debate, in which 'language' is a figure for humanity itself. Ultimately this is a struggle over competing narratives of human nature, one celebrating flexibility, variety and the possibility of change while the other, more austere, offers coherence and stability. Finally, I am less interested in which narrative is 'true' than in why these particular stories of language are ones we seem to want to hear retold in every age.

Blake's comment:

... oversimplifications which elevate gender stereotypes to gospel truth are the problem of evolutionary psychology, not evolution, biology or the scientific method, and it is deeply misleading to heap sins upon "Darwinism" while not bringing the specific subject of evo-psych into the discussion until almost the very end.

I'm sympathetic, and would go further -- simple slagging of "vulgar darwinism", however much deserved, feeds the religiously-motivated opposition to all thinking about biological evolution, whether structural, physiological or psychological. And contra Cameron, I'm more interested in which narratives are true than in why we want to hear them told; and I believe that biological evolution is a key part of the truth about what human beings are like, including how they think and act.

However, I think that Blake Stacey is wrong to accuse Deborah Cameron of importing negative associations from America to hook onto words like darwinist and darwinism. He wrote:

On this side of the Atlantic, hearing a person say "Darwinism" is a red flag that you're dealing with a creationist or, at least, a person whose knowledge of science derives primarily from creationist claptrap. To me, calling evolutionary biology "Darwinism" makes as little sense as calling all modern music "Beethovenism." The year is no longer 1859; we understand many things which Darwin did not, although we stand on his shoulders.

In British usage, "Darwinism" is much more synonymous with evolution in general. (Richard Dawkins is a prime example of this tendency.) I find this unfortunate, partly because it slights all the relevant discoveries we have made since, from Mendel's time to the present day, and partly because it provides unwarranted ammunition to creationists over here. Still, that's the way they talk.

The OED gives us a citation suggesting that derivatives of Darwin were terms of opprobrium even before Charles Darwin came on the scene:

1880 Nature XXI. 246 Coleridge invented the term 'Darwinising' to express his contempt for the speculations of the elder Darwin.

(This would have been Erasmus Darwin, Charles' grandfather, who believed that "all warm-blooded animals have arisen from one living filament".)

And another, suggesting that famous British writers continued into the 20th century to use darwin-derived words to suggest that certain ideas were not quite the thing:

1920 G. B. SHAW in Public Opinion 13 Aug. 160/2 It has restored faith in Providence to a Darwinised world.

And Cameron's 1997 argument, minus the linguistic focus, is closely foreshadowed in an 1876 poem by (British) Edward Dowden, "Darwinism in Morals":

1 High instincts, dim previsions, sacred fears,
2 ---Whence issuing? Are they but the brain's amassed
3 Tradition, shapings of a barbarous past,
4 Remoulded ever by the younger years,
5 Mixed with fresh clay, and kneaded with new tears?
6 No more? The dead chief's ghost a shadow cast
7 Across the roving clan, and thence at last
8 Comes God, who in the soul His law uprears?
9 Is this the whole? Has not the Future powers
10 To match the Past,---attractions, pulsings, tides,
11 And voices for purged ears? Is all our light
12 The glow of ancient sunsets and lost hours?
13 Advance no banners up heaven's eastern sides?
14 Trembles the margin with no portent bright?

(A good question, if a bad poem.)

A quick scan of {Darwinism} in Google News suggests that it now has specific and mostly negative connotations on both sides of the Atlantic, occurring mostly in just two contexts: as the opponent of "creationism" or "intelligent design", or in the phrase "Social Darwinism". Looking over the first four pages, I found that about 60% of the hits put "Darwinism" into opposition with creationism or intelligent design; about 30% were in the phrase "social darwinism"; and the remaining roughly 10% were discussions of Richard Dawkins, or extended uses like "technological darwinism" (invoked to explain the success of the iPod), or things like this screed from David Warren, "The limits of science", 9/23/2007:

The philosophical position corresponding to scientism is called "Positivism," and was systematized by Auguste Comte (the man who coined the term "sociology") in the 19th century. He was building upon the revolutionary heritage of the French Enlightenment; but he was also expressing the God-like aspirations of parlour atheism in the Victorian age -- its "determinism," or faith that once everything is known, everything can be predicted. Lamarckianism, Darwinism, Marxism, Freudianism, and Phrenology were, to my mind, five other expressions of this naive determinism, that belong today in a Museum of Failed Victorian Ideas.

I didn't find any examples at all of "Darwinism" in the context of a simple evolutionary explanation without piles of explicit ideological baggage heaped on top of it.

Searches for {Darwinism} and {Darwinist} on Google Scholar produce similar results, except that (of course unflattering) references to "social darwinism" and "social darwinist" now outnumber references to the creationism-v.evolution debate. In particular, it's notable how reliable these terms are as indicators of negative sentiment (either directly on the part of the writer, or indirectly in discussions of the views of others). For example, Jerry Fodor's 1998 LRB review of Pinker's How the Mind Works and Plotkin's Evolution in Mind ran under the headline "The Trouble with Psychological Darwinism", and darwinism is just about as good an indicator as trouble that Jerry is not going to be dispensing praise.

In this case, darwinism is not just the London Review's choice for a convenient headline word -- Fodor uses it in contexts like these:

It's their Darwinism, specifically their allegiance to a 'selfish gene' account of the phylogeny of the mind, that most strikingly distinguishes Pinker and Plotkin from a number of their rationalist colleagues [...] I'm particularly interested in how much of the Pinker-Plotkin consensus turns on the stuff about selfish genes, of which I don't, in fact, believe a word.


But it's the inference from nativism to Darwinism that is currently divisive within the New Rationalist community. Pinker and Plotkin are selling an evolutionary approach to psychology that a lot of cognitive scientists (myself included) aren't buying. There are two standard arguments, both of which Pinker and Plotkin endorse, that are supposed to underwrite the inference from nativism to psychological Darwinism. The first is empirical, the second methodological. I suspect that both are wrong-headed.


The literature of Psychological Darwinism is full of what appear to be fallacies of rationalisation: arguments where the evidence offered that an interest in Y is the motive for a creature's behaviour is primarily that an interest in Y would rationalise the behaviour if it were the creature's motive. Pinker's book provides so many examples that one hardly knows where to start.

Anyhow, I agree with Blake that what's happened to the terms darwinism and darwinist is a darn shame. But it's not Deborah Cameron who did it. And at least on the evidence of her recent book, The Myth of Mars and Venus, she often deploys her terminology and her rhetoric just in the way that Blake would like.

Chapter 6 is "Back to Nature: Brains, Genes, and Evolution". Cameron starts with a 2005 headline in the London Evening Standard, "Men are Better Shoppers than Women: It's in the Genes". She then describes evolutionary psychology in general terms:

Arguing that some apparently modern phenomenon, like shopping or eating junk food, can best be explained by going back to the Stone Age is the hallmark of a branch of science known as evolutionary psychology. According to evolutionary psychologists, many behaviour-patterns which we might asume to be products of culture are actually the results of biological evolution: they reflect the ways in which our earliest ancestors adapted to the conditions of life on the prehistoric plain.

She then surveys a number of examples, from Simon Baron-Cohen to pop psychology books like Why Men Don't Iron, and discusses and evaluates the evo-psych arguments as they apply specifically to matters of language and communication. I have to say that her tone is much less, well, shrill than Jerry Fodor's -- there are no words like "stuff", "wrong-headed", "fallacies", and so on. This is a dissection, and a rather careful and limited one, not a bludgeoning.

And the words darwinism and darwinist don't occur anywhere in the chapter.

[Note: this post originally characterized Ed Caesar's treatment of Deborah Cameron as "spectacularly sexist". That was (and remains) my impression, but one British reader (whose comments are given in a separate post here) strongly disagrees, and the point is not really relevant to the current discussion, so I've removed the parenthetical.]

Posted by Mark Liberman at 09:37 AM

October 10, 2007

Baboons and Daubert

A recent New York Times article, How Baboons Think (Yes, Think), describes recent, interesting research by Dorothy Cheney and Robert Seyfarth on the Moremi baboons in Botswana. I won't comment on the language and cognition issues discussed there but I admit that I was very interested in the problem these wildlife biologists have with claims made by others-- that current ape studies can help us understand the evolution of communication:

Dr. Cheney and Dr. Seyfarth are skeptical of claims that chimpanzees have a theory of mind, in part because the experiments supporting the position have been conducted on captive chimps. "It's bewildering to us that none of the people who study ape cognition have been motivated to study wild chimpanzees," Dr. Cheney said.

Many linguists face similar problems. We gather data and experiment with it in our laboratories but many times we can't be sure that our results reflect what happens under more natural, non-laboratory conditions. For example, a great deal of the research on deceptive language is based on experiments in which people (usually undergrads) alternatively are asked to make truthful and untruthful statements for a stimulus that is then given to subjects to assess any deception. Not surprisingly, the subjects aren't very good at this. Another way to do this might be to use naturally occurring language data that might be taken from police interrogations or, in some cases courtroom testimony, that already  has been proved to be untrue, and then to produce experimental stimuli from this.

Linguists who study naturally occurring language are somewhat handicapped when their findings are compared with the much neater results produced by controlled experiments. One of the problems facing such linguists who serve as expert witnesses is the standard of admissibility that the US Supreme Court established in 1993 in Daubert v. Merrell Dow Pharmaceuticals, Inc., now commonly referred to as Daubert. Before this decision, the courts had relied on the 1923 decision of Frye v. United States, in which experts were selected using what is called "the general acceptance test," which specified that a proposed expert witness should be qualified in that field, that the testimony should be relevant to an issue in dispute in the case, and that the scientific theory or technique must be shown to be generally accepted by the relevant scientific community.

In most US jurisdictions today Daubert has replaced the Frye general acceptance test with a reliability assessment, an independent judicial evaluation of the reliability of the proposed testimony. The Court suggested four factors to assist judges:

1. Whether the theory or technique has been tested and found to be sound.

2. Whether it has been subjected to peer review and publication.

3. Whether, in respect to a particular technique, there is a high known or potential rate of error and whether there are standards  controlling the technique's operation.

4. Whether the theory or technique enjoys general acceptance within the relevant scientific community.

The Court further noted that these criteria are not fixed or invariable but are meant to be helpful to judges, rather than definitive.

For a few years there was considerable confusion about the scope of Daubert. Did it apply only to scientific testing and analysis or did it apply to all forms of expert witness testimony, including groups such as engineers, automobile mechanics, and therapists? In 1999 the US Supreme Court ruled in Kumho Tire v. Carmichael that the Daubert reliability assessment applies to all forms of expert testimony, even to palmists and astrologists.

Generally speaking, linguistic analysis can satisfy the Daubert factors. The scientific nature and credentials of our field help meet factor 1. Publications and peer reviews help with factor 2. Factor 4 is not a problem because linguists get research funding from the National Science Foundation and other government agencies supporting our work. But factor 3 with its potential error rate can pose a serious problem for linguists who study language as it occurs in its natural contexts. Language evidence in the form of tape-recorded conversations is virtually impossible to replicate or convert into controlled experiments. Other social sciences have a similar problem, especially when they are not doing experimental studies.

I've worked on many such cases but the one that was most frustrating was a case in which there were two tape-recorded conversations in evidence. The prosecution claimed that one of them was a genuine conversation but the other one was a faked, staged conversation between the same two men. My task was to try to determine whether or not the allegedly faked conversation was actually staged.

Even linguists who specialize in conversation data haven't given this topic much thought. I couldn't find research studies on this topic so I couldn't tell the court that my findings had been subjected to previous peer reviews and publications (factor 2). Nor could I say that there was any known error rate for this kind of analysis (factor 3). I had compared the known, real conversation with the allegedly staged one and concluded that in terms of the phonology, morphology, syntax, discourse structure, speech errors, pauses, pause fillers, vocabulary, and other factors, there was no substantive difference between them. I couldn't testify that the speakers had or did not have the intention of producing a staged conversation, because that would go beyond the scope my field (or any other field, as far as I know). But I was prepared to testify that there was no linguistic evidence that would mark the two conversations as different. This didn't satisfy the judge, who from all I could tell may have been predisposed not to allow it anyway. She stuck rigidly to the Daubert factors and didn't allow me to testify.

When I read the complaint of Cheney and Seyfarth about the lack of research on chimpanzees in their wild, natural context, I couldn't help thinking that it parallels the Daubert factors that face some linguistic expert witnesses. Sometimes experimental research, for all its great values, can't tell us all that we really need to know.

Posted by Roger Shuy at 06:17 PM

Verbal (ir)regularity

Emma Marris, "How 'holp' bccame 'helped'", NatureNews 10/10/2007; Erez Lieberman, Jean-Baptiste Michel, Joe Jackson, Tina Tang & Martin A. Nowak, "Quantifying the evolutionary dynamics of language", Nature 449:713:716, 11 October 2007.

In the same issue, another paper on quantitative modeling of language change:Mark Pagel, Quentin D. Atkinson & Andrew Meade, "Frequency of word-use predicts rates of lexical evolution throughout Indo-European history", Nature 449:717-720, 11 October. And W. Tecumseh Fitch has a News and Views piece surveying both the Lieberman et al. and Pagel et al. papers, "Linguistics: An invisible hand", Nature 449: 665-667, 11 October 2007.

In this case, Nature imitates (comic) Art -- One Big Happy for 10/9/2007:

More seriously: it's terrific to see such prominence given in Nature to several pieces of linguistic research. More on this later.

Posted by Mark Liberman at 04:34 PM

Sentiment classification at the Sunday Times

One of the latest fashions in computational linguistics is "sentiment classification", which tries to determine automatically what writers' attitudes towards their topics are. For an interesting account of some recent work in this area by a recent Penn grad, see John Blitzer et al., "Biographies, Bollywood, Boom-boxes, and Blenders: Domain Adaptation for Sentiment Classification".

Most of the information useful to such algorithms comes from particular words or word sequences: thus in the Blitzer et al. paper, positive evaluations of books are associated with snippets like "engaging", "must read", "fascinating" and so on; in contrast, strings like "<num> pages", "predictable", and "plot" tend to be associated with  negative evaluations. Strings like "are perfect", "years now", "a breeze" are positive signs for kitchen appliances, whereas "the plastic", "awkward to" and "leaking" are negative indicators.

A few days ago, I quoted from Ed Caesar's article in the Sunday Times about Deborah Cameron's new book, The Myth of Mars and Venus ("Are pop gender studies from Uranus?", 10/8/2007). I noted that "His tone is just a little bit patronizing (though perhaps he's only being British, it's sometimes hard for me to tell the difference)". To remind you of what I was talking about, here's how he starts -- be on the look-out for negative indicators like "rather prosaically", "pet peeves", "irked", "bludgeoning its brains out":

So it turns out that after all the rows about the washing up, the shopping and the school run, men are not from Mars nor women from Venus. Both sexes are, rather prosaically, from Earth. And, despite anecdotal evidence to the contrary, men and women do speak the same language.

At least we do according to Deborah Cameron, Britain's pre-eminent feminist philologist (not often that you meet one of them) and the current Rupert Murdoch professor of language and communication at Oxford University.

Cameron, 48, is a firebrand with an impressive list of pet peeves, including Tories, Darwinists, GNER's passenger service announcements, Big Brother's language "so-called" experts, man-hating "pseudo-feminists" and societies for the protection of the semicolon. Don't get her started on Lynne Truss.

But the subject that has irked her most recently -- enough for Cameron to dedicate an entire book to bludgeoning its brains out -- is what she calls The Myth of Mars and Venus, published last week by Oxford University Press.

The review by Susannah Herbert, in the Sunday Times of the same day, gives almost exactly the same description of Cameron's book, but with a very different tone. Among the positively-associated words are things like "nuance" and "first-class" -- Caesar's "bludgeoning its brains out" is Herbert's "delightfully spiky":

In the village of Gapun in Papua New Guinea, when a woman is annoyed with her husband, she swears at him for 45 minutes, at the top of her voice so the neighbours catch every nuance. During this "kros" -- the word means "angry" -- the target is not allowed to answer back, nor may anyone interrupt until she's given her feelings full expression.

And what expression it is. The anthropologist Don Kulick recorded a typical kros: "You're a ****ing rubbish man. You hear? Your ****ing ***** is full of maggots. You're a big ****ing semen *****. Stone balls! ...****ing black *****! You *****ing mother's ****!"

When the flowers of English womanhood carry on like this -- at closing time on Friday night in Ipswich, say -- they're thought to be behaving laddishly. When the housewives of Gapun turn the air blue, however, they are only doing what comes naturally to a woman. The village men, apparently, pride themselves on their ability to conceal their opinions and express themselves indirectly: if they need to get a grievance off their chests, they get their wives to do it for them. In Gapun, women are from Mars, men are from Venus.

I sensed early on in this delightfully spiky book that Deborah Cameron -- an Oxford professor of language and communication -- would give a first-class kros, and enjoy it, too. The only problem would be limiting the number of victims to one. Cameron's targets are many: there's John Gray, the author of the psychobabble classic, Men Are from Mars, Women Are from Venus, Deborah Tannen, the author of You Just Don't Understand, Simon Baron-Cohen, the author of The Essential Difference, and the husband-and-wife team behind a slim volume called Why Men Don't Iron.

Some of Caesar's expression of sentiment is more subtle. For example, I noticed that he paired "feminist" with the archaic, rather specialized, and technically inaccurate term philologist rather than linguist, and then emphasized the suggested disciplinary eccentricity with the catty little aside "not often that you meet one of them":

At least we do according to Deborah Cameron, Britain's pre-eminent feminist philologist (not often that you meet one of them) and the current Rupert Murdoch professor of language and communication at Oxford University.

But due to lack of cultural context, I missed what may be the most important features in that sentence. An anonymous academic of British origin, now working in an American university, filled me in.

The Caesar article doesn't strike me as patronising, though the tone is certainly strange. I think this is to do with a generalized wish to debunk self-importance, possibly with a whiff of personal animus bordering on misogyny.

The decalogue for British academics begins:

- thou shalt not self-promote
- thou shalt not be shrill and angry
- thou preferably shalt not come from a bastion of privilege like Oxford University
- thou surely shalt not hold a chair named after the Dirty Digger (= Murdoch)
- if thou art at a bastion of privilege and thou holdest a chair named after the Dirty Digger, thou canst not win.

I gather that "self-promotion" means something like "openly aspiring above your station", and "shrill and angry" means "passionately committed to an unlicensed cause".

Meanwhile, the copy of The Myth of Mars and Venus that I've ordered from amazon.com hasn't come yet, but OUP sent me one from the U.K. After skimming it quickly, I'll go with "delightfully spiky" -- more later.

Posted by Mark Liberman at 10:02 AM

October 09, 2007

The power of blub

Thomas Crimi writes:

I was forwarding your article on the Pirahã resistance to using exact numbers to a friend of mine; when I summarized it I noticed that in software development circles this phenomenon has been written about a few years ago by Paul Graham, which he called the Blub Paradox.

'Blub' is the name given for a middle-of-the-pack programming language (standing in for Java so as to be politically neutral). When some people rave about the great features of other languages (say, Graham's baby, Lisp), the Blub programmer shows complete indifference, not seeing how those features would make a practical difference day-to-day. Meanwhile, for any language worse than Blub, the programmer is able to rail on about how it'd be impossible to do useful work without feature X that Blub has.

From a cursory google, it seems that this appears first in his 2001 article "Beating the Averages".

The lesson of Blub is that even though we may feel superior to 'those' programmers, we all have our own blub and need to attempt to see what other things out there are worth learning.

Yes, I think that the Blub Paradox has considerable explanatory force, and I'm glad to learn about it.

 But the political economy of such situations is uncertain, it seems to me. If your way of life (or your programing language, or your way of thinking about group properties) is working for you and your peeps, it's not obvious that it's generally a good idea to invest a lot of time and effort in trying to learn something different, just because some outsider tells you that you should. It might pay off, and then again it might not; and there are various costs, not least the potential personal or social disruption.

Standing pat might sometimes be the best choice, alas, even when the outsiders are right. Thus people who think that they understand statements like "women have more sensitive hearing than men", without translating them into claims about sampled distributions, are fools at best. But the damage that people do to themselves by continuing to use crude approximate semantics, conceived in terms of the properties of prototypes, might not in individual cases outweigh the costs of learning new thinking skills.

That would be true, for example, if most of the damage is due to bad social decisions (whether explicit public policy mistakes, or implicit market outcomes), which any one individual has little control over. As a result, if you invest in better understanding, you may just wind up writing whiny weblog entries about what's wrong with bestselling books and dominant software-engineering practices.

Posted by Mark Liberman at 12:55 PM

Warning labels and recalls: are we protected?

We've all seen the recent spate of media ads for various types of pharmaceutical products. First they tell us how wonderful their products are and then they warn us about a series of possible bad side effects, some of which are downright frightening. Warning about a product they're trying to convince us to buy may seem like an odd strategy, but they're required to warn us and it's doubtful that they'd do so otherwise. But does this make us feel protected and safe? Eric Lipton's New York Times story summarizes the plight of a product that has so far attracted 31 product liability lawsuits.

The product in the news is called Stand 'n Seal, a fast way to seal grout around tiles in kitchens and bathrooms. It has been reported that two consumers have died and at least 80 others have been sickened after using this sealer. Shouldn't the Consumer Product Safety Commission's recall some two years ago have stopped this carnage? Shouldn't retailers have kept the 300,000 recalled cans of this product off their shelves? Even more important, shouldn't the product have been tested before all this happened? And how effective is the warning label on the can?

There's plenty of blame to go around here, starting with the Consumer Product Safety Commission iteself. Lipton observes it is "too overwhelmed with reports of injuries and with new hazards to comprehensively investigate and follow up on many complaints." Its laboratory is out of date and doesn't have the proper equipment to evaluate products. Like many bureaucracies these days, it's underfunded and understaffed (so much for alleged waste and fraud in bureaucracies). Even worse is the fact that the US does not require premarket testing of such products. The chairperson of the CPS commission says she's proud of the way her agency gets the consumer informed by recalling dangerous products. But apparently consumers have to be sickened or die before this can happen.

The role of linguistics in product liability cases is usually to focus on the warning labels on the products. I haven't been able to get my hands on a can of Stand 'n Seal yet but some parts of what it says have appeared in various articles about the case. The label proclaims that it "evaporates harmlessly" and in small letters says, "avoid breathing vapors" and "make sure room is well ventilated." In cases such as this I would expect the plaintiffs to compare this with the requirements listed on the Material Safety Data Sheet for Grout Cleaners. One does not find "avoid breathing vapors" there. Instead, it advocates the use of a strong directive, "do not breathe vapors." The semantic difference between "avoid" and "do not" should play a role here. The Materials Safety Data Sheet also says that this product, which is made with an aerosol chemical spray that contains Flexipel, should not be used "in restricted areas with poor ventilation." This may seem similar to the wording on Stand 'n Seal's label, "make sure the room is well ventilated," but the MSDS goes on to say that consumers should use an "approved cartridge respirator" and possibly use air scrubbers to prevent exhaust air from contaminating the environment. This advice does not appear on the warning label of Stand 'n Seal. In fact, the advertisements show a drawing of a man wearing no mask while he is spraying the floor.

Linguists who address the problems of warning labels in product liability cases have described the discourse structure  that an effective and readable warning should contain, including a description of the nature of the hazard, the danger of the risk, ways that the consumer can avoid the risk, all in a text written in clear, simple, clear language that the consumer can easily and quickly understand. This language should capture the attention of the reader, be understandable to the ordinary person, be direct and explicit, and be visibly readable and comprehensible. Linguistic issues all.

It's notable that those pharmaceutical ads I mentioned above seem to be able to describe the risks of the product in language that the ordinary person can easily comprehend. From all I can tell, they sell the stuff anyway. So being clear and explicit may not be all that damaging to manufacturers, even those who produce grout cleaners. And a clear and readable warning label can usually protect  them from product liability lawsuits like this one.

Posted by Roger Shuy at 10:05 AM

October 08, 2007

The wheel of reincarnation

Catchphrase saṃsāra. Compressed into two weeks:

Posted by Mark Liberman at 01:42 PM

Are pop gender studies from Uranus?

In the Sunday Times (of London), Ed Caesar has an article about Deborah Cameron's recent book, The myth of Mars and Venus ("Talking tosh on Mars and Venus", 10/7/2007). His tone is just a little bit patronizing (though perhaps he's only being British, it's sometimes hard for me to tell the difference):

So it turns out that after all the rows about the washing up, the shopping and the school run, men are not from Mars nor women from Venus. Both sexes are, rather prosaically, from Earth. And, despite anecdotal evidence to the contrary, men and women do speak the same language.

At least we do according to Deborah Cameron, Britain's pre-eminent feminist philologist (not often that you meet one of them) and the current Rupert Murdoch professor of language and communication at Oxford University.

Cameron, 48, is a firebrand with an impressive list of pet peeves, including Tories, Darwinists, GNER's passenger service announcements, Big Brother's language "so-called" experts, man-hating "pseudo-feminists" and societies for the protection of the semicolon. Don't get her started on Lynne Truss.

But the subject that has irked her most recently -- enough for Cameron to dedicate an entire book to bludgeoning its brains out -- is what she calls The Myth of Mars and Venus, published last week by Oxford University Press. In it Cameron tears into such seminal works as John Gray's Men are From Mars, Women are From Venus and Deborah Tannen's You Just Doní't Understand: Women and Men in Conversation.

On the other hand, it's a long article in a major publication, and Caesar often gets out of the way and lets Cameron speak:

"The main thing about the book is that I wanted to offer people more than the evidence-free rubbish they get every day," says Cameron. "My pitch was basically the CSI pitch: let the evidence tell the story."

Where the book becomes interesting is when she asks why we have become interested in these myths. "The first point to make is that in the past 20 years we have become obsessed by communication," she says. "And that's not just in relationships; it's in customer care, it's in politics. All problems are seen to be communication problems.

"If, for instance, anyone disagrees with someone else, it's seen to be because they don't understand each other. Well, actually you could understand me and still disagree with me. Likewise, if a train is delayed or cancelled, all anyone's interested in is whether there is an appropriate announcement. Communication has become a substitute for actual problem solving.

"Where this relates to the Mars and Venus books is that they say problems in relationships between men and women are all down to communication. The misunderstandings are not, for instance, about the fact that men and women are both vying for jobs, or power, or status, or time. That's quite comforting to a lot of people.

"There has been a revolution in gender politics -- there is much more blurring between the roles of men and women -- and I think a lot of men and women are uneasy about that. Books like Mars and Venus tell us that although men and women may be very similar on the outside, we are profoundly different on a deeper level -- that we're 'hard-wired' differently."

I haven't read the book yet, but conveniently, as of today, The Myth of Mars and Venus is orderable from amazon.com in the U.S. (though there's a note that "This item is not immediately available to ship"). I'm reluctant to dismiss all claims about differences in communication styles as "evidence-free rubbish" (some further discussion is here), but I'm certainly sympathetic to Cameron's appeal to "let the evidence tell the story".

[Meanwhile, three excerpts from the book have been published in the Guardian: "What language barrier?", "Speak up, I can't hear you", "and " Back down to Earth".]

Posted by Mark Liberman at 09:49 AM

But shouldn't that be "weagle"?

The Great Wall Wingle is "the first high-pressure common rail & high-end diesel pickup in China".

Creation embodies charm, technology fulfills leadership. As China's first high-end diesel pickup, Great Wall-Wingle creates a new realm of high-grade Pick-up, and goes forward towards the world as the Chinese Pickup leader. Fashion in every detail demonstrates innovative design concepts, creative design and avant-garde daring style, without any meaningless decoration. Brilliant achievements in their careers enjoy life passion.

As chinamotors.ru tells us, the name Great Wall Wingle is "an event from a combination of words wind (wind) and eagle (eagle)".

The Great Wall press release from the Guangzhou Auto Show explains further that the Wingle is "the first pick up with the ability of dragging, off road and loading and sitting". Its new diesel engine technology "is called INTEC--King of intellectual oil saving". As for style, "Great Wall Wingle is just like a brave and fierce lion from the appearance. The outline design shows an ambitious and arrogant as well as stylish atmosphere." Nor has safety been neglected, as "the design of four door anti-collision beam can utmostly reduces the distortion of rear doors caused by side crash", and in fact the vehicle "can utmostly absorb shock from every side".

Another press release focuses on "Wingle fever in overseas", and observes that "Idea abroad is quite different from that of China. In foreign countries, driving a pick-up is not a lack of pride, but a symbol of fashion and identity." But Chinese patriots should have no fear:

China has ignored this type for years and just regards it as a fractionizing part in motor market. However, it has created miracle that other types cannot achieve! The main symbol of it is that the foreign pick-ups can not be able to export in China since the 90s, because the independent brands leaded by Great Wall Motor have built a solid "Great Wall" to resist the foreign brands.

To be serious for a moment, it's surprising that after investing so much in development, production and marketing of what is no doubt an excellent vehicle, Great Wall didn't feel the need to invest a few thousand dollars in competent translation or creation of English-language blurbage. Then again, perhaps they think they did. Certainly their results are on a somewhat higher level than those of the manufacturers of the fast ether lord commutation machine, or the designers of the aisle signs at Century Mart in Shanghai.

Are the foreign-language promotional materials of American manufacturers this lame? I don't think so -- but then, maybe I just don't know.

[Hat tip: the Engine Room]

Posted by Mark Liberman at 08:37 AM

Shooting down "amateur grammar nazis"

Today's xkcd, "Effect an effect":

Presumably the point is that there really is a verb effect, for which the OED's first gloss is

1. a. trans. To bring about (an event, a result); to accomplish (an intention, a desire).
    b. To produce (a state or condition). Obs.

The origin and progress of this nexus of confusion were described in an old LL post, "Opening for a copy editor at the Associated Press" (12/13/2004).

Randall Munroe's image title text is "Time to paint another grammarian silhouette on the side of the desktop", which I guess is a reference to the practice of painting symbols on the fuselage of fighter planes to record kills of enemy aircraft:

But it's a vulgar error to call ill-informed usage cranks "grammarians".

[Update -- Cathy Prasad asks:

Is the point that if you are busy tripping up the grammar Nazis that spelling Nazis don't catch you? Am I the only one who noticed foriegn?

Good question. I'll beg off, on the grounds that I've already tagged myself as the world's worst proofreader. So far, no one seems to have mentioned it on the xkcd blag, but that might be because there's no entry yet about today's comic. Perhaps the misspelling was cleverly crafted, the bleating of the apparently-careless kid to attract the usage-crank tiger, so to speak. Then again, maybe it was just a misspelling -- Hartman's Law in action. Randall?]

Posted by Mark Liberman at 07:20 AM


John Cole at Balloon Juice ("The Republican Decline", 10/5/2007) says "Hogwash" to David Brooks' theory that the Republican party is collapsing because the "ambition of its creeds" is no longer "restrained by the caution of its Burkean roots". Pointing to the furor over Barack Obama's missing lapel pin, Cole offers another explanation:

For starters, people got tired of being associated with these drooling retards. Then, when they realized that these drooling retards had ideological allies running the show in the Bush administration and then began to experience their idiotic policies, they moved from disgusted to outright hostile.

Like me. It had nothing to do with Burke, and everything to do with what the party had become. A bunch of bedwetting, loudmouth, corrupt, hypocritical, and incompetent boobs with a mean streak a mile long and no sense of fair play or proportion.

My interest in this manifesto focuses on its use of the modifier bedwetting. A couple of years ago, I was puzzled by Tony Snow's use of "the bed-wetting right" to refer to conservatives who were unimpressed by FEMA's response to hurricane Katrina:

And I guess that being a bed-wetter means "delayed in developing the basic characteristics for moving from infancy to childhood", or "inappropriately infantile", or something like that. Thus the idea must be that a grown-up conservative would defend a Republican administration no matter what. Expecting executive competence is something to outgrow, like diapers. It's hard to believe Snow meant that -- but what else could it be?

Now John Cole is using bedwetting in a string of uncomplimentary modifiers applied to Fox News and Tony Snow's wing of the Republican party. I guess that the common thread is an accusation of excessive complaining, associated with an embarrassing sort of immaturity.

This is an all-purpose insult, used more frequently and more widely across the political spectrum than I realized: {bedwetting conservative}, {bedwetting liberal}. So far, however, the words {bedwetting moderate} seems mostly to have come together in discussions of actual incontinence.

Posted by Mark Liberman at 07:10 AM

October 07, 2007

Teh Holiez Bibul

Omri Ceren has drawn my attention to the fact that some people have begun a project to translate the entire bible into lolcatese, starting out this way:

1. In teh beginnin Invisible Man was invisible, and he maded the skiez and da earths, but he did not eated it.
2. The earths wus witout shapez and wus dark and scary and stuffs, and he rode invisible bike over teh waterz.
3. And Invisible Man sayz, i can has light, and teh light wuz.

This should make it possible to do lolcat computational linguistics on a whole new scale, and perhaps even to train statistical translation systems between lolcat and the many other idioms in which the bible is available. But exposed to cat macros in mass quantities, without any actual cat pictures, I repent. Geoff was right.

[Update -- Annabelle Large writes:

I saw this and went straight to Language Log to find an address to send it to. Then I saw you had a similiar link, but not this picture. I think it is the picture that inspired the wiki. http://riotclitshave.livejournal.com/888051.html

I'm not sure who the author of the picture is -- and I'm also not sure about its direction of influence with respect to the lolcatbible wiki.]

Posted by Mark Liberman at 08:33 AM

October 06, 2007

Qat: words, things, and men

The spring 2006 issue of Verbatim (which arrived a few weeks ago; things are a bit behind schedule) leads off (pp. 1-6) with an evocative piece by Gregory Johnson on chewing qat (the psychoactive leaves of the plant Catha edulis) in Yemen.  I found two aspects of the article notable, one small but striking, the other subtler.

Item 1.  On page 3, Johnson tells us

Now qat has taken over society to such a degree that most outsiders and even some Yemeni view it as a self-inflicted curse that is the root of all modern evils: underdevelopment, poverty, and lack of water.  The rest of us, however, are too busy chewing to pay much attention.  Those who refuse to chew have never experienced kayf, that elusive, nearly untranslatable word, which allows one to melt into the background, becoming perfectly at ease with one's surroundings and oneself.

Whoa!  You experience a WORD?  The WORD allows you to become one with your surroundings and yourself?  I don't think so.  It's like saying

I finally experienced nirvana, a three-syllable word for a kind of transcendant state.

I nearly died from necrotizing fasciitis, the name of an affliction known commonly as "flesh-eating bacteria".

This is a kind of use-mention pun, treating the word and the thing it refers to as the same.  In Johnson's sentence, what's elusive is the CONCEPT; what's nearly untranslatable is the WORD -- untranslatable because the concept is elusive, with no easy counterpart in our outsiders' modes of thinking.  The confusion might be promoted in Johnson's case by two different uses that italicization can be put to in English: for using a non-English word ("the pleasure of Schadenfreude"), and for mentioning words ("polysyllabic is polysyllabic").

The easy solution is to refer to the thing: the state (called khayf) that results from chewing qat is hard to describe; I nearly died from necrotizing fasciitis (whose common name is "flesh-eating bacteria"; etc.

Item 2.  Very close to the beginning, Johnson says

In Yemen, qat chews provide a popular and important forum for debate and dialogue.  Nearly everyone chews ...

And later (p. 5):

Nearly every occasion in Yemen is an occasion to chew.

(Looking ahead: the issue is going to be about "nearly everyone" and "nearly every occasion").

The tale goes on, with stories of long chews involving recitations of poetry, disputations, the complexity of obtaining qat, and so on.  There are hints along the way:

... even old age and toothlessness fail to stop some, as old men use a mahtana 'grinder,' to ...  (p. 1)

But really, wherever you go the scene is the same: men are cursing, jostling, and invoking God's name ... (p. 3)

I've watched men sample nearly half a bag before they purchased it.  (p. 4)

Eventually, we get (on p. 6, right before the end) to:

Men wrestle with their internal jinn in silence until the silence is broken and the music ceases, as someone mumbles a quick goodbye and slips on their sandals.  The chew comes to a rather hurried and untidy end as everyone prepares to leave.  Most men tend to go home to a quiet evening with their families, relaxed and at ease with the world.

Ah.  EVERYONE doesn't chew qatMEN chew qat.  The whole business is part of the "men's world" of Arab culture, the public world, which takes place outside the domestic sphere, the private world.

Now I'm NOT saying that this sort of public/private, male/female split is something peculiar to Arab culture.  It occurs all over the world, has been much commented on in the West in the literature on women's history, and still crops up in places like Harvey Mansfield's troglodyte On Manliness (2006).

Instead, what's surprising is that Johnson is so absorbed in the culture he's talking about that he fails to translate for his readers.  He supposes we share the assumptions of the world he's describing, in which only men would gather for affiliation, competition, and verbal displays, in places away from their homes. 

Yes, I know: women have other lives, with their own social configurations, in "private" places.  That's not the point.  The point is that the women are erased in "nearly everyone", and domestic occasions in "nearly every occasion", and a reader from outside the culture could easily fail to appreciate that.

[Full disclosure: I'm a Verbatim board member, but I had no hand in preparing this issue.]

Posted by Arnold Zwicky at 03:05 PM

The long moist tail

Following up on our earlier coverage of "word aversion", especially involving "moist", Ben Zimmer points out to me that there's a facebook group called "I HATE the word MOIST!", which currently has 71 members. "If the very word 'moist' make you cringe every time you hear it, this group is for you."

One of the members pointed out a problem with the group's name:

possibly the worst word in the english dictionary....i feel sick. Could we possibly censor the "m word" from this page? everytime i look at it I actually want to vomit.

The founder's response:

Unfortunately, I can't change the title because that's impossible and people have to know which word we hate. Because of that, I'm just going to leave the other "m words" there too. The damage is done.

Other writers on the wall use "m***t" or "the m word", or just refer to the word indirectly:


My friends and family taunt me with this word constantly. I hate the way it sounds so nasal at the beginning of the word and that it sounds exactly like the meaning. Like its onomatopoeic or something. Urgh.

I'm glad I've finally found a place I can feel I belong!

I hate it went the word m***t is used in cookery programmes to describe the food. Its not appropriate!!!

M***t haters of the world unite! I despise the sick, repugnant word! Itís almost as bad as vaginal! Uurrrrhh - Gotta go chuck now!

AHHH i can't believe there's a group for this! amazing!! I have wanted to barf on anyone who has said m***t for my whole lifeee gah EWWwweewww

But there is a long tail of other possible Facebook word-aversion groups:

You know what else I hate, guys? TENDER ugggh! As in tender loving care. ewewewwwwwwwwwww!
I got to culinary school, so I have to encounter these words in product evaluations a lot. NOT FUN.

Posted by Mark Liberman at 10:40 AM

The Pirahã and us

The Pirahã language and culture seem to lack not only the words but also the concepts for numbers, using instead less precise terms like "small size", "large size" and "collection". And the Pirahã people themselves seem to be suprisingly uninterested in learning about numbers, and even actively resistant to doing so, despite the fact that in their frequent dealings with traders they have a practical need to evaluate and compare numerical expressions. A similar situation seems to obtain among some other groups in Amazonia, and a lack of indigenous words for numbers has been reported elsewhere in the world.

Many people find this hard to believe. These are simple and natural concepts, of great practical importance: how could rational people resist learning to understand and use them? I don't know the answer. But I do know that we can investigate a strictly comparable case, equally puzzling to me, right here in the U.S. of A.

Until about a hundred years ago, our language and culture lacked the words and ideas needed to deal with the evaluation and comparison of sampled properties of groups. Even today, only a minuscule proportion of the U.S. population understands even the simplest form of these concepts and terms. Out of the roughly 300 million Americans, I doubt that as many as 500 thousand grasp these ideas to any practical extent, and 50,000 might be a better estimate. The rest of the population is surprisingly uninterested in learning, and even actively resists the intermittent attempts to teach them, despite the fact that in their frequent dealings with social and biomedical scientists they have a practical need to evaluate and compare the numerical properties of representative samples.

If we project this state of affairs onto the scale of the Pirahã society, with roughly 300 members, we arrive at something like 0.05 to 0.5 people out of 300 who understand how to count and compare quantities in ways that have become essential to the culture. In this respect, I submit, we are exactly like them.

For English-language terms dealing with the comparison of sample statistics, the OED's first citations generally span the period from around 1880 to 1940 (emphasis added):

1885 F. GALTON in Jrnl. Anthropol. Inst. 14 276 The value which 50 per cent. exceeded, and 50 per cent. fell short of, is the Median Value, or the 50th per-centile.

1895 K. PEARSON in Philos. Trans. R. Soc. A. CLXXXVI. 399 The histogram shows, however, the amount of deviation at the extremes of the curve. [Note. The word Ďhistogramí was] introduced by the writer in his lectures on statistics as a term for a common form of graphical representation, i.e., by columns marking as areas the frequency corresponding to the range of their base.

1894 K. PEARSON in Phil. Trans. R. Soc. A. CLXXXV. 80 Then σ will be termed its standard-deviation (error of mean square).

1895 K. PEARSON in Phil. Trans. R. Soc. A. CLXXXVI. 412 A method is given of expressing any frequency distribution by a series of differences of inverse factorials with arbitrary constants.

1918 R. A. FISHER in Trans. R. Soc. Edin. LII. 399 It is..desirable in analysing the causes of variability to deal with the square of the standard deviation as the measure of variability. We shall term this quantity the Variance.

1934 J. NEYMAN in Jrnl. R. Statistical Soc. XCVII. 562 The form of this solution consists in determining certain intervals, which I propose to call the confidence intervals.., in which we may assume are contained the values of the estimated characters of the population, the probability of an error in a statement of this sort being equal to or less than 1 - ε, where ε is any number 0<ε<1, chosen in advance. The number ε I call the confidence coefficient.

Before 1900 or so, only a few mathematical geniuses like Gauss (1777-1855) had any real ability to deal with these issues. But even today, most of the population still relies on crude modes of expression like the attribution of numerical properties to prototypes ("A woman uses about 20,000 words per day while a man uses about 7,000") or the comparison of bare-plural nouns ("men are happier than women").

Sometimes, people are just avoiding more cumbersome modes of expression -- "Xs are P-er than Ys" instead of (say) "The mean P measurement in a sample of Xs was greater than the mean P measurement in a sample of Ys, by an amount that would arise by chance fewer than once in 20 trials, assuming that the two samples were drawn from a single population in which P is normally distributed". But I submit that even most intellectuals don't really know how to think about the evaluation and comparison of distributions -- not even simple univariate gaussian distributions, much less more complex situations. And many people who do sort of understand this, at some level, generally fall back on thinking (as well as talking) about properties of group prototypes rather than properties of distributions of individual characteristics.

If you're one of the people who find distribution-talk mystifying, and don't really see why you should have to learn it, or perhaps think that you're just not the kind of person who learns things like this -- congratulations, you now know exactly how (I imagine) the Pirahã feel about number-talk.

Does this matter? Well, in the newspapers every week, there are dozens of stories about risks and rewards, epidemiology and politics, social trends and psychological differences, with serious public-policy and personal-lifestyle implications, which you can't understand without understanding distribution-talk. And usually you won't just feel baffled -- instead, you'll think you understand, and draw the wrong conclusions.

In fact, the people who write these stories mostly don't understand distribution-talk themselves, and in any case they believe that they need to write for an audience that doesn't understand it. As a result, news stories on these topics are usually impossible to understand correctly unless you go back to the primary sources in order to recover the information that's been distorted or omitted. I imagine that something similar must happen when one Pirahã tells another about the deal that this month's river trader is offering on knives.

If you're one of the small minority who does understand distribution-talk, and you're thinking "well, maybe all those English majors don't get it, but all the technically-savvy people do", please go back and read this quietly-hilarious account of three geek journalists wrestling with the concept of the "long tail". And then imagine three Pirahã joking around about the number seven.

Posted by Mark Liberman at 06:18 AM

October 05, 2007

These ones (cont.)

My mail about these ones and related matters is piling up alarmingly.  The big news is that the expression is indeed regionally distributed: my U.S. correspondents are mostly dubious about it, but my U.K. correspondents find it unremarkable (and were consequently astounded by my judgment that it was non-standard).  And, not surprisingly, several of these British speakers report that these ones and plain these are not equivalent for them.

First in was Nicholas Widdows, who said that he

... would routinely both say and write 'these ones', without any awareness of any particular difference between the fused head [plain these] and determiner + plural noun constructions ...

but went on to consult the British National Corpus, where he found examples suggesting that these ones has a use that isn't easily available to these, namely to pick out one instance as a representative of a type:

Faced with an array of jelly babies I might point to a red one and say, 'I like these ones.' The fused head could be misinterpreted as referring to all jelly babies; the 'ones' says more clearly "this type".

He continues:

... fused head 'this' points to the thing there, 'this one' to one of the things there, 'these' to all the things there as a group, and 'these ones' to this one I'm pointing to and its fellows of that kind out of the larger group there. I'm not saying that's a rigid distinction, of course, only that that seems to be how I would normally use and interpret the phrase

And ends with an

Unconnected crazy idea: could some people think that 'one' is singular so there must be something dodgy about 'ones'?

Not so crazy: Alex Boulton tells me that some French teachers -- Boulton is at the University of Nancy -- tend to think that ones is an impossible form, period.  This is what you'd think if you appealed to "logic" and also subscribed to the idea that the one of this one is the numeral one -- in which case a plural ones would be "illogical".

However, the ONE of this one and these ones is not the numeral, but an indefinite pronoun, which serves as the head in NPs; the numeral, in contrast, functions (like other numerals) most commonly as a determiner (though, again like other numerals, it can serve as head in some constructions, for example the Numeral of NP construction in one of the dogs).  The numeral is the historical source for the indefinite pronoun, and for the generic personal pronoun (as in One never knows), and for that matter the indefinite article a(n), but these lexical items have all gone their own ways long ago, and each has its own syntax and semantics.  Etymology is not destiny.

Several correspondents have suggested treating these ones as parallel to these two, but I think that the real parallel is between these ones and this one, both of them demonstrative + indefinite pronoun.

[Digression on these two, which has two points of interest.  One point is that it shows that sometimes determiners can be layered, as in these two ideas (cf. the/your two ideas and the/your one idea here; there are other types of layered determiners).  Another point is that the two of these two fuses the determiner two with some indefinite head element -- much like ones, in fact -- so that two in some sense realizes both parts, much as the professor's in I like your idea, but I like the professor's even more realizes both the determiner the professor's and an indefinite head -- much like one -- in a single constituent.  (The fusion idea comes from CGEL, ultimately from the work of Michael Wescoat.)]

In any case, there's nothing wrong with ones in general.  Things like the blue ones and which ones and the ones (with a postmodifier, as in the ones from Chicago) and so on are all fine.

[Another side issue: correspondent Empty Pockets reports having been taught never to use these "as a noun" (that is, as a pronoun), but only "as an adjective" (that is, as a determiner), and suggests that overzealous application of this "rule" might be behind the use of these ones instead of these.  I'm familiar with this proscription, posted about it to ADS-L back in 2003, and intend to re-work that posting for Language Log.  But what's important here is that the proscription is general, applying to anaphoric uses of all the free-standing demonstratives: this, that, these, and those.  (The justification offered for this proscription is that all such uses are "vague".  Yes, this is hogwash; you don't need to write me about the deficiencies of the idea.).  There would be no reason to "fix" only one of these four by supplying one(s), and anyway, in the examples we started with it was deictic, not anaphoric, uses that were at issue.]

Back to the geography.  These ones seems widespread and in no way notable in the U.K.  The BNC, according to Alexander Boulton, has 87 occurrences of these ones and 60 of those ones, which is not a lot in 100 million words.  But the reports from native speakers suggest that the low numbers in the corpus merely reflect the rarity of the situations that would call for these expressions; when they're useful, British speakers use them.

Andrew Cave reports (from Brisbane) that it's also common usage in Australia.  Meanwhile, Dan Asimov started noticing it around him when he moved this summer from the U.S. to Vancouver Island, Canada.  He posted about it on the Wikipedia "talk" page on Canadian English, but the follow-ups to his query don't tell us much about Canadian usage on this point.  Now Fiona Hanington writes -- also from British Columbia -- to say that it sounds completely natural to her.

In the U.S., I have a report from Patrick Whittle, saying it was reasonably common in central Kentucky, where he grew up.  And from Dave Kathman, who grew up in suburban Chicago, saying that it's perfectly OK in his idiolect.  As a card-carrying linguist, Kathman went on to say something about how he uses it:

"These ones" does not have the same meaning or distribution as deictic "these", as you might expect. As far as I can tell, I only use it when I'm trying to emphasize that I'm talking about a specific group of items, contrasting them either with a different but similar group of items, or sometimes with a category to which the items in question belong.  For example:

1. Those apples over there look rotten, but these ones are fine.
2. I'll take these ones.  (Where there is an implied contrast with some other group of similar items that I'm not taking)
3. I usually don't like Cubist paintings, but these ones are really amazing.  (Said in an art museum while standing in front of some Cubist paintings)

Simple "these" is also fine for me in all these examples; "these ones" merely provides emphasis when referring to a very specific, well-defined group.

This account is similar to the one from Nicholas Widdows above, and to a suggestion from Andrew Clegg that the 1 variant, with ones, is more likely when there's an implied contrast.

It's not surprising that people with both the 0 and 1 variants should subtly differentiate them -- as I repeat every week or so here on Language Log, variation is very rarely truly free -- or that the 1 variant, with an explicit head pronoun, should be used for contrast.

On the geography and standardness fronts, things are not entirely clear in North America.  A contributor to the Wikipedia talk page ("the user formerly known as JackLumber") points out that

... if you search the entire text of the Oxford English Dictionary, the only instance of these ones you're going to get is American---a citation from The Young Manhood of Studs Lonigan by James Thomas Farrell: "I know they ain't loaded. But use these ones. Them damn things is jinxed!" (s.v. jinx, verb.)  It ain't exactly Standard English anyways...

But then we have Dave Kathman's judgments, and the practice of a number of Canadians, at least in British Columbia.  It's possible that in North America these/those ones is a variant in the gray area between standard and non-standard -- fully acceptable to educated middle-class speakers in some areas, but not fully acceptable, though not actually stigmatized, to such people in other areas.  I know of cases like this.

For example, we had a discussion on ADS-L back in May about the sentence-initial discourse connective (serving as an additive adverbial) as well, as in

There are financial considerations.  As well, there are the children to consider.

(Please note how I've delimited the as well under consideration here.  There are several other uses of as well, with different syntax, semantics, and sociolinguistic status.  What I'm reporting here doesn't carry over to the others.)

Garner's Modern American Usage (p. 71) identifies sentence-initial linking as well 'also' as a Canadianism:

... this phrase has traditionally been considered poor usage.  But in Canada it's standard.

The facts seem to be that this discourse connective as well occurs with some frequency in both the U.S. and Canada, and that in Canada it is subject to no stigma, while in the U.S. it is viewed by some commentators as at least informal (and by some as unacceptable) and occurs infrequently in "good writing".  Maybe these/those ones is roughly like this as well.

This would probably be a good place to halt posting on ones.  I got into the topic through a search for literature on the regional and social distribution of these/those ones and on its history, but I've found nothing.  I wasn't proposing to investigate these topics from scratch myself, and if I were, collecting e-mail reports from individual speakers about their usage would not be an appropriate research strategy, nor would just googling up examples.  The most you can get from such sources is a sense of what you might look at in a careful study, and a careful study is a very big project.

But before I bow out, here's an intriguing report from Nick Baker on yet another side topic:

Your post 4988 on Language Log reminds me of a usage I found surprising among Jehovah's Witnesses.  The literature and the people are very prone to using <adjective> ones (I haven't counted, so it may not be in every Watchtower, but it's common).  One often sees or hears "such ones", "sheep-like ones", "these ones", etc., mostly referring to people (though I see one reference to principles, which sounds more natural to me), where most people I listen to would probably say "such individuals" or "sheep-like people".  I don't think I've ever heard "such ones", for example, from anybody else, and I don't know if it really has a regional usage base.  (watchtower "these ones" gets me 3480 Ghits.)

Every religion has its own vocabulary, but I was surprised to find a faith-based and apparently non-geographical usage of "ones".  I've heard it from Witnesses with a variety of speech backgrounds; I'm sure they pick it up from the religious community.

Posted by Arnold Zwicky at 03:34 PM

You read it here first

According to recent news reports, this year's Ig Nobel Prize in Linguistics went to Juant Manuel Toro, Josep Trobalon and Núria Sebastián-Gallés, of Barcelona University, "for showing that rats cannot tell the difference between a person speaking Japanese backwards and a person speaking Dutch backwards".

Regular Language Log readers will already know this research well, because we covered it (twice) back in January of 2005, with audio waveforms and sound clips.

I'm ashamed to say that we missed the research by Glenda Browne of Australia, who was awarded the prize in Literature "for her study of the word 'the' and the problems it causes when indexing". On the other hand, we have devoted quite a lot of attention to the word "the", and I would tell you about it if only I could figure out how to search our old posts to find the ones that are relevant.

[Update -- John Cowan deployed some strong search-fu, or perhaps just a great deal of patience, and came up with this list:



[And David Beaver points out that John's search for "definite article" missed

The the the and the thee the
The The, the The The, and The Who


[Update -- Julia Keenan points out that Glenda Browne's "The Definite Article: Acknowledging 'The' in Index Entries" is available online.]

Posted by Mark Liberman at 02:47 PM

The caring membrane in the brain

Heidi W., who originally told me about the crockus a few weeks ago, wrote recently that she obtained the handout from one of Dan Hodgin's recent presentations on discipline, and she sent along a scan of one of his slides:

Unlike the crockus, the corpus callosum does actually exist. Everything else on the slide is wrong -- and I'm not talking about the missing apostrophes and the misspelling of than. But I don't think that we have to track down the elusive Dr. Alfred Crockus in order to figure out where Mr. Hodgins got the inspiration for this latest neuro nonsense.

There are three assertions here:

  1. The corpus callosum is "the caring membrane in the brain".
  2. Girls' CC is three times larger than boys'.
  3. Boys frequently cannot demonstrate empathy unless it is related to a physical action.

Hodgins himself is a pathetic figure, worth debunking only as an example of the charlatanry that apparently can flourish these days in the demi-monde of pseudoscience that the education industry calls "professional development". But the seeds that grew into these false or misleading ideas are out there, and it's interesting to take a look at the process of re-interpretation, exaggeration and myth-formation that creates them.

In this case, we can start the debunking with Dan Hodgins' own prior work. In my original post on the Crockus, I quoted from an article that he published in the Mott Community College Focus a few years ago, "Male and Female Differences":

Another structural difference, and perhaps the most striking, is the corpus callosum, the bundle of nerves that connects emotion and cognition. In females, it is up to 20% larger than in males, giving females better decision making and sensory processing skills.

This 20% difference was imaginary to start with, so it's not surprising that in Hodgins' collection of powerpoints, it's grown to 300%. If you're going to make stuff up, and you're bold enough to invent a brain region called the "crockus", why not supersize your fabrications about brain regions that actually exist?

But Hodgins did not invent the idea that there might be a corpus-callosum difference that's somehow relevant to sex differences in empathy. We can find the seeds of this concept in Simon Baron-Cohen et al., "Sex Differences in the Brain: Implications for Explaining Autism", Science 310(5749):819-823, 2005.

Although there is a great deal of individual variance in human brain morphometry (21), it is known that the cerebrum as a whole is about 9% larger in men and is also larger in boys (21), a difference that is driven more by white matter than by gray (22, 23). Despite the larger total volume of white matter in men [and despite the conflicting studies of sex differences in specific corpus callosum measures (24)], three-dimensional (3D) morphometry suggests that the ratio of corpus callosum to total cerebral volume is actually smaller in men (22).

The corpus callosum is a bundle of several hundred million axons that make connections between the two hemispheres of the mammalian cortex. Describing it as "the caring membrane in the brain" seems at first like a bizarrely random choice of phrase, something like describing a car's intake manifold as "the braking cavity in the engine". But as we'll see, this is a misrepresentation and exaggeration of a controversial but respectable theory.

Those "conflicting studies" about sex differences in CC size -- SB-C's reference (24) -- are discussed in K.M. Bishop and D. Wahlsten, "Sex Differences in the Human Corpus Callosum: Myth or Reality?", Neuroscience & Biobehavioral Reviews, 21(5) 581-601, 1997, which I quoted a couple of months ago, and also in my original note on Hodgin's Crockus. They assert that "Data collected before 1910 from cadavers indicate that, on average, males have larger brains than females and that the average size of their corpus callosum is larger. A meta-analysis of 49 studies published since 1980 reveals no significant sex difference in the size or shape of the splenium of the corpus callosum, whether or not an appropriate adjustment is made for brain size using analysis of covariance or linear regression."

Bishop and Wahlsten thought it was appropriate to publish these negative results in order to correct some earlier work by Dr. Sandra Witelson, whose claims of a sex difference in corpus callosum size were the subject of many news reports and much speculation.

SB-C's reference (22), which he takes to trump Bishop and Wahlsten, is J. S. Allen, H. Damasio, T. J. Grabowski, J. Bruss, W. Zhang, "Sexual dimorphism and asymmetries in the gray-white composition of the human cerebrum", Neuroimage 18, 880 (2003). Let's accept that (22) is right and (24) is wrong -- what's the claimed sex difference in corpus callosum size?

Using high resolution MRI scans and automated tissue segmentation, gray and white matter (GM, WM) volumes of the frontal, temporal, parietal, and occipital lobes, cingulate gyrus, and insula were calculated. Subjects included 23 male and 23 female healthy, right-handed subjects. For all structures, male volumes were greater than female, but the gray/white (G/W) ratio was consistently higher across structures in women than men. Sexual dimorphism was greater for WM than GM: most of the G/W ratio sex differences can be attributed to variation in WM volume. The corpus callosum, although larger in men, is less sexually dimorphic than the WM as a whole.

(Roughly, gray matter is made up of the cell bodies of neurons, along with short-distance axons and dendrites, glial cells, capillaries and some other stuff; and white matter is made up of myelinated axons that interconnect neurons across longer distances.)

Here's the relevant table of results from Allen et al.:

In terms of percentages, the average male corpus callosum in this study was about 9% larger than the average female corpus callosum. On the other hand, relative to total brain volume, the average male corpus callosum was about 4% smaller. And relative to total white-matter volume, the average male corpus callosum was about 7% smaller.

It's not clear to me what to make of these differences absolute and relative volumes -- but none of them are anywhere near a difference of 20% larger in favor of women, much less "three times larger". (It's true that Hodgins is talking about girls vs. boys, not adults; for developmental details, if you care, see my earlier post.)

More important, these average differences emerge from distributions that are very highly overlapped. Allen et al. don't show the actual data points for the corpus callosum, but they do show scatter plots of individual's gray (or white) matter volume against the G/W ratio, for the left and right hemispheres separately. There are four of these scatter plots -- this is the one that shows the largest male-female difference:

(I've colored the female points blue and the male points red, since the quality of the figure as reproduced online makes it hard to distinguish them without scrutiny.)

OK, I hope you're saying to yourself, what does all this have to do with empathy, anyhow?

Continuing with Simon Baron-Cohen et al.'s 2005 Science article:

Recent hypotheses concerning neural connectivity in the autistic brain postulate an exaggerated version of what may also be going on in the typical male brain: a skewed balance between local and long-range connectivity (48Ė51). Such a connectivity difference could give rise to a deficit in empathizing, because empathy activates brain regions that integrate information from multiple neural sources (52). In autism, furthermore, long-range connectivity during an empathizing task is abnormally low (53). This notion of skewed connectivity is also compatible with strong systemizing, because systemizing involves a narrow attentional focus to local information, in order to understand each part of a system.

I described SB-C's theory of empathizing vs. systematizing in a post about a year ago, "Stereotypes and facts", 9/24/2006. Since then, several readers have taken me to task for being too credulous with respect to sex differences in empathizing that depend on self-report, as SB-C's data does.

Alen Murn sent me a reference to Nancy Eisenberg and Randy Lennon, "Sex Differences in Empathy and Related Capacities", Psychological Bulletin 94(1): 100-131, 1983. The abstract:

In this article, the literature on sex differences in empathy (defined as vicarious affective responding to the emotional state of another) and related capacities (affective role taking and decoding of nonverbal cues) was reviewed. [...] In general, sex differences in empathy were a function of the methods used to assess empathy. There was a large sex difference favoring women when the measure of empathy was self-report scales; moderate differences (favoring females) were found for reflexive crying and self-report measures in laboratory situations; and no sex differences were evident when the measure of empathy was either physiological or unobtrusive observations of nonverbal reactions to another's emotional state. Moreover, few sex differences were found for children's affective role taking and decoding abilities.

Murn also sent me Richard A. Fabes and Nancy Eisenberg, "Meta-Analyses of Age and Sex Differences in Children's and Adult's Prosocial Behavior", 1998, which surveyed "155 studies yielding 478 effect sizes", and came to essentially the same conclusions. Their interpretation of this pattern is that the apparent sex difference in empathy is probably a difference in social stereotypes and self-presentation, not a difference in actual feelings:

Sex differences in self- and other-reported prosocial behavior may reflect people's conceptions of what boys and girls are supposed to be like rather than how they actually behave.  [...] These findings are consistent with the view that girls' reputations for prosocial behavior are greater than the actual sex difference. [...]

Findings in regard to sex differences in empathy and sympathy, like those for prosocial behavior, vary with the method used to assess empathy-related responding. As mentioned previously, Eisenberg and Lennon (1983; Lennon & Eisenberg, 1987a), in a meta-analytic review, found large differences favoring females for self-report measures of empathy, especially questionnaire indices. No gender differences were found when the measure of empathy was either physiological or unobtrusive observations of nonverbal behavior. In more recent work in which sympathy and personal distress were differentiated, investigators have obtained similar findings, although they occasionally have found weak sex differences in facial reactions (generally favoring females) (see Eisenberg, Martin, & Fabes, 1996; Eisenberg, Fabes,& Miller, et al., 1989). Eisenberg and Lennon suggested that the general pattern of results was due to differences among measures in the degree to which both the intent of the measure was obvious and people could control their responses. Sex differences were greatest when demand characteristics were high (i.e., it was clear what was being assessed) and individuals had conscious control over their responses (i.e., self-report indices were used); gender differences were virtually nonexistent when demand characteristics were subtle and study participants were unlikely to exercise much conscious control over their responding (i.e., physiological indices). Thus, when gender-related stereotypes are activated and people can easily control their responses, they may try to project a socially desirable image to others or to themselves.

Dan Hodgins, however, has a different interpretation:

Boys frequently cannot demonstrate empathy unless it is related to a physical action.

According to Fabes and Eisenberg's review, this might be true, if you interpret it to mean that boys and girls feel and act empathetically to the same extent, but girls describe themselves as more empathetic than boys do, because they've learned that it's expected of them. Somehow, though, I don't think that's how Hodgins meant the audience to interpret his bullet point.

Posted by Mark Liberman at 06:32 AM

October 04, 2007

Just say "these"

Language Log reader Tim Leonard wrote me a little while ago about his wife's use of these ones, and I passed the query on to the American Dialect Society mailing list on 9/29/07, hoping that someone there would know of literature on the usage, especially its regional and social distribution and its history.  Nothing has turned up, and I haven't been able to find discussion of the variant in CGEL -- or, remarkably, in a pile of usage guides I examined.  But having started thinking about the variant, I was moved to write a bit about the syntax of such expressions.

Leonard's original message:

My wife consistently uses "these ones" where I use "these", and to my ear it isn't quite grammatical.  I figured it was a regional variation that was likely (since I probably represent a whole class of people who find her usage jarring) to have become some commentator's bugaboo, and that I might learn something from the usage guides.  But Merriam-Webster's English Usage is silent on "these ones", as is American Heritage Book of English Usage, The Columbia Guide to Standard American English, and Strunk and White.  Google to the rescue, with the web version of Paul Brians's Common Errors in English, but the entry on the web simply disapproves, with no reasoning or citations or commentary, so I've learned very little.

Is this variation familiar to you, or covered in any of your usage guides?  Or can you suggest how else I might learn about it?

And what Brians says:

By itself, there's nothing wrong with the word "ones" as a plural: "surrounded by her loved ones." However, "this one" should not be pluralized to "these ones." Just say "these."

These ones is indeed familiar to me (for many years now); people have reported it to me (with disfavor) many times.  But i don't find it in any of the first 20 usage guides I consulted (thanks again to Rachel Cristy for her help in these searches), despite the fact that it's an obvious candidate for an Omit Needless Words treatment -- that is, for a "secondary" appeal to ONW as backing for disapproval of a non-standard usage.  (I am as often surprised by the things that mostly escape the attention of usage advisers as I am by the things they pile up on.)

As for its regional and social distribution, the variant might have a distribution along the usual lines, but it's also possible that it's merely an occasional non-standard variant that turns up via analogy (to this one) every so often and then spreads from person to person without becoming strongly associated with any social identity, or spreads in this fashion while becoming associated with many different social identities.

On to the syntax.  First, i'll assume that those works like these (and that like this); this could turn out to be wrong, of course.  (Note that Brians mentions only these.  In fact, usage advisers often mention one instance of a phenomenon while neglecting other parallel instances.)

Next, I'll label the variants  0 (without one(s)) and 1 (with one (s)), and I'll distinguish two different uses of the variants -- a "deictic" (D) use, in which the expressions are accompanied by some sort of indicating gesture, perhaps just gaze, and an "anaphoric" (A) use, in which the expressions refer to some entity or entities established in the preceding discourse.  (Undoubtedly, there's more complexity here, but this is enough to expose some big things that are going on.)

I predicted that Leonard's wife doesn't use these ones EVERYWHERE he uses these; rather, he uses these everywhere she uses these ones, but she has both these and these ones.  The non-standard variant is, I believe, available only for D uses, not for A uses. (These predictions have now been confirmed.)

Here's the A pattern, including the Apl case, in which I think everybody (including Leonard's wife) has the 0 variant and NOT the 1 variant:


   I didn't buy it, because it was ugly.  [response] This/That is a strong objection.


   I didn't buy it, because it was ugly.  [response] *This/*That one is a strong objection.


   I didn't buy it, because it was ugly and cost too much.  [response] These/Those are strong objections.


   I didn't buy it, because it was ugly and cost too much.  [response] *These/*Those ones are strong objections.

Now for the deictic uses.  Assume that the speaker is looking at a tray of objects and has been asked to choose one, or some.


   I'll take this/that.


   I'll take this/that one.  [both 0 and 1 are possible, though they're not truly equivalent.]


   I'll take these/those.  [standard variant]


   I'll take these/those ones.  [non-standard variant]

(It's entirely possible, even likely, that some speakers with the 1 variant in Dpl also use the 0 variant on some occasions.  Even Leonard's wife might do this; he's only going to notice her productions when they differ from his own.  I don't know if anyone has looked at within-speaker variation on this point.)

Now to widen the focus a bit: the distribution of 0 and 1 variants differs from construction to construction.  Possessives pattern, I think, like D this/that: standard 0 (mine / yours / etc.), but non-standard 1 (my one(s) / your one(s) / etc.); I don't know whether the constructions co-vary within individuals.  To complicate things further, I think that full-NP possessives are more acceptable in the 1 variant (the professor's one(s)) than possessive pronouns are.

Other constructions are like Dsg this/that in allowing both the 0 and the 1 variant (as above, I'm not claiming that these variants are truly equivalent, only that both are acceptable):

EACH: Each (one) is flawed.

ANY: Any (one) of them will do.

Adj: You take the red pencil and 'll take the blue (one).  You take the red pencils and I'll take the blue (ones).

(As far as i know, no one has objected to the 1 variants here, or to Dsg this/that one, on "primary" ONW grounds.  True, you could save a word, but both variants are standard, so ONW doesn't come up.)

Finally, there are determiners for which only the 0 variant is possible (parallel to A this/that and A these/those), and others for which only the 1 variant is:

0-only ALL: All are flawed.  *All ones are flawed.

1-only EVERY: *Every is flawed.  Every one is flawed.

And that's what I know about this topic at the moment.

Posted by Arnold Zwicky at 12:11 PM

The "gender happiness gap": statistical, practical and rhetorical significance

A couple of weeks ago, two economists at Penn (Betsey Stevenson and Justin Wolfers) finished a paper about changes over time in women vs. men's self-reported happiness. As usual in social science research, it deals with changes in the distribution of the characteristics of individual members of large and diverse groups.

A few days later, David Leonhardt promoted this research in a New York Times article, along with some other recent work by Alan Krueger. As usual in popular understanding of scientific research, this article's readers concluded that the research revealed properties of all the typical individuals in the groups studied, and especially of themselves. At least, that was the reaction of the hundreds of readers who contributed impassioned and often touching comments, both on the NYT site and in other web forums, and put Leonhardt's article on top of the "most emailed" list for several days.

Several friends and acquaintances mentioned the article to me in passing, and their reactions were similar to what I read online. Women used to be happier than men; now men are happier than women; does this mean that feminism has failed, or that it has succeeded, or that it has partly succeeded and needs more effort, or what? Does this mean that that I (or my partner) should do more (or less) of the housework and childcare, or spend less (or more) time on career development, or worry more (or less) about personal appearance, or put more (or less) effort into hobbies and personal relationships?

Those are all good questions, but I wondered how much they really had to do with the original research. The rhetoric of science journalism -- and sometimes the rhetoric of science -- all too easily engages a sort of pop-Platonism that seems to be deeply connected to the way that we think about natural kinds. As a result, small (but statistically reliable) differences in group distributions are seen as essential properties of the groups themselves, and therefore of all the individuals that make them up. Or at least, all the normal or typical individuals. Intellectual and social mischief often ensues.

I wondered to what extent this might have happened in the "happiness gap" case, so I looked into the original papers. I found what I suspected I might -- the changes between groups and across time are small enough, relative to within-group variation, that you need a sophisticated mathematical analysis to see them. So I pointed this out in a couple of blog posts.

Further responses by various parties ensued. To say the least. (You can follow the whole thing in the list of links at the bottom of this post, if you want.)

In particular, Justin Wolfers wrote a couple of blog posts at Marginal Revolution -- "The Significance of Changes in the Gender Happiness Gap", 10/2/2007; and "The Real Significance of Changes in the Gender Happiness Gap", 10/3/2007).

I agree with the implication of Justin's titles -- the second post is much more interesting and important than the first.

In the first post, he argues that I misinterpreted the lack of asterisks (traditionally used to signal statistical significance) on certain lines in their Table 1. I took the missing asterisks to indicate that the analysis of the individual years 1972 and 2006 showed no statistically significant sex difference in self-reported happiness. Here's the table:

And here's what he says about it:

The right way to test for whether women were, on average, happier at the start of the sample is to look at the "Female dummy", which is clearly significant. The right way to ask whether this gender gap has changed is to look at the difference in trends, which is also clearly significant. The last two rows are regression-based predicted values, so we didn't think we should put stars next to these numbers.

OK, my bad. Justin goes on to show that you can limit the regression analysis to just trends in male vs. female happiness inferred from comparing the two years 1972 and 2006, and that when you do, you discover that in predicting the "%Very happy" proportions, "no coefficients are statistically significant", whereas in modeling the "%Not too happy" proportions, all the coefficients are statistically significant.

This is not quite the analysis I had in mind -- I assumed (incorrectly) from their Table 1 that if you analyzed only the data from 1972, or only the data from 2006, not looking for trends over time but just asking about the distribution of responses by sex in a given year, you would not find a statistically significant difference in men's and women's self-reported happiness in either year. (It turns out that the difference was statistically significant in 1972, but not in 2006, according to what Justin says later in the same post.)

But it doesn't matter, because my objection was never about statistical significance, but rather about effect sizes and practical significance, and especially what individual men and women should conclude that this means for them. When I mentioned the need to look at multi-year data to see the trend (although there are thousands of people involved in each year's survey), this was not an attempt to nit-pick the statistical analysis; it was one of a number of ways that I tried to get readers to see this research in terms of (fairly small) shifts in (heavily overlapped) male and female distributions, not in terms of (large, even qualitative) changes in the individual properties of men and women as instances of uniform categories, or in the properties of male and female prototypes.

With respect to the arguments about statistical details, it's hard to improve (rhetorically, if not mathematically) on what Jezebel ("Celebrity, Sex, Fashion, Without Airbrushing") wrote yesterday afternoon, under the category of jezenomics ("Women Have Gotten Less happy, I'll Take My Graphing Calculator Out And Prove It":

Remember that study on women being less happy than men? Sounds about right, right? The internerds thought so! (Different ways internet commenters said no shit: "Boo hoo, the feminists made their bed and now they have to lie in it with their cats" and "Men are dogs. Dogs are happy. The end" and "Duh, we get Halo, and you get periods." ) But hold on! Some linguists think it's not true! It's an academic freestyle battle! So after the linguists called bullshit (and by the way, what the fuck is up with linguists knowing everything about everything?) the original economists who published the study struck back to say the linguists were wrong, women really were unhappier, and here's their proof:

* Gender happiness gap at the beginning and end of the sample
oprobit HAPPY SEX [aw=wt] if YEAR==1972
oprobit HAPPY SEX [aw=wt] if YEAR==2006
* Changes in the gender happiness gap using only the first and last years
xi: reg vhappy i.SEX*i.YEAR[aw=wt] if YEAR==1972 | YEAR==2006
xi: reg unhappy i.SEX*i.YEAR [aw=wt] if YEAR==1972 | YEAR==2006

Ha ha ha ha, here's a little regression theory for you guys! (Get it? Blow me! Don't you think I'd be happier if you could?)

Maybe the real happiness gap started setting in whatever year it became popular for economists to stop working on the economy by day and getting their wives off at night and started applying advanced calculus to every single mundane happening in their lives including though not limited to why their wives were faking it! Because that happened in 2004.

As  Justin Wolfers, who sent me the link, put it: "Funniest thing I've read on a blog in a long time..." After reading it, I think I won't follow my first impulse to ask Justin for the dataset so that I can try some other analyses.

Justin's second Marginal Revolution post takes up some aspects of what I think is the real question here:

As Deidre McCloskey has argued, we shouldn't be in the business of "asterisk econometrics", but instead, figuring out what effects have "oomph".

I can add another Deirdre McCloskey link to Justin's collection here: her Prickly Paradigm Press pamphlet The Secret Sins of Economics, which I blogged about back in September of 2004. What she has to say is fundamental to the scientific side of this whole discussion, and I can't resist quoting her peroration:

Cassandra, you know, was the most beautiful of the daughters of Priam, King of Troy. The god Apollo fell for her and made her a prophetess. In exchange he wanted sexual favors, which she refused. So he cursed her, in a most malicious way. He had already given her the power of prophecy, to know for example what would happen to a science that refused to ask seriously How Much. His curse was to add that though she would continue to be correct in her prophecies, no one would believe her.

Cassandra [to Trojan economists proposing to bring the wooden horse into the city]: The horse is filled with enemy soldiers! If you bring it into the city, economics is lost! Please don’t!

Leading Trojan Economist: Uh, yeah, I see what you mean, Cassie. Good point. Enemy soldiers. Inside. City lost. Qualitative theorems useless for a science. Statistical significance without a loss function equally useless. Economics ruined. Thanks very much for your prophecy. Great contribution. Love your stuff.
[Turning to colleagues] Okay, guys, let’s bring that sucker in!

Turning back to the public rhetoric of science, what Justin Wolfers has to say about the "oompf" of the "happiness gender gap" is cogent and absolutely to the point. You should read his whole post, if you're interested in the topic, but I'd like to comment on two of his suggestions. In general, these are versions of exactly the issues that I raised in my earlier posts:

Her [sic] are a few ways of thinking about the magnitude or "oomph" of the change we document:

  1. Effect Sizes: The coefficient in an ordered probit can be thought of as an "effect size", and hence the relative decline in the happiness of women is roughly one-eight [sic] of one standard deviation of the distribution of happiness in the population.  If you think there is a lot of variation in happiness in the population, this is big; if not, it is small.

It's true that in marketing or politics or sports, an effect size of 0.125 can make your fortune, swing an election, or win the division. And the more the variance in the population, the bigger the effective size of the change you can make by (say) improving your cardiovascular capacity by an eighth of a standard deviation. But more generally, as the Wikipedia article explains, "the most accepted opinion is that of Cohen (1992) where 0.2 is indicative of a small effect, 0.5 a medium and 0.8 a large effect size".

And in particular, when we're talking about the popular interpretation of group differences, effect sizes in the range of 0.1 to 0.2 are really problematic. When ordinary people come to understand how much overlap in the distributions such effect-sizes entail (see my post "Gabby guys: the effect size" for a concrete example), they generally discard the difference as insignificant. This is wrong, but it's wronger to conclude that effect sizes in that range (which are easily shown to be statistically significant, if you have enough data) tell us something about general properties of pop-Platonic prototypes.

Justin's second suggestion for evaluating oompf:

  1. Changing Positions: Let's think of lining up all the men in 1972, in order of their happiness.  In 1972, the median woman ranked between the 53rd and 54th man, happier than a slight majority of men.  By 2006, the median woman is somewhat less happy, ranking between the 48th and 49th man.  (A point of comparison: Moving from the 53rd to the 48th percentile of the household income distribution involves a difference of about $6,000 per year.)

Here's how I presented the same idea in my first post on this topic:

OK, so imagine coming into a door labeled "the room of unhappy people". You enter, and find yourself in a hall with 51 to 54 women, and 46 to 49 men. Do you think that you could decide which sex predominated, without lining everyone up and doing an explicit count?

Now imagine that you walk through two such rooms, where the first one is around 51-to-49 female, and the second is around 54-to-46 female. Do you think that you would notice the direction of difference in the sex ratios, without another pair of line-ups?

More to the point, do you think that you could spin differences like these into today's second-most-emailed NYT story?

If your answer is "yes", then you may have a future as a science writer. (Or, perhaps, as an economist...)

(I used the percentile estimates in the Stevenson-Wolfers paper based on projecting the 2006 median woman and man back onto the 1972 distribution, which is why the particular numbers are a little different. I think.)

Justin's point is that a percentile shift of this size is important. From some points of view, I'm sure that's true. In particular, as he goes on to explain, you'd need a very large change in e.g. unemployment statistics to cause a similarly-sized effect on self-reported happiness.

However, it remains true that the way that men and women reported their own happiness didn't change very much. In his imaginary line-up, the median woman is only moving by 5 places out of 100 over 34 years, and is ending up less than two places out of 100 below the median man. But the public clearly interpreted this research as showing that women as a group are now quite unhappy, both absolutely and also relative to men.

The empirically-observed trend, in which women have shifted over 34 years from self-reporting on average as a bit happier than men to self-reporting as a bit less happy, could be the result of a tiny shift in most people's moods, or a somewhat larger shift in a few people's moods; but it was interpreted by the public as a qualitative change in almost everyone's mood, to an extreme endpoint.

That's how the public rhetoric of science usually works, unfortunately, and that's why I (as a linguist and also as a citizen) was interested in the topic.

If you want the whole shaggy dog story, here's a list of the sequence of points and counterpoints (including only a small sample of the blog and forum reactions):

8/23/2007: Alan B. Krueger, "Are We Having More Fun Yet? Categorizing and Evaluating Changes in Time Allocation", ms. (Princeton U. and NBER).
9/16/2007: Betsey Stevenson and Justin Wolfers, "The Paradox of Declining Female Happiness", ms. (University of Pennsylvania).
9/26/2007: David Leonhardt, "He's Happier, She's Less So", NYT.
9/26/2007: Jezebel, "Women Less Happy Than Men About Performing Every Single One Of Those Multi Tasks".
9/26/2007: digg.com, " Men Are Now Happier Than Women".
9/26/2007: Mark Liberman, "The 'happiness gap' and the rhetoric of of statistics", Language Log.
9/27/2007: Mark Liberman, "Gender-role resentment and rorschach-blot news reporting", Language Log.
10/1/2007: Steven D. Levitt, "Why Are Women So Unhappy?", Freakonomics Blog (NYT).
10/1/2007: Mark Liberman, "Why are economists so misleading?", Language Log.
10/2/2007: Jill Filipovic, "Feminists made their bed, now they have to lie in it alone with their cats", Feministe.
10/2/2007: Amanda Marcotte, "Women: Not really that unhappy avoiding scowling cretins and petting cats", Pandagon.
10/2/2007: Justin Wolfers, "The Significance of Changes in the Gender Happiness Gap", Marginal Revolution.
10/3/2007: Echidne, "The Gender Happiness Gap", Echidne of the Snakes.
10/2/2007: Jezebel, "Women Have Gotten Less happy, I'll Take My Graphing Calculator Out And Prove It".
10/3/2007: Justin Wolfers, "The Real Significance of Changes in the Gender Happiness Gap", Marginal Revolution.
10/3/2007: Steven D. Levitt, "The Debate on Female Happiness Heats Up", Freakonomics.

[Update -- a reader writes:

Last weekend, I saw Justin Wolfers give a talk on his paper "Racial Discrimination Among NBA Referees" at the Stats in Sports conference at Harvard. In the paper, he demonstrates that the racial composition of an officiating crew has a significant effect on the number of fouls called, given the race of the player who committed the foul.

I work as a statistical analyst for a pro basketball team. Many people in the front office called me to talk about the paper, asking for my interpretation. I had to explain that while the results were statistically significant, and that the paper was extremely well done (a rare occurrence when academics step into the sports arena), the effect size was so small that it was overwhelmed by less controversial aspects of the game.

Moreover, I explained that you couldn't make decisions based on these results, since the effect was visible only over a large dataset and that the bias was not tied to any particular player or official, but only the aggregate of all players and officials.

Your posts on gender happiness brought all this back to me. Most people have a very hard time understanding variation within populations. Much of my job is pointing out when the variation is so large that any talk about the "average" is practically meaningless -- which leads to the spectacle of the stats analyst arguing against using stats analysis and for subjective evaluations.

I don't know anything about economics, but I get the feeling that their papers are never about what they're about. Justin's talk used racial bias in basketball to make a larger point about racial bias in society -- but most readers (including me) are primarily concerned with what it says about basketball. I think the same thing is happening here with the gender happiness thing.


Posted by Mark Liberman at 10:24 AM

October 03, 2007

The finance is enclosedchief: Fire exting wisher box

Victor Mair has sent in a few more examples, collected on his most recent trip to China, of "the proliferating Chinglish that leaves us all breathless".

Here is the banner line of a full-page advertisement for a new office building in a glossy in-flight magazine:


This mystifying English sentence is mirrored by the following Mandarin phrase:

seat of honor;
head of the table
"The seat of honor in the financial district"

Next come several warning signs:

In a stairwell at Lishi Hotel (Beiing):

Have a care!

I had no idea what to take care of until I read the accompanying Mandarin:


Be careful not to hit your head.

In the hallway of the hotel where I stayed in Ürümchi:

In contrast, in the Foreign Languages Bookstore in Beijing, we have:

FIREEXTINGUISHERCASE (no camera with me that day)

This photograph, courtesy of Paula Roberts, was taken in Lhasa:

The mistranslation is due to an overly literal rendering of the first two characters of the Chinese sentence, after which the translator (or the sign painter) apparently just gave up:

there is / has dog, please do not get near

I like the contrast between "have a care" and "have a dog"!

Victor also sent a photo to illustrate the growing role of doubtful English as a fashion accessory in China:

A similar impulse is behind the many strange uses of Chinese and Japanese characters documented by Tian at Hanzi Smatter. The most recent example is an analysis of the katakana subtitles on Kanye West's video for Stronger.

Victor is off to China for a conference on the Origins of Writing at Peking University, and promises to collect more examples along the way.

Posted by Mark Liberman at 06:37 AM

October 02, 2007

Pinker's almer mater

The September 22 issue of The Guardian featured a long profile of Steven Pinker by Oliver Burkeman. It's worth reading, especially if you want to know about some of the extreme reactions that Pinker's work in linguistics and evolutionary psychology has provoked. ("You wouldn't believe the kind of hate mail I get about my work on irregular verbs," Pinker boasts.) But buried in the middle of the piece (as it originally appeared) is an error that Pinker would surely appreciate:

Pinker graduated from Montreal's McGill University in 1976, reading experimental psychology, then completed a PhD in that field at Harvard, in 1979. (He has spent the rest of his professional life in the neighbourhood of Harvard, moving to the Massachusetts Institute of Technology, then back to his almer mater.)

Jenny Davidson spotted the goof right away, and eventually the Guardian ran the following correction in the Oct. 1 paper:

An interview with Steven Pinker referred to his "almer mater"; we meant alma mater.

As Pinker himself pointed out in Words and Rules, "speech errors provide clues on how the speech system is organized." (Just ask our own blunder maven.) This is an error of the graphological rather than phonological variety (and not atypical for a paper endearingly called The Grauniad), but it too yields some linguistic insights.

The first thing to note is that "almer mater" and "alma mater" would be pronounced the same for the non-rhotic speakers that predominate in Great Britain: both would be [ˌalmə ˈmɑːtə], give or take some variation in the vowels (the a of "mater" is sometimes pronounced as [eɪ], for instance). Given this pronunciation, it's not too surprising that a Latin expression would be misrendered in this way, since it's opaque for anyone not carrying around the knowledge of its derivation from alma (feminine of almus) 'bounteous' + mater 'mother'. A non-rhotic speaker might remember the -er of mater and accidentally extend it to both elements of the compound — in this case possibly further encouraged by the subject of the article, Pinker (that's [ˈpɪŋkə] to non-rhotics). I've spotted the error in some other UK media sources, such as this from the Liverpool Daily Post: "Heidi [Range] recently opened the performing arts centre at her almer mater, Maricourt."

The non-rhotic pronunciation of er in unstressed position as [ə] is responsible for a number of peculiar orthographic phenomena — peculiar at least from a rhotic speaker's perspective. For instance, it makes this already cryptic passage from A.A. Milne's Winnie-the-Pooh even more difficult to decipher:

When I first heard his name, I said, just as you are going to say, "But I thought he was a boy?"
"So did I," said Christopher Robin.
"Then you can't call him Winnie?"
"I don't."
"But you said—"
"He's Winnie-ther-Pooh. Don't you know what 'ther' means?"
"Ah, yes, now I do," I said quickly; and I hope you do too, because it is all the explanation you are going to get.

Elsewhere, Milne introduces a similar non-rhotic pronunciation spelling in the name Eeyore, which rhotic readers might be surprised to learn is meant to represent the sound of a donkey, i.e., [h]ee[h]aw.

Another non-rhotic spelling stumper involving -er is the title to the Led Zeppelin song "D'yer Mak'er." That's supposed to be pronounced like Jamaica [dʒə ˈmeɪkə], with the spelling indicating a punnish misunderstanding in a rather lame joke. Wikipedia elucidates:

The name of the song is derived from a play on the words "Jamaica" and "Did you make her", based on an old joke ("My wife's on vacation in the West Indies." "Jamaica?" "No, she went of her own accord.") ... The title, which appears nowhere in the lyrics, was chosen because it reflects the reggae flavor of the song. [Robert] Plant has said that he finds it amusing when American fans completely ignore the apostrophes and pronounce it as "Dire Maker".

Some other non-rhotic pronunciation spellings with er I've come across: proberbly and princerple for probably and principle, Murkarker for Macaca [məˈkɑːkə] (which I found when researching last year's Macaca-gate), and manner (from heaven) for manna (one of several non-rhotic items in the Eggcorn Database).

Finally, there's er itself, used as a written representation of the "pause filler" or "hesitation particle" [əː], which rhotic speakers would tend to write as uh. In his entertaining new book Um... Slips, Stumbles, and Verbal Blunders, and What They Mean, Michael Erard notes that Oliver Wendell Holmes, Sr. (father of the Supreme Court Justice) made an early complaint about this pause filler, using the variant spelling ur:

Once more: speak clearly, if you speak at all;
Carve every word before you let it fall;
Don't, like a lecturer or dramatic star,
Try over-hard to roll the British R;
Do put your accents in the proper spot;
Don't,—let me beg you,—don't say "How?" for "What?"
And when you stick on conversation's burrs,
Don't strew your pathway with those dreadful urs.
(Urania: A Rhymed Lesson, 1846)

For Holmes (born in Cambridge, Massachusetts in 1809), burr(s) and ur(s) would both have been pronounced non-rhotically, with the [əː] vowel. (Similarly, James Russell Lowell, another versifying 19th-century New Englander, used the pronunciation spelling princerple.) The proportion of non-rhotic speakers in the U.S. has since waned dramatically, so that now the spelling of the hesitation particle as er is sometimes (mis)construed rhotically, as if it were pronounced [ɚ] (rhotic schwa) or [ɹ̣] (syllabic r) — thus turning a pronunciation spelling into a spelling pronunciation.

I could go on about linking r's and intrusive r's and such, but I think I've wrung enough analysis out of the Guardian error for one post. (Hat tip, Regret The Error.)

[Update #1: Arnold Zwicky writes:

Googling on "almer mater" pulls up rather a lot of instances from American speakers. Some of these could be mere anticipations in typing or writing -- these are pretty common (my daughter's blog has a recent occurrence of "one one" for "on one", and I have some examples from my own writing, like "an organizing working" for "an organization working") -- but I'd imagine that many are reshapings of "alma mater" based on the knowledge that the expression is Latin and that words in Latin agree with each other in some way, leading to making the spelling of the first word "agree" with the spelling of the second, primarily accented word. ]

[Update #2: The Guardian writer, Oliver Burkeman, sent a very pleasant email. My apologies for originally spelling his name as "Burkmann," which was how Jenny Davidson had it in her post. The inevitable Bierce/Hartman/McKean/Skitt Law of Prescriptive Retaliation strikes again! Mr. Burkeman writes:

What a fascinating post! I feel strangely honoured. I'm tempted to pretend it was a deliberate error in a piece on language, intended to stimulate such an interesting discussion on language, but I don't think I have quite enough gall.
I can confirm that I'm a non-rhotic speaker, and that there's no difference between how I pronounce "alma" and the non-word "almer". I assume that's why I got this wrong, though of course when it comes to my unconscious mental processes, your guess is as good as mine. I did, of course, know the correct spelling of "alma mater", but that just makes things worse, really...
Also, I never studied Latin. That's the decline of the British education system for you.

Posted by Benjamin Zimmer at 04:55 PM

Weather report

Today's weather: Troublesome blustering winds from the North Atlantic dissipating quickly over the Philadelphia area. Early morning gender fog front gradually clearing with advanced enlightenment. Chance of snow of any description only slight in Alaska.

Posted by Roger Shuy at 11:23 AM

Losers are from Mars, winners are from Venus

Or is it the other way around? Anyhow, the day after the Guardian featured an excerpt from Deborah Cameron's new book "The Myth of Mars and Venus", we get a "special report" under the headline "Venusians in a Martian's world: How do women fare in parliament?". The lead paragraph:

In 2006, Tony Blair told the House of Commons that the next election would be a contest between "a heavyweight and a flyweight". He predicted that his successor, Gordon Brown, would knock out the Conservative leader, David Cameron, with a "big, clunking fist". These remarks delighted the Cameron camp while appalling many on Blair's own side. Labour supporters feared for their electoral prospects if voters got the idea that, as one journalist put it, "Gordon Brown is from Mars, David Cameron is from Venus".

The cute thing about this take on the mythology of gender, as the report observes, is that it doesn't even require any actual sexes.

The core of the report presents some work by Sylvia Shaw on the behavior of female vs. male MPs in the British and Scottish parliaments. As described (I know it only through this article), her work supports "dominance theories" as opposed to "diference theories" about language and gender:

Officially, arcane rules of courtesy govern the speech of MPs. [...] But in reality the rules are breached constantly. [...]

The influx of more than 100 women MPs in 1997 prompted many commentators, and some of the new MPs themselves, to suggest that women would exert a positive influence by introducing a more civilised way of doing business. [...]

In 1999, the linguist Sylvia Shaw decided to investigate whether any of this was happening. She found that it was not: rather than changing the verbal culture of the House of Commons, women seemed to have adjusted to its adversarial norms. In proportion to their numbers, women spoke as often as men and challenged other speakers to "give way" as readily as men. In short, they were (as MPs at Westminster have to be) assertive in competing for opportunities to speak. There was, however, one significant difference. Women rarely seized the floor "illegally" by interrupting or interjecting comments. In five debates analysed closely by Shaw, men made almost 10 times as many illegal interventions as women. [...]

Women MPs are classic "interlopers": they form a relatively small minority within a historically male institution, and the verbal harassment they face suggests a degree of active hostility to their presence. One logical response to being positioned as an interloper is to do exactly what Shaw found the women MPs did: observe the rules meticulously as a symbolic way of showing that you are worthy to belong. Paradoxically, however, this strategy only underlines the insecurity of those who use it. [...]

Shaw also studied the recently opened Scottish parliament, where once again, the most effective speakers tended to be people who deviated from the official rules. In Edinburgh, however, these rule-breakers were as likely to be women as men. This, Shaw argued, reflected the fact that the Scottish parliament was a new institution, with procedures designed deliberately to be less arcane than Westminster's. The proportion of women members was higher, and they had been there from the very beginning.

The women MPs' problem is clearly not that they have a less assertive or competitive style of speaking than men. That would not explain why there is a difference between the Westminster and Edinburgh parliaments, nor why Westminster women hold their own with men so long as they are speaking legally. The variable that does explain these patterns is not gender as such, but whether or not women are positioned as interlopers. To the extent that their behaviour is different from men's, it is not because they have a different style, but because they have a different status.

Posted by Mark Liberman at 09:03 AM

When I say "you" I mean "me"

From Lameen Souag at Jabal al-Lughat, "Impersonal vs. personal 'you'", 9/9/2007:

In English, "you" is equally used in a literal sense (referring to the addressee) or in an impersonal sense (referring to an arbitrary imagined experiencer.) In Darja, at first sight, it looks the same way - and for speakers of any one gender, this is true. However, looking at speakers of both genders allows you to realise that the distinction is grammaticalised. Addressee "you" agrees in gender with the addressee; impersonal "you" does not agree in gender with the addressee, but with the speaker. Thus a woman speaking to a man will say tṛuħ "you go" when "you" refers to the man addressed, but tṛuħi when it refers to an arbitrary person, like "When you go by bus, it takes a while."

Posted by Mark Liberman at 06:12 AM

October 01, 2007

Why are economists so misleading?

The "happiness gap" coverage continues over at the NYT, where today's Freakonomics blog picks it up ("Why are women so unhappy?", 10/1/2007), and hundreds of readers are once again pouring out their souls in the comments.

But the real question here is not why women are so unhappy, but why economists (and journalists) are so prone to oversell tiny group differences as if they were universal characteristics of the individual group members. Steven D. Levitt answers that implicit question in his lead sentence:

I saw Justin Wolfers a few weeks back, and I joked with him that it had been months since I'd seen his research in the headlines.

There's a parallel question, "Why is the public so gullible?", but the answer to that one has been well understood since the seminal research of Barnum (1869). Overall, the "happiness gap" coverage is a wonderful case study of the role that science journalism has come to have in public discourse.

In a couple of earlier posts, I took at quick look at the U.S. General Social Survey data from the unpublished Stevenson and Wolfers paper "The Paradox of Declining Female Happiness", and concluded that the "happiness gap" is pretty underwhelming, despite the deep chord that the news has obviously struck in the public consciousness. The "gap" doesn't show up in any single year's survey data -- indeed, both in 1972 and 2006, a somewhat larger fraction of women than of men reported themselves to be "very happy". The "gap" and can only be seen through the magnifying-glass of a powerful statistical analysis across 34 years, which turns up a trend that is small relative to the year-to-year fluctuations, and tiny relative to the within-group happiness variation estimated by the model (which suggests that the between-group difference in happiness is about 1.9 percentile points).

Last week, I failed in my search for the other unpublished paper, Alan Krueger's work on trends in sex differences in the amount of time spent on activity perceived as unpleasant. So I fell back on some general observations about how the poor test/retest correlations of such judgments suggest that his group differences in misery indices are also likely to be extremely small in percentile terms.

But this afternoon, thanks to a tip from Matt Henning, I found Alan B. Krueger, "Are We Having More Fun Yet? Categorizing and Evaluating Changes in Time Allocation", draft version of 8/23/2007. Unfortunately, Prof. Krueger doesn't give us any information about the within-group (or even overall) distribution of his misery index. However, he does provide the table of time-series data that the NYT writer (David Leonhardt) called "an even starker pattern". Here it is -- I've plotted the data from his Table 4A "U-Index from Men and Women Combined". (Men's data is plotted with red 'm' letters, women's data with blue 'w' letters.)

Now, how many of you really think that the question to ask about this graph is "Why Are Women So Unhappy?", as opposed to the other two questions that I asked at the start of this post?

In Table 4B, Krueger massages the data a bit differently, using sex-specific estimates of activity unpleasantness rather than estimated from pooled evaluations, and comes up with slightly different time functions:

In both time-series, I continue to speculate that the within-group standard deviation of the underlying time-allocation data will be a substantial fraction of the mean values; when I have time, I'll check this out in the ATUS microdata, which is available here. But whatever the effect size, I leave it to you to judge how "stark" this trend in mean values really is.

The real effect here, I think, is the public's hunger for "scientific" evidence about sex roles, as a basis for releasing pent-up negative emotions about personal relationships and social status anxiety. The feelings are real, and worth discussing. The "happiness gap" is not, as far as I can tell, at least not on the basis of what the economists have to tell us so far.

[Update -- Mark Thoma at Economist's View complains that Alan Krueger described the effect as a "gradual" one percent change, and that the "even starker pattern" phrase was due to the journalist David Leonhardt. That's fair enough -- I've neglected my usual rule, which is "When in doubt, blame the journalist".

I spread a bit of the blame on the economists in this case as well, however, because I made the (perhaps unwarranted) assumption that Leonhardt learned about these two pieces of unpublished work because the authors took the initiative to tell him about it, and that his spin on it came from them. And Steven Levitt, who is a member of the clan, seems to like Leonhardt's interpretation just fine.

In any event, of course I don't believe that economists as a group are misleading -- the point of my headline was to make fun of the Freakonomics headline, which asked the a similarly global "why" question about women's alleged unhappiness.]

[More here.]

Posted by Mark Liberman at 05:36 PM

The myth of Mars and Venus

An excerpt from Deborah Cameron's new book, "The Myth of Mars and Venus", was published today in the Guardian ("What Language Barrier?", 10/1/2007). A sample:

The idea that men and women differ fundamentally in the way they use language to communicate is a myth in the everyday sense: a widespread but false belief. But it is also a myth in the sense of being a story people tell in order to explain who they are, where they have come from, and why they live as they do. Whether or not they are "true" in any historical or scientific sense, such stories have consequences in the real world. They shape our beliefs, and so influence our actions. The myth of Mars and Venus is no exception to that rule.

Posted by Mark Liberman at 10:22 AM

Finch phrase structure?

Ten days ago, I described the recently-published PLoS paper by Gang Li et al., "Accelerated FoxP2 Evolution in Echolocating Bats", which showed that "contrary to previous reports, [the] FoxP2 [gene] is not highly conserved across all nonhuman mammals but is extremely diverse in echolocating bats" ("Wherein I take the bait", 9/21/2007). I briefly mentioned the unfortunate fashion, at its height five or six years ago, for calling FoxP2 the "grammar gene" or the "language gene" -- see Geoff Pullum's 9/5/2005 post "The Continuing Misrepresentation of FoxP2 effects" for some details, and Alec MacAndrew's 3/1/2003 essay "FoxP2 and the Evolution of Language" for more. I expressed the hope that the new research would further discourage careless speculation on this point.

But just a few days later, Seed Magazine published an article (Juan Uriagereka, The Evolution of Language, 9/25/2007) asserting that "we can be confident of the fundamental role of FOXP2 in human language".

Uriagereka doesn't mention the bat results, but he does mention (without specific references) some bold speculations about the role of FoxP2 from his earlier work with M. Piatelli-Palmarini (The immune syntax: the evolution of the language virus, inVariation and universals in biolinguistics, 2004; and "The Evolution of the Narrow Faculty of Language: The skeptical view and a Reasonable Conjecture", Lingue e Linguaggio, 2005.). The Seed Magazine article goes beyond those highly speculative proposals, to an even more extreme suggestion about the role of FoxP2 in "what linguists call parsing: the integrated processes that allow you to reconstruct complex sentences as you hear or see them, to produce them, or to acquire the fundamental parameters of your language as you first experience language as a baby":

Because of the similarities in brain structure and in the syntax of their song, finches must also have this parser. They may not be using it to process complex thoughts because they lack the cortex to generate them, but their songs are certainly intricate enough, and reconstructing their subtle structure from a linear sequence of notes is a remarkable computational task. [...]

Chimps, and our other close relatives the apes, certainly have the hardware for some basic forms of meaning, but all indications are that Neanderthals also had meaningful thoughts, enough to bury their dead or control fire, without much of a language. What they don't have is a way to externalize their thoughts. I'd wager that chimps just lack the parser that FoxP2 regulates. Somehow humans, by contrast, were able to recruit an ancient gene with a relatively ancient function to help us squeeze our thoughts out into the airwaves, much as a finch does with his. We are just thinking apes, with a finch's ability to sing.

On the question of whether there's any evidence that the FoxP2 gene regulates a "parser", I'll refer you back to MacAndrew's essay. As for the vocal abilities of Neanderthals, Steven Mithen, in his 2006 book "The Singing Neanderthals", weighs the evidence and concludes that they did not have compositional language, but did have a highly evolved form of singing, such that "all modern humans are relatively limited in their musical abilities when compared with the Neanderthals". Whether or not this theory is true, it illustrates the lack of any scientific consensus that singing is an innovation of anatomically modern humans (though it must have developed in the hominin line at some point after we split from the other apes).

But it's the stuff about finches in Uriagereka's essay that is most deeply strange. He combines some important themes with what seem to me to be some bizarre fantasies.

Now, I'm a fan of Darwin's idea that human language developed out of love songs. And I'm a great admirer of songbirds, and have put in some time working with researchers interested in the structure of their songs. I think that the intersection of distributional, physiological and genetic studies of birdsong production and perception offers one of the most exciting opportunities for scientific research today. The learning, production and perception of birdsong may well share some principles with human speech and language. But I'm baffled by the idea that finches "must have this parser", where "this parser" is identified exactly with whatever allows humans to learn, produce and perceive linguistic structures.

I'm also baffled by the statement that finches perform a "remarkable computational task" by "reconstructing [the] subtle structure" of their songs "from a linear sequence of notes"

What is the syntax of birdsong actually like? Here's a description from Kazuo Okanoya, "The Bengalese Finch: A Window on the Behavioral Neurobiology of Birdsong Syntax", Annals of the New York Academy of Sciences, Volume 1016 Behavioral Neurobiology of Birdsong Page 724-735, June 2004:

Bengalese finches have been domesticated in Japan for 240 years. Comparing their song syntax with that of their wild ancestors, we found that the domesticated strain has highly complex, conspicuous songs with finite-state syntax, while the wild ancestor sang very stereotyped linear songs.


We may divide birdsongs into two types. When one song note is followed by another song note in deterministic fashion in a single song, or the order of song notes are fixed in each song of a multirepertoire bird, such songs may be identified as "linear" song. The most widely used oscine song system models (the zebra finch, white-crowned sparrow, song sparrow, and swamp sparrow) could all be identified as having linear song syntax. When there are some variations introduced in the ordering of song notes, such a song should be called as a non-deterministic song. Species with non-deterministic song repertoires include the nightingale, starling, willow warbler, and Bengalese finch.

Among these species, Bengalese finches are unique in that their songs are characterized by finite-state syntax. Finite-state syntax refers to a simple form of syntax in which finite numbers of state are interconnected by arrows and a string of letters is produced when state transition occurs. In Bengalese finches, 2 to 5 song notes are chunked together, each of these chunks are emitted at a particular state transition, and the pattern of chunk production follows finite-state syntax.

Here are Okanoya's grammars for the songs of a Bengalese finch and a white-rumped munia (representing the wild ancestors of Bengalese finches):

FIGURE 2. Example of sonograms and transition diagrams of a white-rumped munia song (upper) and a Bengalese finch song (lower).

Okanoya Fig. 2

When Okanoya writes that "Bengalese finches are unique in that their songs are characterized by finite-state syntax", I think he must mean to be claiming that they are the only songbirds whose song grammar involves a rich repertoire of repeated sub-sequences, or something of the sort, as created by the recurrent arcs in the sample song grammar that he provides. I can testify from observation that (e.g.) the syllable sequences in zebra finch songs are well described by a finite-state grammar in which some states can be skipped or repeated. Similar finite-state grammars  can also be inferred from other animal behaviors where discrete gestures are serially ordered in well-practiced ways, such as mouse grooming (See e.g. John C. Fentress, "Emergence of pattern in the development of mammalian movement sequences", Journal of Neurobiology, 23(10): 1529-1556, 1992; A.V. Kalueff and P. Tuohimaa, "Grooming analysis algorithm for neurobehavioural stress research", Brain Research Protocols, 13(3): 151-158, 2004.) And for more on individual and dialect variation in finch tweeting, see the sequence of Language Log posts on Vinkensport.

But I'm not aware of any evidence that birdsong grammars involve long-distance dependencies, or clausal structures in which subsequences belonging to specific lexical categories can be substituted in designated slots, or any number of other characteristic properties of human syntax; nor do I know of any evidence that birds "parse" these behavioral sequences, in any way beyond the neural equivalent of transition-matrix probabilities. Perhaps they do, but simple assertion doesn't make it so.

Working out the neural architecture of motor control that underlies such patterns is one of the most exciting research topics around these days, in my opinion. And it's plausible that motor-system patterns are somehow behind linguistic structures, and not only as a matter of evolutionary history -- the "Declarative/Procedural Model" of Ullman, Pinker and others suggests that "mental grammar involves procedural memory and is rooted in the frontal cortex and basal ganglia" (see Michael Ullman, "A Neurocognitive Perspective on Language: The Declarative/Procedural Model", Nature Reviews Neuroscience, 2:717-726, 2001).

But to reify human speech and language abilities as a "parser" located in the caudate nucleus, regulated by FoxP2 and shared with finches -- well, speculation is fun, but this is like the kind of too-specific science fiction that's out of date by the time it's published, and seems merely quaint within a few years.

Posted by Mark Liberman at 08:20 AM