May 31, 2006

"What up, Nick--?"

Last June in the Howard Beach neighborhood of Queens, New York, 19-year-old Nicholas "Fat Nick" Minucci beat black 23-year-old Glenn Moore with a baseball bat. Details are being hashed out in trial as to exactly what the sequence of events and motivations were: Moore admits he was in the neighborhood to steal cars, and Minucci and some friends claim that Moore tried to rob one of them, sparking the later baseball bat attack. But what's getting all the attention is that Minucci used "the N word" while beating Moore.

Or, perhaps just before: Minucci's version is that when Moore tried to rob him, he said "What up, n-----?" To the prosecutors, this means Minucci committed a hate crime that could get him sent up the river for years. The defense, however, are claiming that n------ is now heard so often that it is merely slang, and is no longer a "bad word."

The idea that Minucci's linguistic impropriety is more evil (and interesting) than the beating of anyone for any reason is one thing -- and a sad one, if you ask me (and for the record I am black). However, the defense's cute feint that n----- no longer carries a sting because it's all over rap albums and black men use it with each other is, well, b------- and they know it.

It's very simple. Long ago (and long before the 1980s, contrary to what many seem to think), black people and especially black men recruited n----- as an in-group term of endearment. It was a way of taking the sting from the slur. Today, the word signifies that said n----- is "one of us," no higher than the rest of us. N-----is a democratizer -- among black people.

Indeed, this means that white people are not allowed to call black people niggers (see, I can say it!). This is not difficult. This kind of thing happens in language. In Japanese, for example, instead of the meaning of a word varying with the situation, the word itself can. To give is AGERU if I give something to people, but it's KURERU if someone gives something to me -- but then if I give something to a high-placed person then I SASHIAGERU, and if a high-placed person gives something to me then they KUDASARU it to me.

Well, in English, when said by a white person to a black one n----- is an insult, while if black people use it among themselves, it's a term of endearment. Just as rank is the grand obsession of the Japanese (or at least used to be), race has been the grand obsession of America since the 1960s. Naturally, ceremonial linguistic rituals will arise on its basis, and internalizing them becomes part of the national identity.

When this goes as far as whites getting fired for even uttering the word in reference to it rather than actually wielding it, we have slid into the realm of senseless taboos of the sort that make remote tribal ones look so foreign to us. But the basic idea that a term is used only among groups corresponds neatly to universals of human social bonding and group definition.

If we all hear it "around" more lately, it's mostly on recordings of black men addressing and referring to one another. The rule nowadays even extends to other groups using n----- among themselves -- I have heard it used by Latino, Filipino and Asian teens. Okay. Among a new generation of hip-hop obsessed white kids, one even hears them using it with each other. But still, they cannot use it with a black person anymore than a Japanese person can AGERU something to the Emperor.

Minucci and his defense team would have it that if Minucci said "What up, n-----?" to Moore, then he was just using common slang like fella, or the Russian MUZHIK, which means peasant but is used affectionately to mean FELLOW or GUY rather like n----- is. That is, they want us to believe that n----- no longer carries any racial meaning, just as DUDE, now used among women as well as among men, is becoming gender-neutral.

I suppose that would be nice, but we aren't there yet, no matter how many hip hop albums younger white kids have heard, and no matter how much some of them may think of themselves as, on some level, black. Minucci broke the rules on how n----- is used — which is especially clear if he indeed used it repeatedly while beating Moore. Imagine him punctuating each blow with, say, "Pal! Pal! Pal!" A term of endearment? And even if it was just a single "What up, n-----?", them was fightin' words and Minucci and his defense know it very well.

Posted by John McWhorter at 11:00 PM

Grunt and Grumble: sociolinguistic speculation at Slate

A few months ago, it was Jason Horowitz telling us about the "City Girl Squawk" of younger urban women ("The Affect: sociolinguistic speculation at the NYO", 3/22/2006; "Further thoughts on The Affect", 3/22/2006). Now it's Jon Katz telling us about the "Grunt and Grumble" of older rural men. ("Grunt and Grumble: Why do men in the country talk that way?", Slate 5/29/2006).

Both pieces are interesting examples of the genre of verbal caricature. I don't have time to say much about Katz's characterization of rural men right now, since I've got an appointment in a few minutes on the other side of campus (for a cute Language Log post about older-generation rural speech, see this one). But I'll mention one thing that struck me. Horowitz's caricature was aimed specifically at young women in the American northeast; Katz's caricature is pitched as a characterization of all rural men, although his observations are apparently limited to his neighbors in upstate New York.

Upstate New Yorkers are not exactly core Yankees, but they're not that far out in the fractal penumbra of Yankeehood; and there's an old stereotype of rural Yankees, represented by various anecdotes about Calvin Coolidge -- "silent Cal".

Coolidge was both the most negative and remote of Presidents, and the most accessible. He once explained to Bernard Baruch why he often sat silently through interviews: "Well, Baruch, many times I say only 'yes' or 'no' to people. Even that is too much. It winds them up for twenty minutes more."

[...]

Both his dry Yankee wit and his frugality with words became legendary. His wife, Grace Goodhue Coolidge, recounted that a young woman sitting next to Coolidge at a dinner party confided to him she had bet she could get at least three words of conversation from him. Without looking at her he quietly retorted, "You lose." And in 1928, while vacationing in the Black Hills of South Dakota, he issued the most famous of his laconic statements, "I do not choose to run for President in 1928."

And on his first day of retirement, back in his small hometown in Vermont, it's said that he went down to the local store, made his selections, and checked out via the following exchange:

Store owner: [rings up purchase, displays total] Been away.
Coolidge: [counts out money] Ayuh.

But I don't think you'd hear that, even as a joke, in rural Texas.

Posted by Mark Liberman at 10:09 AM

Congratulations to Joseph Aoun

Boston, MA, 7:00 a.m., Wednesday
Joseph Aoun, the highly successful Dean of the College of Letters, Arts and Sciences at the University of Southern Califonia, will soon become the first Professor of Linguistics to assume the top position (president or chancellor) in a major university in the USA.

He has just been named the next president of Northeastern University here in Boston. Congratulations to both him and Northeastern.

Aoun earned his PhD in the Department of Linguistics and Philosophy at MIT, and has served for six years as a dean. He is noted for his an excellent record in fundraising.

Further details in the Boston Globe.

Aoun is the first Professor of Linguistics to become a university president in this country, but not the first holder of a PhD in linguistics. The Swedish-born Nils Hasselmo earned a PhD from the Department of Linguistics at Harvard University in 1961, and later served from 1989–1997 as the president of the University of Minnesota. His university posts, however, were as a professor of Scandinavian languages and literatures. And Father Lawrence Biondi, S.J., who earned an MA in linguistics (1966) and a PhD in sociolinguistics (under Roger Shuy, in 1975) from Georgetown University, and since 1987 has been the very successful president of St Louis University, a Jesuit university of moderate size (11,000 students) in Missouri — and the oldest university west of the Mississippi. Father Biondi has been active in fields other than linguistics (notably theology and university administration) since earning his doctorate.

Other high-ranked linguists who have held US university administrative positions have not been permanently appointed at ranks higher than that attained by the late Victoria Fromkin, who had the title Vice Chancellor for Graduate Programs at UCLA in addition to being graduate dean. It should be noted that Sheila Blumstein served for a while as Interim President at Brown University; Susan Steele was vice provost at the University of Connecticut and then provost at Mills College; Samuel Jay Keyser was an associate provost at MIT; a significant number of professors of linguistics have held deanships (probably a dozen or more); and Alfred Bloom, the current president of Swarthmore College, started at Swarthmore as an assistant professor in psychology and linguistics and is thus certainly an honorary linguist (though his degrees are in adjacent fields: a BA from Princeton in Romance languages and literatures and a PhD from Harvard in psychology and social relations).

Posted by Geoffrey K. Pullum at 07:09 AM

GAN: Whodunnit, and how, and why?

[Victor Mair sent in further analysis of a common but spectacular mistranslation, discussed in earlier LL posts: "A less grand Chinglish" 5/30/2006, which dealt with a button labelled "dry fry" in Chinese and "fuck to fry" in English; and "Engrish explained", which discussed a menu item reading "Hot and spicy garlic greens stir-fried with shredded dried tofu" in Chinese, but "Benumbed hot vegetables fries fuck silk" in English, 3/11/2006. Victor's note follows. ]

The translation of GAN as "fuck" is fairly ubiquitous in China. There are complications, of course, since GAN1CHAO3 on the sign I wrote about must mean "dry fry," with GAN1 in the first tone, whereas GAN meaning "fuck" probably derives from GAN4 ("to do") in the 4th tone. This latter word, furthermore, is written with an entirely different character in the traditional script (幹), though GAN1 and GAN4 have both collapsed into the same three-stroke calendrical graph in the simplified script (). Furthermore, the actual sign from which I took this example has an arrow next to the GAN1CHAO3 / FUCK TO FRY which seems to be pointing to a button that you're supposed to PUSH to start the frying. Still, if naughty people are intentionally producing these risque, nonsense translations, then the double entendre of GAN1/4 ("dry / fuck") must be taken into serious consideration.

These sites show how widespread the mistranslation of GAN1/4 as "fuck" is:

http://www.flickr.com/photos/xiaming/70761148/
http://www.cameraontheroad.com/?p=1010
http://pangea.stanford.edu/~pvermees/chinglish/index.html (select "Fuck the price" in the radio box).
http://www.alwayson-network.com/comments.php?id=P14329_0_6_0_C

Just google {Chinglish fuck} and you'll get a lot of results.

I am trying to make sense of how this phenomenon actually came about. It seems that the twenty or so different meanings of the three-stroke calendrical graph that is used to write GAN1/4 (a total of three distinct graphic forms in the traditional script -- , , -- all reduced to one -- -- in the simplified script) in Chinglish have all collapsed into the single meaning of "fuck". Wherever that graph occurs, Chinglish speakers will translate it as "fuck".

This is an extremely bizarre situation, because:

a. normal Chinese-English dictionaries do not even give this definition

b. the widespread rendition of GAN1/4 as "fuck" in all sorts of situations where other translations are called for occurs on restaurant menus, official notices, and so forth, and it is not likely that the proprietors would intentionally want to insult or embarrass their patrons

Who's telling the menu-makers and sign-painters to write "fuck" for GAN1/4? They probably don't even know English and probably don't know much Chinglish either. How did this get started? (Perhaps somebody was being intentionally mischievous.) And how did it become such a common phenomenon? That's the real mystery. How is this horrible mistranslation continuing to spread and not being caught by the tens of millions of Chinese who do speak good English?

I'm deeply interested in the linguistic mechanics and the sociolinguistics of this baffling phenomenon. It is almost beyond belief that GAN1/4 as "fuck" proliferates when there are so many other good translations available in different contexts. You'd think that at least they'd write "do" everywhere, or that people who do know English would tell the proprietors to hurry up and change the offending word so as to avoid further embarrassment!

[Guest post by Victor Mair.]

[Update -- Brendan O'Kane wrote:

Hi - long-time listener, first-time caller.

I've been living in Beijing and working as a free-lance translator for some time now. It's pretty common for clients to take a text, hand it off to a (cheaper) Chinese translation company, and then pass it on to me to 'edit' - a lower-paid gig - and so I've seen quite a lot of this kind of thing. My guess, with the disclaimer that Prof. Mair has forgotten more than I'll ever know about Chinese, is that someone ran the Chinese 干炒 in the example in your blogpost, and the menu posted on rahoi.com, through a machine translation program, perhaps Jinshan Kuai Yi or something of the sort, with the offending results.

Can someone verify that there is a common Chinese-English MT program that maps GAN to "fuck"? That would explain a lot, if it's true. The puzzle then would become why outraged customers have not forced a modification of the software... ]

Posted by Mark Liberman at 05:31 AM

May 30, 2006

Supreme Freedom of Speech

Internal conflict is not common at Language Log Plaza. But you would think that when one of our writers disagrees with one of the high mucky-muck elders, there could be all hell to pay. And once in a while conflicts actually happen here, right in our hallowed halls. You may recall that last month there were lots of Language Log posts about the Harvard plagiarist, Kaavya Viswanathan. Geoff Pullum thought the young student/budding chick lit author should be hogtied for copying passages word for word from an earlier novel (see here). Bill Poser wasn't as sure about this and came to her defense (well, sort of anyway) (see here). The rest of us hunkered down at our desks, waiting for a donnybrook to take place. It didn't. Life went on as if no conflict had ever happened (we're special that way).

Why are we so calm and dignified at Language Log? Because we Loggers are civilized people who don't mind disagreeing with each other. No sweat. On the the next exciting topic! But it's a good thing that Language Log isn't part of the government. The Supreme Court just nailed a public employee who claimed that he had been denied promotion for challenging the legitimacy of a search warrant (New York Times). A Los Angeles deputy prosecutor complained to his boss that he had found some serious misrepresentations in an affidavit that his office used to get it. Apparently not a  good move, since the deputy prosecutor shortly afterward was reassigned and denied promotion, for which he promply filed a grievance. He lost in federal court but later prevailed in the Court of Appeals, which upheld his claim that his freedom of speech rights had been violated. Off the case then went to the Supreme Court, which ruled against the deputy prosecutor, claiming that when people enter government service, they have to accept certain limitations on their freedom. The deputy prosecutor was said to be acting in his official capacity, not as a private citizen, when he made his internal complaint about the inadequate search warrant. The Supreme Court's 5 to 4 decision split along the lines of ... well, you probably know how it split without my mentioning it. Many feel that, among other things, this decision could cast an uncomfortable pall on whistle blowers in the future.

So the Supremes now tell us that our right to freedom of speech has some pretty strong limits, especially if you happen to be a government employee. Fortunately, those of us at Language Log work in the private sector where we can disagree with each other all we want, even though we seldom do. We have freedom of speech. That's what democracy is all about -- except for government employees. But now that I think about it, aren't Justices Souter, Ginsberg, Stevens and Bryer also public employees, disagreeing with the majority? I wonder if ... naah.

Posted by Roger Shuy at 11:49 PM

So the search engine can understand

An article by Steve Lohr in the New York Times last April 9, about the way newspapers are using duller headlines online to make sure they get the right pickup by news-hunting web crawlers, contains the following quote from the head of product development and technology at BBC News Interactive, Nic Newman:

"The search engine has to get a straightforward, factual headline, so it can understand it," Mr. Newman said.

Now, if I seem a bit over-cautious here, keep in mind that BBC News is the organization that brought you the telepathic parrot and the three-headed frog, and Language Log is a little bit concerned that loonies have infiltrated the fine organization in question. But if Mr Newman's remark here is taken at face value, he would appear to believe that search engines understand things.

I will present no view here on whether machines might or might not be in principle capable of understanding (you'll want to vote yes if you go with Turing, no if you go with Searle; for now, I'm neutral), but my understanding is that search engines today, on this planet, cannot conceivably be described as understanding anything at all. The headline scanners of Google News do scan headlines and the first paragraphs of stories, and they do pick up enough information to classify the stories (the Google News page is put together entirely by machines, which is a really remarkable achievement). But the scanners simply look for words (letter strings) that are of normally low frequency and thus might be clues to the topic at hand. (For example, they conclude nothing at all from finding the, which occurs in nearly all sentences, but they conclude quite a lot from seeing Iran, which in texts on most subjects is rare.) They don't read for content, get the drift of the story, compare the sense of the paragraphs with their background knowledge and common sense, and chat about the issues with their friends. They tabulate letter strings and do statistical computations.

The very least one has to admit about machine understanding is that there is a big difference between a search engine algorithm and a genuine understander like you or me — and I'm not saying it necessarily reflects well on me. If you switch a Google-style search engine algorithm from working on English to working on Arabic, it will very largely work in the same way, provided only that you make available a large body of Arabic text from which it can draw its frequency information. (I have actually met people working at Google on machine processing of stories in Arabic. They do not know how to read Arabic. They don't need to.) I, on the other hand, will become utterly useless after the switch. I will no longer be able to classify news stories at all (I don't even know the Arabic writing system, so I can't even see whether Iran is in a paragraph or not).

Call the machines cleverer, or call me cleverer, I don't care, but we're not the same kind of animal, and it seems to me that the verb understand is utterly inappropriate as a term for what Google News algorithms do.

[Added later: People from the programming culture have been mailing me to point out that a metaphorical use of the word ("The compiler won't understand that unless you put brackets round it") is commonplace among programmers. And if Steve Lohr is a programmer, the above could well be regarded as unfair. Maybe so. In that case, just ignore the above cautions. But be very aware that metaphor is in play. Google's algorithms are ingenious and they work very well; but they understand things only in a very attenuated metaphorical sense under which you might also say that a combination door lock set to 4357 understands you when you punch in 4357 but not when you punch in 4358.]

Posted by Geoffrey K. Pullum at 05:35 PM

A Less Grand Chinglish

[Guest post by Victor Mair]

(Signs in Photographs Taken by My Student Carley Williams during Her Travels in China)

For each item I give the Chinglish sign, identification of the site where it occurred [in double parentheses ((xxx))], pinyin transcription, literal word-for-word translation, and then an idiomatic English translation; sometimes I omit the latter when the meaning of the word-for-word translation is sufficiently clear.

1. THEREOUT PULL IN GOT ON

((at a Taishan [Mt. Tai] cable car entrance))

   YOU2 CI3  JIN4  ZHAN4   CHENG2 CHE1
from here enter station board car

"Enter the station here and board the car."

2. GODCHOSEN TRAVEL SERVICE

((on a banner held by a guide))

   TIAN1  ZUO4 LYU3YOU2 
Heaven-made Travel

3. The stairs and Pu Jiang Hotel of the carving arm-rest is like the long history

((on a wall next to a staircase in a hotel))

   DIAO1HUA1 FU2SHOU  DE LOU2TI1, YU3 PU3JIANG1   DE LI4SHI3 YI1YANG4 YOU1JIU3 
carved handrail 's stairs and north-river 's history equally old/long

"This staircase with carved banister has a history as old as that of the Pujiang Hotel."

4. My beauty comes from your painstaking care and attention

((at a scenic vista))

   WO3 DE MEI3LI4 LAI2  ZI4  NI3 DE JING1SHEN2 HE1HU4
my beauty comes from your spirit protection

"The beauty of these natural surroundings depends upon your conscientious care."

5. Those who suffer from high blood pressure, mental disease, horrifying of highness and liquor heads are refused.

((notice at the entrance to a ride in an entertainment park))

   HUAN4     YOU3 XIN1ZANG4BING4, GAO1XUE4YA1,         JING1SHEN2BING4, KONG3GAO1ZHENG4 
   afflicted have heart disease,  high blood pressure, mental illness,  vertigo

   JI2 XU4JIU3ZHE3              XIE4JUE2 CHENG2ZUO4
and those who are inebriated decline ride

"Those who suffer from heart disease, high blood pressure, mental illness, or vertigo, and those who are drunk are not permitted to ride."


The next sign is in a class of its own. It comes from a photograph taken by another of my students named Jeisun Wen. Jeisun encountered this sign in a restaurant that he went to with his girlfriend. Neither of them could figure out what the sign was instructing them to do. I've shown this sign to scores of people but nobody can understand what it means. Because of my long experience in reading ancient Chinese manuscripts, I was able to decipher this most mystifying Chinglish sign within a couple of minutes.

FUCK TO FRY

(written all in capital letters just that way)

This sign is located at the corner of a panel in the center of which is found a tray labled "CANDIED FRUIT". The corresponding Chinese text for "CANDIED FRUIT" is MI4JIAN4 LING2SHI2, which does indeed mean "candied / preserved [lit., honey] fruit snacks", so there's no real problem there, except that it's a bit odd to say "candied fruit snacks", since MI4JIAN4 traditionally would have been used by itself to signify a type of snack, and there is no need to specify MI4JIAN4 as LING2SHI2.

Now, on to the solution of the difficult part. The corresponding Chinese text for "FUCK TO FRY" is GAN1CHAO3,lit., "dry fry," which doesn't help us to unravel the "FUCK TO FRY" knot. I believe what happened is that a Chinese person asked an English speaker what to write below the GAN1CHAO3 sign that would more or less equal it. The English speaker must have told them to write PUSH TO FRY, i.e., push a button at the corner of the table to heat up the MI4JIAN4 on the tray. Unfortunately, when the Chinese sign painter did the lettering for PUSH TO FRY, P morphed into F, S morphed into C, and H morphed into K (such things can happen when one's handwriting is not perfectly clear!), and the rest is history, immortalized in this eternally perplexing instruction: FUCK TO FRY.

[Guest post by Victor Mair]


[Comment by myl: I never thought I would be in a position to amend Prof. Mair's Chinese philology, even indirectly! In an earlier Language Log post by Ben Zimmer, "Engrish explained", you'll find a related puzzle, the scanned menu item

taken from a blog post by Jon Rahoi, an American living in China. One commenter accused Rahoi of photoshopping it; but "an anonymous professor of China studies" rescued Rahoi by offering the following explanation, reproduced below:

Take #1313, "Benumbed hot vegetables fries fuck silk." It should read "Hot and spicy garlic greens stir-fried with shredded dried tofu." However, the mangled version above is not as mangled as it seems: it's a literal word-by-word translation, with some cases where the translator chose the wrong one of two meanings of a word.

First two characters: "ma la" meaning hot and spicy, but literally "numbingly spicy" -- it means a kind of Sichuan spice that mixes chilies with Sichuan peppercorn or prickly ash. The latter tends to numb the mouth. "Benumbed hot" is a decent, if ungrammatical, literal translation.

Next two: "jiu cai," the top greens of a fragrant-flowering garlic. There's no good English translation, so "vegetables" is just fine.

Next one: "chao," meaning stir-fried, quite reasonably rendered as "fries" (should be "fried," but that's a distinction English makes and Chinese doesn't).

Finally: "gan si" meaning shredded dried tofu, but literally translated as "dry silk." The problem here is that the word "gan" means both "to dry" and "to do," and the latter meaning has come to mean "to fuck." Unfortunately, the recent proliferation of Colloquial English dictionaries in China means people choose the vulgar translation way too often, on the grounds that it's colloquial. Last summer I was in a spiffy modern supermarket in Taiyuan whose dried-foods aisle was helpfully labeled "Assorted Fuck." The word "si" meaning "silk floss" is used in cooking to refer to anything that's been julienned -- very thin pommes frites are sold as "potato silk," for instance. The fact that it's tofu is just understood (sheets of dried tofu shredded into julienne) -- if it were dried anything else it would say so.

I believe that this explanation applies to "FUCK TO FRY" as well, and is simpler than the letter-substitution theory.

Also see "A grander Chinglish", "Regale in Basilica"; and from the other side, "Semen, green rice and the rate of internet decay" ]

Posted by Mark Liberman at 10:39 AM

Rhythms of the blogosphere

Last year ("Language: the anti-beer?" 4/23/2005) I mentioned something that's obvious to anyone who tries blogpulse -- the blogosphere, like the ocean, has rhythms on several different time scales. Comparing, say, "paper" and "movie", we can see an inverse correlation at two of these scales:

There's a week-vs.-weekend pulse, which we can also see in pairs like "work" vs. "fun":

And in the case of "paper" and "movie" there's also a semester-sized rhythm, with an decrease of school-related concerns relative to leisure during the Christmas break, and an increase at the end of the spring semester -- and then there's the leading edge of the summer holidays.

The same sort of rhythms are apparent in Language Log's visits and page views. Language Log's weekly rhythms traditionally correlate, alas, with "work" as opposed to "fun" -- up during the week, down on the weekends (ignore the fractional-day numbers for today, May 30):

The weekend of May 21 was something of an exception, due to traffic associated with the opening of The Da Vinci Code.

On a larger time scale, we can see the (negative) effect of holidays superimposed on a general positive trend:

Not all words in the blogosphere resonate to the semesterly rhythm: "work" vs. "fun" seems to show the more local effect of grown-up holidays:

Down in the Language Log marketing department, Arnold Zwicky keeps yelling "come on, enough with the lexicostatistics and grammar already, it's after Memorial Day, we need more movie reviews and travel features!"

Temporarily ignoring this sensible prescription, I'll observe that there's an opportunity here for some interesting lexico-temporal hacking. "Latent Semantic Analysis" and similar techniques find useful relations among words based on the eigenstructure of a term-by-document matrix; does adding the dimension of time contribute anything that is not already implicit in the distribution of words across (atemporal) documents? There are some large weblog and webforum databases where this could be explored.

[Update -- Bob Carpenter emailed a pointer to Krisztian Balog and Maarten de Rijke, " Decomposing Bloggers' Moods", 3rd Annual Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics (at WWW 2006). This paper applies ARIMA time-series analysis to "20 million mood-annotated blog posts harvested between June 2005 and March 2006". The authors draw four conclusions:

(i) there is a clear overall decline in the usage of mood annotations; (ii) weather phenomena and holidays have a clear impact on the profile of some moods; (iii) looking at the relative counts, we observe that some moods are stationary, while others decline or climb; and (iv) several moods display changes in their cyclical or seasonal component during the period covered by our data.

This seems both plausible and interesting, but it's not what I was suggesting. My idea was that by modeling word-co-occurrence data relative to significant periodicities, such as weekly or seasonal rhythms, you could learn something about the distributional implications of word content beyond what would emerge only from considering a-temporal co-occurrences.]

Posted by Mark Liberman at 09:38 AM

May 29, 2006

Tolstoy enlisted to sell Viagra

I just received a spam email (and it got through the filters) containing part of Chapter 18 of Tolstoy's Anna Karenina. It was included as random plain text, to fool the spam filters into thinking it was a perfectly ordinary email from a Russian aristocrat. This does worry me a bit: I'm not sure I want to re-train spamassassin to deep-six all messages containing fine writing, richly delineated characters, and depth of emotion. Seems like throwing the baby out with the bathwater.

The main content of the spam, of course, was contained in an image showing an advertisement aimed at getting me to purchase pharmaceuticals that would give me "an unbelievable sex during all the night". There are certainly unbelievable aspects to its picture of sex.

The message asks me a number of rhetorical questions, the first two of which are: "Wanna be the first in her list? Are you dreaming about her friends beating your time?"

We appear to be talking about a chick with a stopwatch who has so many friends-with-benefits that she has to keep a list. And my goal is taken to be to rise to the top of that list.

The message then asks, provocatively but incomprehensibly: "Wanna her making all your dreams come true in the bed?" I think actually I wanna them making all my emails come understandable in the translation.

But let's move on. The next topic has to do with what other people say: "Would you like to hear from the babes, ‘he was the best man in my life’?" Well, note (for this is Language Log) the third person singular. If the babes are saying, "He was the best man in my life" to me, they're talking about someone else (someone who, if they're all saying it, has bedded all of them).

Perhaps the intent is to ask me whether I would like to hear from the babes that one of their number had been talking about me and had said in that context "He was the best man in my life", meaning by "he", me? Well, I invite you to note also (for this is Language Log) the preterite (or simple past) tense. If she (or they) said "was" rather than "is" when talking about me, then apparently I'm history already. (Who has displaced me, in this scenario? All those "friends" of hers beating my time, I guess.) So the answer is, since they ask, no, I definitely do not want to learn that the babes have been saying "He was the best man in my life."

Following this indirect-discourse puzzler, the message moves directly to the heart of the matter, or rather (for its priorities are clear), the phallus of the matter. It turns out that "your hypersexuality doesn't depend on the size of your penis, it depends on its ability to keep its hard-on up to several hours! And that's the way to deliver the best orgasm to her!"

So there we have it. Hours and hours of pounding, that's what those babes want. Relentless, unstoppable, bed-breaking hammering from a guy (size doesn't matter) with a multi-hour chemically-induced erection that beats all the time records set by everyone else in the long lists of timings and orgasm counts that the little sluts all keep in their diaries. This is the picture of mutual sexual pleasuring that is offered by these people, whom we are expected to trust in matters of pharmacosexual advice.

What we have here is a case of someone trying to sell generic Viagra using advertising copy written by sexually inexperienced male illiterates with small peckers. And the copy comes wrapped in Anna Karenina now. I'm so glad Tolstoy didn't live to see this.

Posted by Geoffrey K. Pullum at 10:09 PM

Selling ignorance

We Americans love to learn about how ignorant we are. At least, you'd think we did, given the steady pulse of news stories about how we can't find Afghanistan on the map, enumerate first-amendment freedoms, and so on. There are some other motivations, I guess, including the traditional sport of grumbling about cultural decay, and educators' interest in persuading the populace that we got trouble; but whatever the reason, there seems to be a small industry whose product is press releases suggesting that most of us are about 10 SAT points above grunting and bashing one another with sticks.

The First Amendment to the U.S. Constitution is a particularly easy peg to hang such stories on. It's important, but complicated -- and so it's easy to find what looks like evidence of ignorance, and it's obvious that this matters, and it's trivial to apply the rhetoric of survey spin to make the point. For some LL discussion of an earlier case, see "Freedom of speech: more famous than Bart Simpson?" 3/3/2006. The truth about public knowledge and opinion in this area matters, in my opinion, and it's worthwhile for people to inoculate themselves against the kinds of spin used to exaggerate public ignorance, so I thought I'd post a little tour of a recent blogospheric example.

A couple of days ago, Glenn Greenwald posted a passionate denunciation of what he sees as recent assaults on First Amendment rights (Unclaimed Territory, "People who don't understand how America works", 5/27/2006). He describes the threats to "[imprison] journalists who publish stories containing information which the Bush administration wants to conceal", and the belief that "[if] you are a U.S. citizen, the President can unilaterally order you abducted and imprisoned; does not have to charge you with any crime; can block you from speaking with anyone, including a lawyer; can keep you incarcerated indefinitely (meaning forever); and can deny you the right to any judicial review of your imprisonment or any mechanism for challenging the accuracy of the accusations." He quotes approvingly from Antonin Scalia's opinion (in Hamdi v. Rumsfeld) that "[t]he very core of liberty secured by our Anglo-Saxon system of separated powers has been freedom from indefinite imprisonment at the will of the Executive...", and concludes that "people who never learned that American citizens can't be imprisoned by Executive decree and without a trial, or that American journalists aren't imprisoned for stories they write about the Government's conduct ... plainly do not embrace, or comprehend, even the most basic principles of what America is".

I agree with Greenwald and Scalia about these issues, and (when the rhetorical underbrush is cleared away) I think that most other Americans do too. But one of the commenters on Greenwald's post suggests that the core of the problem is in the American population, not in certain factions of the American intellectual and governing classes, and supports the case with a quotation from one of the ignorance-mongering press releases I'm talking about:

Glenn, these demagogues are just reflecting the beliefs and understanding of their voters. Sadly enough, a recent poll shows that a significant minorty of Americans think the press should have moderate to severe restrictions on its freedom:

* Only 14% of Americans – and only 57% of journalists – can name freedom of the press as a right in the First Amendment.
* 43% of Americans believe the press has “too much freedom,” while 3% of journalists agree.
* 22% of Americans believe government should be able to censor newspapers.
* 72% of journalists said the media is doing at least a good job in reporting information accurately; 39% of Americans agreed.
* Only about one-third (36%) of Americans agree the news media tries to report the news without bias, while 61% claim there is bias in news coverage.

The hyperlinked press release doesn't tell us what questions were asked in what order, but it gives a clue:

Only 14% of Americans, and 57% of newspaper and TV journalists, can name “freedom of the press” as a right that is guaranteed by the First Amendment, according to a new University of Connecticut study.

“Freedom of the press is at the core of America’s brand of democracy,” commented Professor Ken Dautrich who directed the study. “It’s quite surprising that so few Americans can name it as part of the First Amendment. Even more disappointing is the fact that those who use free press rights in their work aren’t more knowledgeable about it.”

When asked to identify the specific rights guaranteed by the First Amendment, “freedom of speech” is cited most frequently (58%) by Americans, followed by freedom of religion (16%). The right to peaceably assemble (10%), and the right to petition government for a redress of grievances (1%) are even less identifiable than free press.

If you're hip to the rhetoric of survey spin, you'll guess at this point that the survey asked people to enumerate first-amendment rights by free recall. They probably weren't given a list of possible rights (real and fake) to pick from; and they probably weren't asked to list the rights guaranteed by the constitution as a whole, or by the bill of rights, or whatever.

Think about this for a minute. Do the Ten Commandments prohibit adultery? I bet that most people would say "yes". What is the number of the commandment that prohibits adultery? I bet that most people can't remember. What does the 8th commandment say? I bet that most people can't remember this either (hint: that's not the one about adultery).

Now, if you want to design and report a survey to show that people are ignorant of the decalogue, you'll ask a question like "what does the eighth commandment say?", and you'll report the results by writing something like "Only 14% of Americans, and 57% of preachers, were aware of the commandment that prohibits stealing".

The UConn press release tells us that "A complete copy of the survey results can be found at: http://www.dpp.uconn.edu", but this is apparently no longer true. However, a bit of general googling suggests that there is a close relationship between the cited survey and one carried out by New England Survey Research Associates for the First Amendment Center: "State of the First Amendment 2005". At least, the report credits "Professors David Yalof and Ken Dautrich" with devising the questions and supervising the survey, and the reported numbers are similar though not identical. According to that report, the very first question in the survey was indeed:

As you may know, the First Amendment is part of the U.S. Constitution. Can you name any of the specific rights that are guaranteed by the First Amendment?

The report gives percentages for a few of the answers, broken down over several years of the survey:

  1997 1999 2000 2001 2002 2003 2004 2005
Freedom of the press 11% 2% 12% 14% 14% 16% 15% 16%
Freedom of speech 49% 44% 60% 59% 58% 63% 58% 63%
Freedom of religion 21% 13% 16% 16% 18% 22% 17% 20%
Right to petition 2% 2% 21% 1% 2% 2% 1% 3%
Right of assembly/association 10% 9% 9% 10% 10% 11% 10% 14%
Don't know/refused to answer N/A N/A 37% 36% 35% 37% 35% 29%

Just for reference, in case you don't happen to have the first amendment memorized yourself, it reads:

Congress shall make no law respecting an establishment of religion, or prohibiting the free exercise thereof; or abridging the freedom of speech, or of the press; or the right of the people peaceably to assemble, and to petition the government for a redress of grievances.

(And why not go ahead and memorize it? It's only 45 words long... For that matter, the whole Bill of Rights is just 613 words.) As the survey results indicate, most people associate the first amendment with freedom of speech. I suspect that 63% is a higher percentage than could say what the first commandment requires, in a similar test of free recall.

The poll's questions #3 and #8 are more relevant to the first topic of Glenn's post:

3. Overall, do you think the press in America has too much freedom to do what it wants, too little freedom to do what it wants, or is the amount of freedom the press has about right?

  1997 1999 1999(f) 2000 2001 2002 2003 2004 2005
Too much freedom 38% 53% 42% 51% 46% 42% 46% 42% 39%
Too little freedom 9% 7% 8% 7% 8% 8% 9% 12% 10%
About right 50% 37% 48% 41% 42% 49% 43% 44% 47%
Don't know/refused to answer 3% 2% 3% 2% 3% 2% 1% 3% 4%

8. Overall, do you think Americans have too much, too little or just the right amount of access to information about the federal government’s war on terrorism?

  2002 2003 2004 2005
Too much access 16% 12% 15% 14%
Too little access 40% 48% 50% 52%
Just about the right amount 38% 38% 31% 30%
Don't know/refused to answer 6% 2% 4% 4%

Those were the results from a year ago -- this year's survey has yet to be posted -- but I'd be very surprised if things had changed so as to put the American public further away from wanting to be kept informed by a free press. Except for the aftermath of the Monica Lewinsky scandal, a solid majority of Americans think that there is too little or about the right amount of press freedom. Public opinion about legal sanctions for publishing classified information no doubt depend on the details of the case, but I'd be surprised if a majority of Americans favored allowing a president to categorize the publication of arbitrary information as a crime.

The results of four additional survey questions were released separately by the American Journalism Review. These results underlined my point even more strongly:

The 2005 edition of the poll, commissioned by the First Amendment Center in collaboration with AJR, found that 69 percent of Americans agree with the statement: "Journalists should be allowed to keep a news source confidential."

This surprised the pundits, who may have been drinking their own kool-aid:

National Journal media columnist William Powers thinks it's "amazing that that many people are behind this principle," saying he would have guessed support would be less than 50 percent.

There was some other good news:

The survey offers one other encouraging finding for the media. Americans endorsed the press' watchdog role, with 74 percent agreeing with the statement: "It is important for our democracy that the news media act as a watchdog on government."

All this despite the public's rather low opinion of the media biz:

... an unnerving 65 percent of those polled agreed with the statement: "The falsifying or making up of stories in the American news media is a widespread problem."

And a mere 33 percent agreed that: "Overall, the news media tries to report the news without bias." That's down 6 percentage points from last year. Among the 64 percent of Americans who disagreed with that statement, 42 percent strongly disagreed.

On all these points, my own opinions line up pretty close to those of the folks who were surveyed. And as a group, we're better informed, more sensible, and more American than the PR spin suggests.

Note that I'm not blaming Greenwald's anonymous commenter. (S)he just swallowed the bait dangled by the press release and its media uptake, as all too many people do. But the real public-opinion situation is a lot better -- and that matters.

[Update -- Zeno points out that the ten commandments are an especially tricky example:

The sixth commandment says, "Thou shalt not commit adultery." The eighth commandment says "Thou shalt not bear false witness." If you think I'm off by one in each case, then it's because you're not using the traditional Catholic numbering of the commandments. (I think the Lutherans use the same numbering.) Catholics have two commandments (9 & 10) about coveting, whereas most Protestants have one omnibus anti-coveting commandment (10). To make up the difference, they split in two the commandment that Catholics consider the first: "I am the Lord thy God, thou shalt have no strange gods before me."

This means, of course, that a Protestant pollster could easily mark me down as someone who doesn't know the Ten Commandments. I'd be a confirming instance of the ignorance of the man in the street. And it would be false witness, too.

So it'll be especially easy to find "ignorant" clergy. Seriously, this is like the issues with counting the "five freedoms" discussed here. Also, it never occurred to me before to wonder: whose numbering gets used on those public decalogue displays that have been a matter of first-amendment contention in recent years? ]

[Update #2 -- Ran Ari-Gur wrote:

I completely agree, and I think a further point is that freedom of the press is really just a kind of freedom of speech, so that it's not unreasonable for someone not to name it if they've already named freedom of speech. (Technically, there may be a line where one ends and the other begins, but if so, we're no longer talking about the basic facts of the Bill of Rights that every American should know.)

On one hand, there's a conceptual thread that ties religion, speech, press, assembly and petition together in the first amendment; and on the other hand, the founders had reasons for enumerating them separately. But in any case, it's certainly true that the development of online media increasingly blurs the boundary between the speech of ordinary citizens and "the press". ]

Posted by Mark Liberman at 12:54 PM

Confusing web language with the web world

Says Adam Cohen in a New York Times article on corporate threats to the democratic ethic of the World Wide Web [print: Sunday May 28, 2006, Week In Review, p. 9]:

"The blogging phenomenon is possible because individuals can create Web sites with the World Wide Web prefix, www, that can be seen by anyone with Internet access."

The remark suggests at least two mistakes. One (relatively minor and perhaps debatable) is linguistic. The other is more technical, and concerns a false belief that is probably fairly widespread.

1. Morphologically, the www in such words as www.languagelog.com is more like a combining form (see chapter 19 of The Cambridge Grammar) than a prefix. While there is great freedom in creating URLs, the most usual practice follows the convention of having a name for a server or a department followed by a name for the site or the company followed by a suffix indicating either a type of domain (commercial, non-profit, educational) or a specific country, and only the last element has to be picked from a predetermined list and has a semantics that the owner does not control. Thus we find email.sjsu.edu for the email server at San Jose State University; ftp.debian.org for the ftp (file transfer protocol) of the Debian Linux organization; ling.ucsc.edu for the departmental server of the Department of Linguistics at the University of California, Santa Cruz (where I work when I'm not at Language Log Plaza or away on sabbatical); and so on. The www portion of the millions of URLs that have it is really a server name: www.webster.com is the URL for the main web server (or server array) of the Merriam-Webster Corporation in the commercial arena, matching the typical pattern of server + site + domain. To the linguist's eye, the www is more like an initial combining form in a multi-component word (like the geo in geophysical or the psycho of psycholinguistic) than like a sense-modifying suffix (un- in unhelpful or pre- in prenatal).

2. You don't have to have www as the first component of the URL in order to have a blog or any other kind of web site. What you have to have is an Internet-connected machine running http server software. You can see this immediately from any of the livejournal blogs (like clunis.livejournal.com, to cite a random example), or from the fact that ling.ucsc.edu and lib.harvard.edu are web servers, etc. Cohen appears to believe that in order to have a blog, or perhaps in order to have a web site at all, you have to create a site with www as the first element of the URL. You don't. Doing so is purely a very widespread linguistic convention. One more case of confusing your language with your world.

Posted by Geoffrey K. Pullum at 10:46 AM

Tensions between a singular and plural nouns

New York Times cultural critic Edward Rothstein has a provocative column about the Senate vote to declare English the "national language," contrasting the legislation with the European Charter for Regional and Minority Languages. Rothstein's pointed arguments about the divergent American and European approaches to multilingualism deserve serious consideration, but I had a hard time getting past the headline that the Times stuck on the column in the online edition, both in the main Arts section and on the page for the article itself.

It's another case of contracted "headlinese" leading to a curious, if not downright bizarre, grammatical construction.

We are of course expected to read

(1) tensions between [ [a national Ø] and [minority languages] ]

as elliptical for

(2) tensions between [ [a national language] and [minority languages] ]

rather than the ill-formed

(3) *tensions between [ [a national languages] and [minority languages] ].

Presumably the full version (2) was deemed too long for the space allotted for the headline (three short "decks" on the main Arts page and two longer ones on the article page). But the contracted version just doesn't work for me: I stumbled over the headline, expecting to find a singular head noun to agree with the singular article in "a national..." What we have here is yet another flavor of WTF coordination, this time with a singular-plural conflict between the two conjoined NPs.

It would have been so much simpler if the headline-writer had just omitted the singular article a, leaving

(4) tensions between [ [national Ø] and [minority languages] ]

which unproblematically expands to

(5) tensions between [ [national languages] and [minority languages] ].

So what's wrong with that? Well, it's a bit less precise given the context of the article, which considers policies balancing a country's national language (e.g., German) and its minority languages (e.g., Low German, Sater Frisian and Lower Sorbian). So the "tensions" are in a one-to-many relationship (between a single national language and multiple minority languages) rather than a many-to-many relationship. The problem is that there is no concise way to express that one-to-many correspondence without causing an eyebrow-raising mismatch between singular and plural conjuncts. In this case, I would prefer the semantic imprecision of (4) to the grammatical weirdness of (1).

The moral of the story: you can call English "national," but that doesn't make it rational.

[Update: The headline for the column in the print edition (as verified by the Nexis and ProQuest databases) is completely different: "Translated From Spanish (or Lower Sorbian or Breton), With High Emotion." That more creative style of headline-writing doesn't fly in online editions, as the New York Times itself explained in a recent article, "This Boring Headline Is Written for Google."]

Posted by Benjamin Zimmer at 01:45 AM

May 28, 2006

Menu conventions vs. syntax

Maybe, just maybe, there's another way to look at the menu of the EVOO restaurant that Geoffrey Pullum described(see here) as full of noun phrases with many attributive modifiers:

Garlicky Pork Sausage Stuffed Crisp Fried Maryland Soft Shell Crab

On the (possibly weak) assumption that fair-minded restaurant patrons ought to try to read menus from the perspective of the menu writers, I ask the question, "Why did EVOO's menu have a string of words of such complexity?" Were the owners laying in wait for a grammar expert to come in and parse their menu? Or were they merely victims of the long-held conventions of menu presentation?

Let's begin with data. Restaurant menus follow a standard, expected series of meaning slots. I've examined dozens of restaurant menus around the country and I've found that they consistently present their offerings according to this formula:

Slot 1. self-congratulations about the item        our famous, world's best
Slot 2. method of cooking the item                   fried, roasted, baked, wood-fired
Slot 3. style of cooking the item                        Italian, Cajun, Southern
Slot 4. the food item                                          chicken, beef, salmon, pork
Slot 5. serving modification                              sandwich, roll-up

To make this look more scientific than it really is, note the following formula:

+/- slot 1      +/- slot 2      +/- slot 3      + slot 4      +/- slot 5

This says that all slots are optional except slot 4, the food item. Other slots can be used but they don't have to be, depending on the menu writer's discretion and creativity. Even though it's not obligatory for menu items to fill all 5 slots, their order is fixed. You probably won't see many menus offering: "Salmon roasted Cajun our best" or "sandwich French baked our famous."

The really top-flight (expensive) restaurants  don't just list the food item all by itself. That would not be classy and their customers would certainly not be impressed to see "crab" on the menu without the method in which it was cooked. Nor would they want to order a sandwich denuded of what actually was in it. Self-congratulation is  to be avoided in all high-class menus. If you have to say how good the item is, it probably means that it isn't all that great anyway.

Although it has considerable oddness in slot 2, method of cooking, the menu item that Geoff analyzed follows the standard menu slot sequence:

Slot 1. self-congratulations -- not used
Slot 2. method of cooking -- Garlicky Pork Sausage Stuffed Crisp Fried
Slot 3. style of cooking -- not used
Slot 4. the food item -- Maryland Soft Shell Crab
Slot 5. serving modification -- not used

By now you're wondering, "How can Garlicky Pork Sausage Stuffed Crisp Fried" possibly be a a method of cooking?" This is where the menu writers got into trouble. One could argue that the method of cooking is really "crisp-fried," but that by itself apparently didn't  sound classy enough to them. If the method of cooking had stopped with "crisp-fried," and if the writers had insisted on putting "garlicky pork sausage stuffed" somewhere on the menu, it might have made sense to shift it to some other slot. But it doesn't really fit slot 3, style of cooking. Nor would the restaurant want its customers to think that lowly and pedestrian "pork sausage" is part of the classy slot 4 food item, since the owners no doubt wanted to highlight Maryland Soft Shell Crab more than anything else.

So the menu inserts "garlicky pork sausage stuffed crisp fried" into the method of cooking slot, leading to the confusing syntax that Geoff described so well. What appears to be wrong with this slot is that it's missing a preposition, a conjunction, and some punctuation. It seems to mean this:

Crisp-fried (and stuffed with pork sausage)         Maryland Soft Shell Crab
                  Slot 2 (method of cooking)                                    Slot 4 (food item)

From the restaurant's perspective the problem with this is that it becomes an overly long introduction to the most important part of the menu, still to come -- Maryland Soft Shell Crab. The menu writers did their best to follow standard menu conventions but they fell considerably short of making syntactic sense. If they were courageous enough, it might have been prudent for them to fly in the face of menu slot conventions, reversing the slot order, and say simply:

Maryland Soft Shell Crab, crisp fried and stuffed with garlicky pork sausage
         Slot 4 (food item)                  Slot 2 (method of cooking)

Maybe we should pity the poor menu writers who have to choose between following the conventions of their field and writing with English  syntax.

[Update] Gabriel McCall writes that when menu items are offered verbally by servers, they generally follow the reverse pattern. Interesting.

Posted by Roger Shuy at 08:08 PM

And the plural of MacBook Pro is ...

Greetings from the youth and popular culture desk here at Language Log Plaza. We also happen to be the ones who take questions regarding language use in the computing industry; nobody else here much cares about it and so our new telephone system just redirects those calls to our desk (following a fittingly recursive route, natch).

Having thrice outed myself as an Apple product fanatic, Language Log reader Jake Seliger recently contacted me directly to ask about how the new line of high-performance Apple notebook computers should be pluralized.

Most Highly Esteemed Professor Bakovic:*

You noted on Language Log that you're fascinated with all things Apple, so I thought you the person to go with a question combining language and Macs.

On Apple websites I've chiefly seen the plural of "MacBook Pro" as "MacBook Pros." Yet I believe MacBooks Pro would be correct as a plural since Pro is just a modifier -- similar to "attorneys general." Is this correct?

Yet when one is referring to the possessive -- concerning the hard drive, for example -- saying, "The MacBook's Pro hard drive died" would, I'm fairly certain, be wrong. So one would say, "The MacBook Pro's hard drive died" instead.

As a result, calling the plural "MacBooks Pro" and calling the possessive "MacBook Pro's" would seem likely to generate confusion.

Is any of this right?

-Jake Seliger

What follows is my reply to Jake's question, suitably edited for Language Log viewing by those of you who may have the same question -- or one very similar to it. (A note to the less fortunate among you who will have to settle for one of the less expensive consumer-level Apple notebooks: they're just called Macbooks, so no linguistic issue for you folks there.)


I think insisting that it should be attorneys general rather than attorney generals (or mothers-in-law rather than mother-in-laws, and other such examples) is a little silly in the first place, and I wouldn't insist on MacBooks Pro over MacBook Pros. There's good linguistic reason for the indecision here: names, titles, and other such labels tend to be analyzed grammatically as compound nouns in English, rather than as phrases. The difference between these two things is in most cases very subtle -- which is part of the problem leading to the issue at hand -- but the following example should help to show that there is one.

Think of the compound noun blackboard, meaning the thing that you write on with chalk. Many blackboards are indeed black in color, but not necessarily; many are green, for instance, but we still have no problem calling them blackboards. On the other hand, take the phrase black board. (The orthographic space between the words only highlights the distinction; in pronunciation, the difference is roughly one of relative stress: BLACKboard vs. black BOARD. Compounds are often, but not always, written without a space.**) A black board cannot be green, and it's not necessarily something you write on with chalk -- it's just a board that happens to be black.

Especially in writing, phrases and compounds often appear to be very similar, as this example illustrates; the differences are mainly things we don't represent orthographically (such as relative stress) and the consequences for meaning: phrases tend to add up to the meanings of their parts, whereas compounds can have specialized meanings of their own that bear less of a relation to their parts. This is what linguists refer to as compositionality: phrase meaning tends to be compositional (transparently composed of the meaning of its parts), whereas compound meaning can be noncompositional.

Another key difference between phrases and compounds is that the parts of a compound can be ordered in ways in which phrases cannot. For example, within a phrase, a modifier overwhelmingly tends to precede the noun that it modifies. (There are particular learned exceptions such as in it came upon a midnight clear, but these are clearly felt by English speakers to be exceptional.) No doubt related, at least in part, to their noncompositionality, compounds can (sometimes) have the order noun + modifier, as in the attorney general and MacBook Pro examples. (Another example like this is bootblack, someone who polishes shoes and boots.)

As you can see, the grammatical rules that are typical of phrases don't necessarily apply in the same way to compounds. The same thing goes for the rule that you're interested in: that the noun part rather than the modifier should receive the plural -s marker. But note that the entire compound in these cases is itself a noun -- they refer to persons (attorney general) or things (MacBook Pro) -- and so the plural -s marker can be thought of as indicating that the entire compound is plural. So, the problem in these particular cases just boils down to the fact that the last part of the compound happens to be the modifier rather than the noun, and so it looks like the modifier is what's being pluralized.

The following diagrams might help. Think of the plural rule as something that says: "give me a singular noun (N), and I'll give you another N with -s attached to it that means the same thing as the original N, except plural rather than singular". Since the compounds you've asked about are Ns that consist of an N and a modifier (M), the plural rule can operate on either of the two Ns, and so you get two possible structures in each case:

-s attached to lower N -s attached to higher N
MacBook Pro
attorney general

So, both possibilities are technically correct. (For some English speakers there may be a tendency not to put things like plural markers in between the parts of a compound -- see for example this paper, p. 19ff -- but that's a separate issue.) I think the reason folks tend to insist on things like attorneys general is because they stop to think about the internal structure of the compound as if it were a phrase.

The possessive marker -'s follows a different rule than the plural, so there's no reason for that to influence one's judgment of MacBooks Pro vs. MacBook Pros. While the plural marker attaches to a single noun, the possessive marker attaches to an entire noun phrase (NP). This is shown by examples like the following (which people sometimes find awkward and often try to rephrase, especially in writing, since there are many other ways to mark possession in English):

[NP This guy I know ]'s sister is a fashion designer.

You definitely wouldn't say this guy's I know sister; possession is marked on the entire noun phrase this guy I know, which happens to end with a verb instead of a noun.

In the MacBook Pro's case, the NP consists of just the compound. So, the relevant structure in such a case would be something like the following.

And that's why MacBook's Pro wouldn't work at all.


Jake wrote back to thank me for this reply, and to give me permission to quote all this -- plus he sent this very relevant link. He also writes:

One other thing to note is the abbreviation issue surrounding MacBooks Pro: online, especially at Ars Technica's Macintoshian Achaia, people tend say "MB" for "MacBook" and "MBP" for "MacBook Pro." As a result, the plural of MacBook Pro becomes MBPs (I usually leave out the apostrophe for plurals). In this case, MBsP obviously wouldn't make any sense because MBP is the entire noun and "Pro" no longer really modifies it.

Final note: it occurs to me that we could just ask the good folks at Apple what they consider more carefully what they think the plural of MacBook Pro should be. When confronted with the fact that Walkmen sounds odd to (most) English speakers, and that Walkmans fails on the supposed analogy with the irregular pair man ~ men, Sony apparently decreed that the plural of their insanely popular product should be Walkman® personal stereos. Whaddayasay, Apple -- MacBook Pro notebook computers???

[ Comments? ]

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

* Actually, Jake's message started with just "Hi," -- one of the reasons I've been told that our youth and popular culture desk exists in the first place. I've felt free to translate the salutation on Jake's behalf. [back]

** This is my argument for why the name of the film The English Patient should be pronounced The ENGLISH Patient, not The English PATIENT -- it's a title, hence a compound. Some folks vehemently disagree; for them, I've got a can of Jolt cola right here. [back]

Posted by Eric Bakovic at 06:26 PM

From the deepest void of neverness

I have an uneasy feeling that just because I offered some modest syntactic reflections on the syntactic complexity of an EVOO menu item ("garlicky pork sausage stuffed crisp fried Maryland soft shell crab") I am now going to be inundated by messages from people who think they have found a noun phrase with an even longer succession of attributive modifiers. These would be people who have, sadly, mistaken me for someone who might give a damn.

Already one of our youth and popular culture correspondents here at Language Log, Eric Bakovic, has supplied me with some examples that he found quoted from a user named Babaquara in an article about online music services (which Bakovic was apparently reading on Language Log company time). He quotes the phrase "dada kraut psych mindblowing conscience expanding sublime acid oriented arcana coelestia weirdness", for example. I will explain my reaction to Eric using the unique trisyllabic word that appears to be widely understood by his generation: whatever.

Certainly, it is possible that the phrase dada kraut psych mindblowing conscience expanding sublime acid oriented arcana coelestia weirdness has roughly nine stacked attributive modifiers; but one cannot really tell, because it all depends on how it is parsed: doubtless "consciousness-expanding" (I add the helpful hyphen) is intended as a syntactic unit, but one doesn't know about "kraut psych" and so on. This is basically the problem one finds with quotes from chimpanzee language: chimps are occasionally reported as having signed things with transcriptions like BANANA BANANA HELP REFRIGERATOR GIMME OPEN BANANA GIMME, and syntactically one does not really know where or whether to begin.

Part of the problem here is that Eric is one of the younger staffers here at Language Log Plaza. They work with headsets on, they have X-men posters on their walls, they talk about whether Lara Croft's breasts in the new Crystal Dynamics video game release are as big as before. The average age in their part of the building is approximately 19. They typically list their hobbies as (i) being wicked cool, (ii) dancing to their iPods in public places, (iii) shopping at American Eagle, and (iv) staying out all night.

One does not see them at EVOO; they dine at places where the menu is a series of brightly colored pictures on glass with lights behind them. Often there is a neon sign in the window saying "BURRITOS AS BIG AS YOUR HEAD".

And their reading material does not fully meet the criteria for being called "language". Another phrase quoted from Babaquara (see it here if you have your parents' permission) puts it well: "ultrahypermegamonstaheavy over the top mammoth freakin mind exploding destroyer psychedelia from the deepest void of neverness." The fact is that the younger Language Log staffers seem well acquainted with the deepest void of neverness. Eric Bakovic has definitely been there. I have seen him pour a can of Jolt cola over David Beaver's head during a disagreement about whether something or other was ultrahypermegamonstaheavy or not.

I simply do not understand half the things they say to each other (if my using the verb "say" is not begging the question there). In normal running text, 37% of the words are nouns; in the cubicles of the Youth and Popular Culture department at Language Log Plaza, 37% of the words are dude.

So I am not necessarily prepared to consider random examples containing huge numbers of attributive modifiers to be within the normal range of non-chimpanzee syntax, if they come from things Eric would read and understand, OK? Is that understood? Let's try to keep some reasonable standards in place here. Call me a fusty old conservative if you like, but I think English is quite lax enough on stacked prenominal modifiers without our seeking data from any mammoth-freakin' mind-exploding dialects in which the word like is used as a punctuation mark. Just don't send me any.

Posted by Geoffrey K. Pullum at 04:20 PM

Da Vinci Q & A

I've finally done my civic duty. I read The Da Vinci Code, and saw the movie. Reading the book was an anti-climax: I have nothing add to Geoff Pullum's deconstructions (look at the bottom of this post for a list). The cinematic signs and portents were ambiguous: on one hand, the theater was nearly deserted; on the other hand, a sophisticated fourth grader of my acquaintance thought the movie was better than X-Men, though not as good as The Terminal. But I agree with Geoff Pullum that traditional media are generally "Behind the Da Vinci Curve", and as further evidence of the superiority of the new-media coverage, I'd like to draw your attention to a recent post on The Medicine Box ("The Internet Theologian Explains The Da Vinci Code" 5/17/2006).

It begins:

As the responses to my helpful guide on Christianity show, when theological controversies arise, many people wisely turn to an anonymous crank with a web log. Or, as I prefer, to a Big-Time Internet Theologian.

These are good days for us Big-Time Internet Theologians: religious controversies are in the news daily, and many people have probing, searching questions that cannot be answered by relying on traditional, "second wave" sources like books, professors, or subway graffiti. People want answers, and they want them to come with hyperlinks to Wikipedia entries compiled by embittered teenagers.

The first few (questions and) answers:

Q: Who is Dan Brown and what is "The Da Vinci Code"?
A: Dan Brown is the biggest-selling, and therefore best, author of our times, and "The Da Vinci Code" is his masterpiece: a thrilling, shocking journey across thousands of years of history all packed within a pulse-pounding chase across scenic Europe, leading up to the greatest conspiracy of all.

Q: What is the greatest conspiracy of all?
A: The 1954 NIT point-shaving scandal.

Q: What does all this have to do with Jesus? Or, for that matter, Leonardo Da Vinci?
A: The premise of the book is that Jesus was married to Mary Magdalene, and that the two had children, who passed along Jesus' bloodline through generations of French people. Leonardo was the member of a secret brotherhood of painters who protected this secret by painting pictures of men that look like ladies.

This is Language Log, after all, so there is an obligatory linguistic hook:

Q: Why does the dialogue in the book which is supposed to be in French include French words alongside the English translation, like, "Pain is good, monsieur" and "Le capitaine is happy you decided to stay overnight"?
A: That is how the French speak. There is no French language per se, just a few words they throw into English sentences to make themselves seem superior to Americans.

You should read the rest of it for yourself, but I can't help quoting a few more:

Q: The book goes into detail about a group called the Knights Templar. Can you explain what they were?
A: They were the basketball team of Temple University in the 1950s. Philip the French, who was King of Congress at the time, suppressed them because of the NIT point-shaving scandal. In addition to playing basketball, they also guarded the secret of Jesus' French kids by painting pictures of men who look like ladies.

Q: Okay, explain this whole "painting pictures of men who look like ladies" thing. What does it have to do with Leonardo?
A: In 1099, a reggae group called the Priority of Zion was founded to hush up the truth about Jesus' French children. It was felt at the time that if word got out that Jesus had lived in France, it would drive up real estate costs beyond what the knights were willing to pay. So the Priority of Lion was formed to keep the secret. Throughout the centuries, every time someone became prominent in Europe - Botticelli, Sir Isaac Newton, Tintin - they would be enrolled into the Prior of Zionism to help keep the secret.

Q: Doesn't it seem more sensible, if they wanted to keep a secret, not to enroll high profile Europeans?
A: Yes, except that it was hard for many years to avoid famous Europeans. From 1755 to 1914, everyone in Europe was either an author, inventor, or executed king.

Q: So how do the paintings factor in?
A: Leonardo Da Vinci was a member of the Priorities of the Elders of Simon. However, he was terrible at keeping secrets, and felt it necessary to leave little clues in all his paintings about Jesus' Francophone offspring. For example: the Mona Lisa is smiling because Leonardo was feeling smug about knowing where Jesus lived, all the while Raphael was thinking Jesus lived in Jerusalem.

This captures the book's zany dream-logic better than any other reviews that I've seen.

[Note: the identification of "holyoffice" as Terry Mattingly, though based on what I once thought was plausible evidence, is clearly false. Apologies to both Prof. Mattingly and to "holyoffice" for the error, which I've left in place as evidence of my own carelessness.] At this point, though, I need to 'fess up that holyoffice, the author of The Medicine Box blog on Livejournal, is apparently* Terry Mattingly, who also posts on the blog GetReligion ("The press ... just doesn't get religion"). In Real Life, he's director of the Washington Journalism Center at the Council for Christian Colleges and Universities, and author of a weekly column for Scripps Howard. In other words, an old-media infiltrator.

When you're done with as many of those links as you care to follow, you might want to try the glossary of Christian terminology at the end of the post "The Interpretive Dance Theocrats" (Terry Mattingly as holyoffice on The Medicine Box, 5/12/2006), which begins:

Premillenialism
This is the belief among some Christians that, ever since Jan. 1, 2000, it has no longer been possible, in the words of the Prince song, "to party like it's 1999." Postmillenialists are those Christians who believe that it will always be possible to do so, while Amillenialists believe that in this context, "1999" cannot be understood literally, but must be read as an allegorical term roughly meaning "a time at which it is especially appropriate to party."

Rapture
This was a #1 hit in 1980 for Blondie (#5 in the UK), from the otherwise underwhelming "Autoamerican" album. Many Christians now concede that the then-pioneering use of rap in the song sounds a little lame in retrospect. In their best-selling series of books about the song, "Left Behind (Parallel Lines)," Jerry Jenkins and Tim LaHaye defend the rap verse's hip references to Grandmaster Flash and Fab Five Freddy, and maintain that when Jesus returns, all believers will be united in accepting that Blondie's cover of "The Tide Is High" is better than the original.


* I inferred that Terry Mattingly is holyoffice, or perhaps vice versa, from the LiveJournal profile page for holyoffice, which gives The Press doesn't get religion in the "website" slot. Among the folks who post there, Terry Mattingly seemed like the best fit to "holyoffice". If I got that wrong (and two readers have written with scholarly objections to the analysis), I apologize to all concerned. The DVC Q&A is still the only stuff on Dan Brown I've seen that's as funny (and true) as Geoff Pullum's posts. [ Well, I *did* get it wrong, and apologies are certainly in order to both writers.]

Posted by Mark Liberman at 12:59 PM

Monkey words

To balance our occasional complaints about foolish and misleading science reporting, I'd like to commend an article by Nicholas Wade in the NYT ("Nigerian Monkeys Drop Hints on Language Origin", 5/23/2006), on recent research by Kate Arnold and Klaus Zuberbühler ("Language evolution: Semantic combinations in primate calls", Nature 441, 303, 18 May 2006).

Wade's first two paragraphs describe the new research, with an appropriately nuanced claim about its importance:

Researchers taping calls of the putty-nosed monkey in the forests of Nigeria may have come a small step closer to understanding the origins of human language.

The researchers have heard the monkeys string two alarm calls into a combined sound with a different meaning, as if forming a word, Kate Arnold and Klaus Zuberbühler report in the current issue of Nature.

In the third paragraph, Wade sets the stage in an informative and sensible way:

Monkeys are known to have specific alarm calls for different predators. Vervet monkeys have one call for eagles, another for snakes and a third for leopards. But this seems a far cry from language because the vervets do not combine the calls into anything resembling words or sentences.

(This is a reference to the work of Dorothy Cheney and Robert Seyfarth, explained at length in their wonderful book How Monkeys See the World.) Wade then describes the new facts as Arnold and Zuberbühler reported them :

The putty-nosed monkeys have a "pyow" call meaning there are leopards about and a hacklike sound to warn of the crowned eagle. The "pyow" calls attention to a leopard on the ground.

When hearing the "hack" sound, a monkey tends to freeze because movement would betray its position to an eagle.

Dr. Arnold and Dr. Zuberbühler, zoologists at the University of St. Andrews in Scotland, noticed that adult male monkeys in each troupe were combining the "pyow" and "hack" calls.

Playing back a "pyow-hack" call to see how the monkeys interpreted it, the zoologists found it made the troop leave the area.

This lays out what is new in the research: the animals' response to playback of the combined calls. Monitoring response to call playbacks is a technique pioneered by Cheney and Seyfarth with vervet alarm calls, so this is a new application of an old technique. And what does it mean that putty-nosed monkeys use (and respond to) combinations of two different alarm calls? Wade explains:

Researchers studying monkeys and apes have learned that they possess all the basic apparatus needed to make and analyze sounds. But the nonhuman primates did not seem to possess either of the two combinatorial features of language, those of combining discrete sounds into compound words, and of stringing words together under rules of syntax.

Dr. Zuberbühler said that he and Dr. Arnold had not observed anything resembling syntax, but the putty-nosed monkeys, Cercopithecus nictitans, "combined two types of utterances according to a rule and the combination takes on a novel meaning," a procedure perhaps analogous to forming a word from two sounds.

Notice first that Wade correctly reports Zuberbühler's evaluation that this has no apparent connection to the development of syntax, but may have something to do with phonology (which was also David Beaver's more vividly expressed suggestion). Wade (and Zuberbühler) are also appropriately tentative about the connection to phonology.

Logically, there are a number of possible sources for the phonological "duality of patterning" which is robustly observed in all human languages, but exists (at best) in embryonic or allusive forms in non-human animals. This system for making meaningful messages out of intrinsically meaningless parts -- a digital form of message coding -- might arise from assigning meanings to vocal displays that were originally purely formal (like the songs of birds or whales); or it might arise from combining meaningful bits of behavior (like alarm calls) into communicative structures with new and (and least partly) arbitrary associations. I believe that Klaus Zuberbühler's idea is that the putty-nosed monkeys may be taking a first small step along the second path.

Wade ends with a skeptical quote from Marc Hauser:

Marc Hauser, an expert on animal communication at Harvard, said that the observation was very interesting but that stricter criteria should be applied before assuming the combination of alarm calls was similar to the way people combined sounds into words.

"Because there is no evidence that the calls are words or even wordlike, the connection to language is tenuous." he said.

This might be a bit too skeptical, at least as a way to end the piece; but it's an appropriate corrective for the range of spectacular over-interpretations elsewhere in the media. These include headlines like "Monkeys use 'Sentences', Study Suggests" [National Geographic], "African monkey can 'talk in sentences'" [The Independent], "Monkeys Found Using Primitive Linguistic Grammar" [Fox News]; and leads like "Monkeys are able to string together a simple 'sentence', according to research that offers the first evidence that animals may be capable of a key feature of language" [Mark Henderson in News.com.au], or "Researchers said Nigeria's putty-nosed monkey sometimes communicates with a combination of sounds different from others, offering, they say, the first proof that animals may be able to talk" [UPI].

Another good feature of Wade's story: it gives a link (in the margin under the heading "Related") to the original paper in Nature.

Recently, the major journals Science and Nature have been vying with one another in giving prominent display to papers about animal communication research. This in a fascinating topic, in my opinion, and I think that the published research is individually and collectively valuable (even if I sometimes disagree with the interpretation). Animal communication stories evoke deep resonances in our culture, and so these stories generally also make into news media of all sorts, mostly in bizarrely misconstrued forms.

[ One of the consequences of recent attention from Science and Nature is that the animal communication research featured in the media is not always the most interesting stuff. For example, it's (logically though not journalistically) odd to play up the recent Nature article on putty-nosed monkeys, while ignoring Zuberbühler's (in my opinion even more interesting) 2002 paper on cross-species communication between Diana monkeys and Campbell's monkeys in Côte d'Ivoire ("A syntactic rule in forest monkey communication", Animal Behaviour, Vol. 63 no. 2 , Feb. 2002, pp. 293-299).

Here's the abstract from that paper, for those without subscription access:

Syntactic rules allow a speaker to combine signals with existing meanings to create an infinite number of new meanings. Even though combinatory rules have also been found in some animal communication systems, they have never been clearly linked to concurrent changes in meaning. The present field experiment indicates that wild Diana monkeys, Cercopithecus diana, may comprehend the semantic changes caused by a combinatory rule present in the natural communication of another primate, the Campbell's monkey, C. campbelli. Campbell's males give acoustically distinct alarm calls to leopards, Panthera pardus, and crowned-hawk eagles, Stephanoaetus coronatus, and Diana monkeys respond to these calls with their own corresponding alarm calls. However, in less dangerous situations, Campbell's males emit a pair of low, resounding 'boom' calls before their alarm calls. Playbacks of boom-introduced Campbell's alarm calls no longer elicited alarm calls in Diana monkeys, indicating that the booms have affected the semantic specificity of the subsequent alarm calls. When the booms preceded the alarm calls of Diana monkeys, however, they were no longer effective as semantic modifiers, indicating that they are meaningful only in conjunction with Campbell's alarm calls. I discuss the implications of these findings for the evolution of syntactic abilities.

Anyway, given the need to follow Nature's lead in assigning relative importance, I think that Wade's 5/23/2006 NYT story is a model of how to approach such topics in a responsible way. ]

If someone were to treat this research in a longer feature, there are some other kinds of background that would be worth bringing out. As I discussed at greater in an earlier Language Log post ("Cotton-top tamarins on the road to phonology as well as syntax", 2/9/2004), a number of other animal communication systems are said to

exhibit what Charles Hockett called "duality of patterning": larger patterns made up of well-defined combinations of recurrent, well-defined smaller units.

In that post, I linked to a page (now alas at a different URL, so I've updated the link) offering spectrograms and audio clips of the vocalizations of cotton-top tamarins, including some vocalization type that seem to be combinations of sound classes used independently in other circumstances.

These include repeated sequences called "pulsed vocalizations". In most cases, the "pulsed" form seems just to be an intensified form of the simple call, like the multiple dog barks or whines that David Beaver cites. Thus the "Type F Chirp" is said to be used "During intergroup antiphonal calling of Normal Long Calls. To audible outgroup vocalizations." The corresponding pulsed form, the "Type F Chirp Trill", is "Same as for Type F Chirp. Tilling [sic] indicative of a higer state of arousal than Type F Chirp alone."

But sometimes , there is a suggestion of a different process. Thus the "type D chirp", which is glossed as a "post-food" call used "when an animal actually possesses food or object", has a pulsed form called "hooked chatter", which is used "as infants approach". Perhaps this is because the adults are saying "hey kid, I got something for you here!" And perhaps the "hooked chatter" then acquires a new set of associated meanings, associated with welcoming infants rather than with signaling the possession of food, by the sort of process that Charles Darwin described (in "The Expression of the Emotions in Man and Animals") for the development of meaningful displays (such as snarls) from fragments of gradually-decontextualized behavior.

An apparent typo in another entry raises expectations of greater interest. The "Type E Chirp" is a "general alarm" associated "[t]o sudden visual and acoustic stimuli. To sudden leaping movement by group members if animal startled." In contrast, the "Type A Chirp" is said to be used "During mobbing behavior, to sudden animated stimuli. By some groups to preferred foods. Rarely given to acoustical stimuli."

The "Type E chirp Chatter" is a pulsed form whose usage the cited page describes as "Same as for Type A Chirp only when animal is more highly aroused." That would be neat -- replication shifts the meaning from the expected intensified form of Type E to an intensified form of Type A? That's the kind of unexpected little irregularity that happens in human morphology all the time. Alas, I'm afraid it's just a typo.

There are also some "combination calls", such as the "Type F Chirp + Whistle", said to be used "By individuals less confident than when giving Normal Long Calls. As response to Combination or Normal Long Calls or non-group Type F Chirps. In isolation." The associated sound files suggest that this is the same as a "Normal Long Call" with a "Type F Chirp" substituted for the first of two (or more) rising "whistles". This is interesting and suggestive.

[Full disclosure: Klaus Zuberbühler got his PhD in 1998 from the University of Pennsylvania, where I teach, and I had a very high opinion of him while he was a student, which has been maintained by his subsequent work.]

Posted by Mark Liberman at 08:54 AM

May 27, 2006

Behind the Da Vinci curve

Anthony Lane's movie review of The Da Vinci Code ("Heaven can wait", The New Yorker, May 29, 2006) is wonderful: screamingly funny (IMHO; not everyone agrees). I recommend it if you want further excuses to giggle at (and perhaps not go and see) the ponderously silly movie that has been made out of the much-loved blockbuster of blasphemy. But — if I may raise this sensitive topic — what is it with these print journalists who keep picking up months or years later on linguistic points already well known to the world through Language Log? Look at Lane's paragraph on "renowned" (right hand column below; quotes from TDVC are highlighted in red) and compare it with what you read here (and here) well over two years before, about the same quote (left column; paragraph breaks removed):

Language Log, May 1, 2004:

The New Yorker, May 29, 2006:

I am still trying to come up with a fully convincing account of just what it was about his very first sentence, indeed the very first word, that told me instantly that I was in for a very bad time stylistically. The Da Vinci Code may well be the only novel ever written that begins with the word renowned. [...] "Renowned curator Jacques Saunière staggered through the vaulted archway of the museum's Grand Gallery" [...] I think what enabled the first word to tip me off that I was about to spend a number of hours in the company of one of the worst prose stylists in the history of literature was this. Putting curriculum vitae details into complex modifiers on proper names or definite descriptions is what you do in journalistic stories about deaths; you just don't do it in describing an event in a narrative.

There has been much debate over Dan Brown's novel ever since it was published, in 2003, but no question has been more contentious than this: if a person begins reading the book at ten o'clock in the morning, at what time will he or she come to the realization that it is unmitigated junk? The answer, in my case, was 10:00.03, shortly after I read the opening sentence: "Renowned curator Jacques Saunière staggered through the vaulted archway of the museum's Grand Gallery." With that one word, "renowned," Brown proves that he hails from the school of elbow-joggers — nervy, worrisome authors who can't stop shoving us along with jabs of information and opinion that we don't yet require.

Notice, I'm definitely not alleging plagiarism here: despite the tongue-in-cheek remark about the question of how many words you have to read before you form your judgment having been much discussed, I'm quite sure Lane thinks he's bringing up a new point, and he certainly does it in an original way: his point about elbow-jogging, information-jabbing writers seems new and fresh. He's no thief. He copies no sentences (and this cannot be said of everyone, can it?). What's more, he actually does new research in Brownian literary stylistics: he looks through the rest of the book and finds another example of an exactly parallel sort: he says that you could perhaps "dismiss that first stumble as a blip", but later on in the book you will find this:

Prominent New York editor Jonas Faulkman tugged nervously at his goatee.

This is a wonderful new example of clunky use of an anarthrous occupational NP preposed to a proper name, one that somehow I had not spotted in 2004. So Lane doesn't just use other people's examples; he is an active data-gatherer.

And yet... One does get a vague sense that he doesn't know he's into a two-year-old project here. He sounds a tiny bit like an intelligent literary stylistician who has just been awakened from a two-year coma and thus attracts a certain amount of eye-rolling at conferences as he brings up points that he thinks are new but they're not.

Listen up, Anthony: it's 2006, and everyone reads Language Log now. The web programmers at your own magazine read it: when I commented on an utterly insane prescriptivism-induced message from the magazine's web site search engine in 2005, they reprogrammed to get rid of it within a week or so (we get "Sorry, there are no results matching that search" now, instead of the hilarious message that I had mocked, "I'm sorry I couldn't find that for which you were looking"). Your boss reads Language Log, your Aunt Meg reads Language Log. The other day I saw a dog reading Language Log (on the Internet nobody knows you're one; though come to think of it, he may have thought it said Language Dog — I don't know what his motive was in reading our stuff, I only know he was sitting up on a stool in an Internet café paging through something by Mark Liberman that had graphs in it).

Has word not yet reached the New Yorker office film desk, whose very métier crucially involves being fully and deeply in touch with current cultural trends? Are they really working without having Language Log bookmarked? Is that why Lane seems to imagine that he is raising a brand new linguistic theme here, rather than hauling out a well-roasted old chestnut from the cheery fire of Language Log?

Posted by Geoffrey K. Pullum at 02:46 PM

Linguistics goes out to dinner

Barbara and I dined last night with her brothers and their combined families at EVOO, on Beacon Street in Somerville, near where (until the end of June) we live. (The restaurant name, by the way, is an acronym, from the initials of Extra Virgin Olive Oil.) Excellent as always. And, linguistics being everywhere, I had the pleasure of noting that one item on the list of specials for May 26 was described using one of the most complex and varied naturally-occurring nominal premodifier constructions I have seen in quite a while:

Garlicky Pork Sausage Stuffed Crisp Fried Maryland Soft Shell Crab

(I omit here the rest of the phrase, which dealt with the accompaniments of Potato Gnocchi, Pesto, Orange Segments, Shaved Fennel and Roaster Pepper Aioli; you may be interested in them, but right now I am interested only in the above nominal constituent.)

The ten words of this syntactically composed phrase yield a fantastically large number of possible parses (the number can be computed using the formula for Catalan numbers, but I am not going to compute it, because if I simply fail to do so, someone with better computational skills will email me with the number later, and will then get their name mentioned on Language Log). If I parse the whole thing correctly, and I think I do, then

  • the noun pork is used as an attributive modifier of sausage;
  • the adjective garlicky is used as an attributive modifier of the nominal pork sausage;
  • the nominal garlicky pork sausage is incorporated (with instrumental meaning) into the complex adjective headed by stuffed;
  • garlicky pork sausage stuffed is used as an attributive modifier of the nominal constituent formed by all the subsequent words;
  • crisp modifies fried to form a complex adjective;
  • the complex adjective crisp fried is used as an attributive modifier of the nominal constituent formed by all the subsequent words;
  • the proper noun Maryland is used as an attributive modifier of the nominal constituent formed by all the subsequent words;
  • the adjective soft is used as an attributive modifier of the noun shell;
  • the nominal soft shell is used as an attributive modifier in a (lexicalized) the nominal constituent whose head is the noun crab.

Thus there are four stacked attributive modifiers of crab. It's not at all unusual to have four modifiers of one noun, of course (my phrase "most complex and varied naturally-occurring nominal premodifier constructions" above has three, counting "nominal premodifier" as one, and I could doubtless have tossed in another one without anyone raising an eyebrow). But the variety and internal complexity of these four caught my syntactician's eye; proceeding from the innermost to the outermost, they are soft shell, Maryland, crisp fried, and garlicky pork sausage stuffed. Three of them are syntactically complex, and each of those has a quite different structure from all of the others. The bracketing of the whole thing is like this:

[ [ [ [ Garlicky] [ [ Pork ] [ Sausage ] ] ] [ Stuffed ] ] [ [ [ Crisp ] [ Fried ] ] [ [ Maryland ] [ [ [ Soft ] [ Shell ] ] [ Crab ] ] ] ] ]

I took all that in at a glance, pocketed a copy of the specials menu so I could explain all this later to you, and turned to a very enjoyable dinner. Linguistics is everywhere. It will even go out to a family dinner with you, and enliven the experience. Linguistics is your friend. Linguistics is like family.

Steve Jones points out that a few hyphens would reduce the ambiguity a lot, and that's true; but the menu was printed with none. Putting hyphens in the right places for optimal parsing is your homework exercise for today.

Posted by Geoffrey K. Pullum at 10:50 AM

May 26, 2006

Retirement or Phase II

What is it in our makeup that keeps us going when it seems like we're old enough to quit work and we should be trying to take it easy? Today's article in The Washington Post (here) about the failure of two Indy car racing icons to retire and stay retired seems to have currency for lots of us retired folks. Michael Andretti and Al Unser, Jr. are back at racing again this year, having officially retired a year or so ago. As Unser put it, they just can't stay away:

"Both of us couldn't make the clean break," said Unser, who said he's giddy to be preparing for his 18th Indianapolis 500 ... I thought I could make a clean break away from it ... I'm not doing this for a living. I'm doing this because I love racing, and that's what I want to do."

When I retired from teaching linguistics, I thought I'd spend the rest of my life doing oil painting, exploring the mountains of Montana, and maybe doing some fishing. Now, ten years later, I still haven't touched the huge supply of art equipment my students gave me as a retirement gift. I quickly sold the pickup truck that I planned to drive in the mountains. And it didn't take me long to discover that I don't much like to fish. Meanwhile, it was very hard for me to get linguistics out of my system.

Like Al Unser, Jr., I couldn't make a clean break. I've also noticed this in many of my colleagues who tried to retire. Among others, Dwight Bollinger, Peter Ladefoged, and Eric Hamp easily come to mind. We love linguistics and that's what we want to keep on doing. Like them, I've failed the course, Standard Retirement 101, several times in the past ten years. I don't teach university classes any longer but I continue to consult, review tenure and promotion applications, read new stuff (and old stuff too), give lectures once in a while, evaluate grant applications and book prospectuses, see more of the world, serve on boards, and, mostly, do a lot of writing. There's a better name for "retirement" here. Let's call it "phase II."

For those of us who love what we do and whose minds are still relatively clear, retirement is probably not the time for dropping everything to sun ourselves on the beach or play shuffleboard in Florida. It's the time to reflect, synthesize, and bring together those loose ends that have nagged at us until now. It's the time to write the books and articles that we didn't seem to have time for while we worked every day in the classroom. And it's a freeing time, with no administrative tasks to interfere with our research and writing, no tenure to strive for, no annual course evaluations, no tests to grade, and no more endless university committees to serve on.

Like Unser, I'm giddy (well, happy anyway) about this phase of my life, whether it's called "retirement" or simply "phase II." And I want to report to younger scholars that there's no need to be afraid of reaching this wonderful stage of life.

Posted by Roger Shuy at 01:43 PM

"Not good enough for us, too good for you"

This bumper sticker was posted by James Joyner at Outside the Beltway ("Congressional Double Standard on Warrants", 5/25/2006), with credit for the slogan given to an "anonymous comment on an Orin Kerr post":

The meaning seems intuitively obvious -- Congress is OK with warrantless wiretapping of citizens ("us"), but objects to a warrant-based search of a congressional office ("you"); and the person displaying the bumper sticker thinks this is hypocritical. But it's not so easy to explain how to get there from the words.

I don't mean the referents of "us" and "you", which have to be inferred from the picture of the capitol building and the current associations of "warrant". [Well, maybe this is a problem -- I thought it was obvious that first person ("us") is in the voice of the person displaying the sticker, while "you" is the U.S. Congress; but Melissa Fox (see below) assumed the opposite assignment.] The problem has to do with two quasi-idioms based on good (in "not good enough for" and "too good for").

The second half of the slogan is pretty easy. If we ask Google what sorts of things are "too good for him|them|you", we find a preponderance of punishments: impeachment, US jails, killing, the death penalty, hanging, horse whipping, etc. The idea is that some crimes are so heinous that the legally-prescribed punishments (impeachment, the death penalty) and even extra-judicial sanctions of whatever kind, are not strong enough responses. In the current situation, according to the bumper sticker, mere FBI investigation and search under the terms of warrants granted for probable cause is "too good for" corrupt members of congress. " I say, search 'em all. Now."

But what about the first half of the slogan? In what sense are warrants "not good enough for us"? When we say that "X is not good enough for Y", and X is some sort of object or situation or process or institution, we generally mean that Y wants or needs something more than what X provides. (Y might have a valid reason, or just feel generally superior to things like X...)

100 feet not good enough for you?
Jimmy Carter, a third generation southern Baptist, has come to a painful decision that the ole-time religion is not good enough for him.
... apparently my software's not good enough for them anymore.
If GnomeMeeting is not good enough for you, where is the problem?
If so, the [audio] system is probably not good enough for you.

So the first half of the bumper-sticker slogan ("Warrants: not good enough for us...") suggest that we want something more than what warrants provide. This is confusing, since the main complaint about the warrantless wiretapping was that it is extra-judicial -- if the FISA court had been used, most people would not have objected. So does displaying this bumper sticker (or wearing one of the t-shirts) commit the user to the view that even FISA warrants would have been "not good enough"?

The other possibility, I guess, is that the "warrants are not good enough" complaint was meant to refer not to extrajudicial wiretapping but to the concerns (fairly widespread in the libertarian blogosphere) over "intrusive paramilitary raids" carried out with warrants.

A less generous interpretation of the bumper sticker's sentiment might be: "never mind all this business about warrants, the feds should just search the bad guys (themselves?) and leave the good guys (us) alone". That's a natural human reaction, but not much of a judicial (or logical) principle.

Great bumper sticker, though.

[Update -- Melissa Fox writes:

I was surprised to see that you parsed the slogan from the bumper sticker as referring to the citizenry as 'us' and Congress as 'you'. My reading had been that the capitol dome indicated Congress was the speaker -- so warrants aren't good enough for us (congress) but too good for you (the people); that is, normally a warrant is enough to allow a search, but not for members of Congress, oh no, warrants aren't good enough -- what do they want, a sign from above? Conversely, as you say about the second half (only with opposite referents), the contempt in which Congress seems to hold the American people suggests that we don't even deserve to be protected by due process, to be notified before our homes and offices are searched or our phone conversations recorded, etc., etc. Warrants are too good for us -- which, as it's Congress speaking to us, means they're "too good for you".

That seemed so intuitive to me that I had to read your LL post three times before I got it. :-/

Hmm. I'm not sure whether this is evidence that this is not such a great bumper sticker after all, or that it is even better than I thought :-). If I have time later today, I'll put up an interactive poll so that we can get a sense of how many people construed the slogan in which way...]

[Karen Davis and Fernando Pereira agree with Melissa Fox -- I can see that I've got the back end of the elephant on this one.]

Posted by Mark Liberman at 06:37 AM

May 25, 2006

Not old enough for sex, by half

I was surprised to find that a linguistic point was front and center in Dan Savage's widely published raunchy sex advice column ‘Savage love’ last week.

"I'm a straight guy, 17-and-a-half", wrote an advice-seeking reader whose "Catholic Christian girlfriend" is "still a virgin" and has stubbornly not agreed to have sex even after "more than four months" of dating. Is four months of "stalemate" too long? Should he hold or fold? "Please help", he says.

And Dan Savage (not one of your feelgood, I'm-OK-you're-OK therapists) comes roaring down on him, armed with a crucial linguistic piece of evidence:

...no one who gives his age as "whatever-and-a-half" is mature enough to be having sex himself, much less sitting in judgment over someone else's decision not to have sex.

Dan's quite right, as far as my linguistic intuition goes: there is some vaguely delimited age at which you stop counting your age in steps smaller than one year, and the age at which it seems reasonable to say that a person is truly mature enough for the responsibilities that go along with sexual activity seems (forgive me, sexually active junior high-schoolers) to be broadly located somewhere after that point. I hadn't explicitly noticed that before, but I think Dan has identified a reasonable rule of thumb.

To make it depend on something more robust than my linguistic intuitions, or Dan Savage's, let's look at the Google hit counts for a few relevant phrases. Some will be bad hits (like across sentence boundaries, or followed by "months"), and at 22 there is a bad data point because of a much-quoted historical reference, but the pattern is a descending one from 18,900 hits for two and a half down to zero at twenty-five and a half. It's very clear from a graph on a logarithmic scale:

The raw data follow:

PhraseG-hits
aged two and a half
aged three and a half
aged four and a half
aged five and a half
aged six and a half
aged seven and a half
aged eight and a half
aged nine and a half
aged ten and a half
aged eleven and a half
aged twelve and a half
aged thirteen and a half
aged fourteen and a half
aged fifteen and a half
aged sixteen and a half
aged seventeen and a half
aged eighteen and a half
aged nineteen and a half
aged twenty and a half
aged twenty-one and a half
aged twenty-two and a half
aged twenty-three and a half
aged twenty-four and a half
aged twenty-five and a half
18,900
631
468
134
134
133
80
81
31
27
32
17
21
32
31
35
16
10
1
1
>3
1
1
0

Endnote: I am expecting to get a certain amount of joshing around the corridors of Language Log Plaza, and perhaps a certain amount of mail from impudent strangers, over what exactly I was doing reading a sex advice column in Boston's raunchiest free weekly, The Dig. The answer is that it is the duty of the Language Log staff to scan the widest possible array of popular media to provide you with the hard, penetrating (oops, 'scuse the metaphor) linguistic analysis you have come to expect. For us, all human linguistic life is worthy of study. There is nowhere that Language Log will not go to find linguistic insights for your interest and reading pleasure: the casinos of the Las Vegas strip; the pages of 18th-century pornography; the sleazier side of the psychological study of transsexuals; graffiti in men's bathrooms... There is nowhere we will not go, nothing we will not read, if it is in the service of linguistic science.

Posted by Geoffrey K. Pullum at 06:50 PM

I have stress! You have stress! Not resolved!

The latest "viral video" to become a global sensation via the Youtube website is a six-minute clip from Hong Kong called "Bus Uncle" (or "Uncle Bus," as Wikipedia currently renders it). It's a cell-phone video shot on a bus, documenting an older passenger chewing out a younger one who dared to tap him on the shoulder for talking too loudly. Here is how a recent AP article describes the encounter:

The film starts out when the protagonist, a middle-aged man, reacts strongly when a young man sitting behind him taps his shoulder to ask him to keep his voice down while talking on the phone.
"I don't know you. You don't know me. Why do you do this?" the infuriated bus rider says, punctuating the sentence by jabbing his right hand downward in the air.
When the young man, who rarely talks back during the lengthy argument, expresses an unwillingness to continue the conversation, the middle-aged man explodes, "This is not resolved! This is not resolved! This is not resolved!" — which has now become a catch phrase in Hong Kong.
He goes on to say, "I face pressure. You face pressure. Why did you provoke me?"

The belligerent man goes on to make some indecent remarks about the younger man's mother, all the while bullying him into shaking his hand to "settle" the dispute. It makes for strangely compelling viewing — indeed, the original video has been viewed about 1.8 million times since it was uploaded on April 29th. And that doesn't include other versions, such as one subtitling the Cantonese dialogue with English and Mandarin (highly recommended), not to mention musical remixes and various parodies. Thanks to Youtube, the "Bus Uncle" catchphrases have become firmly lodged in Hong Kong pop culture. And thanks to the new AP article, English translations of the phrases are now circulating widely. One blogger says she has taken to repeating the translations without even having seen the video. As a public service, I've transcribed and transliterated two of the most popular catchphrases for those who would like to spread the meme in the original Cantonese.

Disclaimer: I don't know a lick of Cantonese, but fortunately these two phrases are lexically and syntactically simple enough that I was able to piece the transliterations together without too much trouble using Adam Sheik's online Cantonese Dictionary. I've provided the original phrases with Jyutping romanization and word-for-word glosses. (See here for how to interpret the tone numbers.)


 我 有 壓 力 ! 你 有 壓 力 !
"I have stress! You have stress!"

壓 力 壓 力
ngo5 jau5 ngaat3 lik6 nei5 jau5 ngaat3 lik6
I have pressure
you have pressure



 未 解 決 !
"Not resolved!"

解 決
mei6 gaai2 kyut3
not yet solve

With any luck, these catchphrases will achieve the prominence of other Youtube-disseminated memes, such as, say, "Mr. Pibb and Red Vines equals crazy delicious."

[Update: You never know where those "Bus Uncle" phrases are going to show up next. Here's a lovely if enigmatic image from no-sword:

]

Posted by Benjamin Zimmer at 06:05 PM

Public competence, linguistic and otherwise

In response to my call for "broader common-sense discussions" of language-related issues, Ryan Miller wrote:

I just wanted to note that if your hope for public knowledge of linguistics is that it reach the level of public competency that "automobiles, computers, investments, or court cases" have revealed, then you may well be disappointed with your results. Marginal Revolution had quite the head-shaking over the inability of economics graduate students to pass a multiple-choice test on opportunity cost, and I don't gather that most people have any idea what habeus corpus means or what diesel fuel is or the difference between documents and applications (if the latter seems improbable to you, ask any help-desk technician). So I suspect you will either be easily satisfied or easily disappointed.

I recognize that not all public discussion of opportunity cost, habeas corpus, diesel fuel or software is well informed, but at least some of it is. And I continue to believe that weblogs and other social media can provide a sort of inverse Gresham's Law, which I formulated this way: "[O]pen intellectual communities intrinsically tend to generate a virtuous cycle: if there were an order of magnitude more science writing in blogs, there'd be less than an order of magnitude more crap, and more than an order of magnitude more good stuff".

(I'm sure this idea is not original, but I don't recall where I've seen it before. As an idea about rational inquiry in general, it surely goes back at least to Roger Bacon.)

Based on this virtuous-cycle perspective, I'll be satisfied if the amount of discussion increases, and the rate of growth of bullshit is significantly smaller than the rate of growth of sensible and informed stuff.

Ryan cited the infamous Tuttle Software Correspondence as evidence for his view that the "level of public competency" in the computer area might be disappointing to me. On the contrary, I see this as a pretty favorable case -- a large community of well-informed people have been shaking their collective heads over one individual's stubborn misunderstandings. And as far as I can tell, the issue wasn't picked up by the AP or AFP newswires (much less the NYT or the Guardian) and presented by technologically-ignorant reporters under headlines like "Software company defaces town website". If language-related issues and events generally came out this well, I'd be ecstatic, not just satisfied.

[Update -- Mike Albaugh comments:

In the discussion about "Licensed Linguists", you quoted Ryan Miller in re: how the great unwashed don't seem to grasp the difference between applications and documents. Perhaps their guts sense that the diference is not as clear-cut as the Ryan Millers of the world think it is. Perhaps he was one of the ones confidently assuring his friends that there was no possibility of an email virus, on the eve of "I Love You".

A little knowledge is a dangerous thing. If you get very far into any subject, you find yourself sounding like a lawyer, or Rabbi: "It depends".

Well, I'm sure that Ryan knows about rogue email attachements and word-processor macros, and was just waving a hand at the numerous faulty presuppositions that help-desk folks have to deal with every day; but Mike's point is also a valid one. It's a strength of vigorous civic discourse that assumptions are always getting questioned, and compelling questions have a reasonable chance of reaching a critical mass of people.]

Posted by Mark Liberman at 09:06 AM

May 24, 2006

On Learning Mandarin in America

[Guest post by Victor Mair]

Two daughters of a couple whom Li-ching and I have known for many years both went to the same elite American university (one of the very best in the United States; extremely competitive to get into). The father and mother are mainland Chinese who were born in Taiwan, went to National Taiwan University, and attended graduate school in the United States. Both of the daughters were born in America, but the father and mother have always spoken Mandarin to them and required that they attend weekend schools for Mandarin instruction throughout their primary, junior high, and high school education.

The two girls, who are very diligent, intelligent, and obedient (TING1HUA4), probably recognized about 500 HAN4ZI4 before entering college and could speak at the intermediate level, but they could barely write anything and had difficulty reading even children’s books (admittedly, Chinese children’s books are infamous for not being written in a style appropriate for young readers). When they went to their elite university, they essentially had to start all over again at the beginning.

Both of the daughters went through the third-year level of Mandarin at the university. The elder graduated about five years ago, and her Mandarin now is probably at the same level she had achieved when she was in the fifth grade of elementary school – viz., barely functional for reading and writing, and moderately fluent for speaking and listening. Her younger sister is now in 3rd-year Mandarin at the university and is likely going to end up at about the same level in all four skills (reading, writing, listening, and speaking) as her JIE3JIE.

In fact, what prompted me to write this account at all is the sensational realization by MEI4MEI’s mother that she was writing her Mandarin compositions with a software program that permits one to type in English sentences and let the computer convert them into Mandarin! The mother discovered this situation when she read one of her daughter’s compositions and could scarcely believe how ungrammatical and unnatural it was. Such software is apparently in quite widespread use, not just in America, but even in Hong Kong, Singapore, and elsewhere.

Now, JIE3JIE and MEI4MEI are both highly intelligent and hard-working, but their Mandarin is, for all intents and purposes, useless when it comes to reading and writing, and minimal when it comes to speaking and listening – despite the fact that they have spent more than 15 years studying it in school and university (plus at home). And I’m sure that this tragic situation is by no means peculiar to these bright sisters, but is endemic among many (actually most) students across the country and, indeed, throughout the world. How can this be?

Well, as I knew intuitively when I began studying Mandarin in 1968 and have reiterated on countless occasions since then, it’s because there’s far too much emphasis on HAN4ZI4 from the very beginning. I believe that students should NOT be exposed to HANZI for **at least** the first year of instruction, and preferably not for the first two years. Only then should the characters be gradually introduced. Why? The answer is simple: students need to master the basics of the language (pronunciation, vocabulary, grammar, syntax, idioms, etc.) before they are required to memorize hundreds of HANZI. It is essential to internalize the patterns and structures of the language before spending endless hours vainly striving to master large numbers of HANZI. As a matter of fact, students who initially concentrate exclusively on learning pronunciation, grammar, syntax, etc. and are not burdened by having to memorize HANZI actually learn the HANZI more effectively and quickly later on when they do start to acquire them systematically **in relation to the language as a whole.** This has been proven in the ZHU4YIN1 SHI4ZI4, TI2QIAN2 DU2XIE3 (“recognize characters [through] phonetic annotation, speed up reading and writing”) experiment in the People’s Republic of China. It is also borne out by the experience of students who learn Cantonese, Taiwanese, and other Sinitic languages without having to be saddled with the HANZI. It is remarkable how swiftly such students can attain fluency when exposed to a full course of instruction and exercise.

The negative effects of having to learn HANZI during the first few years of instruction are numerous. In the first place, they consume limitless amounts of time that would better be spent on actual language learning. Secondly, they emphasize monosyllabic morphemes over multisyllabic words, which constitute the overwhelming proportion of the lexicon. Third, the HANZI themselves cannot be efficiently memorized in the absence of a prior mastery of the fundamentals of the language, thus a vicious cycle ensues, with neither language acquisition nor control of the script making any noticeable progress.

JIE3JIE’s and MEI4MEI’s mother is still earnestly seeking a way to improve the reading and writing skills of her daughters in a realistic manner. I have a simple solution: make available a large amount of quality literature with phonetic annotation for each character, preferably showing word division marked by spaces (FEN1CI2 LIAN2XIE3). This is comparable to KANJI texts with FURIGANA in Japanese.

Students should not be blamed for being poor learners when it is their teachers who employ outmoded, impractical methods.

[Guest post by Victor Mair.]

Posted by Mark Liberman at 06:34 PM

Broader common-sense discussions, not narrower "licensed" commentary

There's a lot of confident-sounding nonsense out there about language, in print and in conversation. And it's natural for Roger Shuy, who does a lot of work in forensic linguistics, to describe this in terms of the metaphor of "practicing linguistics without a license". However, I'm afraid that some of the nonsense comes from people with impeccable academic credentials, while some of the sensible stuff comes from people who are smart, rational and curious but don't have any diplomas. And the idea of being "licensed to practice linguistics" suggests that we want people to be passive consumers of expertise, whereas the truth is just the opposite. It's great that so many people want to talk about language. Our suggestion is not that they shut up and leave it to the experts, but rather that they put some effort into learning and thinking as well as into writing and talking.

In the case that Roger was commenting on, my beef with Stewart Lee was not that he's a comedian without any particular credentials in linguistic analysis. My problem was that his ideas are plain nonsense, in ways that anyone who can read his essay can easily understand. For example, Lee says that the "pull back and reveal" type of joke doesn't translate well from English into German, because "the rigours of the German language's far less flexible sentence structures" "prevent using little linguistic tricks to conceal the subject of our sentences until the last possible moment". But in his examples, the "reveal" is not determined by the order of words in a sentence, but rather by the order of clauses in a discourse -- a matter in which German differs from English not at all. (For that matter, the sentential word-order in his example doesn't crucially differ between the two languages, though that isn't relevant to his argument.)

You don't need a course in linguistics, much less an advanced degree or any other sort of credential, to see the problem here. You just need is to understand the meaning of the words Lee used -- or to look them up on line if you're not sure -- and to ask yourself a few simple questions about the logic of his argument.

The same thing is true about the other case that I mentioned, the claims of Prof. Jean-Claude Sergeant about the paradoxical rigidity of the English language in comparison to French. Prof. Sergeant is certainly as licensed as they come, in formal terms -- "Director of the Maison Française in Oxford, a research centre funded by the French Ministry of Foreign Affairs ... former head of the British and American Studies Department at the ... Sorbonne Nouvelle", recipient of an honorary degree in 2002 from Oxford University. But what Prof. Sergeant had to say about English in relation to French was just as confidently nonsensical, in my opinion, as what Mr. Lee had to say about English in relation to German. And anyone who can read his essay, and cares to take the time to think through what it means, and to spend a few more minutes looking at the facts, can see this.

Like Roger, I'm very much in favor of better linguistics education in the general curriculum. More linguistic knowledge and more analytic skills and more practice in analysis and argumentation would surely be a good thing. But the pay-off, in my opinion, would be public discussion of language that is as vigorous, rational and well-informed as (say) public discussion of automobiles, computers, investments or court cases. And the way to get there is probably to encourage broader discussion -- and therefore more nonsense -- rather than less.

At least, that's what I argued last year about science blogging in general ("Raising standards -- by lowering them", 3/7/2005). I offered a three-point plan for improving scientific communication, whose first point was:

Encourage everyone to think about science, and to write about it on the web, whether they know anything about it or not. And encourage them to criticize what others write, and to read others' criticisms, and to tell their friends about the best stuff that they find, whether in the popular media, or in the technical literature, or in weblogs. I claim that open intellectual communities intrinsically tend to generate a virtuous cycle: if there were an order of magnitude more science writing in blogs, there'd be less than an order of magnitude more crap, and more than an order of magnitude more good stuff. (The same is probably true for science writing in newspapers, though the network effects are smaller there.) This follows from a scientific version of Moglen's Metaphorical Corollary to Faraday's Law: add more wires, lower the resistance, and more intellectual current is induced.

In my opinion, the same thing goes for language-related writing, whether strictly scientific or simply rational. I know that Roger feels basically the same way that I do. It's just that when you hear some of the crap that people come out with, it's hard to resist the temptation to turn on the siren and write them a ticket.

Posted by Mark Liberman at 05:37 PM

Report from the language log security department

Mark Liberman's post (see here) on the Guardian article in which the writer practices linguistics without a license echoes many other Language Log complaints about this crime, which appears to be running rampant throughout our society these days, not just in the press. For example, one evening at a dinner party I was the only linguist present when a psychiatrist sitting next to me felt it necessary to lecture me on the evils of Vernacular Black English, uttering nonsense in virtually every sentence. More recently, I've found that the legal profession has a distressing lack of knowledge about our field. And I won't even go into the problems that educators have with this. But let me relate an incident that happened to me just yesterday.

In my role as one of the security officers at Language Log Plaza, I was forced to issue a citation to a PhD dissertation writer who, in her preliminary draft, falsely referred to part of her research as linguistics. Okay, it was a psychology dissertation and so maybe I should have cut her some slack. But, as a law-abiding linguist, I felt that I had to enforce the law. So I ticketed her for driving her dissertation on the wrong side of the road. In a conciliatory tone, I explained that I was doing this for her own good. I wouldn't want readers of her final draft to accuse her of practicing linguistics without a license.

Like many psychology dissertations, her research was a rather good content analysis of lots of data that she had carefully gathered. Among other things she used a computer program to count the number of times personal pronouns and other language features occurred. She backed this up with a good statistical analysis. The problem is only that she referred to this as a linguistic analysis. She found instances of indirectness but didn't say anything about how this worked. It was simply a good tallying exercise. Same for conditionals, passives, and instances of politeness. No linguistic analysis of these features--only their presence or absence. She used a research approach that was appropriate enough for what she was trying to accomplish but it simply wasn't linguistics.

Mark is quite correct about the way journalists mangle linguistics in articles that compare one language with another. Past Language Log posts, too numerous to mention here, have also dealt with journalistic ignorance of such things as the alleged language learning of birds and animals, how many words there are in the English language, and that there are no words for X in language Y. Maybe it's our job to give citations to offenders but there is also a whole lot of work for us to do in the education of our sister disciplines. We haven't been very good at this.

Posted by Roger Shuy at 12:38 PM

Thriving on confusion in the Guardian

Last year, I wrote about Jean-Claude Sergeant's view of the English language ("Paradoxes of the imagination"; "Another over-earnest comedy of fact checking"). Sergeant, then "professeur de civilisation britannique" at the University of Paris III, informed us solemnly that

In its present configuration, current English is characterized first by an extreme concern for coherence and for explicitness approaching redundancy. The core constituents of the phrase -- subject, verb, complement -- cannot be as easily separated as in French, and the order in which they occur in the phrase is less susceptible of modification.

Now Stewart Lee ("Lost in Translation", Guardian 5/23/2006) tells us that

The German language provides fully functional clarity. English humour thrives on confusion.

It's hard to keep track of the intricate graph of European ethnic stereotyping, involving relative degrees of rationality, punctuality, diligence, food-preparation skills, and so on, though certain north-to-south and west-to-east trends are obvious. But I'm beginning to get the idea that at least one of these relations of mutual prejudice is symmetrical: speakers of language X always think that language Y is less flexible than X, for any values of X and Y.

Lee's conclusions about the German language are based on what must have been a very curious experience:

In December 2004 I accompanied Richard Thomas, the composer of the popular stage hit Jerry Springer The Opera, to Hanover, where he had gained a commission to develop an opera about a night in a British stand-up comedy club. We wrote the words in English and Richard then collaborated on a translation with a talented German comedy writer called Hermann Bräuer.

Throw in a machine-translation researcher, and you've got the premise of a Richard Powers novel.

Anyhow, it turns out that translating the jokes in this opera libretto was hard. You might think that this is because it's hard to translate songs, and hard to translate jokes, and doubly hard to translate sung jokes. However, Lee concluded that blame should be assigned to "the rigours of the German language's far less flexible sentence structures". Specifically, "German will not always allow you to shunt the key word to the end of the sentence to achieve [the] failsafe laugh" associated with "the endless succession of 'pull back and reveals' that constitute much English language humour".

He gives an interesting, quasi-formal analysis of "pull back and reveal" humor (more fodder for that Richard Powers novel):

At a rough estimate, half of what we find amusing involves using little linguistic tricks to conceal the subject of our sentences until the last possible moment, so that it appears we are talking about something else. For example, it is possible to imagine any number of British stand-ups concluding a bit with something structurally similar to the following, "I was sitting there, minding my own business, naked, smeared with salad dressing and lowing like an ox ... and then I got off the bus." We laugh, hopefully, because the behaviour described would be inappropriate on a bus, but we had assumed it was taking place either in private or perhaps at some kind of sex club, because the word "bus" was withheld from us. Other suitable punchlines for this set-up would be, "And that was just the teachers", "I was 28-years-old" and "That's the last time I attempt to find work as a research chemist in Paraguay."

At the risk of being accused of spoiling a good story with a mutant American version of Teutonic over-rationality, I have to point out that this makes absolutely no effing sense at all. I don't mean that Lee's example needs work (although it does -- if you try telling his joke, in English, with any of the proposed punch lines, you're more likely to get puzzled stares than laughs). Let's imagine substituting a better joke, and go on. The "pull back and reveal" joke structure, as Lee describes and exemplifies it, consists of a sequence of clauses: A, B, C ... and then D. Is there any language on earth in which you can't tell a story that way? My knowledge of German barely reaches the ability to read with the help of a dictionary, but that's enough to make me sure that if any language is so bizarrely crippled, it's not German.

[In the particular case of "I got off the bus" as a punch line, German might prescribe "... dem Bus aus" instead of "... off the bus" -- but can that possibly spoil the associated joke, if any? And what if the joke involved "off" rather than "bus" -- "and then I got OFF the bus" -- then German would allegedly have the advantage, right?]

To this incoherent theory about the alleged role of sentence structure in the cross-linguistic rhetoric of alleged humor, Lee adds some equally incoherent stuff about the effect of German noun compounds:

In English there are many words that have double or even triple meanings, and whole sitcom plot structures have been built on the confusion that arises from deploying these words at choice moments. Once again, German denies us this easy option. There is less room for doubt in German because of the language's infinitely extendable compound words. In English we surround a noun with adjectives to try to clarify it. In German, they merely bolt more words on to an existing word. Thus a federal constitutional court, which in English exists as three weak fragments, becomes Bundesverfassungsgericht, a vast impregnable structure that is difficult to penetrate linguistically, like that Nazi castle in Where Eagles Dare.

Penetrating this fortress of balderdash is left as an exercise for the reader. I'll supply one clue, namely a German joke that depends on a pun wrapped in a noun compound:

Aus welchem stahl macht man Autos in Polen? Diebstahl.

Literal translation: "From what steel do they make cars in Poland? Theft." Linguistic background: stahl means "steel"; stehlen means "to abstract, appropriate, steal", preterite form stahl etc.; dieb mean "thief", and the compound diebstahl means "larceny". Cultural background: Poland is apparently a notorious destination for cars stolen in Germany.

Can you translate that joke? No, not really. Is it because {English|German} is {more|less} ambiguous or linguistically penetrable than {German|English}? The answer is left to you.

Finally, Lee offers a third "explanation" for why German is allegedly a bad language for jokes. According to him, in German

... [t]here seemed to be no nuanced, nudge-nudge no-man's land, where English comic sensibilities and German logic could meet on Christmas Day and kick around a few dirty jokes in a cheeky, Carry On-style way. A German theatre director explained that this was because the Germans did not find the human body smutty or funny, due to all attending mixed saunas from an early age.

At this point, I'm beginning to think that "Stewart Lee" is the invention of a team of writers from the Onion. There are no doubt some cultural differences in deploying sexual allusions in humor, but are saunas really a ubiquitous fixture of modern German family life, or has "Lee" carelessly displaced this practice a country or two southwards? And aren't there any metaphors for English-German translation that don't involve WWI or WWII?

I don't have any sort of broad empirical basis for evaluating Lee's conclusion:

The geographical accident of Germany has denied Germans the fun we have with language, and it seemed to me that their sense of humour was built on blunt, seemingly serious statements, which became funny simply because of their context.

But my experiences with German friends, and what I little know of German literature and German humor -- Freud's discussions of humor will do for a start here -- leaves me very skeptical of the notion that Germans are "denied the fun we have with language" and that "their sense of humour [is] built on blunt, serious statements".

I guess it's inappropriate to expect coherence in an opinion piece by a comedian, even if its veneer of rationality suggests that its description of the English and German languages is meant to be meaningful. And there's some good stuff in the article. Lee mixes his little spurts of ethnic prejudice and his incoherent linguistic analyses into a slurry of interesting anecdotes and jokes -- leading with a pretty good version of the traditional essentialist joke about the ethnically-German child raised by English parents -- which many will enjoy.

Far be it from me to suggest that the Guardian needs theory checkers. And it's obviously impractical to imagine that people in general, and comedians in particular, will ever give up basing their opinions on unsupported and unexamined national stereotypes. But I can hope that someday, people in a position to write for outlets like the Guardian will have gotten some elementary linguistic analysis skills somewhere along the way.

[Hat tip to reader Ben Hadley.]

[Update -- John Cowan writes:

You write:

> At the risk of being accused of spoiling a good story with a mutant
> American version of Teutonic over-rationality, I have to point out that
> this makes absolutely no effing sense at all. I don't mean that Lee's
> example needs work (although it does -- if you try telling his joke,
> in English, with any of the proposed punch lines, you're more likely
> to get puzzled stares than laughs).

That's because "English humor" doesn't mean "humor expressed in the English language", it means "what those peculiar people in the south-eastern part of Ysl Prydain think is funny." And from the English point of view, we Americans are nothing but a lot of anglophone Germans. We can't even get the local-politics jokes in Monty Python episodes. (Seriously, Edward Hall does rate American culture as only slightly less low-context than German culture, and far more so than English, French, or New World Spanish culture. High-context cultures like the English have no problem with finding jokes like that funny.)

That's this Edward Hall, and John is usually sensible and well informed, but is that joke about getting off the bus actually perceived as funny (as opposed to silly) by UK residents? Even those from the London area? ]

[Update #2 -- Karen Davis writes:

You quote Stewart Lee as saying:

At a rough estimate, half of what we find amusing involves using little linguistic tricks to conceal the subject of our sentences until the last possible moment, so that it appears we are talking about something else. For example, it is possible to imagine any number of British stand-ups concluding a bit with something structurally similar to the following, "I was sitting there, minding my own business, naked, smeared with salad dressing and lowing like an ox ... and then I got off the bus."

I find it difficult to think of "and then I got off the bus" as the subject of the rest of it.

Perhaps he meant "the topic of our jokes"? But then, why say "the subject of our sentences" - and why believe that German can't tell that story, with an entire sentence (then I got off the bus) last? And why oh why doesn't he see that in none of his sample sentences was the subject concealed? "I" is right up at the front!

Indeed. The equivocation "subject of the sentences" vs. "topic of the joke" was one of the first things that I noticed about Lee's article. I wound up leaving it out in favor of other confusions, but maybe that was a mistake. Anyhow, there are a lot of interesting fragments of ideas floating around in what Lee has to say. The problem is that no one seems to have taken any trouble to try to clarify what they mean, how they go together, and whether they're true. That's reasonable practice for a stand-up routine, I guess, but it strikes me as out of place in an essay published in what claims to be the "best daily newspaper on the world wide web". ]

[Update #3 -- Margaret Marks of Transblawg writes:

I picked up the Guardian article on German humour too, because Trevor pointed it out. He didn't think it was as ridiculous as I did! Anyway, thanks for the detail.

That getting off the bus joke is just silly to me (from the London area!)

And Germans have several ways of expressing that:

dann bin ich aus dem Bus ausgestiegen
dann stieg ich aus dem Bus aus
dann verließ ich den Bus  (this does have a snappier and more amusing quality)

And they don't have to use a subclause - they can just string a sentence on

Ich stieg dann aus dem Bus aus
Ich bin dann aus dem Bus ausgestiegen

This Stewart Lee appears to exist, but it is amazing people can write such rubbish.

Anyway, the Germans laugh at my jokes. Should I be worried? (Actually, when I first spent a year here, I got on much better in the six months in Berlin than the six in Franconia, from the witticism point of view. Germany is very varied, and Berlin humour is not the same as Bavarian).

]

[See Abiola Lapite's post at Foreign Dispatches " Nice Theory, Shame about the Facts" (5/25/2006) for a congruent point of view with some excellent examples and clear-headed reasoning about them. I especially liked his closing point:

Usually, when people are looking for foreign languages to construct elaborate Sapir-Whorfian theses of difference around, they're prudent enough to opt for something so exotic and alien that no one is likely to call them on their claims: quite why Mr. Lee chose to use a language so similar to his own for such a purpose - and therefore so easy to check his claims against - is mystifying to me.

Perhaps it's that comedians are not used to being fact-checked. (But then how do we explain science journalists?) ]

Posted by Mark Liberman at 09:18 AM

May 23, 2006

Never mind the prose, plot and gnosticism...

Clay Jones, from the Free Lance-Star in Fredericksburg, Va., reminds us what the real issues are.   [Hat tip to Cynthia McLemore.]

Posted by Mark Liberman at 12:16 PM

The most powerful person no one has never heard of

According to a U.S. News article by Chitra Ragavan ("Cheney's Guy", 5/29/06):

[David] Addington, says an admiring former White House official, is "the most powerful person no one has never heard of."

There's one too many negatives in that sentence, or one too few. The article's subhead says it the way the the source meant it: "He's barely known outside Washington's corridors of power, but David Addington is the most powerful man you've never heard of."

But overnegation is easy to fail to miss, as we've (shockingly) often observed. [Update -- extra links added 5/18/2007.]

"Negated, or not" (1/21/2004)
"I challenge anyone to refute that this negative is not unnecessary" (1/21/2004)
"Challenge as negation" (1/23/2004)
"Too complex to avoid judgment?" (2/21/2004)
"Who is to be master?" (2/21/2004)
"On not avoiding negatives" (2/21/2004)
"Why are negations so easy to fail to miss?" (2/26/2004)
"Overnegation supererogation" (4/12/2004)
"Another overnegation" (4/27/2004)
"We cannot/must not understate/overstate" (5/26/2004)
"Another overnegation opportunity: yet vs. yet to" (5/31/2004)
"Overstating understatement" (6/22/2004)
"Nothing that cannot impede even by failure" (8/16/2004)
"Rumsfeld overnegates Powell, Powell uses 'fulsome' correctly" (11/16/2004)
"Overnegation alert" (1/11/2005)
"Still unpacked after all these years" (5/17/2005)
"Still upacked: threat or menace?" (5/17/2005)
"The temptation of overnegation" (5/23/2005)
"Things that are rarely better than they normally are" (10/17/2005)
"Never anything but less than precise" (10/20/2005)
"Negation, over- and under-" (12/21/2005)
"On not emerging unscathed" (3/1/2006)
" Not doubting that the door could not be opened wider" (6/5/2006)
" Unlike no other" (7/27/2006)
" It's hard not to read this and not do a double-take" (8/1/2006)
" Been anything so long it looks like not to me" (8/3/2006)
" Overnegation as obfuscation" (8/9/2006)
" Scalar failure" (3/5/2007)
" Everyone was spared no mercy" (3/26/2007)
" Barely missing a chance to overanalyze" (4/1/2007)
" Total undernegation" (4/17/2007)

[U.S. News quote via Andrew Sullivan -- who was focused on the morality of torture, an incomparably more important issue, and apparently missed the extra negation. ]

Posted by Mark Liberman at 07:49 AM

A new rhetorical technique

Or rather, a new term for an old rhetorical technique. John Holbo, "The Mark Steyn Code", describes premature dejoculation: "stripping off the humor so you can paint it on again". As he explains,

Something like this problem actually arises in academic writing. In order to have something original to say, you pretend so-and-so didn’t see something about his own position which, plausibly, he really did. (Not a strawman argument, because you aren’t exactly attacking. A straight man argument. You need them to say something a bit obtuse, to make a space for your cleverness.)

Among the comments, Holbo hints at a relationship to Harold Bloom's theory of misreading, which is described by the Wikipedia as follows:

"Poetic influence, as I conceive it, is a variety of melancholy or the [Freudian] anxiety-principle." A new poet becomes inspired to write because he has read and admired the poetry of previous poets; but this admiration turns into resentment when the new poet discovers that these poets whom he idolized have already said everything he wishes to say. The poet becomes disappointed because he "cannot be Adam early in the morning. There have been too many Adams, and they have named everything."

In order to evade this psychological obstacle, the new poet must convince himself that previous poets have gone wrong somewhere and failed in their vision, thus leaving open the possibility that he may have something to add to the tradition after all.

Posted by Mark Liberman at 06:48 AM

Dan Brown, evangelist?

Laurie Goodstein, in the May 21 2006 NYT, explains that "It's Not Just a Movie, It's a Revelation (About the Audience)":

Even Tom Hanks, the lead actor, called the plot "scavenger-hunt-type nonsense." But it is doubtful the uproar will disappear.

The reason is that "The Da Vinci Code" is, in the sweep of Christian history, a historical marker — encapsulating in one muddled movie an era in which many Christian believers have assimilated a whole lot of new and unorthodox ideas, as well as half-truths and conspiracy thinking, into their faith, while still seeing it as Christianity. Call it Da Vinci Christianity.

In support of this view, Goodstein quotes an evangelical pollster, George Barna, as saying that

25 percent of those who had read the book said it helped them achieve personal growth or understanding.

More exactly, according to the article that Goodstein seems to have been working from, which is posted on the Barna Group's website as "Da Vinci Code Confirms Rather Than Changes People's Religious Views":

Among the adults who have read the entire book, one out of every four (24%) said the book was either “extremely,” “very,” or “somewhat” helpful in relation to their “personal spiritual growth or understanding.” That translates to about 11 million adults who consider The Da Vinci Code  to have been a helpful spiritual document.

To place that figure in context, the Barna study revealed that another recently published popular novel about Jesus Christ – Christ the Lord: Out of Egypt , written by Anne Rice – was deemed to be spiritually helpful by 72% of its readers – three times the proportion who lauded Dan Brown’s book.

On the other hand, I imagine that about 300 times more people have read Dan Brown's book, yielding 100 times or so more spiritual influence.

Goodstein also quotes some of Barna's other findings, for example:

"Few people said that reading the book had actually changed any of their beliefs," he said. "That was only 5 percent. Most people said that it essentially reinforced what they believed coming into the book."

Again, 5% of 44 million is more than two million. "Changed any of their beliefs" is a pretty low standard, but comparing apples and oranges, we can observe that the Campus Crusade for Christ, the "largest evangelical organization in the United States" according to USA Today (cited here), says on its website that "Over the past five years, more than 37,900 students made a decision to become a Christian".

The Barna Group article makes the same point in a different way:

“On the other hand,” [Barna] continued, “any book that alters one or more theological views among two million people is not to be dismissed lightly. That’s more people than will change any of their beliefs as a result of exposure to the teaching offered at all of the nation’s Christian churches combined during a typical week.”

Let me suggest a different point, which is plausible though not supported by any polling: TDVC has made more of an impact on Americans' models of literary style and plot construction than exposure to all the teaching offered in all of the nation's English courses over the course of a decade.

The Barna Group article continues:

The people most likely to have altered their religious views in response to the book’s content were Hispanics (17% of those who read the book), women (three times more likely than male readers to do so), and liberals (twice as likely as conservatives). Upscale adults were also much more likely than downscale individuals to shift their thinking based on the novel.

I'm almost ashamed to say that I still haven't read it. It's starting to feel like an unpleasant but unavoidable civic duty to do so.

Goodstein's "Da Vinci Christianity" thesis -- that in America, "many Christian believers have assimilated a whole lot of new and unorthodox ideas, as well as half-truths and conspiracy thinking, into their faith, while still seeing it as Christianity" -- seems consistent with the theme of Harold Bloom's 1993 work The American Religion, which the Publishers Weekly review on amazon.com describes this way:

Without knowing it, American worshipers have moved away from Christianity and now embrace pre-Christian Gnosticism. ... In his most controversial book to date, the Yale professor defines "the American Religion" as a Gnostic creed stressing knowledge of an inner self that leads to freedom from nature, time, history and other selves. Every American, he writes, assumes that God loves her or him in a personal, intimate way, and this trait is the bedrock of our national religion, a debased Gnosticism often tinged with selfishness. The core of this odd, ponderous book focuses on Pentecostals, Christian Scientists, Jehovah's Witnesses, Seventh-Day Adventists and especially Mormons and Southern Baptists--the two denominations Bloom believes will dominate future American religious life. He argues that mainline Protestants, Jews, Roman Catholics and secularists are also much more Gnostic than they realize. He identifies African-American religion, mystical and emotionally immediate, as a key element in the birth of our home-grown Gnosticism around 1800. Bloom is not likely to win many converts to his viewpoint.

I don't know -- with Laurie Goodstein on board, and Dan Brown at the throttle, that train seems to be picking up speed even without Bloom's name on it.

[Update: Laurie Goodstein is not the only one who apparently wasn't paying attention to Bloom's theories and arguments. Adam Gopnik, in his review "Renaissance Man" (New Yorker, Jan. 17, 2005 -- unfortunately not available on line), writes:

A cultural anthropologist, a hundred years from now, will doubtless find, in the unprecedented success of "The Da Vinci Code" during the time of a supposed religious revival, some clear sign that, in the Elvis mode, what a lot of Americans mean by spirituality is simply an immense opennness to occult superstitions of all kinds.

I'm skeptical that this characterization of my fellow citizens is a fair one, but in any case, it's odd that Gopnik didn't make the connection to Bloom's presentation of a similar set of ideas, not a hundred years later, but a dozen years before. I guess that Bloom's interest in American vernacular religion seemed so out of the mainstream, in 1993, that even an omnivorous intellectual like Gopnik simply ignored his book.]

Posted by Mark Liberman at 05:50 AM

May 22, 2006

Euphony and usefulness

On the title page of a 1993 manuscript by John McCarthy and Alan Prince, "Prosodic Morphology I: Constraint Interaction and Satisfaction", was this epigraph:

"Bulgermander. It's more euphonious than Weldmander. Weldmander will never stick." William Weld, Governor of Massachusetts, on a redistricting plan he devised with Senate President William Bulger. Boston Globe, July 11, 1992.

Alas, neither Bulgermander nor Weldmander seems have stuck, since Google's index is currently ignorant of both coinages. However, I was reminded of Weldmander's alleged euphony problems by something Bernard Lewis said in a recent interview about the neologism Islamdom:

In talking of the Christian world ... we use two terms: Christianity and Christendom. Christianity means a religion, in the strict sense of that word, a system of belief and worship and some clerical or ecclesiastical organization to go with it. If we say Christendom, we mean the entire civilization that grew up under the aegis of that religion, but also contains many elements that are not part of that religion, many elements that are even hostile to that religion. ... In talking of Islam, we use the same word in both senses, and this gives rise to considerable confusion and misunderstanding. There are many things that are described as part of Islam, which are indeed part of Islam, if we take the word as the equivalent of Christendom, but are very much not part of Islam — are even alien or hostile to Islam — if we take the word Islam as the equivalent of Christianity. ...

The late Marshall Hodgson, of the University of Chicago, in discussing this issue, suggested that we use the word Islamdom to describe the civilization. A good idea, but it didn't catch on, probably because it's so difficult to pronounce.

As McCarthy and Prince recognized, this sort of explanation makes a lot of sense. It's plausible that some words, like craptacular and truthiness, have got the right sort of mouth feel to make it. But is it really lack of euphony that dooms almost all made-up words to the fate of glemphy and blang? Anyhow, Islamdom is by no means a total failure, since it gets 19,700 web hits on Google.

What happened to Bulgermander? Well, I guess that it referred to a small, local event whose general category already had the similar coinage gerrymander, so that there was little reason to retain it or generalize it. And Billy Bulger became president of UMass in 1996, and his brother James J. "Whitey" Bulger is still at large. So when "Prosodic Morphology I" was posted to the Rutgers Optimality Archive, as John McCarthy explained (p.c.), "cooler heads prevailed" and the epigraph was omitted.

John added:

By the way, the word "Islamdom" brings to mind my first exposure to the word "Islamism" in my youth. Once a month or so, we were made to recite this prayer:

http://www.catholic.org/prayers/prayer.php?p=26

This prayer is a rich tapestry of intolerance, what with the pairing of idolatry and "Islamism" followed by a couple of sentences on the blood guilt of the Jews.  The Protestants and the Orthodox get much milder treatment with their "erroneous opinions".

He's talking about this passage in the "Consecration of the Human Race to the Sacred Heart of Jesus":

You are King of all those who are still involved in the darkness of idolatry or of Islamism; refuse not to draw them all into the light and kingdom of God. Turn Your eyes of mercy toward the children of that race, once Your chosen people. Of old they called down upon themselves the Blood of the Savior; may it now descend upon them a laver of redemption and of life.

At least it's a bit more welcoming than the Saudi-textbook language documented by Nina Shea in yesterday's Washington Post...

The OED dates Islamism to the middle of the 18th century, as a term for the religion we would now call Islam, a word which did not come into use until quite a bit later:

1747 Gentl. Mag. 373 Never since the rise of Islamism [note So the Mahometans call their own religion] has our worship once varied.
1754 Phil. Trans. XLVIII. 755 Before the introduction of Islamism into Arabia.
1855 MILMAN Lat. Chr. IV. i. (1864) II. 169 To subdue to the faith of Islam. Ibid. 213 The potentates summoned by Mohammed himself to receive the doctrine of Islam.

That's clearly the usage in the prayer that John cited. However, the American Heritage dictionary gives Islamism a very different and much more recent gloss:

1. An Islamic revivalist movement, often characterized by moral conservatism, literalism, and the attempt to implement Islamic values in all spheres of life. 2. The religious faith, principles, or cause of Islam.

The Wikipedia article say that Islamism

attained its modern connotation in late 1970s French academia, thence to be loaned into English again, where it has largely displaced "Islamic fundamentalism."

Islamist, meaning "orthodox Muslim" or alternatively "one who is versed in Islamic studies", didn't arrive until the mid-19th and mid-20th centuries respectively:

1855 MILMAN Lat. Chr. XIV. iii. (1864) IX. 108 Caliphs who were, at least no longer, rigid Islamists.
1937 R. H. LOWIE Hist. Ethnol. Theory viii. 97 Westermarck is very widely read, and his original researches in Morocco, though only appraisable by Islamists, bear the earmarks of scholarship.

Again, the AHD offers a more contemporary sense of Islamist as the adjectival form of its noun Islamism, i.e. a certain type of Islamic fundamentalist.

Islamism and Islamist are certainly now successful words, as judged by their 2.4 million and 11.5 million Google hits, respectively. However, since "Islamic fundamentalism" also has 1.77 million Google hits, we can clearly reject the Wikipedia claim that Islamism "has largely displaced 'Islamic fundmentalism'" in English. In fact, on Google News as of yesterday afternoon, "Islamic fundamentalism" got 539 hits, and "Islamism" got only 259. On Yahoo News, "Islamic fundamentalism" got 154 hits, and "Islamism" got 78. I conclude from this that the terminological battle -- if it is one -- is still very much in progress.

In a May 12, 2005 blog entry "Onward, Christianist soldiers?", Ruth Walker wrote that

Google has rounded up 631 hits for me for "Christianist," along with the query, "Did you mean to search for 'Christiano'?" [...]

 I figure 631 hits for ‘Christianist’ is the Internet equivalent of seeing the first sliver of the sun coming up over the mountain in the morning.

Now, a bit more than a year later, {Christianist} gets 70,200 hits (though Helpful Google still asks "Did you mean: christiano"). Some of the increase has been promoted by Andrew Sullivan, who has started using the term frequently on his blog The Daily Dish. He has also adopted the nominal form Christianism, for example in his May 15, 2006 essay "My Problem with Christianism" (free version here for non-subscribers).

{Christianism} gets 592,000 Google hits, many more than Christianist -- but that seems to be because the term has a number of prior realms of use, both positive and negative.

Anyhow, my guess is that Islamdom as a term for "Islamic civilization" has failed, while Islamist as a term for "Islamic fundamentalist" has succeeded, not because of their relative euphony, but because of their relative usefulness.

[Update: Ben Zimmer writes

Marshall Hodgson introduced a number of neologisms in The Venture of Islam. He particularly liked the "-ate" suffix for nouns and attributives, as in "Islamicate", "Persianate", and "agrarianate". Other Hodgsonisms include "citied", "technicalistic", and "shari'ah-minded". Many of these terms continue to be used by Islamicists (as opposed to Islamists!) who begin their studies with Hodgson's three-volume classic.

]

[Update #2: I neglected to note (because I had forgotten) that William Safire wrote about Christianism and Christianist in his column of May 15, 2005 -- and I didn't find it earlier, because Google doesn't index behind the Times Select wall. I happened on it in a post by Tristero at Hullabaloo, which in turn I happened on because in an adjacent post he cited a recent post on mine on the striking similarities between a book review by Mark Steyn and a blog post by Geoff Pullum. ]

Posted by Mark Liberman at 05:41 AM

May 21, 2006

Pirahã channels

According to Daniel L. Everett, "Cultural Constraints on Grammar and Cognition in Pirahã: Another Look at the Design Features of Human Language", Current Anthropology, Volume 46, Number 4, August-October 2005 (preprint for those without a subscription):

The Pirahã people communicate almost as much by singing, whistling, and humming as they do using consonants and vowels.

Dan explains the contextual functions of these "channels" as follows:

Channel
Functions
a. Hum speech Disguise
Privacy
Intimacy
Talk when mouth is full
Child language acquisition relation
b. Yell speech Long distance
Rainy days
Most frequent use – between huts & across river
c. Musical speech ('big jaw') New information
Spiritual communication
Dancing, flirtation
Women produce this in informant sessions more naturally than men.
Women's musical speech shows much greater separation of high and low tones, greater volume.
d. Whistle speech (sour or 'pucker mouth' --     
    same root as 'to kiss'
    or shape of mouth after eating lemon)
Hunting
Men-only (as in ALL whistle speeches!)
One unusual melody used for aggressive play

Lots of other cultures have ways of expressing speech as whistling, drumming, or chanting. What seems to be unusual about the Pirahã is the relatively large role of these other "channels" (as Dan calls them) in everyday life. As Dan suggests, this may be connected to the fact that Pirahã has a small number of consonant-vowel distinctions

The consonants of Pirahã are /p b t k g ' h/ and, in men's speech only, /s/, and the vowels are /i a o/

and a relatively complex system of syllable-weight, stress and tone. Whistling and humming preserve the prosodic distinctions and blur or eliminate the distinctions among different consonants and vowels. Thus it'll be easier to understand what someone is humming (for example) in a language where there's more information in timing, stress and tone, and less information in consonant and vowel distinctions.

But in my opinion, the most interesting aspect of this situation is what Dan calls "the sloppy phoneme effect", which allows for a "tremendous amount of variation among consonants" in ordinary speech. For example, there's apparently free substitution of /'/ (glottal stop), /p/ and /k/, as in the word for "head", which can be any of:

'apapaí    kapapaí    papapaí    'a'a'aí    kakakaí

(and so on) but not

*tapapaí    *tatataí    *bababaí    *gagagaí    *gagagaí    *aaaí

or other examples where voiced consonants or no consonants are substituted. Some (though not all) of this variation is equivalent to preserving the syllable-weight classes involved in the Pirahã stress rule, and thus sets up equivalence-classes of words in ordinary speech similar to the equivalence-classes of hummed or whistled "channels".

[A side note for the linguists among you -- I don't know why Dan describes things this way, rather than saying that glottal stop, /k/ and /p/ are actually just allophones of a single underlying phoneme. Perhaps the statistical variation is lexically modulated in a way that makes clear that one word is basically /kaka/ while another is basically /papa/, even though both can in principle be any of [kaka] [kapa] [paka] [papa] (adding in the glottal-stop variants as appropriate)? That would be fascinating but very unexpected.]

To make all this less abstract, here are a couple of examples. First, an .mp3 recording of two boys conversing in musical speech ("big jaw"), from the "enhancements" of the cited Current Anthropology paper. I believe that this is roughly a "hey guess what I did today" conversation, but unfortunately I don't have a transcript, a rendition in normal speaking mode, or any further explanation. Nor do I have any examples of "hum speech" to show you.

Here's the example of whistled speech that Dan gives in the cited article, with spoken and whistled version performed by Dan himself (he kindly sent it from a hotel room in Brazil, where he was on his way to another summer of field work). First the basic sentence, kái'ihí'ao 'aagá gáhí "there is a paca" (audio link):

(syllables)
kái
'i
'ao
'aa
gái
(tones)
H-L
L
H
H+L
L
H
H-L
H
(syllable weight)                
(stress)
(glosses) paca possible exist-be there
(translation)
"there is a paca"

(I guess that this is the "hunting" kind of whistle speech, rather than the "agressive play" kind.)

Here's a picture showing the pitch contour, spectrogram and waveform of the spoken version (note that I've used "?" to mark glottal stop in this case -- sorry for the lack of consistency):

Here's an audio link to Dan's whistled version. The picture below shows a "narrow-band" spectrogram, to make the pitch of the whistling plain, along with the waveform.

(The whistled version is somewhat longer, in this case -- 2.34 seconds versus 1.759 seconds, or about a third longer. This might just be because Dan is a less-practiced whistler than speaker.)

For any and all of the alternative-channel versions of speech across the world's languages, it would be nice to have a collection of analyzed examples that was large enough for some quantitative analysis. Depending on the complexity of the system and the amount of variability in it, that would be several thousand phrases at a minimum, and of course the more the better. This may exist, at least in museum or library archives, for some of the cases (e.g. Vedic chanting and Yoruba drumming), but I don't know of any published (or even accessibly unpublished) examples.

Posted by Mark Liberman at 08:52 AM

May 20, 2006

The dawn of a new era


Roger Shuy touts the new automated phone system at Language Log Plaza:

One day we'll even perfect the way to have recursive menu items redirect you to one of the menus you've already traversed. Now won't that be a great day? We're so proud!

A fabulous day indeed.  Because then our phone system will be A HUMAN LANGUAGERecursivity ruulz!

Posted by Arnold Zwicky at 06:32 PM

We proudly announce our new telephone system

We here at Language Log Plaza recognize the need for huge operations like ours to keep pace with the times. Being simple folks, we tend to answer our own phones when they ring. But no more nice guy! From now on if you have serious language problems and try to call our number, you will be greeted by our spanking new automated telephone answering service. Here's what you can look forward to:

(telephone ringing at Language Log Plaza)

Voice: Welcome to Language Log. Your call is important to us and we want to be sure that you talk with the right expert here. Please listen carefully to the following menu:

If you wish to speak with a phonologist, punch 1--but be sure to speak very clearly.

If you is trying to reach one of our grammarian which are here right now, punch 2.

If you wish to speak to a semanticist, punch 3, if you know what I mean.

If you wish to speak to a sociolinguist, punch any variety of numbers.

If you wish to speak to a computational linguist, punch 110100110001110000001100101111010.

If you wish to speak to The Director, punch star.

If you feel that you really need a psycholinguist, hang up, dial 911.

If you wish to make a complaint, punch 0 and tell it to the operator.

If you have the wrong number, hang up and redial the same number.

See how easy this is? Give us a call and see how Language Log can be every bit as efficient as the other large companies and government offices.

Note: We apologize for the fact that so far we haven't figured out how to construct an embedded menu, where our first menu transfers you to another menu, which then transfers you to still another menu. Nor have we developed ways to transfer you to a human operator, who will put you on hold for the standard and obligatory 30 minutes. Our engineers are working on these problems, however, and we plan to have these modern, automated systems in operation within a few months. One day we'll even perfect the way to have recursive menu items redirect you to one of the menus you've already traversed. Now won't that be a great day? We're so proud!

Posted by Roger Shuy at 03:17 PM

The Arabs own him

Half-listening to NPR's Weekend Edition, I heard a man talking about current prospects in the horse-racing world, and he said of one horse (I forget which):

He's a very expensive horse. The Arabs own him.

I thought, what, all of them? The entire population of the continuous region of predominantly Arab nations extending from Mesopotamia to Western Sahara? I suppose he meant that the horse was owned by consortium of super-rich sheikhs from Saudi Arabia or the UAE. [Update: Quite possibly the horse he was talking about was Discreet Cat, which won the $2 million United Arab Emirates Derby last March 25. Discreet Cat is owned by Godolphin, the racing stable of Sheikh Mohammed bin Rashid al Maktoum of Dubai.] The remark sounded very strange to me. If a group of South American investors owned a racehorse, would he say "The Latin Americans own him?" If it was Don Ho and some well-heeled co-investors from some of the Pacific Islands, would he say, "The Polynesians own him? It seems unlikely. The Arabs seem to have a much more sharp and unified profile than other widely spread transnational ethnic or linguistic groups. When one of them gets in the news, the news is more likely to be attributed to the entire group than it would be if the same thing had been done by a member of some less salient group.

Posted by Geoffrey K. Pullum at 10:16 AM

Attorney General caught in linguistic snare!

Confusion reigned on Friday over the Senate vote on separate amendments to the immigration reform bill declaring English the "national language" on the one hand and the "common and unifying language" on the other. Sen. James Inhofe (R-OK), sponsor of the "national language" amendment, belittled the softer alternative, saying "You can't have it both ways." But White House spokesman Tony Snow said President Bush supports both amendments, agreeing with the two dozen senators (on both sides of the aisle) who voted for the two measures.

Unfortunately, no one filled in Attorney General Alberto Gonzales, who was in Houston meeting with state and local officials about the enforcement of immigration laws. "The president has never supported making English the national language," Gonzales said after the meeting. "I don't see the need to have legislation or a law that says English is going to be the national language." The White House was forced to backpedal from Gonzales' remarks later in the day, explaining that Bush doesn't believe English should be the "official" language, though "national" is OK. White House spokeswoman Dana Perino clarified:

"The attorney general got caught in a linguistic snare. He took 'national' language to mean what we describe as 'official' language. We have no problem in identifying English, our common linguistic currency as a national language; we also view it more expansively as the "common and unifying language."

Everyone clear now? The word from the White House is: "national" good, "common and unifying" also good, "official" bad. Even if the binding force of the "national language" amendment is tantamount to treating English as "official," that word is still apparently off-limits. Too bad no one told the Attorney General about this fine-grained distinction.

Posted by Benjamin Zimmer at 10:07 AM

Better late

I apologize to our UK readers for not posting this notice before the premiere at SOAS on May 17 of The Last Speakers,

[a] documentary film on endangered languages [that] shows the work of David Harrison and Gregory Anderson on the language of the Ös people who live in log-cabin villages in central Siberia, 3,500 km east of Moscow.

David Harrison is based at Swarthmore, and there will surely be some U.S. showings, which I'll try to write about before they happen rather than after. (And maybe there'll be an online version at some point, to reach the many interested people who won't be lucky enough to be in the audiences?)

Posted by Mark Liberman at 10:07 AM

May 19, 2006

1421 Update

A couple of years ago I wrote about the ridiculous linguistic evidence put forward for the claim in the book 1421 that the Chinese fleet of 1421 reached the Americas. Well, things aren't getting any better. Not only have the 1421 people people not answered any of the criticism of their argument, but the new stuff on their web site is if anything even worse than the old.

The web site now includes an interactive map of British Columbia. The red dots represent putative pieces of evidence. Click on one and in theory (but only part of the time, in practice) a description of the evidence apears below. Clicking on the northernmost dot, which looks to be around Dease Lake, produces this:

Inuit = Yin uit (people from Yin) (Martin Tai).

My previous post dealt with the problems with this equation, but its location on the map really brings out the utter incompetence of these "researchers". The vicinity of the dot is not anywhere near Inuit territory. No part of British Columbia is Inuit territory, nor any part of the Yukon except for a strip in the far north, right along the Arctic Ocean. The nearest Inuit would be about 1,000km away. On the page devoted to linguistic evidence, however, the Inuit are said to be found in Vancouver, which is about 700km South of Dease Lake, not to mention nearly 2,000km from their actual location. We're supposed to take seriously the geographical claims of people who haven't a clue as to where the Inuit live and can't tell the difference between Vancouver and Dease Lake?

P.S.: There's a web site devoted to debunking 1421. Check out 1421exposed.

Posted by Bill Poser at 08:40 PM

Anti-Elmorese

According to an article by Linda Greenhouse in today's NYT ("Second Hearing on Detroit Drug-Search Case Shows Deep Divisions on Supreme Court"),

Nonetheless, Justice Breyer proceeded to make it clear that he remained unpersuaded by Mr. Baughman's argument that the Michigan Court of Appeals was correct in refusing to exclude from Booker T. Hudson's trial the drugs the police found when they executed a search warrant by bursting into his home without knocking or waiting for him to open the apparently unlocked door.

There can be no question that both Justice Breyer and Ms. Greenhouse are users of a human language, by the recursion-based standards of Hauser, Chomsky and Fitch.

In fact, this example is pretty much the opposite end of the syntactic stick from the flat structures used by the Pirahã and by Elmore Leonard characters. Using the crude metric of clausal depth featured in my study of secular trends in presidential embedding, Ms. Greenhouse's sentence weighs in with an truly impressive (word-wise) average embedding depth of 5.98, and a spectacular peak of 12. Without even one comma or other internal punctuation.


 0 [Nonetheless, 
 0 [Justice Breyer proceeded 
 1   [to make it clear 
 2      [that he remained 
 3         [unpersuaded by Mr. Baughman's argument 
 4            [that the Michigan Court of Appeals was correct 
 5               [in refusing 
 6                  [to exclude from Booker T. Hudson's trial the drugs 
 7                     [the police found 
 8                        [when they executed a search warrant 
 9                           [by bursting into his home 
 10                             [without knocking 
 10                                   or waiting 
 11                                [for him to open the 
 12                                   [apparently unlocked]
 11                                 door.]]]]]]]]]]]
 
 

(The quantification of depth has been revised to reflect Geoff Pullum's judgment that I was wrong to let "remained unpersuaded" go by without an increment of embedding:

"Remained" is definitely a complement-taking verb; and "unpersuaded by Mr. Baughman's argument..." is definitely a passive clause. So you have undercounted Greenhouse's astonishing hypotacticity: she hits 12.

)

Of course, this isn't the kind of center embedding that tamarins and starlings have been tested on (short versions of), it's almost all embedding of the type that linguists call "right branching" (because at each iteration, it's the right-hand constituent that is subdivided further). But still.

Gene Buckley, who sent in the link, point out that it's not just the depth, it's the negation:

It was only some top-down knowledge of Breyer and the usual votes on cases like this that permitted me to understand the sentence, at least when I was still on my first cup of coffee. The string of predicates "unpersuaded... correct... refusing... exclude...", three of which have some kind of negative meaning, was just too much. What follows is plenty complex as well!

Yes, don't forget "without knocking"!

I've noticed recently that readers of Language Log sometimes misconstrue my opinions, so I'll be explicit: no criticism of Justice Breyer or Linda Greenhouse is intended. I'm proud to be a member of a species that can think and write like that, when it wants to. Nor, of course, do I mean mean any disrepect towards people who talk like Elmore Leonard characters, with hardly any embedding at all. I'm one of them, sometimes, and not ashamed of it either. It takes all kinds.

Posted by Mark Liberman at 04:05 PM

Hutchisonian science

As Mark Liberman notes in an update to his post, "Request for action from the AAA," Inside Higher Ed now reports that the Senate Committee on Commerce, Science and Transportation has rebuffed Sen. Kay Bailey Hutchison's proposal to cut (or at least drastically deprioritize) social science funding in the NSF budget. The committee's compromise evidently involves a call for "increased support for physical science research," so presumably other disciplines would continue to be supported under the committee's proposal but not at increased levels. We'll have to wait for the text of the revised bill before we know for sure what specific recommendations the committee has made.

I was particularly struck by one paragraph in the Inside Higher Ed article:

Hutchison reiterated her feeling that Congress should "focus on science and technology" because "we are responding to a crisis in our country." Hutchison added that she is "not against social sciences being part of the NSF budget," but that "I want to make sure we focus on the mission we are after." Hutchison appeared to be using a broad definition of social science when she noted that biology, geology, economics, and archaeology are worthy pursuits, but can often stray from the innovation and competitiveness path.

Attention biologists and geologists! According to Hutchisonian definitions, you are now social scientists!

It's downright alarming that someone with such influence on the funding of the NSF and other research-related agencies doesn't seem to have a clue what counts as "physical" or "natural" science and what counts as "social science." One can only wonder why Hutchison felt the need to lump biology and geology in with disciplines that she regards as somehow peripheral to the "mission" of the NSF, like economics and archaeology. My best guess is that Hutchison's comment was intended as an indirect swipe at researchers doing work on such disfavored topics as evolution and global warming. This would be entirely in keeping with what Chris Mooney has identified as the Republican war on science.

Stay tuned for further senatorial edicts from the once august (now increasingly reactionary) body. I await the senator who announces, "I can't define hard science, but I know it when I see it!"

[Update #1: More coverage from ResearchResearch ("Hutchison amendment tweaked, social scientists relieved"):

Sen. Kay Bailey Hutchison, R-TX, had originally sought an amendment designed to ensure a focus upon science and technology fields, but later reached a compromise with Sen. Frank Lautenberg, D-NJ, who objected on the grounds that such language could stifle social science. The original amendment would have instructed NSF to give priority to research grants and activities that contribute specifically to "physical science, technology, engineering, or mathematics," but Hutchison explained that she agreed when Lautenberg added the words "innovativeness" and "natural science" to her proposal. "It's all good. We are really happy," said Barbara Wanchisen, executive director of the Federation of Behavioral, Psychological and Cognitive Sciences.

Nevertheless, Hutchison stuck by the rationale of her original wording. "The awarding of tax money should be to further our goal of innovation and competitiveness in math and science," she told the committee. But, the senator clarified that she is not against the social sciences being part of NSF's mandate and mission. "I think that biology, economics, geology, geography, archeology, are all worthy of our study and there are some great studies going on in the fields of sociology," she said.

So apparently all disciplines would be eligible for prioritized funding, as long as they contribute to "innovativeness." Innovativeness is, of course, in the eye of the beholder. Among Hutchison's examples of an "inappropriately supported" project is a study of how global and national economic changes are affecting urban women workers in Bangladesh. Such a topic is apparently not "innovative" and therefore unworthy of taxpayer funding from Hutchison's perspective. Will NSF-supported researchers now be subject to an "innovativeness" test to decide who gets to move to the front of the funding line?]

[Update #2: Based on the full quote from Hutchison provided by the ResearchResearch article, it may have been unfair for the Inside Higher Ed writer to say that she "appeared to be using a broad definition of social science" (and in turn, unfair of me to say that she called biology and geology "social science"). We know that she said she is "not against social sciences being part of the NSF budget" and further clarified, "I think that biology, economics, geology, geography, archeology, are all worthy of our study and there are some great studies going on in the fields of sociology." So to give her the benefit of the doubt, she may have been trying to recognize the significance of research conducted in a diverse range of fields, from economics and archaeology to biology and geology, rather than simply labeling them all as "social science." We'll know better what she was driving at once we see the full transcript of her remarks.

Also, Mark Liberman observes that a focus on "innovativeness" is nothing new for the NSF:

Having served on many NSF review panels, I can say that NSF proposals are already evaluated for innovativeness, both in terms of intellectual merit and in terms of broader social implications.
The word "innovation" occurs in 23,100 pages on the NSF web site; "innovative" on 7,980 pages; and "innovativeness" on 77. I expect that NSF will respond by increasing that last number.
]

[Update #3: Here's the agreed-upon wording, according to ScienceNOW:

After highlighting the importance of the "physical and natural sciences, technology, engineering, and mathematics," the amendment explains that "nothing in this section shall be construed to restrict or bias the grant selection process against funding other areas of research deemed by the foundation to be consistent with its mandate, nor to change the core mission of the foundation." ]

Posted by Benjamin Zimmer at 01:55 PM

Homo journalisticus

The story in National Geographic News on the putty-nosed monkeys and their combination pyow-hack calls (acknowledgment to Evan Bradley, who made my day slightly sadder by pointing it out to me) is worse than the one David Beaver cites. It is headed:

Monkeys use "sentences", study suggests

Study suggests nothing of the kind, of course.

In fact the story itself reports that the author of the scientific report "cautions that analogies to human language are not always helpful in understanding the utterances of animals." Quite so. I guess content means nothing to the headline writers where science journalism is concerned.

I have no doubt that for a long, long time we shall continue to see stories recognizing language use in dumb animals and birds sitting alongside stories about it being absent in various kinds of humans (Bushmen, undergraduates, primitive tribes, bureaucrats, urban blacks, Danes, male scions of the Bush family, teenagers, Southerners, university administrators, and other despised groups). Because, while it is completely unclear whether the roots of language are innate, there is overwhelming evidence of an innate drive in Homo journalisticus to write stories about talking or understanding being manifested in chimps, gorillas, orangutans, baboons, monkeys, tamarins, bees, dolphins, whales, parrots, starlings, dogs, bats (yes, bats — see below) and I don't know what will be next but perhaps donkeys. And the subspecies Homo journalisticus subeditorialis clearly has a built-in drive to write wild and goofy headlines for such stories.

I am not exaggerating. You might want to look at Holy Bat Chat, Batgirl! Medic Is Cracking Bat Code, about Barbara French, who "has decoded a basic repertoire of bat calls and deciphered the social context in which they are used," before you accuse me of exaggerating. Check this bit, which Mark Liberman pointed out really needed to be quoted:

French believes the animals are using sounds with syntax. To test the hypothesis French, [her collaborator the neurophysiologist George] Pollak, and one of his graduate students are cataloging all the calls, and analyzing the acoustic structure of each, to study how sounds are manipulated to produce different meanings.

During mating season, for example, males produce a "territorial announcement buzz" to woo females. The same sound, albeit at a different intensity and pace, seems to be used to ward off competing males. "It's the difference between saying something sweetly, and screaming those same words — they could have very different meanings," said French.

That's right: they do an intensified angry buzzing sound to make rival horny male bats buzz off, and it's reported as sounds with syntax. Bzzzzzzz! Leave my woman alone! I mean, hello?

You might also want to look at Monkeys Have Accents, Japanese Study Finds, before you suggest that I am overstating. There it is reported that "primate researchers have discovered that Japanese macaques can acquire different accents based on where they live — just like humans." Just like humans! No, I'm not exaggerating.

Why this drive toward drivel in linguistic science reporting? Is there a survival advantage conferred by some trait manifesting itself in credulity concerning animal communication? Further research is needed.

Posted by Geoffrey K. Pullum at 10:41 AM

What does "official" mean?

The problem with nitpicking over whether a particular piece of legislation makes English "official" is that being "official" has no well-defined meaning. Some countries make a distinction in their legislation. For example, Switzerland has three "official" languages (French, German, and Italian) but four "national" languages (the foregoing plus Romantsch). Swiss legislation specifies various ways in which a language that is merely "national" rather than "official", in practice just Romantsch, has a somewhat second class status. The distinction made in Switzerland, however, is not necessarily carried over in other uses of terms like "official" and "national". elsewhere. In the Northwest Territories, for example, several native languages have "official" status along with English and French, but their status is in fact not the same. There is, for example, no legal right to receive one's education in a native language.

There are uses of "official language" that are apparently outside the scope of the Inhofe amendment. It evidently does not envision denying the status of legal instrument to documents written in languages other than English. But denying US citizens the legal right to receive government services in any language other than English certainly comes close enough to what "official" means in many contexts for it to be quite legitimate to say that the Inhofe amendment makes English the official language of the United States.

Posted by Bill Poser at 03:15 AM

English: official, national, common, unifying, or other?

Has the United States Senate really voted for "official English," as Bill Poser writes? The situation's a bit more complicated than that, as suggested by the AP headline, "Senate sends mixed signals on English." Senators have not actually weighed in on whether English should be made the nation's "official" language, though a House bill along these lines is said to have strong support. Rather, the Senate considered two amendments to the immigration reform act which proposed modifiers to "English" not quite as forceful as that magic word "official."

The first amendment, sponsored by Sen. James Inhofe (R-OK), is intended to "preserve and enhance the role of English as the national language of the United States of America." The amendment passed by an overwhelming vote of 62 to 35, with 10 Democrats joining 52 Republicans in support. This is evidently the first time in history that the Senate has identified English as the "national" language, if not the "official" language. According to various news accounts, Inhofe had originally wanted to use the word "official" but changed it to "national" to draw more support for the amendment. The Chicago Tribune reports that for Official English proponents, the choice of adjective didn't actually matter very much. Tim Schultz, director of government relations of U.S. English Inc., is quoted as saying, "We don't care. We think it's basically the same thing. It's a 'You say potato, I say potahto' kind of thing."

For those on both sides of the debate, what was clearly more important than the cosmetic choice of adjective was the amendment's "teeth": its unprecedented insistence that unless otherwise authorized "no person has a right, entitlement or claim" to obtain government services in a language other than English. Many Democrats were harsh in their assessment: Minority Leader Harry Reid said, "While the intent may not be there, I really believe this amendment is racist. I think it's directed basically at people who speak Spanish."

Immediately after the vote on Inhofe's amendment came another vote, this time for a less binding amendment to the immigration bill sponsored by Sen. Ken Salazar (D-CO). The purpose of Salazar's amendment was "to declare that English is the common and unifying language of the United States, and to preserve and enhance the role of the English language." Inhofe scoffed at the followup amendment, saying, "You can't have it both ways." Apparently, to Inhofe's way of thinking, English can be either "national" or "common and unifying," but not both. From an outsider's perspective this might seem slightly insane, but it makes perfect sense in the context of congressional party politics. The "common and unifying" measure was, to Inhofe, a weak Democratic response to declaring English the "national" language, since "national" is now supposed to be taken as a code word for "official," softened to placate moderates.

The moderates, however, did want to "have it both ways." A total of 23 senators who voted for Inhofe's amendment (roughly split between Republicans and Democrats) crossed over and voted for Salazar's amendment too. This allowed the bill to pass by a margin of 58 to 39. So the Senate has now told us that English should be recognized as "national," "common," and "unifying," though less than two dozen senators liked all three of those adjectives.

Here's the breakdown of the vote so American readers can know on which side of the adjectival divide their elected representatives stand:

English is national, not common & unifying (39: 39R, 0D) English is common & unifying, not national (35: 1R, 33D, 1I) English is national and common & unifying (23: 13R, 10D)
Alexander (R-TN) Akaka (D-HI) Baucus (D-MT)
Allard (R-CO) Bayh (D-IN) Brownback (R-KS)
Allen (R-VA) Biden (D-DE) Byrd (D-WV)
Bennett (R-UT) Bingaman (D-NM) Carper (D-DE)
Bond (R-MO) Boxer (D-CA) Chafee (R-RI)
Burns (R-MT) Cantwell (D-WA) Coleman (R-MN)
Burr (R-NC) Clinton (D-NY) Collins (R-ME)
Chambliss (R-GA) Dayton (D-MN) Conrad (D-ND)
Coburn (R-OK) Dodd (D-CT) DeWine (R-OH)
Cochran (R-MS) Domenici (R-NM) Dorgan (D-ND)
Cornyn (R-TX) Durbin (D-IL) Graham (R-SC)
Craig (R-ID) Feingold (D-WI) Hagel (R-NE)
Crapo (R-ID) Feinstein (D-CA) Johnson (D-SD)
DeMint (R-SC) Harkin (D-IA) Lincoln (D-AR)
Dole (R-NC) Inouye (D-HI) McCain (R-AZ)
Ensign (R-NV) Jeffords (I-VT) Murkowski (R-AK)
Enzi (R-WY) Kennedy (D-MA) Nelson (D-FL)
Frist (R-TN) Kerry (D-MA) Nelson (D-NE)
Grassley (R-IA) Kohl (D-WI) Pryor (D-AR)
Gregg (R-NH) Landrieu (D-LA) Snowe (R-ME)
Hatch (R-UT) Lautenberg (D-NJ) Specter (R-PA)
Hutchison (R-TX) Leahy (D-VT) Voinovich (R-OH)
Inhofe (R-OK) Levin (D-MI) Warner (R-VA)
Isakson (R-GA) Lieberman (D-CT)
Kyl (R-AZ) Menendez (D-NJ)
Lott (R-MS) Mikulski (D-MD) Not voting (3: 2R, 1D)
Lugar (R-IN) Murray (D-WA) Bunning (R-KY)
McConnell (R-KY) Obama (D-IL) Martinez (R-FL)
Roberts (R-KS) Reed (D-RI) Rockefeller (D-WV)
Santorum (R-PA) Reid (D-NV)
Sessions (R-AL) Salazar (D-CO)
Shelby (R-AL) Sarbanes (D-MD)
Smith (R-OR) Schumer (D-NY)
Stevens (R-AK) Stabenow (D-MI)
Sununu (R-NH) Wyden (D-OR)
Talent (R-MO)

Thomas (R-WY)

Thune (R-SD)

Vitter (R-LA)

[Late update: Initial reports of the Senate vote had Mary Landrieu (D-LA) voting for the Inhofe amendment, but the final tally shows her in the "nay" column.]

Posted by Benjamin Zimmer at 01:43 AM

Senate votes for official English

The US Senate achieved a linguistic nadir today in approving 63-34 an amendment by Oklahoma Senator James Inhofe to S. 2611, the Comprehensive Immigration Reform Act of 2006, that makes English the official language of the United States. The main part reads:

The Government of the United States shall preserve and enhance the role of English as the national language of the United States of America. Unless specifically stated in applicable law, no person has a right, entitlement, or claim to have the Government of the United States or any of its officials or representatives act, communicate, perform or provide services, or provide materials in any language other than English. If exceptions are made, that does not create a legal entitlement to additional services in that language or any language other than English. If any forms are issued by the Federal Government in a language other than English (or such forms are completed in a language other than English), the English language version of the form is the sole authority for all legal purposes. [source]

Unless and until passed by the House of Representatives, this is not the law, but as the House is also dominated by Republicans, it may well. The vote was not strictly partisan - 13 Democrats voted for it - but the opposition consisted entirely of Democrats with the exception of New Mexico Senator Pete Domenici. The roll call can be found here.

One justification for this is that eliminating services in languages other than English will save $1 to $2 billion (Inhofe speech). That's at most 0.7% of the cost thus far of the invasion of Iraq. In both cases, the dollar figures don't include the human cost. A second is that it will encourage immigrants to learn English, as if they needed encouragement. The myth that immigrants are unwilling to learn English was debunked so long ago you'd think that people would be embarassed to mention it. The third argument, believe it or not, is that making English official will have a unifying effect! That's rich. Depriving Spanish-speakers in the Southwest and Puerto Rico and American Indians and Eskimos of services in their own languages is obviously a great way to make them feel wanted.

Posted by Bill Poser at 01:20 AM

Parataxis in Pirahã

Take a look at this quote from Elmore Leonard's LaBrava:

"What're you having, conch? You ever see it they take it out of the shell? You wouldn't eat it."

The speaker is Maurice Zola, "five-five, weighed about one-fifteen and spoke with a soft urban-south accent that had wise-guy overtones, decades of street-corner styles blended and delivered, right or wrong, with casual authority".

"Wise-guy overtones", check; "casual authority", OK; but is this guy speaking a human language? Maybe not, according to Marc Hauser, Noam Chomsky and Tecumseh Fitch. If they're right, we need to wait and see whether Maurice comes up with any genuinely "recursive" syntax. Until then, the humanity of his soft urban-south way of talking is uncertain.

In a 2000 article in Science, Hauser, Chomsky and Fitch (HCF) argued that the "aspects of language that are special to language ... only [include] recursion". Steve Pinker, Ray Jackendoff and many others have disagreed. (A quick historical sketch of the debate, with links to the original publications, is here.)

This debate underlies the recent series of articles about whether monkeys and birds can learn "recursive" patterns. (You can read about the experiments on cotton-top tamarins here and the experiments on starlings here, and you can find a list of more than a dozen other relevant Language Log posts here. A short note from Ray Jackendoff, Geoff Pullum, Barbara Scholz and me about this stuff is here.)

By "recursion", HCF mean "computational mechanisms ... providing the capacity to generate an infinite range of expressions from a finite set of elements". "Recursion" in this sense goes beyond the simple combinations of modifiers and heads ("red" + "cow" → "red cow"), or subjects and verbs ("Joan" + "disagree" → "Joan disagrees"), or any other construction that doesn't involve embedding a complex element repeatedly inside another element of the same type. Non-recursive constructions (like modifier+head) are very useful, and such embeddings multiply the set of messages that you can make out of a finite set of elements, but they don't "generate an infinite range of expressions" unless they operate recursively.

And "recursion" in HCF's sense also excludes simply stringing together expressions in a sequence. Animal signaling, human or otherwise, is limited to a single item only if an unhappy accident immediately silences the signaler. Otherwise communication, like life, is just one thing after another; and if mere sequence is recursion, then bacterial signaling is recursive. To be syntactically "recursive", a message must involve structural embedding that goes beyond concatenation or juxtaposition.

That's why Dan Everett ("Cultural Constraints on Grammar and Cognition in Pirahã: Another Look at the Design Features of Human Language", Current Anthropology, Volume 46, Number 4, August-October 2005 -- free preprint) can plausibly think that

... the evidence suggests that Pirahã lacks embedding altogether.

even though he presents and discusses examples that he translates as "When I finish eating, I want to speak to you"; "If it rains, I will not go"; "I want the shirt that Chico sold"; "The woman wants to see you"; "He knows how to make arrows well"; "I said that Kó'oí intends to leave"; "There are two big red airplanes"; and so on.

Now, Dan is talking about recursive embedding, not simple modification or combining words into simple clauses. But even so, the most interesting thing about this claim, in my opinion, is that it imposes a lot fewer constraints on what the Pirahã can say than you might think.

Here are a couple of examples with his discussion:

ti gái -sai kó'oí hi kaháp -ií
I say -nominative Kó'oí he leave -intention
"I said that Kó'oí intends to leave."
(lit. "My saying: Kó'oí intend-leaves.")

The verb "to say" (gái) in Pirahã is always nominalized. It takes no inflection at all. The simplest translation of it is as a possessive noun phrase "my saying," with the following clause interpreted as a type of comment. The "complement clause" is thus a juxtaposed clause interpreted as the content of what was said but not obviously involving embedding. Pirahã has no verb "to think," using instead (as do many other Amazonian languages [see Everett 2004]) the verb "to say" to express intentional contents. Therefore "John thinks that ..." would be expressed in Pirahã as "John's saying that. ..."

A similar construction in Elmorese would be something like "My opinion, he's gonna leave".

English complement clauses of other types are handled similarly in Pirahã, by nominalizing one of the clauses:

kahaí kai -sai hi ob -áa'áí
arrow make -nominative he see -attractive
"He knows how to make arrows well."
(lit. "He sees attractively arrow-making.")

There are two plausible analyses for this construction. The first is that there is embedding, with the clause/verb phrase "arrow make" nominalized and inserted in direct-object position of the "matrix" verb "to see/know well." The second is that this construction is the paratactic conjoining of the noun phrase "arrow-making" and the clause "he sees well."

Dan gives some arguments (clitic agreement and so on) for the "paratactic conjoining" theory -- but what is this "paratactic conjoining" anyhow?

The American Heritage Dictionary says that parataxis (contrasted with syntaxis) is

NOUN: The juxtaposition of clauses or phrases without the use of coordinating or subordinating conjunctions, as It was cold; the snows came.
ETYMOLOGY: Greek, a placing side by side, from paratassein, to arrange side by side : para-, beside; ... + tassein, tag-, to arrange.

The OED's gloss allows "connecting words" in general to be left out, not just conjunctions:

The placing of propositions or clauses one after another, without indicating by connecting words the relation (of coordination or subordination) between them, as in Tell me, how are you?.

But there's a tricky point here: in parataxis, is the (contextually apparent) relation between the phrases really there, but just not overtly expressed? Or is it in some sense not there at all?

Another way to put this is to ask whether we're talking about the sentence structure or discourse structure. Most linguists think that the rhetorical structure of a coherent discourse is not encoded in the same way that phrasal structure is. On this view, a story may have a beginning, a middle and an end, but this is a different kind of structure from the organization of a clause into a subject, a verb and object. From the listener's point of view, you might say that syntactic structure is part of the evidence you use to make sense of a sentence, while rhetorical structure is part of the result you get when you've succeeded in making sense of a discourse.

There are a lot of different ideas about what rhetorical structures are like once you figure them out (see here and here and here for some examples), but everyone seems to agree that they can be recursive. For example, a story can contain another story, or an elaboration can contain another elaboration, as readers of this weblog have reason to know.

However, there's a sort of gray area in between sentential syntaxis and discourse parataxis. In English noun compounds like [[sickle cell] anemia] vs. [rat [bile duct]], we assume that there's really a (syntactic) structure there, even though there are no words or other signs that make the relationship explicit. (Well, there's a stress difference in this case, but never mind that for now.) So couldn't there be a similar implicit relationship between apparently "paratactic" words and phrases, at least in some cases?

The reported speech of Elmore Leonard's characters, especially the lower-class ones, is full of concatenated phrases where this question comes up, because the semantic connection between the phrases is clear in context, but is not made explicit.

Sometimes the implied connection is temporal ("When X, Y"), as in this example from Mr. Majestyk:

"We get here," Larry Mendoza said, "this guy's already got a crew working."

Sometimes it's conditional ("If X, Y"), as in this example from the same book:

"Listen," Renda said, "we get to a phone we're out of the country before morning."

There are also examples of juxtaposed noun phrases whose connection is left implicit, as in these two examples (again from Mr. Majestyk):

"All right, I call some more friends. They get us out of the the country, some place no extradition, and wait and see what happens."

"That goddam truck of his, he can go anywhere," Renda said. "He told me, he comes up here hunting."

This is all perfectly grammatical vernacular American English, in my opinion; I talk this way myself, in some kinds of casual conversation.

These examples feel similar to Everett's examples of Pirahã parataxis, sharing the property of implying, via juxtaposition, phrasal relationships that English in other styles encodes via explicit clausal or phrasal embedding. Here's how Pirahã does temporal clauses:

kohoai -kabáob -áo ti 'ahoai -soog -abagaí
eat -finish -temporal I you speak -desiderative -frustrated initiation
"When [I] finish eating, I want to speak to you."
(lit. "Eating finishes, I you speak-almost-want")

There is almost always a detectable pause between the temporal clause and the "main clause." Such clauses may look embedded from the English translation, but I see no evidence for such an analysis. Perhaps a better translation would be "I finish eating, I speak to you.

A lot like Elmorese, except that the Pirahã examples are often more explicit about the semantic relationships, as indicated in this case, for example, by the completive and temporal morphemes on eat.

So let's return to the example I started with. Maurice juxtaposes three clauses

[You ever see it] [they take it out of the shell] [you wouldn't eat it].

whose relationship might have been made explicit by adding an "if" and a "when"

If you ever saw [a conch] when they take it out of the shell, you wouldn't eat it.

Once the relationships are made explicit like that, we've arguably got a recursive sentence, since the structure is something like this:

What about the way Maurice said it? Are his three clauses organized in a recursive syntactic structure:

or just a paratactic juxtaposition:

I'm not sure. A lot seems to hinge on the answer, at least in the case of Pirahã. Is "paratactic juxtaposition" like stringing sentences together in a discourse, or is it like combining words and phrases in a sentence? Or are those two alternatives really just versions of the same thing seen from different disciplinary angles? Those are deep questions, to be answered by people who know more about syntax and discourse than I do.

In any case, I'm confident that HCF are now among those who make a sharp distinction between syntax and discourse structure, and I'm equally sure that "recursion", for them, is a matter of syntax. Chomsky has famously been skeptical for decades about whether discourse coherence is even a problem amenable to rational investigation, as opposed to one of the mysteries that "lie beyond the reach of the form of human inquiry that we call 'science'" ("Problems and Mysteries in the Study of Human Language," Reflections on Language pp. 137-227, 1975). And the recent animal experiments are all about learning "grammatical" constraints on short sequences of uninterpreted sounds, a situation where discourse structure simply doesn't arise. HCF are strictly concerned with recursion in the syntactic structure of sentences, not in the interpreted structure of discourses.

That's why Dan Everett's claim that Pirahã lacks (recursive) syntactic embedding is important. If some human languages (and Pirahã is not the only candidate) lack recursion, then it's hard to see how recursion could be the defining characteristic of human language. I tried to make this point in a humorous way last year ("Homo Hemingwayensis"). So Tecumseh Fitch is heading to Amazonia this summer, according to Elizabeth Davies' May 6, 2006 article in The Independent:

Professor Everett ... will head back to Amazon this summer with a bevy of enthusiastic young PhD students to try to introduce others to the Pirahã and to prove his theories. A mark of how seriously the linguistic world takes his studies is that accompanying him will be W Tecumseh Fitch, one of the three architects of the original theory of universal grammar along with Chomsky and Dr Marc Hauser. The expert is keen to see whether the tribe does indeed refute their long-established theory.

(Given that the theory in question (that human language in the narrow sense is only recursion) was first proposed in 2000, and immediately called into question by Pinker and Jackendoff among others, the modifier "long-established" is a bit of a stretch here. "Universal grammar" is a term with a longer and broader history -- but this post is not yet another critique of linguistic journalism...)

Anyhow, the article doesn't tell us what Fitch is going to be doing. Probably not analyzing the sentence and discourse structures of Pirahã, since that's not the kind of stuff he does. I'd guess that he'll be testing the Pirahã people on the sorts of acoustic novelty-detection tasks that Hauser and Fitch 2004 applied to cotton-top tamarins and Harvard undergraduates. If he can get them to approach the task the way he wants, I'll look forward to learning the results. But I'd also really like to know when parataxis is really covert syntaxis, and what sort of embedded linguistic structures the Pirahã actually use.

[See David Beaver's recent post "And people say we monkey around" for more on the question of concatenated signals from animals.]

[And yes, I'm aware that some of Elmore Leonard's paratactic juxtapositions could alternatively be analyzed in terms of "prosiopesis", Otto Jespersen's term for starting to talk without putting your mouth in gear, e.g. "[with] that goddam truck of his, he can go anywhere", "[in] my opinion, he's gonna leave." But I think there's plenty of evidence that this is not the right story, in general.]

[Update: Dan Everett writes that

This looks great. You have made this all much clearer than I have. So I will happily borrow from you in future discussions.

The New Yorker is going to be doing an in-depth article on this stuff. [...] What you have to say on this will be very helpful.

Actually, what Tecumseh will be doing is checking for recursive reasoning, which I am quite confident that the Piraha have. I see this as independent of their syntax, though, whereas he does not.

I'm encouraged that Dan didn't find any howlers, though I'm still not sure of my grasp of the logic of this question. His description of Fitch's project does suggest that my characterization of Fitch's views is wrong -- if "recursive reasoning" is the same as recursive embedding in syntax, then I suppose that rhetorical structure imposed on paratactic juxtaposition of phrases would count for Fitch as "recursion".]

[In a later note, Dan confirms that the Pirahã do have recursive narrative structures. A simple example would be interpolating a digression into the plot of a story. The same thing of course is also true of even the most paratactic writers. ]

Posted by Mark Liberman at 12:04 AM

May 18, 2006

Request for action from the AAA

[Update: according to Inside Higher Ed

Voicing concern over America's math and science competitiveness, a Senate committee on Thursday unanimously approved legislation that would push physical science research and teaching partnerships involving colleges and government agencies.
[...]
On Wednesday, Hutchison proposed an amendment that would have forced NSF to give funding priority to work that is expected to make contributions in the physical sciences, technology, engineering, or math. By voting time, however, a compromise was reached with Sen. Frank R. Lautenberg (D-N.J.) and, while the language in the bill places special emphasis on the physical sciences, Hutchison's amendment was changed to allow NSF to be flexible with its funding priorities.
[...]
Hutchison's was the lone voice of concern Thursday.
[...]
The bill would also authorize the NSF to give 2,500 additional grants to be used for graduate research fellowships and for the Integrative Graduate Education and Research Traineeship Program, which preps doctoral science and engineering students for interdisciplinary work.

(Thanks to Kai von Fintel for the link.)]

In this case, the "AAA" is the American Anthropological Association, and by several routes today I've gotten copies of an an email with the Subject "Urgent Action Required", regarding a "proposed amendment by Senator Kay Bailey Hutchison (R-TX) that would instruct the National Science Foundation (NSF) to direct its resources primarily to the physical sciences." Some further details are available on the AAA website here.

I haven't seen the text of Hutchinson's amendment to S. 2802 (the “American Innovation and Competitiveness Act of 2006”) -- for some reason the AAA web page doesn't quote it or link to it. But as described, it would not only defund sociology, anthropology, linguistics and economics, but also mathematics and most of the Directorate for Computer and Information Science and Engineering, whose Digital Libraries initiatives had a significant impact on recent American Innovation and Competitiveness:

In 1996, Google co-founders Sergey Brin and Larry Page were graduate computer science students working on a research project supported by the Stanford Digital Library Technologies Project. Their goal was to make digital libraries work, and their big idea was as follows: in a future world in which vast collections of books are digitized, people would use a "web crawler" to index the books' content and analyze the connections between them, determining any given book's relevance and usefulness by tracking the number and quality of citations from other books.

The crawler they wound up building was called BackRub, and it was this modern twist on traditional citation analysis that inspired Google's PageRank algorithms – the core search technology that makes Google, well, Google.

Without seeing the text of the bill and the amendment, or reading a more extensive analysis of what they say and mean, I'm not sure whether this interpretation of the Hutchinson Amendment's effect is valid. In any case, it may be too late to affect this particular committee action one way or another, since the AAA page asserts that "At a meeting of the Senate Commerce Committee TODAY [i.e. Thursday], an authorizing bill – S. 2802 – focusing on American competitiveness will be marked up (i.e. negotiated)", and that phones calls or emails would need to have gone in this morning in order to have an effect on committee members' votes.

On the other hand, this is only a committee vote on "authorizing legislation", so later votes and actions will be at least as important, if not more so. Therefore I suggest that you look into this and act as your political opinions dictate.

[Update: Ben Zimmer writes that

There is no actual amendment (yet) -- Hutchison so far has just been raising questions about what types of projects NSF funds, with vague suggestions that some other agency be in charge of social science funding. So it was a bit misleading for the AAA email to refer to the "Hutchison amendment." The request for action is apparently preemptive, in order to forestall any amendment to the bill that would limit NSF funding to particular disciplines. More here.

Also, there's no indication that Hutchison wants to restrict NSF funding for mathematics (despite the AAA email's reference to "the physical sciences"). In fact, she is quoted by Science as saying: "I want NSF to be our premier agency for basic research in the sciences, mathematics, and engineering. And when we are looking at scarce resources, I think NSF should stay focused on the hard sciences." So it's just those "soft sciences" that would lose out.

Another piece of evidence that one should be careful before crediting mass emails with "Urgent" in the Subject line, even if they come from a semi-reputable source like the AAA. The issue is surely an important one, but it seems to me that the AAA should send out a more accurate picture of what is going on.]

[Update #2: Joshua Tauberer, linguistics grad student, to the rescue:

Hi, Mark. It's a rare moment when my linguistic and non-linguistic activities come together. Details of the bill can be found on my website:

http://www.govtrack.us/congress/bill.xpd?bill=s109-2802

The amendment wasn't linked to or quoted probably because it hasn't been officially proposed as a true amendment, but rather is all happening within committee where public disclosure of things is sadly pretty limited.

Unfortunately this links to the official record on the Thomas website, where the text of the bill is not yet available.]

[Fernando Pereira supplies this link to a "Staff Working Draft" of May 12 on the Senate Commerce Committee web site, which contains the paragraph:

PRIORITY TREATMENT. -- Proposed research activities, and grants funded under the Foundation's Research and Related Activities Account, which can be expected to make contributions in physical and natural sciences, technology, engineering, and mathematics, and other research that underpins these areas, shall be given priority in the selection of awards and in the allocation of Foundation resources.

No amendments are indicated, but this language seems similar to what the Science article suggests that Senator Hutchinson is after.]

[Kai von Fintel sent in a link to a press release from the Senate Commerce Committee, indicating that

The U.S. Senate Committee on Commerce, Science, and Transportation today approved S. 2802, the American Innovation and Competitiveness Act, by a vote of 21-0.
[...]
S. 2802 responds to recommendations contained in the Council on Competitiveness' Innovate America Report and the National Academies' Rising Above the Gathering Storm Report. In responding to these reports, the legislation focuses on three primary areas of importance to maintaining and improving United States' innovation in the 21st Century: increasing research investment, increasing science and technology talent, and developing innovation infrastructure.

The bill sets authorization levels for both the National Science Foundation (NSF) and the National Institute of Standards and Technology (NIST). To increase the nation's commitment to basic research, the bill increases authorized funding for NSF from $6.4 billion in Fiscal Year 2007 to $11.4 billion in Fiscal Year 2011. The legislation authorizes NIST from approximately $640 million in Fiscal Year 2007 to $937 million by Fiscal Year 2011, and it establishes a Fiscal Year 2007 level of approximately $110 million for the Hollings Manufacturing Extension Partnership program (MEP), which increases to $130 million in fiscal years 2008 through 2011.

In addition, the bill requires the National Academy of Sciences to conduct a study to identify forms of risk that create barriers to innovation one year after enactment of the bill and every four years thereafter. The study is intended to support research on the long-term value of innovation to the business community and to identify means to mitigate legal or practical risks presently associated with such innovation activities.

The press release says that "A list of amendments that were adopted as part of a manager's package is attached", but the online version doesn't seem to connect to any such list. ]

Posted by Mark Liberman at 07:42 PM

And people say we monkey around


Yeah, yeah, another animal language story. We are sooo excited at Language Log Plaza that we are taking it in turns to bungee jump from Mark's helipad. Geoff P. was kind enough to let me go first, and I'm writing this upside down swaying in the breeze while staring through Ben's window, but he seems kinda busy typing, so even if he could hear me through the inch thick glass, I wouldn't disturb him. Maybe someone will pull me up soon. Anyway, it must be years months weeks since the last time it turned out animals could do so much more than anyone ever suspected.

And can you guess what those smart little critters can do now? They can make not one, but two different sounds. In combination. And the combination means something different from either sound. That's syntax! Everyone is saying so! Of course, it could also be phonology, but everyone isn't saying that. You see, the sounds are so far apart they seem more like words than phonemes. Listen for yourself. Oops, I meant here of course. And Chomsky has argued on many occasions that one of the hallmarks of human syntax is that there are really big gaps, or at least that's how I interpret him. So you can see why these new critters, putty-nosed monkeys no less, are really sending us off the deep end. Gosh, I mean these monkeys almost have compositionality. That would mean that the combined sound had a meaning that was built up of the meanings of the parts.

Based loosely on the work (and I haven't seen the original, so none of my comments apply to it) of Kate Arnold and Klaus Zuberbühler, of the University of St Andrews, as reported in Nature News, in an article subtitled "monkeys string sounds together to create meaning," ehhh, this sentence has a lot of parts to it, a wonder that you can even begin to parse it, and I want to wish you the very best of luck with getting all the way to the end, well, actually, I must confess I'm probably making life unnecessarily tough for you by writing it backwards as well as upside down, you'd never have known, would you, here is a putty-nosed monkey phrasebook you may find useful:

pyow: hey everyone, get away from the lower branches, or some ground beast might get you.
hack: hey everyone, get  away from the canopy or  an eagle might get you.
pyow ... hack: hey everyone, wherever you are, move.

You're impressed, right? The first time a monkey came up with that innovation the whole pack looked at him like he was crazy. But nowadays it's pretty much accepted. "Pyow hack!" "OK, we're moving, we're moving." (They don't actually say that last part. More of a Gricean inference.)

My dog, see, he's a pretty smart dog. He can make two sounds. He can whimper and he can bark. And sometimes he barks lots of times. And sometimes, if you shut him in the kitchen, he whimpers lots of times. But what he doesn't do is bark and whimper in the same sentence. Except when he wants to play with another dog and you're restraining him and he's excited but disappointed. But that doesn't count, cos it isn't in Nature. There will not be a Nature article "David's dog strings sounds together to create meaning."

Umm, you can pull me up now. Guys? Hello? Is anyone there?

Posted by David Beaver at 05:32 PM

Regale in basilica

A few days, ago, Victor Mair sent in a spectacularly mistranslated blurb from a package of mushrooms and seaweed, full of sentences like "It is the masterwork of the curiosity selected by our professional." One phrase in particular was baffling: "It is always the regale in basilica". Victor tracked down the Chinese original, and contributed this analysis

Well, I went and checked the parallel Chinese text for "It is always the regale in basilica" and this is what I found:

LI4LAI2 LIE4WEI2 HUANG2GONG1 YU4PIN3 HE2 GUO2YAN4 JIA1YAO2

A fairly literal translation of that would be "Throughout history it has been ranked among the royal products [for use in] the imperial palace and the delicacies [to be served at] state banquets."

Now you can get an idea of the brain-crunching that Sinologists have to go through every day.

Posted by Mark Liberman at 07:19 AM

May 17, 2006

Inconceivable!

Even A.O. Scott takes time in his review of the film adaptation of The Da Vinci Code to bash Dan Brown's prose, starting from the very first paragraph:

"The Da Vinci Code," Ron Howard's adaptation of Dan Brown's best-selling primer on how not to write an English sentence, arrives trailing more than its share of theological and historical disputation.

No, he's not about to "borrow" from some of Geoff's posts, but the two obvious digs may be interesting to Language Log readers nonetheless.

(Full disclosure: I haven't read any of Dan Brown's books nor have I seen the movie, and I don't plan to, thanks in no small part to Geoff and A.O. Scott. Consequently, I may have missed other more subtle digs in Scott's review.)

First, there's this curious comment about a pied-piped preposition (emphasis added):

To their credit, the director and his screenwriter [...] have streamlined Mr. Brown's story and refrained from trying to capture his, um, prose style. "Almost inconceivably, the gun into which she was now staring was clutched in the pale hand of an enormous albino with long white hair." Such language – note the exquisite "almost" and the fastidious tucking of the "which" after the preposition – can only live on the page.

Since the comparison is with the following alternative phrasing, isn't it the preposition "into" which is fastidiously tucked before the wh-word "which", rather than the other way around?

the gun which she was now staring into

Of course, this rephrasing makes the "which" unnecessary; it could be replaced by "that" or (better) omitted altogether, so I can almost see why Scott says what he says about the original. But still.

The second dig is very Pullum-esque: Scott takes the "almost inconceivably" bit that he has already shown his distaste for and incorporates it into the first sentence of his outline of the movie plot:

[A]n old man (Jean-Pierre Marielle) is killed after hours in the Louvre, shot in the stomach, almost inconceivably, by a hooded assailant.

Maybe Scott ripped Geoff off after all? Inconceivable!

[ Comments? ]

Posted by Eric Bakovic at 06:54 PM

Big much squib


I signed the e-mail "arnold, who much enjoyed the visit with beth et al. last night" and then realized that this use of much was of interest to me.  I've posted here on determiner much (vs. determiner a lot of), and "much enjoyed" has a different much in it, a VP adverbial of degree, but the various uses of much have a lot in common, including alternation with a lot (of) and an affinity for negative and interrogative contexts, so it was notable that what I wrote had much in a positive declarative clause.  In short order I racked up a list of puzzling properties of the VP adverbial much, beginning with a contrast in acceptability between preverbal positioning and postverbal positioning:

(1a)  ok  I much enjoyed these concerts.  (preverbal)
(1b)  ??  I enjoyed these concerts much.  (postverbal)


I'm inclined to asterisk (1b), but I'll settle for deep disapproval for now.  In any case:

Observation 1: The VP adverbial much is much less acceptable postverbally than preverbally.

Since my initial interest in determiner much had to do with its alternation with a lot of, I tried the VP adverbial a lot in the two positions, and found it to work essentially opposite to much: absolutely unacceptable preverbally, fine postverbally:

(2a)  *    I a lot enjoyed these concerts.  (preverbal)
(2b)  ok  I enjoyed these concerts a lot.  (postverbal)

Observation 2: The VP adverbial a lot is acceptable postverbally, unacceptable preverbally.

This is not so surprising; it's well-known that different adverbials have different privileges of occurrence in the several positions open to them.  Still, it's interesting that much and a lot look like they're parceling out the two positions between them.

On to irrealis contexts, in particular negativity and interrogativity.  After my posting on determiner much, John Lawler wrote me to claim that determiner much and many were in fact negative polarity items (NPIs) -- expressions that are restricted to certain contexts in which the factuality of some situation is not assumed or asserted, notably negative and interrogative contexts -- noting that they had been on his list of NPIs since he started keeping it, "around 1971 or so".  I disputed Lawler's claim, pointing out that determiner much and many are virtually never unacceptable in positive declarative clauses; instead, sometimes they just seem infelicitously formal, and other times they are impeccable, as in this example (one of several such) supplied to me by Marilyn Martin:

My first full year at the Hawaii Film Office has been filled with much joy and much pain. (link)

(Meanwhile, Amanda Kraus wrote to report that determiner much is very common in hip hop culture, citing "the frequent (too numerous to list) calls of 'much love' versus Led Zeppelin's 'whole lotta love'".)

In any case, my attention had now been drawn to irrealis contexts, so I checked out the negative and interrogative counterparts of the questionable (1b), and found them fine:

(3a)  ok  I didn't enjoy these concerts much.  (postverbal)
(3b)  ok  Did I enjoy these concerts much?  No.  (postverbal)

I concluded that there is a small island of NPI-hood in the much world, a place in which the affinity of much (in several of its uses) for negative and interrogative contexts has hardened into a restriction:

Observation 3: Postverbal VP adverbial much is a NPI.  (And its preverbal counterpart is not.)

(There are also a couple of idioms with much in them that are NPIs:  be much of a, as in "He's not much of a linguist" and "Is he much of a linguist?" but *"He's much of a linguist" 'He's an excellent linguist', and be much to look at, as in "He's not much to look at" and "Is he much to look at?" but *"He's much to look at" 'He's attractive'.)

At this point things got weirder.  In my earlier posting I'd pointed out that the alternation between much and a lot of as determiners is complicated by the fact that the modifiers that determiner much can take -- so, that, very, etc. -- are not available for a lot of (and that, correspondingly, quite can modify a lot of but not much), so that when you want to modify these quantity determiners, you'll be forced to choose just one of them, with the result that in many contexts determiner much improves in acceptability just by being modified:

(4a)  ?    With much shrubbery growing in front of it, the house seems dwarfed.
(4b)  ok  With that/so much shrubbery growing in front of it, the house seems dwarfed.

All the uses of much are subject to modification in pretty much the same ways, and this includes the VP adverbial much.  Preverbally, this much is fine unmodified, as in (1a), so it's no surprise that it continues to be fine when it's modified, but the postverbal version shows the amelioration effect in (4):

(5a)  ok  I very/so much enjoyed these concerts.  (preverbal)
(5b)  ok  I enjoyed these concerts very/so much.  (postverbal)

That is, we CAN get postverbal VP adverbial much in positive declarative clauses.  It just has to be modified.  Observation 3 has to be refined:

Observation 3 (revised):  Unmodified postverbal VP adverbial much is a NPI. 

This is a very small island of NPI-hood indeed.

But wait!  There's more.  So far I've been talking about the VP adverbial of DEGREE much; the meaning of much in the two examples of (5) is roughly 'greatly, to a high degree'.  But there's another VP adverbial much, namely a FREQUENCY adverbial with roughly the semantics and syntax of many times.  The frequency adverbials much and many times are possible, though a bit edgy, in postverbal position, but (like a lot, and unlike often) absolutely unacceptable preverbally:

(6a)  ?  I come here  much/many times.  (postverbal)
(6b)  *  I  much/many times  come here.  (preverbal)

(7a)  ok  I come here often.  (postverbal)
(7b)  ok  I often come here.  (preverbal)

We are forced to revise Observation 3 once again, to shrink the island still further:

Observation 3 (second revision):  Unmodified postverbal degree VP adverbial much is a NPI.

Enough of postverbal much for today.  On to preverbal much, as in (1a).  If you do a Google web search on <"I much">, as Thomas Grano did for me yesterday, you get an enormous number of hits, nearly three million.  Suspiciously many of them are "I much prefer".  Googling on <"I much prefer"> shows that about HALF of those original hits have the verb prefer, and that lots of the rest are junk of one sort or another.  Grano began to suspect that most verbs don't allow preverbal much, and we were quickly able to concoct near-minimal pairs like these:

(8a)  ok  I much appreciate your advice.  (APPRECIATE)
(8b)  *    I much believe your claims.  (BELIEVE)

(9a)  ?    I much look forward to her arrival.  (LOOK FORWARD)
(9b)  *   I much expect her to arrive soon.  (EXPECT)

Observation 4 (tentative): The default is for verbs to disallow preverbal degree much.

At the moment, Grano and I have no idea about what properties of verbs -- semantic, phonological, whatever -- might improve them as hosts for preverbal much.  It is known that there are verb-specific conditions on VP adverbials of degree; Pullum and Huddleston (CGEL, p. 579) survey the situation warily:

There are significant differences among degree adverbs.  Some, such as almost, nearly, quite, normally occur only in [preverbal] position.  Others, such as thoroughly, enormously, greatly, occur in either [preverbal] or [postverbal] position.  With this second set, [postverbal] position is the default, and acceptability in [preverbal] position depends on the verb.  Thus He enormously admires them is fine, but we cannot have *The price has enormously gone up.

With much, the situation seems to be:

Observation 5: Some verbs permit preverbal much, and also postverbal much if the much is modified, while others -- the default type, perhaps -- permit preverbal much ONLY IF IT IS MODIFIED, and disallow postverbal much entirely.

For appreciate (in (10)) vs. believe (in (11)):

(10a)  ok  I much appreciate your advice.  (preverbal, unmodified)
(10b)  ok  I very much appreciate your advice.  (preverbal, modified)
(10c)  *    I appreciate your advice much.  (postverbal, unmodified)
(10d)  ok  I appreciate your advice very much.  (postverbal, modified)

(11a)  *    I much believe your claims.  (preverbal, unmodified)
(11b)  ok  I very much believe your claims.  (preverbal, modified)
(11c)  *    I believe your claims much.  (postverbal, unmodified)
(11d)  *    I believe your claims very much.  (postverbal, modified)

Perhaps there are more than these two types.  Grano and I are just getting into this stuff, which is vastly more complex than we'd thought at first.  And we haven't yet looked at how the classification of verbs with respect to degree adverbial much lines up with their classification with respect to other degree adverbials.  And we're sure that there will be some variation here from speaker to speaker.

We also don't know if we're walking on a path that others have traveled on.  It usually turns out that Jespersen or Curme has been there, or Bolinger, or McCawley, just to name the most likely suspects.

[And now, an unsolicited letter of thanks, as the end of my year at the Stanford Humanities Center looms.  First to Thomas Grano, who (as an Undergraduate Fellow at the SHC) has worked with me all year on my project on the advice literature on English grammar, usage, and style in the 20th century; he's scoured this literature for treatments of particular points, collected data (usually by Google searches) on twelve different topics, and joined me in hours of discussion about interpreting what he and I had found.  It's been like having an annex to my mind. 

Thanks also to the SHC staff, for selecting him for a fellowship and providing him with practical support of several kinds, including free lunch whenever he wanted it, and to the office of the Vice Provost for Undergraduate Education at Stanford, which funded that fellowship, oversees the undergraduate honors programs (Grano has also just completed an honors thesis, on pronoun case in coordination), funds the Stanford Introductory Seminars (my advice-literature project grew out of sophomore seminars I taught over the years in the SIS program), and is now about to fund an undergraduate intern for me for this summer, to continue my research on the choice of variant expressions, like much vs. a lot (of).  In two past summers, the VPUE's office has funded interns for me on other pieces of my usage project (on the reflexive themself and on dangling modifiers), as well as interns for the Stanford ALL Project (on innovative uses of all).  The VPUE's office is there to benefit students, but obviously it does a lot for faculty too.

Finally, thanks to the sources of my own funding for this fabulous year: the School of Humanities and Sciences at Stanford, the Department of Linguistics at Stanford, and the Mericos Foundation, through a gift to the SHC's endowment.]

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 06:29 PM

Starlings, darlings

Monkeys, dogs, whales, chimps, gorillas, dolphins, parrots... Everybody seems to want to find language in animals. And then they turn round and deny it when it is seen in university students. Just twenty minutes ago at a Radcliffe Institute reception I met a woman who thought that for an 20-year-old American to use an expression like I should have went is evidence that something awful has happened to the linguistic capacities of Harvard undergraduates. What is wrong with everybody? Attributing language to a pet animal with a brain the size of a hazelnut but denying it in a fluent speaker of a slightly non-standard variety of English? Talk about a double standard! Listen, I'll grant you that you have found a language-using animal when you can show me one that is capable of plagiarism. No, no, don't tell me parrots and songbirds do it; they don't. They imitate sound streams. They don't even know it's language. To be a plagiarist you have to know what language is, and know how to use it, and know about authorship, and about concealment. It's a highly sophisticated linguistic activity. Nothing unintelligent about it. Like other language skills, plagiarism is not for the birds. Oh, and by the way, talking of birds, Ray Jackendoff and Mark Liberman and Barbara Scholz and I sent a letter to Nature about the Gentner et al. work on alleged syntactic skills in starlings (sigh; yes, I said starlings, darlings). Nature rejected it by lunchtime the next day (science journals almost always reject criticisms by linguists of non-linguists' papers about linguistic topics), but you can read the main part of the text of our letter on the LINGUIST List: follow this link.

Barbara Scholz pointed out to me that Medical News Today headlined its story about the starlings with a remark about how "The European starling ... may also soon gain a reputation as something of a grammar-marm". You just can't parody stuff as dopey as the headlines that get made up for science stories, can you?

Posted by Geoffrey K. Pullum at 06:19 PM

A tale of two copiers

In 1988, Molly Ivins published an article in Mother Jones magazine called "Magnolias and Moonshine". Seven years later, Florence King responded with an article in The American Enterprise magazine, September/October 1995, under the title "Molly Ivins, Plagiarist".

King accused Ivins of three things. The first thing was "gilding the lily". King had written in her 1975 book Southern Ladies and Gentlemen that the southern woman

... is required to be frigid, passionate, sweet, bitchy, and scatterbrained—all at the same time. Her problems spring from the fact that she succeeds.

Ivins quoted this as follows:

In her definitive work, Southern Ladies and Gentlemen, Florence King observes, “The cult of southern womanhood…requires [a female] to be frigid, passionate, sweet, bitchy, animated, and scatterbrained all at the same time…. A horrifying number of us succeed, which accounts for that popular southern female pastime, having a nervous breakdown.”

"Add a l’il more on there, honey, give the folks they money’s worth", King suggests.

The other two alleged authorial crimes are instances of apparent plagiarism. King writes that

My name is strewn through ["Magnolias and Moonshine"], but never where it counts. She credits me on minor observations, but when the subject is politics—her turf—she plagiarizes me.

King cites two instances, both of plagiarism in the paraphrase mode:

IVINS: “Keep in mind that Southerners are so conservative they voted for Franklin Roosevelt, so isolationist they voted for Richard Nixon, so populist they voted for Barry Goldwater, so aristocratic they voted for George Wallace, and that they see nothing peculiar in any of this.”

KING: “The typical Southerner:
—Brags about what a conservative he is and then votes for Franklin D. Roosevelt.
—Or brags about what an isolationist he is and then votes for Richard Nixon.
—Or brags about what a populist he is and then votes for Barry Goldwater.
—Or brags about what an aristocrat he is and then votes for George Wallace.
—And is able to say with a straight face that he sees nothing peculiar about any of the above.”

IVINS: “The Southern passion for military service first astonished the rest of the country in 1898, when Southerners signed up in droves to avenge the Maine. It was the country’s first war since Appomattox, and for 33 years Yankees had questioned Southern loyalty.”

KING: “In 1898, the phenomenon that surprised Americans nearly as much as the explosion of the battleship Maine was the vast number of Southern men who answered the call to the colors. It was America’s first war since Appomattox, and Southern loyalty had been in question for 33 years.”

King was very angry. She is quoted elsewhere as telling reporters that "if we had the right kind of laws in this country I’d challenge her to duel over this." She opens her TAE article by writing that "Most liberals sneer, grate, whine, scream, and picket, but Molly Ivins chuckles wisely and smiles tiredly so everyone will regard her as a lovable cynic", and sprinkles the piece with zingers like "she delivers laid-back wisdom with the serenity of a down-home Buddha who has discovered that stool softeners really work", and "Watching her go through her paces is like watching Ona Munson, who played Belle Watling in Gone With the Wind, doing an imitation of Spencer Tracy playing Clarence Darrow in Inherit the Wind. That’s a lot of wind."

In my opinion Ivins was clearly guilty as charged, although the longest stretch of literal copying was only four words long ("first war since Appomattox, and"), and none of the other literally-copied sequences are more than two words long. This was plagiarism of the paraphrasing type.

Apparently Ivins agreed, because her (apparently immediate) response was to 'fess up and apologize. The December issue of The American Enterprise magazine published an exchange of letters between Ivins and King, under the title "Author, Author!". (Both letters, puzzingly, seem to have been written before the first TAE article came out. I believe that time worked differently back in the last century, at least in the publishing industry.)

Ivins' letter:

August 16, 1995

Dear Ms. King,

You are quite right. There are three sentences in my article “Magnolias and Moonshine” —one of them a really good political line—that should have been attributed directly to you and are not.

On the third matter you raise in your Author Author! column in The American Enterprise, I have no idea how I managed to attribute to you more than you actually said—perhaps a recollection of something somewhere else in one of your books on the South. But I do not think a mistake of excessive attribution can be considered plagiarism.

I owe you an apology and I hereby tender it. I am deeply ashamed. I regret not giving you credit, and devoutly wish the matter had been brought to my attention earlier so it might have been corrected in subsequent editions and the paperback edition of the book.

I hope this does not sound too defensive to you, but there was no intention on my part to deceive anyone into thinking I had not read the many funny things you have said about the South. I hope my good faith is evidenced by the fact that I did cite you directly six times in the piece and praise one of your books as “definitive” on the peculiarities of Southerners as well.

I was inexcusably sloppy about the three sentences in question, with emphasis on the inexcusably.

Over the years, I have not only quoted many of your wonderful lines about the South in speeches—always, I believe, giving you credit—but also recommended your books to hundreds of people. I realize this does not excuse my lifting lines of yours without credit, but I did want you to know.

As for the rest of your observations about me and my work in your Author Author! column, boy you really are a mean b——, aren’t you?

Sincerely,
Molly Ivins, plagiarist

King's response:

August 24, 1995

Dear Miss Ivins:

Rather than rehash what I call plagiarism and you call careless attribution, I will speak in general terms.

First, the Washington Post, in breaking this story, referred to your “side” and my “side.” How can there be a “side” in this when everyone involved is either a writer or an editor? All of us, by definition, are on the same side—the word side. Every word I write is a piece of my heart, and I presume you feel the same way.

Second, I’m wondering how you managed to recycle me unchanged from the 1988 Mother Jones article into the 1991 book. When I compiled The Florence King Reader, I reread everything I’ve published over the last 20 years. I polished, revised, even rewrote some of the early selections to bring them up to my present standards, and I also prepared a fresh manuscript. This is how you catch mistakes. Anthologies are harder than they look, so please look next time.

Third, your publisher contends that I am seeking publicity by “attempting to hang onto the cape of Molly’s notoriety.” (You may want to take issue with him over his choice of words.) I have no need or wish for “notoriety”; celebrity is bad enough. I already have the only thing I want: the admiration and respect of people who know good writing and love the English language as I do.

Finally, it’s a shame this had to happen because you and I are such a pair of old rips that we probably would have gotten along like gangbusters. Please don’t spoil any more potential friendships.

Sincerely,
Florence King

And now for something completely different.

Recently, Mark Steyn (a witty political commentator, a lot like Molly Ivins) wrote a book review that had some very striking similarities to a 2004 weblog post by Geoff Pullum. Steyn's response to email from Pullum, requesting an apology and a link, was to have an assistant write a note saying that

We cannot see any similarities between Mark's piece and yours other than the quotations themselves, which obviously are the work of Mr Brown, and the grammatical term, which Mark was at pains to credit to you.

It's true that the three quotations are "obviously the work of Mr. Brown", but

Even facts or quotations can be plagiarized through the trick of citing to a quotation from a primary source rather than to the secondary source in which the plagiarist found it in order to conceal reliance on the secondary source. ["Copy Wrong: Plagiarism, Process, Property, and the Law," California Law Review, 1992; quoted in "What is plagiarism?", The Chronicle of Higher Education, 12/17/2004]

However, Steyn's "similarities" are not limited to a selection of quotations from Dan Brown, along with a set of ideas about why those particular quotations are interesting. For an even more striking similarity, the reader should consult the table at the end of this post, comparing Steyn's witticism "Novelist Dan Brown staggered through the formulaic splendour of his opening sentence" to its possible sources: the original quotation from Brown ("Renowned curator Jacques Saunière staggered through the vaulted archway of the museum's Grand Gallery"), the title of Pullum's post ("Renowned author Dan Brown staggered through his formulaic opening sentence"), and Pullum's reprise of the theme in the body of the post ("Renowned linguist Geoffrey Pullum staggered across the savage splendor of the forsaken Santa Cruz campus").

Given this (in my opinion clear) sign of influence, do you believe that Steyn didn't read Pullum before writing his review? How credible do you find it that Steyn came up independently with the idea of focusing on Brown's missing the's, and also with the particular examples and their order, and was subsequently given Pullum's grammatical terminology by one of his assistants

... because Mark asked if there was a technical term for a missing definite article and a Welsh University website, which led us to you, suggested the term had been coined by you.

The assistant pointed to an alternative chain of influence, also not credited in Steyn's review:

Mark's interest in this subject was piqued not by your website, with which he was not familiar, but by an item by Mark's former editor at his London newspaper, The Daily Telegraph, on the missing definite article in the first sentence of The Da Vinci Code. Mark mentioned it on a radio show last year and then noticed a similar start in Angels And Demons and wondered if it was a habit.

In a follow-up note, the same assistant insisted again that the idea came from Steyn or from his "colleagues in London", not from Pullum:

Mark had never heard of your website till last week [i.e. after writing the review - myl] and we will be able to demonstrate in court that nobody in our office clicked on two of your three allegedly plagiarized pieces until we received your e-mail. The points you claim Mark stole from you were made by others, including Mark and Mark's colleagues in London, long before we ever clicked on your website, as we would again prove in court.

The issue is not whether Steyn or his assistants clicked on Pullum's "website", but whether they copied Pullum's ideas and words, directly or indirectly. I guess it's possible that Steyn's "former editor at his London newspaper, The Daily Telegraph" borrowed Pullum's ideas and words in 2005, and Steyn then borrowed them from him -- the Telegraph's online archives only go back one year, so I can't check. But Geoff's posts on Dan Brown have been very widely read and circulated on the internet. As I mentioned before, one of them has for some time been on the first page of Google hits for {Dan Brown}. Language Log has gotten roughly four million page views since 2004, and around five percent of these are Geoff's Dan Brown posts, so that something like 200,000 people have read one or more of them. So it's also possible that someone emailed a copy of Pullum's posts to Steyn or to one of his assistants.

Whatever the detailed chain of transmission, I find it very hard to believe that Steyn wrote his Maclean's review "The Da Vinci Code: bad writing for biblical illiterates" without having read Pullum's post "Renowned author Dan Brown staggered through his formulaic opening sentence" (and perhaps others, such as "The Dan Brown Code"). What do you think?

I should mention that Steyn's assistant ended her communications with what might be perceived as a threat:

It is up to you whether you wish to escalate this any further. [...] But, given the intemperate nature of your e-mails, I think it would be better if you spoke to your lawyer and we will refer him to ours.

It's pretty common for people whose words and ideas are copied without attribution to get a little hot under the collar. In contrast to King's public take-down of Ivins, however, Pullum's private request for an apology and a link didn't mention challenging Steyn to a duel, or comment on the looseness of his bowels, or call him a windbag. And as Geoff made very clear, he doesn't see this as a legal issue but as a moral one, where the appropriate and courageous response would be a forthright apology. To my mind, the question here is whether Mark Steyn has as much grace and courage as Molly Ivins does.



[Update (with apologies for adding to what is already an overlong post): Ben Zimmer did a better job of searching the Daily Telegraph's archives than I did, and found the following. On 2/11/2005, Sam Leith's Daily Telegraph "Notebook" included the following item, reprinted below in its entirety:

The Da Vinci Code is an exemplary demonstration of the truth that, more than any other genre, a thriller need not be well written to work. Plotting and pace are all.

But seldom do books manage to grate from before the first word of the opening sentence. "Renowned curator Jacques Sauniere staggered through the vaulted archway…" It's the dog that didn't bark. The first word - "the" - isn't there. My theory is that a shadowy order of monks has stolen Dan Brown's definite article, and is guarding it at an ancient Templar priory.

According to Nexis, this appeared on p. 22 of the 11 Feb. 2005 edition, well after Pullum's widely-circulated posts on Dan Brown. In particular, Leith's little joke about the how Brown's novel "[grates] from before the first word of the opening sentence" is similar to what Pullum wrote 9 months earlier in "The Dan Brown code":

I am still trying to come up with a fully convincing account of just what it was about his very first sentence, indeed the very first word, that told me instantly that I was in for a very bad time stylistically.

The Da Vinci Code may well be the only novel ever written that begins with the word renowned. Here is the paragraph with which the book opens. The scene (says a dateline under the chapter heading, 'Prologue') is the Louvre, late at night:

Renowned curator Jacques Saunière staggered through the vaulted archway of the museum's Grand Gallery. He lunged for the nearest painting he could see, a Caravaggio. Grabbing the gilded frame, the seventy-six-year-old man heaved the masterpiece toward himself until it tore from the wall and Saunière collapsed backward in a heap beneath the canvas.

I think what enabled the first word to tip me off that I was about to spend a number of hours in the company of one of the worst prose stylists in the history of literature was this. Putting curriculum vitae details into complex modifiers on proper names or definite descriptions is what you do in journalistic stories about deaths; you just don't do it in describing an event in a narrative.

Leith might have come up with this idea independently, or he might have gotten it from Pullum and thought it didn't rise to journalistic standards of sharing credit, or he might have gotten it from someone who got it from Pullum. This sort of recycling of jokes has always been common, if not entirely sanctioned -- Oscar Wilde: "I wish I had said that." James Whistler: "You will, Oscar, you will."

Leith's note is a small thing, in any case: the fifth of five brief items in an editor's column of miscellanies. And it's credible that Steyn originally got the idea of focusing his review on Brown's the's from this note (though his Maclean's article doesn't credit Leith either). But wherever Steyn first got the idea from, I find it hard to believe, as I wrote above, that he didn't use material from Pullum's post "Renowned author Dan Brown staggered through his formulaic opening sentence", and perhaps other posts, in writing his Maclean's review. And the apparent scale of copying in that case, at least in my opinion, rises to the level where an acknowledgment (or after the fact, an apology) would be appropriate. ]

Posted by Mark Liberman at 12:02 AM

May 16, 2006

Recycling grammatical terminology

Christopher Hitchens' latest fighting words column for Slate ("Don't Talk to the Mullahs", 5/15/2006) directs a few desultory insults towards his recent virtual debating partner Juan Cole, while describing Mahmoud Ahmadinejad's letter to President Bush:

It then turns to a pedantic discussion of the wrongness of the whole existence of the state of Israel, which might have been designed to make professor Juan Cole (who thinks that Khomeinist anti-Zionism is a derivation from Persian poetry) look like a fool and an ignoramus.

Though these insults are pedestrian, by Hitchens' standards, the column did feature a notable act of lexicographic creativity.

His innovation occurs in the last sentence of the paragraph quoted below:

The man is as mad as a hatter, therefore, and makes up for his impotence and insanity with many ingratiating assurances about Jesus and his honored place in the Quran and many lachrymose remarks about violations of human rights. He declares that his regime's nuclear program is a matter of "scientific R&D," and he ends with a salutation in Arabic which is given without translation in the news-agency versions that have been made available. The salutation reads, "Vasalam Ala Man Ataba'al hoda." This is a customary signoff by devout clerics, in Iran as well as in Arab lands, and can be approximately translated as "peace unto those who follow the true path." It was a favorite of the late Ayatollah Khomeini's. According to some, it was used as a silkily threatening mode of address by the Prophet Mohammed, who employed it when addressing neighboring states that had not yet converted to Islam. In this declension, it could be interpreted to imply war unto those who did not choose to follow the true path. [emphasis added]

As the AHD explains, Declension can mean things like

2. A descending slope; a descent. 3. A decline or decrease; deterioration: “States and empires have their periods of declension” (Laurence Sterne). 4. A deviation, as from a standard or practice.

but Hitchens has no descents, declines or deviations in mind. Instead, I believe, he's taking off from what the AHD gives as the first meaning of declension:

1. Linguistics a. In certain languages, the inflection of nouns, pronouns, and adjectives in categories such as case, number, and gender. b. A class of words of one language with the same or a similar system of inflections, such as the first declension in Latin.

However, Hitchens is not talking about a word, but rather about a phrase used as a "customary signoff", and he's not talking about its processes or categories of grammatical inflection. Instead, he obviously means something like "in this interpretation" or "in this construal" or perhaps "in this context of use". This is a plausible extended meaning, since the process of declension suits a noun, pronoun or adjective for different functions in different contexts, just the Arabic valediction is said to have different implications when addressed to co-religionists and to others. [Whether this interpretation is linguistically or historically correct is beyond my knowledge.]

It would be unwise to claim that no one has ever used the word declension in this way before, but I'm pretty sure that I've never heard or read it.

It's become standard for people using parse to mean "examine very carefully, especially with respect to possible alternative interpretations". This was originally a metaphorical extension of its basic meaning of "perform grammatical analysis", but now that grammatical analysis has largely vanished from the public curriculum, the metaphor is all that's left in popular discourse. As Hitchens has realized, a long list of other grammatical terms are also now available for re-use.

But there may be a narrow window of opportunity here: this metaphorical recycling can only happen as long as a fair fraction of the population can access, at least as a vague resonance, the literal meaning of grammatical terms like conjugation, inflection, mood, tense, affix, verb, predicate and the like. Unless there's a renaissance of linguistic analysis in primary and secondary education, this won't be true for long.

[Hat tip to Lane Greene]

Posted by Mark Liberman at 05:52 PM

Ambiguously numbered pronoun

My fascination with all things Apple has me occasionally reading the (likewise, occasionally) funny Crazy Apple Rumors Site. Yesterday's post was mildly funny, with a bit of corporate social commentary thrown in for good measure. What I found most interesting, though, was a very nice example of them being deliberately ambiguous between singular and plural. (Emphasis added.)

Crazy Apple Rumors Site has learned that former Apple General Counsel Nancy Heinen was released from the firm after failing to produce a pair of testicles. [...]

"We knew, of course, that Nancy was a woman," Jobs said. "But she long assured us that she had a pair of testicles that she kept in a safety deposit box somewhere." [...]

Late in April, Heinen reportedly stalled for time by saying that she had "loaned out the testicles to a friend who had forgotten to return them and then went on vacation and [she] couldn't get a hold of them."

"Them" apparently meaning either the friend or the testicles. Apple's male board members were apparently not impressed as they are usually quickly able to get a hold of their testicles.

Some related posts:

Less closely related (but still relevantly interesting):

[ Comments? ]

Posted by Eric Bakovic at 04:56 PM

Locative Epithets as Names

In all the foofuraw about the barbarism of referring to Leonardo da Vinci as da Vinci, nobody seems to have noticed that referring to people by their locative epithets alone is quite common. Here are some examples:

  • (Vincent) van Gogh
  • (Alexis) de Tocqueville
  • (James) van Allen
  • (Johannes Diderik) van der Waals

How many of you even knew the given names of the latter two?

Not only is the use of such names common, but I suspect that it is not necessarily the result of ignorance of the structure and usage of such names in their original language. In fact, one finds the same usage in the original languages. Check out the article on Democracy in America in the French language version of Wikipedia and you'll see that it refers to de Tocqueville , or the article on Van Gogh in the Dutch Wikipedia, which refers to Van Gogh.

Posted by Bill Poser at 01:07 PM

Cutting in line: what would Of Nazareth do?

Mark Steyn's review "The Da Vinci Code: bad writing for Biblical illiterates" starts like this:

It's a good rule in this line of work to respect a hit. But golly, The Da Vinci Code makes it hard. At the start of the book, Dan Brown pledges, "All descriptions of artwork, architecture, documents and secret rituals in this novel are accurate." It's everything else that's hokum, beginning with the title, whose false tinkle testifies to Brown's penchant for weirdly inauthentic historicity. Referring to "Leonardo da Vinci" as "da Vinci" is like listing Lawrence of Arabia in the phone book as "Of Arabia, Mr. L," or those computer-generated letters that write to the Duke of Wellington as "Dear Mr. Duke, you may already have won!"

This paragraph is just about the only part of what Steyn has to say about Dan Brown's book that is not strikingly similar to a Language Log post by Geoff Pullum. That's not to say, however, that it's original.

Actually, Geoff Pullum's 1/18/2005 post "The Kaleidoscope of Power" did say of The Da Vinci Code:

Even the title contains a linguistic error, Adam Gopnik claims in this week's issue of The New Yorker. Leonardo came from Vinci. Da Vinci is not a name. It's a prepositional phrase, like of Nazareth in Jesus of Nazareth. What would Of Nazareth do?

But Geoff credits Adam Gopnik. And this witticism is apparently not original to Gopnik either. A very similar point is made in a post by Emily in the weblog "it comes in pints?", dated almost a year earlier:

Though it still grinds me a little when people refer to Leonardo Da Vinci as Da Vinci, which is like calling William of Orange Of Orange. [March 18, 2004]

And the "of Nazareth" version is used on a web page that claims to have last been modified in October of 2004:

Also, Brown refers to Leonardo da Vinci, as "Da Vinci" , all through his book. Since "Da Vinci" means literally, "of Vinci", that would be the same as us calling Jesus, "of Nazareth", instead of, Jesus, of Nazareth ! [apparently October 26, 2004 or earlier]

The "da Vinci is like of X" meme has been picked up widely: on September 27, 2005, Jay Nordlinger wrote in the National Review:

Okay, a little language, and a little art. I want to say: Et tu, Antonin? While at the Juilliard School the other day, Justice Scalia referred to "da Vinci" — meaning, Leonardo. I'm surprised at him.

The mistake of referring to Leonardo as "da Vinci" is so entrenched, I'm afraid it's uncorrectable. I have had to fight with editors about this: You say "Leonardo," and they want to say "da Vinci," thinking it's his last name — thinking it's the same as saying "Reynolds." They think that, when you say "Leonardo," you're saying the equivalent of "Joshua." Actually, to say "da Vinci" is to say "of Orange," instead of "William."

Nordlinger quotes from Charles Moore's Spectator Diary "not long ago" (i.e. not long before September 2005):

My colleague Christopher Howse has pointed out that you can tell that The Da Vinci Code is rubbish just by its name. Students of art refer to the man in question as 'Leonardo', 'Da Vinci' being simply the identifier of his town of origin. So Dan Brown's title is the equivalent of a book about Jesus being called Of Nazareth. [That is much better than my "of Orange" example.]

(The passage in blue is Nordlinger's quote from Moore; the remark in square brackets is Nordlinger's comment on that quote.)

Depending on when Moore's piece actually ran (I'm not willing to pay 50 pounds for a subscription to The Spectator in order to find out), Moore's colleague Howse may have have misled him about the source of the "Of Nazareth" joke by failing to cite Gopnik. But this doesn't matter much, in my opinion -- it seems likely that some form of this insight has been commonplace among intellectuals for a while. It certainly pre-dates Gopnik, and it wouldn't suprise me to find a similar remark -- some re-phrasing of <referring to Leonardo as 'da Vinci' is like referring to X of Y as 'of Y'> -- dating from many years before.

However, it's worth noting that Pullum and Nordlinger both take the trouble to make it clear that the remark is not original to them. Steyn, in contrast, does not. As a result, quite a few bloggers have credited him with brilliance for trotting out his version of this well-worn witticism. For example:

Who looks at the world in quite the way Mark Steyn does? Here he's putting The Da Vinci Code into perspective:

It's everything else that's hokum, beginning with the title, whose false tinkle testifies to Brown's penchant for weirdly inauthentic historicity. Referring to "Leonardo da Vinci" as "da Vinci" is like listing Lawrence of Arabia in the phone book as "Of Arabia, Mr. L," or those computer-generated letters that write to the Duke of Wellington as "Dear Mr. Duke, you may already have won!"

Another one:

Steyn wins in a knockout. It wasn't really even a fair fight:

It's a good rule in this line of work to respect a hit. But golly, The Da Vinci Code makes it hard. At the start of the book, Dan Brown pledges, "All descriptions of artwork, architecture, documents and secret rituals in this novel are accurate." It's everything else that's hokum, beginning with the title, whose false tinkle testifies to Brown's penchant for weirdly inauthentic historicity. Referring to "Leonardo da Vinci" as "da Vinci" is like listing Lawrence of Arabia in the phone book as "Of Arabia, Mr. L," or those computer-generated letters that write to the Duke of Wellington as "Dear Mr. Duke, you may already have won!"

And another:

Quote of the day - Mark Steyn (the Da Vinci Code)

Referring to "Leonardo da Vinci" as "da Vinci" is like listing Lawrence of Arabia in the phone book as "Of Arabia, Mr. L," or those computer-generated letters that write to the Duke of Wellington as "Dear Mr. Duke, you may already have won!"

Or again:

A Steyn-slap to the Da Vinci Code. Nobody does it better. [emphasis added]

But in fact another writer did it at least as well, and in almost exactly the same way, and earlier. This sort of reputational mis-attribution is just what Jonathan Baron was writing about in his post "Plagiarism as probabilistic harm":

Plagiarism exists only as part of a system in which people are rewarded for their work. The reward in these cases is primarily reputation, or a personal record. [...]

When people pass off someone else's work as their own, they butt into the queue. This weakens the system and makes it less trustworthy. The harm from such weakening is probabilistic and not always noticed immediately.

Let me say at this point that I've been a fan of Mark Steyn's writing for some time. He's opinionated, clear, memorable and often very witty. For example, read the opening of his May 7, 2006 column "Moussaoui gets life, the terrorists win":

"America, you lose," said Zacarias Moussaoui as he was led away from the court last week.

Hard to disagree. Not just because he'll be living a long life at taxpayers' expense. He'd have had a good stretch of that even if he'd been "sentenced to death," which in America means you now spend more years sitting on Death Row exhausting your appeals than the average "life" sentence in Europe. America "lost" for a more basic reason: turning a war into a court case and upgrading the enemy to a defendant ensures you pretty much lose however it turns out. And the notion, peddled by some sappy member of the ghastly 9/11 Commission on one of the cable yakfests last week, that jihadists around the world are marveling at the fairness of the U.S. justice system, is preposterous. The leisurely legal process Moussaoui enjoyed lasted longer than America's participation in the Second World War. Around the world, everybody's enjoying a grand old laugh at the U.S. justice system.

Except for Saddam Hussein, who must be regretting he fell into the hands of the Iraqi justice system. Nine out of 12 U.S. jurors agreed that the "emotional abuse" Moussaoui suffered as a child should be a mitigating factor. Saddam could claim the same but his jury isn't operating to the legal principles of the Oprahfonic Code.

Whether or not you agree with Steyn's sentiments, you have to admit that this is brilliant writing. "Upgrading the enemy to a defendant" and "the Oprahfonic Code" are particularly nice touches, in my opinion. And as far as Google knows, these phrases are original to Steyn, as is the observation that Moussaoui's trial lasted longer than America's engagement in WWII. I bet that all the rest of it is original Steyn too, though I haven't checked.

My own experience has been that the students who engage in the more subtle forms of plagiarism are often among the smartest and most verbally adept. Something similar seems to be true of Stephen Ambrose, Doris Kearns Goodwin, Kaavya Viswanathan and the like. Such people are not borrowing in order to make up for their inadequacies, it seems, but in order to help establish or maintain a reputation that they have every reason to believe that they deserve.

[By the way, Steyn could have found a better replacement for his Duke of Wellington example (also a variant of an old joke) by looking in Geoff's August 26, 2004 post "A letter from the Lord Quirk". And if he'd cited it as Pullum cited Gopnik or as Nordlinger cited Moore, he'd have been welcome to it.]

Posted by Mark Liberman at 12:08 AM

May 15, 2006

Strange ifs of the third kind

In a recent New Yorker cartoon by Alex Gregory [April 17, 2006, page 74], a hospital patient, lying strapped down on the operating table and ready to be anaesthetized for an operation, looks up earnestly at the masked surgeon and says:

"You know, doctor, right now I'd really prefer if your sense of humor were a tad less self-deprecating."

The joke about what patients would think of surgeons' traditional operating-room humor is good, and made me smile; but the use of if in the caption really caught my attention. It's not the conditional if, you see — the one that you get in You can get it if you really want. And it's not the interrogative subordinator — the one that has exactly the same function as whether, as in I don't know if I'm coming or going. It's a very interesting if indeed. A third kind of if that almost no grammarians have written about. Let me explain.

The subordinator if that introduces interrogative content clauses is easy to spot: you just replace by whether and make sure the result is grammatical and has the same meaning. Prefer doesn't take interrogative content clauses: *I'd prefer whether your sense of humor were a tad less self-deprecating is obviously completely ungrammatical. So we can forget that. The cartoon does not have the interrogative subordinator if. What I have to do now is to convince you that we can tell the conditional if from the strange new third kind of if I am claiming English has. And I think I can do that.

My claim is that the patient in the cartoon is using if as a subordinator to introduce a declarative irrealis mood content clause, and that this is one of the grammatical possibilities with the matrix verb prefer.

Here's how to tell that it's not the conditional. Conditional if phrases are adjuncts, and you can always put an adjunct at the front of the clause it belongs to if you want. So we have pairs like this:

(1) a. You can get it if you really want.
b. If you really want, you can get it.
(2) a. I would die of embarrassment if that happened to me.
b. If that happened to me I would die of embarrassment.

In each case, because the (a) example is grammatical, so is the (b) example. Now look at what happens with the cartoon example (which I shorten by trimming the irrelevant stuff at the beginning):

(3) a. I'd prefer if your sense of humor were a tad less self-deprecating.
b. *If your sense of humor were a tad less self-deprecating, I'd prefer.

The version with the if clause at the front is ungrammatical! The situation is exactly comparable to this one:

(4) a. I'd prefer for your sense of humor to be a tad less self-deprecating.
b. *For your sense of humor to be a tad less self-deprecating, I'd prefer.

What's going on here is that you can always front an adjunct, but in general, it is much less likely that you will be able to front a complement clause. The sentence (3a) is comparable to the sentence in (4a) in every way, in fact: they mean just about the same, and I'm saying they have just about the same structure.

One difference between clauses introduced by if and clauses introduced by for is that for introduces an infinitival clause (it has the verb in the plain form and a to at the beginning of the verb phrase), but if introduces a finite clause in which the verb is in the special irrealis form if the verb has one.

The inflectional system of English is much less complex than it was a thousand years ago, and today there is only one verb that has an irrealis form that is (in spelling and pronunciation) different from its preterite. That verb is be. And even for be, there are only two contexts in which you can tell the irrealis from the preterite: the first person singular (if I were you) and the third person singular (if he were really smart). In all other person/number combinations, were is the preterite form too, but in these two the preterite would be was, so we can see whether the irrealis is being used. And by a stroke of good luck, that was the verb the cartoonist chose as the main verb of the complement clause! If it hadn't been — say, if he had chosen to have the patient say, "I'd really prefer if you stopped making jokes" — I wouldn't have been able to show you that it really is a special irrealis mood form in there, so there are special syntactic properties of this construction.

One other thing. Like other clauses, these if-introduced clauses are often found in what we call extraposition, with a meaningless it occupying the subject or object slot instead of the clause occupying it: we get It would be silly if you gave up now, and I'd appreciate it if you'd take your hand off my leg. But in these cases, since sentences like It would be silly and I'd appreciate it are grammatical in their own right (with a meaningful it that refers to something), it is not so easy to tell that you are looking at content clauses. There is a separate reading of these sentences under which the it is meaningful and the if introduces a conditional adjunct. That is, the sentences are ambiguous. What is so nice about the cartoon example quoted above is that it doesn't have an it object. And conveniently for my purposes, *I'd prefer is ungrammatical on its own. That tells us conclusively that we have a declarative irrealis content clause in our cartoon caption example. Q.E.D.

To summarize, before I dismiss the class: there are at least three items spelled if. Two of those three are described in every grammar book, but the third has virtually never been described anywhere. This latter one, this strange if of the third kind, introduces declarative content clauses in the irrealis mood. And let me make it very clear: there will be a test on this material, and it will be on the final.

Posted by Geoffrey K. Pullum at 11:43 PM

Linguistics fails again

In the May 13, 2006 New York Times, there's an article by Diana Jean Schemo about the controversy at Gallaudet University over the selection of Jane K. Fernandez as president, "Protests Continue at University for Deaf". The article is an interesting account of an unusual situation, as you'd expect from one the paper's main higher-education reporters. However, it contains one very odd sentence:

Deaf students here said that American Sign Language, which uses gestures to express words and ideas rather than specific letters, was easier for them to understand than other forms of communication that may translate letters and syntax that they have never heard and that are more difficult to grasp.

I think that this may be a reference to the difference between ASL and "finger spelling", in which the letters of written English are spelled out with a series of hand-shapes. However, the article continues

Erin Moran, who is studying for a master's in counseling and was handing out fliers opposing Dr. Fernandes, criticized her for not banning students from speaking in front of deaf students, instead of using only American Sign Language. When that happens, Ms. Moran said, deaf students feel shut out at an institution that should help strengthen their identity as deaf people with a right to participate fully in the world.

This makes it seem as if Schemo is contrasting sign language with spoken language, implying that speech "uses ... specific letters" "to express words and ideas". Perhaps this is Schemo's confusion, or perhaps it was introduced by an editor working to shorten a longer account of the linguistic issues involved. In either case, it's another example of the common confusion between languages and their writing systems, and another casual journalistic mis-description of speech and language.

Confusions about the nature of orthography and its relationship to language are most evident in discussions of Chinese, but there are plenty of examples within the boundaries of English. For another case involving a smart and well-educated person writing in a major American publication, consider Leon Wieseltier's account of his g-dropping choices in his Sopranos role as "Stewart Silverman". It's clear enough what pronunciation options Wieselter was getting at, though his description was completely inaccurate; in contrast, it's not at all clear what Schemo meant to tell us about the nature of ASL.

As I wrote about another casually botched linguistic description in the popular press, I blame the linguists. Modern intellectuals are almost entirely bereft of resources for talking about the simplest facts of pronunciation, sentence structure, and meaning. This isn't their fault -- in most cases, no one has ever taught them anything about these topics. My profession has failed in its most basic duty to society.

Posted by Mark Liberman at 06:58 PM

A grander Chinglish

Email from Victor Mair:

From the back of a package full of fancy mushrooms and seaweed products (I boiled a decoction of them for Li-ching last night):

It is made from the exclusive remote mountains or non-polluted maritime space. It is the masterwork of the curiosity selected by our professional. This series use the classy material, fastened on the edible value and health care. It is the pure natural health care. It is according to the pursue of modern to return to the nature and green life. It is always the regale in basilica and the best choice of presenting a gift to friends.

The first clause of the last sentence is particularly precious. I have no idea what it means.

Breathless,

Victor

Posted by Mark Liberman at 04:19 PM

Is Mark Steyn guilty of plagiarism?

I described the facts in an earlier post. It seems clear to me that in Steyn's 550-word discussion of Dan Brown's style, he took the terminology, most of the basic ideas, all of his three examples (in order), a couple of turns of phrase, and his punch line from one of Geoff Pullum's Language Log posts. He credits Pullum by name (though he gives no link or any other sort of source citation) for the term "anarthrous occupational nominal premodifier", but not for the rest of his borrowings. I promised to give my opinion later on, and this is a first installment.

When an undergraduate in one of my classes turns in a paper with a similar amount of uncredited copying, I ask him or her to come see me. We're not talking about something copied wholesale from a published paper or from an internet paper mill -- that would simply get a grade of zero and a referral to the Office of Student Conduct for further action. We're talking about a case where some basic ideas, a series of quotations or examples, and some key turns of phrase are taken from another source without explicit credit.

After laying out the facts, I'd ask for a response, which at first is usually a denial or an excuse. One of the commonest excuses is "But that source is in my bibliography/footnotes!" In that case, I would explain how Stephen Ambrose was accused of plagiarizing Thomas Childers, despite the fact that Ambrose gives Childers "a mention in the bibliography and four footnotes" (according to Fred Barnes' Daily Standard article of January 1, 2002, "Stephen Ambrose, Copycat" which also gives several examples of the copying involved). I'd also show them the coverage of the case where Doris Kearns Goodwin was accused of plagiarizing Lynne McTaggart, starting with what Timothy Noah wrote in a Slate article from 1/22/2002 headlined "Doris Kearns Goodwin, liar: First she plagiarized, then she lied about it":

Did Doris Kearns Goodwin commit plagiarism? "Absolutely not," she tells Boston Globe reporter Thomas C. Palmer Jr. "There were extensive footnotes.'' Chatterbox has had it with brand-name historians who pretend that the rules allow you to steal someone else's sentences (for examples of Goodwin's theft, click here) provided that you supply a footnote. This is not a gray area.

And I'd urge them to read the rest of the Ambrose/Goodwin coverage, in the hopes of persuading them that more is at stake than just their grade in one undergraduate class. These days, I might take this hypothetical student through the sad tales of Kaavya Viswanathan and William H. Swanson as well, to drive home the lesson about the consequences of plagiarism (though those were cases where no reference of any kind was given, even an inadequate one).

Since students often remain convinced that what they did was OK, since they rearranged words a bit or paraphrased some of the material rather than quoting it, I give them a copy of the special report from The Chronicle of Higher Education, posted 12/17/2004, "What is plagiarism?" I might ask them to read these two paragraphs out loud:

Outright copying of someone else's writing is only the most clear-cut form of plagiarism. The Modern Language Association provides a succinct but sweeping catalog of varieties of plagiarism in its MLA Handbook for Writers of Research Papers: "A writer who fails to give appropriate acknowledgment when repeating another's wording or particularly apt term, paraphrasing another's argument, or presenting another's line of thinking is guilty of plagiarism."

The term "plagiarism" applies to "the imitation of structure, research, and organization," notes Laurie Stearns, a copyright lawyer in "Copy Wrong: Plagiarism, Process, Property, and the Law," an essay appearing in the California Law Review in 1992. "Even facts or quotations can be plagiarized," writes Ms. Stearns, "through the trick of citing to a quotation from a primary source rather than to the secondary source in which the plagiarist found it in order to conceal reliance on the secondary source." In the sciences, "accusations of plagiarism may center on the content of discoveries or the interpretation of data rather than on specific phraseology."

I also try to make sure that the hypothetical student understands that plagiarism is not at all the same thing as copyright violation. As the Chronicle article explains

If Smith copies a chapter from a book by Jones without permission, then the rights of the copyright holder have been violated. But suppose Smith paraphrases the chapter, argument by argument. In that case, Smith will have copied the ideas, but not the expression, of a copyrighted work. If no credit is given, then Jones has every reason to complain about being plagiarized. Still, assuming that Smith has been careful not to borrow any of the language of the original, it will not be an infringement of copyright.

In his essay "Plagiarism, Norms, and the Limits of Theft Law: Some Observations on the Use of Criminal Sanctions in Enforcing Intellectual Property Rights," appearing in the Hastings Law Review in 2002, Stuart P. Green, a professor of law at Louisiana State University at Baton Rouge, writes that copyright law "protects a primarily economic interest that a copyright holder has in her work ... whereas the rule against plagiarism protects a personal, or moral, interest."

I might also try to engage their moral sense with a discussion of why plagiarism is ethically wrong, based on ideas like those that Jonathan Baron lays out in his blog post "Plagiarism as probabilistic harm". I try to explain that I don't think that they're an evil person, but what they did was wrong. In order to try to engage their sense of self-preservation, I underline (as I did at greater length here) that plagiarism in academic and journalistic writing is one of those sins against the social order that our culture often takes seriously, like murder, rather than one of those that it usually excuses, like extramarital sex.

After all of this discussion, what happens next depends on how the student reacts. Usually we have a tense but friendly discussion, at the end of which they agree to do the paper over again. If their first and last reaction were instead to be "I did nothing wrong -- see you in court!", I'd refer the case to my university's Office of Student Conduct and let them sort it out. I'm happy to say that this has never happened to me.

Unfortunately, this is roughly what happened to Geoff Pullum in the case under discussion. As I understand it, the sequence of actions and reactions was something like this. First, a Language Log reader emailed Geoff to tell him that he was mentioned by name in a Steyn piece (no reference given). Geoff googled Steyn and found the Steyn Online web site. He was expecting to see just some passing mention in a piece about something else, but found that Steyn's review of The Da Vinci Code seemed to be developed entirely out of his ideas. Thinking initially that it was merely a piece on the web site, Geoff wrote to Steyn and asked him if he could modify it with links to credit the Language Log pieces that had influenced it. A short time later, after learning that links were now out of the question because the piece was in final form had already appeared in print in Maclean's (the link from Steyn Online actually pointed to the Maclean's web site), and having had no immediate reply, Geoff wrote again to ask for an acknowledgment and some public attempt to to clarify the source of the ideas and examples. At no time did he mention legal action, copyright, or courts, because it was always clear to him (as it is to me) that this is not a matter to which copyright law could possibly be applied.

Steyn's assistant responded (and I paraphrase rather than quoting here) <<Steyn did nothing wrong -- see you in court if you dare to take this further. >>

Mark Steyn is, of course, not a student. So given his attitude, I think it's appropriate to refer his case to the court of public opinion. Make of it what you will.

Posted by Mark Liberman at 11:02 AM

Pulling (to) within: the paper trail

Last week I wrote about the peculiar sports expression "pull (to) within N" meaning 'narrow a differential of points, runs, etc. to exactly N' — and the even more peculiar construction "pull (to) within X-Y," where X-Y represents the score of two teams in a game. I traced the turn of phrase back to the language of boat-racing and horse-racing, which feature competitors shifting positions along a continuous course. When the spatial metaphor was borrowed into team sports like baseball and basketball, where scoring is discontinuous (expressed in whole numbers), the prepositional phrase "within N" came to be understood as 'previously more than N behind, now exactly N behind.' I sketched out a vague chronology for this shift, suggesting that by the mid-20th century it was common for sports commentators to talk about "pulling (to) within N points, runs, etc.," and that the newer version with a game score as the object of "within" was common by the 1970s. I based that chronology on a cursory glance through digitized newspaper databases, finding plenty of examples for the first sense c. 1950 and plenty for the second sense c. 1975.

My imprecision and lack of citational support did not impress Dr. Metablog, who had previously griped about the sportscaster usage of "within." Dr. M wrote:

I can't say that my memory supports his version of history.  I've been listening to basketball games on the radio since just after World War II, when the NBA came into existence, and I'm moderately sure that I didn't hear the idiom "within two points" until the 1980s at the earliest.  It was an unpleasant innovation in language that stuck painfully in my ear.  To the best of my recollection, "within" was an invention of Marv Albert -- one of his extremely limited repertoire of linguistic tics. Others: "from downtown," "served up a facial," " yesss," "a spec-tac-ular move."

Since I was remiss in providing actual citations the first time around, I'll do so below. As it happens, Dr. M has fallen prey to the dastardly Recency Illusion, as these semantic shifts emerged much earlier than I had initially estimated.

First, it's important to recognize that the sporting version of "within" has been percolating for quite a long time independent of its usage in the longer phrase "pull (to) within." For instance, during the 1880 baseball season (back when the National League was the only game in town), the Chicago White Stockings opened up a big lead over the other teams in the standings. At the time, the league championship was awarded to whichever team had won the most games at the end of the season. As the season wound down, the Chicago Tribune kept track of how close the White Stockings were to clinching the championship (a calculation that baseball buffs would later call the "magic number"). On Sep. 10, 1880, the Tribune reported that the White Stockings were "now within four games of the championship goal," meaning that they could lock up the championship with four more wins. Two days later, the Tribune continued the countdown with the headline, "The Chicago team now within two games of the championship." Clearly, "within two games" could not be construed as "less than two games away from," since the article explains that two was the exact number of wins that would guarantee Chicago the National League pennant.

Now what about the full expression with "pull"? By the early years of the 20th century, one variant form of the phrase had already become popular with sportswriters: "pull up to within N points, runs, etc." (This echoed less quantifiable expressions of the time like "pull up to within striking/hailing/speaking distance.") I found that "pull up to within N" was most common in baseball, with cites back to 1891 if not earlier. But by the turn of the century it could also be applied to a variety of other sports, such as bowling, tennis, and the burgeoning pastime of basketball:

Atlanta Constitution, Sep. 5, 1891, p. 6
For five innings the contest was interesting. Then the visitors pulled up to within one of tieing the score, and it became still more so.

Boston Daily Globe, Jan. 12, 1900, p. 4
In the seventh frame the visitors had pulled up to within 37 pins.

Washington Post, Aug. 24, 1900, p. 8 (headline)
Rally fell a run short. Phillies pulled up to within one run of Giants last time at bat.

New York Times, June 9, 1901, p. 20
Miss Jones made a splendid effort when the score stood at 5-3 against her, pulling up to within a single point of the set only to lose it finally after a long struggle by 10-8.

Trenton (N.J.) Times, Oct. 28, 1901, p. 7
During the last ten minutes Millville pulled up to within seven points of Trenton.

The version with "pull up to within" was eventually overtaken by the shorter variants "pull to within" or simply "pull within." Here's a selection of cites from 1924 and 1925 (as above, drawn from ProQuest and Newspaperarchive) showing that these variants were already becoming firmly entrenched by that time:

Danville (Va.) Bee, May 14, 1924, p. 11
McGraw's worried outfit lost their fourth consecutive game to St. Louis yesterday, 8 to 3 and the Cubs pulled to within half a game of second place by defeating Brooklyn, 3 to 1.

Washington Post, June 9, 1924, p. S2
The Cubs, by winning two games from New York, pulled to within one game of the Giants.

Olean (N.Y.) Evening Herald, July 11, 1924, p. 10
Cleveland pulled to within a game of St. Louis by squeezing a 4 to 3 win out of the stubborn Athletics.

Appleton (Wisc.) Post-Crescent, Sep. 26, 1924, p. 17
The Pirates, however, went down with colors flying in the ninth inning when with two out they rallied and pulled to within a run of the champions on Carey's home run drive.

New York Times, Dec. 21, 1924, p. S2
At one time in the first half Franklin and Marshall had pulled to within one point of the Hoboken team, the score standing at 7 to 6.

Washington Post, Feb. 21, 1925, p. S1
With Ryan and McNaney playing in place of Farley and Gitlitz, Bucknell pulled to within 8 points of the Hilltoppers.

Decatur (Ill.) Daily Review, Jan. 18, 1925, p. 9
Taylorville led all the way until the last quarter when the Y staged a rally and pulled within one point of the visitors.

Washington Post, Mar. 23, 1925, p. S1
The bakery team seemed to find itself at the start of the second period and at one time pulled to within six points of its opponents.

Bridgeport (Conn.) Telegram, Aug. 7, 1925, p. 9 (headline)
Senators pull within game of Athletics by twin win over George Sisler's club.

As with the earlier cites, there is no mistaking that "pull (to) within N" meant 'narrow the gap to exactly N.' This is true even when the expression was used to refer to "games behind" in the standings, despite the fact that such calculations can involve half-games. To take the last citation as an example, on the morning of Aug. 7, 1925 the Washington Senators were exactly one game behind the Philadelphia Athletics in the American League standings, not a half a game. (The nifty website Retrosheet verifies this.)

What about the second sense of "pull (to) within," where a game's score is the object of the preposition? It turns out my previous estimate of the 1970s as the time of its emergence was significantly late. (The Recency Illusion spares no prisoners.) I've found attestations all the way back to 1930, with frequency increasing to a high level by about 1950. Here are some cites taken from Newspaperarchive's database of regional papers, providing a range of sports coverage from local beat reporters to nationally syndicated wire stories:

Decatur (Ill.) Herald, Nov. 26, 1930, p. 7
The Banner Blues were slow getting started, and found themselves trailing 6 to 0 when the first quarter ended, but they pulled to within 11-10 by halftime and took the lead in the third quarter.

Nebraska State Journal, Jan. 13, 1937, p. 11
Then Coach Pop Klein put in his reserves and Hebron pulled to within 19-15 at the half.

Charleston (W. Va.) Daily Mail, Aug. 24, 1937, p. 14
Carbide pulled within 3-4 in the sixth when Ware got on base as his third strike got away from the catcher.

Clearfield (Pa.) Progress, Feb. 19, 1941, p. 3
The Bisons made a fourth-quarter surge, pulling to within a count of 25-28, but couldn't quite make the grade.

Kingston (N.Y.) Daily Freeman, Mar. 20, 1945, p. 11
St. John's pulled to within 14-13 at halftime and went ahead at 15 soon after second half started.

Joplin (Mo.) Globe, Feb. 28, 1946, p. 12
Greenfield pulled to within a 27-30 shade starting the last period and kept on the heels of the eventual winners all the way.

Post Standard (Syracuse, N.Y.), Feb. 10, 1947, p. 11
Kentucky had pulled to within 49 to 47.

Traverse City (Mich.) Record-Eagle, Dec. 13, 1947, p. 6
The Vikings, paced by their big center Johns, found the range in the third period and pulled to within a 23-16 count.

The vast majority of the "pull (to) within a score" citations from the 1930s onwards come from coverage of basketball, which was gaining quickly in regional popularity. It's not too surprising that basketball reporters would be the ones to develop this new usage, since the sport involves a great deal more game-score fluidity than relatively low-scoring sports like baseball, hockey and even football.

Finally, I wondered about the application of the "pulling (to) within" idiom to the political calculus of voting. Though I still haven't found much of a parallel to the "pull within a score" expression in the world of politics, citations for "pull (to) within N votes" are easy to spot by the 1940s:

Zanesville (Ohio) Signal, May 10, 1944, p. 1
Edging steadily upward, Atty. Gen. Thomas J. Herbert pulled to within 8,225 votes of James Garfield Stewart today in a stretch finish for the Republican nomination for governor.

Charleston (W. Va.) Gazette, June 25, 1948, p. 1
On the second [ballot] he [sc. Thomas E. Dewey] raided opposition camps, lassoed stray votes from delegation after delegation and pulled to within 33 votes of the glittering goal of 548.

The precise nature of these vote tallies suggests once again that "within" is used to denote an exact differential, though perhaps the 1944 example rounds up to the nearest 25. But if we return to the older variant of "pull up (to) within N," we can find citations in electoral contexts going all the way back to the late 19th century:

Bucks County (Pa.) Gazette, Sep. 7, 1882, p. 2
The sixty-seventh ballot put Evans 3 1/2 ahead of Weand, and on the sixty-eighth Weand pulled up within 1/2, he having 35 1/2, Evans 36, Thropp 29 1/2, Godshalk 15, and Bean 2.

Fitsburgh (Mass.) Daily Sentinel, Oct. 27, 1883, p. 3
G. H. Kellogg said that Mr. Thayer was nominated a few years ago and pulled up within 300 of an election.

Frederick (Md.) News, Nov. 7,  1890, p. 3
The result of the canvas, however, so impressed itself upon the public mind that last year Mr. Russell again made the race and pulled up within 6,775 votes of Governor Brackett.

Again, it's possible that "within 300" rounds up the differential to the nearest 100 and "within 6,775" to the nearest 25. The citation from 1882, however, leaves no room for ambiguity: the difference between 36 and 35 1/2 is exactly one half. So it looks like this sense of "within" had already made the jump from sports to politics about 125 years ago. Marv Albert, you've been well and truly exonerated. ("Yesss!")

Posted by Benjamin Zimmer at 05:59 AM

Some striking similarities

In 2004 and 2005, Geoff Pullum wrote a few Language Log posts about Dan Brown's style. I think that they're among the funniest bits of stylistic criticism since Mark Twain took on "Fenimore Cooper's Literary Offenses", and I'm not alone in being impressed. Two of these posts ("The Dan Brown code", May 1, 2004; and "Renowned author Dan Brown staggered through his formulaic opening sentence", November 7/2004) are still generally among our top ten pages, and "The Dan Brown Code" has been the third or fourth Google hit for {Dan Brown} for some time. As a result of those posts, Geoff was invited to contribute to a collection called "Secrets of Angels and Demons", published in December of 2004.

Recently, Mark Steyn contributed a book review to the Canadian weekly magazine Maclean's, "The Da Vinci Code: bad writing for biblical illiterates". The online copy is dated May 10, 2006, and in paper form, the material appears on p. 54 of the May 15, 2006  issue. Steyn's piece is about 1300 words long. The first 550 words or so are about Dan Brown's writing; the rest is about the Gospel of Judas. If you read the first portion of Steyn's review along with the two Language Log posts that I've cited (here and here), I believe that you'll notice some striking similarities.

In this post, I'm going to limit myself to pointing out some of these similarities. I'll explain later what I think they mean. [Some opinions are now available here, here, and here.]

The first thing to observe is that Steyn cites Pullum:

The linguist Geoffrey Pullum -- or linguist Geoffrey Pullum, as novelist Dan Brown would say -- identifies this as the anarthrous occupational nominal premodifier, to which renowned novelist Dan Brown is unusually partial.

The reference is to Brown's habit of starting books with phrases like "renowned curator Jacques Saunière", "physicist Leonardo Vetra", and "geologist Charles Brophy". Roughly 400 of Steyn's 550 words on Dan Brown are focused on this intrusion of journalistic style into Brown's novels. Steyn's wording suggest to me that he is giving Geoff credit for the the grammatical terminology, not for the stylistic observations or the selection of examples. The reference to Pullum comes after two paragraphs describing two anarthrous examples of Brown's style (out of the three that Steyn quotes), which are presented as if the stylistic observation were Steyn's original reaction as a reader:

So I didn't like the title and then I began reading the book. In the beginning was the word, and Mr. Brown's very first one seems to have gone missing:

"Renowned curator Jacques Saunière staggered through the vaulted archway of the museum's Grand Gallery."

And after that I found it hard to stagger on myself. Shouldn't it be "The renowned curator"? What happened to the definite article? Did Mr. Brown choose to leave it off in order to affect an urgent investigative journalistic style? No, it's just the way he writes. Here's the first sentence of Angels &Demons:

"Physicist Leonardo Vetra smelled burning flesh, and he knew it was his own."

The key joke in Pullum's two cited posts was the observation that this phrasing (which Pullum calls "an occupational term is used with no determiner as a bare role NP premodifier of a proper name") is characteristic of journalism and never normally used in fiction, and that Dan Brown nevertheless uses it to start several of his novels. In his Language Log post "Renowned author Dan Brown staggered through his formulaic opening sentence", Pullum illustrates this point by discussing, in order, three quotes from Brown:

1. Renowned curator Jacques Saunière staggered through the vaulted archway of the museum's Grand Gallery.
2. Physicist Leonardo Vetra smelled burning flesh, and he knew it was his own.
3. Death, in this forsaken place, could come in countless forms. Geologist Charles Brophy had endured the savage splendor of this terrain for years, and yet nothing could prepare him for a fate as barbarous and unnatural as the one about to befall him.

Steyn's 400-word discussion of Brown's anarthrous style is also structured around the discussion of these three quotes. He elides the last phrase of the last quote, but otherwise he gives the same quotes in the same order, supplying no other examples:

1. "Renowned curator Jacques Saunière staggered through the vaulted archway of the museum's Grand Gallery."
2. "Physicist Leonardo Vetra smelled burning flesh, and he knew it was his own."
3. "Death, in this forsaken place, could come in countless forms. Geologist Charles Brophy had endured the savage splendor of this terrain for years . . ."

Following the last Dan Brown quote, Steyn produces a real zinger of a witticism:

Novelist Dan Brown staggered through the formulaic splendour of his opening sentence.

I'm not the only one who was impressed:

(link) And on a lighter note, I always enjoy Mark Steyn. This is great on the Da Vinci Code: "Novelist Dan Brown staggered through the formulaic splendour of his opening sentence."

Steyn's bon mot was also posted (as the example sentence in a fake Word For The Day post for "anarthrous") on May 11 on Free Republic.

Steyn's witticism is strikingly similar to the language of Pullum's post "Renowned author Dan Brown staggered through his formulaic opening sentence", as displayed in the table below.

  • First column: the first sentence of The Da Vinci Code, by Dan Brown. Words not reproduced in Pullum's two parodies of it are on a blue background.
  • Second column: the title of Pullum's post, parodying Brown. It is colored lilac where it echoes Brown, and pink where words are replaced by others in the parody.
  • Third column: Pullum's jokey echoing of his own title near the end of his piece, mingling Saunière's "staggering" (from The Da Vinci Code with Brophy's "savage splendor" (from Deception Point). It is colored green where it does not exactly echo the title (column 2) and pink where it does.
  • Fourth column: Mark Steyn's blend of Pullum's two sentences. Yellow signals words that are original with Steyn (there is just one: he replaces "author" with "novelist"); lilac signals words carried through all versions; pink indicates copying from Pullum that is not carried over from Brown.
  Dan Brown's sentence Pullum's parody Pullum's repetition Steyn's blend
1 Renowned Renowned Renowned  
2 curator author linguist Novelist
3 Jacques Dan Geoff Dan
4 Saunière Brown Pullum Brown
5 staggered staggered staggered staggered
6 through through across through
7 the his the the
8 vaulted formulaic savage formulaic
9 archway   splendor splendour
10 of   of of
11 the   the his
12 museum's opening Santa Cruz opening
13 Grand Gallery sentence campus sentence

 

[Seven of Geoff's Dan Brown posts are reprinted in our recent book, but of course they are still available in the Language Log archives, along with some others not yet reprinted:

"The Dan Brown code" (May 1, 2004)
"The sixteen first rules of fiction" (May 15, 2004)
"Dan Brown still moving very briskly about" (November 4, 2004)
"Renowned author Dan Brown staggered through his formulaic opening sentence" (November 7, 2004)
"Oxen, sharks, and insects: we need pictures" (November 8, 2004)
"Thank God for film: Dan Brown without the writing" (December 2, 2004)
"The kaleidoscope of power" (January 18, 2005)
"Learning the ropes in the trenches with Dan Brown" (July 14, 2005)
"Don't look at their eyes!" (July 19, 2005)
"A five-letter password for a man obsessed with Susan" (September 10, 2005)

For the many writers in need of material to deal with the imminent opening of The Da Vinci Code movie, there's a lot of good stuff in there that hasn't been re-used yet. ]

Posted by Mark Liberman at 12:21 AM

May 14, 2006

Think this

I recently mentioned that I had read a VF article about Steve Jobs. This led me to check out a recent book from my local public library: iCon / Steve Jobs: The Greatest Second Act in the History of Business, by Jeffrey S. Young and William L. Simon. On p. 236, I found this interesting little passage:

The new Apple billboards, spare and stunning, with a simple message of "Think Different," sprang up everywhere, even painted on the sides of buildings, announcing a fresh start for the company. They boosted employee morale. It didn't matter that the phrase was gratingly ungrammatical; maybe that was even part of its charm.

The "gratingly ungrammatical" bit here depends on two assumptions:

  1. Different is supposed to be an adverb that modifies the verb think.

  2. Different is an adjective; the appropriate adverb would be differently.

So, according to Young and Simon, the message on the billboard should have been "Think Differently". Hmm.

First, it's not clear that different is supposed to be an adverb in this case. Apple started this ad campaign after Steve Jobs' return in 1997; he had been ousted from Apple some 12 years earlier. Those 12 years saw Apple hit its lowest point, with little in the way of the kind of successful product innovation it's now well-known for. Jobs was always a maverick-type, wanting to do what nobody else dared to do (and sometimes failing at it, of course). Different in this case could thus easily be interpreted as a kind of object of think, as if in answer to the question: "What is the one word you think of when you think of Apple and Steve Jobs?" The answer could be "Innovative", or "Awesome", or "Different" -- hence, Think Different.

Also, distinctions between many adjective/adverb pairs have been slowly but surely eroding in English. Different/differently is among these pairs; the OED lists different as an adjective or an adverb, in the latter case meaning the same thing as differently and with the caveat "Now only in uneducated use." I think the erosion has gone so far that the "educated/uneducated" distinction made in this OED usage point comes close to simply separating pedants from most other folks; thus, the ad campaign benefitted from the slight double meaning: Apple thinks different(ly), and (therefore) Apple is different.

(In anticipation of the rumors that will no doubt begin flying about: no, this is not the adverb that David Beaver and I came to blows over. But if David has any problem with this post, he knows where he can bring it.)

[ Comments? ]

Posted by Eric Bakovic at 06:05 PM

Sighted in the wild

An Escher sentence sent in by Don Blaheta: "Don would know how much more true that is than I do!"

Apparently it's something about the month of May:

We cannot/must not understate/overstate ... ? (5/6/2004)
Escher sentences (5/7/2004)
An Escher sentence in the wild (5/8/2004)
Approximate inference and global (in)coherence (5/9/2004)
High plains construction grammar (5/12/2004)
Escher sentences: prior art (5/15/2004)
What is linguistics, and why do they embarrass your international customers? (5/28/2004)
Things that are rarely better than they normally are (10/17/2005)
Asking Dr. Language Log (5/12/2006)

[Update: Marilyn Martin writes

In the '40s an issue of the Reader's Digest had a short section on what they called "wolf sentences."

Two that I remember are

I feel a lot more like I do now than I did before.
Please permit those who are going out first.

Google and the Proquest historical New York Times database both appear to be ignorant of the phrase "wolf sentence(s)": can anyone supply more information? ]

Posted by Mark Liberman at 09:23 AM

May 13, 2006

Mitsuwhatzit


Last week I looked at mispronunciations and misspellings of the name Mitsubishi, in particular Mitsubushi (the winner in the misspelling bee) and Mitsibushi and Mitsibishi (the runners-up), arguing that the problem presented by Mitsubishi isn't in nativizing the Japanese name, but in remembering and retrieving the name correctly.  Then came the mail (it's always like that here at Language Log Plaza): about actual problems in nativizing words (mostly from Japanese), about Japanese car names and their etymologies, about factors that might have helped boost Mitsubushi to the top of the heap, and about still more manglings of Mitsubishi.  Here's the digest.


Problems in nativization (not all of which I understand).  Bill Poser, who has the cubicle just down the row from mine at LLP, wrote to wonder why English speakers so frequently mispronounce (and sometimes misspell) harakiri (as hari-kari) and karaoke (as karioki). 

Part of this -- the [i] instead of [e] at the end of karaoke -- is easy.  Final unaccented [e] is at best marginally acceptable in English, and is normally "fixed" by raising it to [i]; this sometimes shows up in spelling, in the variant karaoki.  You see raising not just in karaoke, but also in, for instance, karate, in Hare Krishna, and in some borrowings from Italian, like the salami and zucchini (Italian salame and zucchine) that M. I. Amorelli asked about (from Sardinia) on ADS-L back in April.

The second vowel of karaoke -- which is sometimes spelled karioki, in line with its most common pronunciation in English -- is a bit trickier.  This unaccented vowel would be expected to come out as a schwa, giving a sequence of vowels that isn't actually unpronounceable in English (it occurs in supraorbital) but is very rare.  So possibly that [i] is just a fix in the direction of a better unaccented vowel before [o].

[Added later 5/13/06: Aaron Dinkin writes to remind me that (unaccented) schwa generally gets raised to [i] before a vowel, as in Judaism and the three-syllable version of Israel, so karioki is just what you'd expect.  Words like supraorbital and (six-syllable versions of) extraordinary don't show raising, because they're "level 2" morphological combinations (in Kiparsky's terms).]

Harakiri > hari-kari is more puzzling to me.  Something like para-teary seems entirely pronounceable to me; it's just an absurd combination of elements.  As it happens, NOAD2 gives a straightforward nativization of harakiri as the first pronunciation, but then admits that the rhyming pronunciation -- the one I hear from everybody except pedants and people who actually know something about Japanese -- is also possible.  AHD4 goes a step further, and gives hari-kari as an alternative SPELLING for the word.  The correct A A I I spelling outnumbers the rhyming version A I A I about 5 to 1 in raw Google webhits, but we're still talking lots of A I A I spellings. 

Google also turns up some A I I I spellings (with the second vowel anticipating the two I's -- and [i]'s -- that follow), about one-third as frequent as the A I A I.  But this version might have served as an intermediate step from A A I I to A I A I: first anticipation, in A I I I, then an improvement of this into the satisfying rhyming pattern of A I A I.

I know, some of you are thinking that the baseball announcer Harry Caray (1914-98), whose name was pronounced just like hari-kari, must somehow be involved here.  But no, as a quick trip to OED2 shows.  The first OED cites under hara-kiri in fact have the spelling hari-kari, and this is in 1856, 1859, and 1862, surely before Harry Caray's PARENTS were born.  (By the way, H.C. was born Harry Christopher Carabina.)  We don't get "correct" spellings until 1871.  In 1888, we get one of each of these spellings, plus the possibly intermediate version hari-kiri.  So whatever is going on here, it's been going on for a very long time.

Japanese car names.  Several correspondents have pointed out that Mitsubishi is a meaningful compound in Japanese: mitsu 'three' plus hishi 'diamond' (in its voiced variant bishi).  The three diamonds are visible in the company's logo.  This has damn-all to do with the pronunciation or spelling of the name, but it's still entertaining.  (Even cooler is Bill Poser's observation that karaoke is also a compound: kara 'empty', as in karate, literally 'empty hand', plus oke, which is, wonderfully, a borrowing of English orchestra, somewhat truncated.  So orchestra traveled to Japan as oke and then came back inside karaoke.)

One other Japanese car name, Isuzu, gives trouble for English speakers.  Here the vowels are fine, but the S Z sequence is problematic.  As one correspondent pointed out, you'll hear the reversed Izusu (even in some old Isuzu commercials!), and occasionally the assimilated Izuzu.  Here the trick is to explain why Isuzu is troublesome but Suzuki is not.  Probably something to do with the voicing of [s] (when spelled with a single S) between an unaccented vowel and an accented one, as in presume.

Facilitating factors.  Back to Mitsubishi.  Several correspondents have suggested things that might tip the scales in favor of I U U I, over its closest competitors I I U I and I I I I.  The most common suggestion is that bushido, literally 'warrior's way' (bushi + do:) and referring to the code of the samurai, favors bushi (U I) in the second half of the name.  Of course, both I U U I and I I U I have bushi, but I U U I has the extra advantage of preserving the first half, I U, of the original.

More important, I'm doubtful that the word bushido is widely known among English speakers, even though it did make it into both AHD4 and NOAD2.  And even more doubtful that many speakers appreciate that bushi is a significant piece of the word bushido -- though there are Bushido Blade video games, and people who play these games are likely to know bushi meaning 'warrior'.  In any case, I think that the most bushi(do) could have done is helped I U U I a little bit.

One correspondent did suggest the word sushi (which is probably the Japanese word most widely known to speakers of English -- outside of brand names, of course) as a factor facilitating I U U I.  This has some plausibility.

Finally, one ADS-L poster suggested that I I U I should be favored because of the English proper name Mitzi.  It is true that of mitsi, mitsu, mutsi, mutsu, bishi, bishu, bushi, and bushu, only the first is pronounced like a generally recognizable English word (though bushi is not too far from bushy).  There was a general feeling on the ADS-L that Mitzi is now a name of too little currency to have much influence in reshaping word forms.

More manglings.  One further correspondent reported that a pronunciation with two [š]'s -- Mishybishy, as he represented it -- "was quite prevalent in Charlottesville VA a decade ago" and that he'd never heard that version anywhere else.  Here we have an anticipation, in the second syllable, of the [š] in the fourth syllable.  Plus the shift to the I I I I pattern.

It turns out that the spelling Mishibishi gets a modest number of relevant Google webhits, and they seem to come from all over the place, including Australia.  Mishubishi, preserving the original vowel pattern, gets even more.

No doubt there are more.  Well, yes, there are a few occurrences of Misubisi.  And Michubichi.  But it's time for me to put the Mitsubishi file away.  I can barely spell the word myself any more.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 02:53 PM

The geopolitical significance of sentence-final prepositions

At last , a rational argument about why it's a bad idea to end a sentence with a preposition. This long-awaited explanation turned up during a Daily Show bit about an Army essay contest.

Jon Stewart introduces the subject:

It's obviously no secret that the military has been having a difficult time fighting the Iraqi insurgency, but help is on the way. The U.S. Army has recently sponsored a civilian-only essay contest called "countering insurgency", to solicit ideas from the public on how best to defeat the Iraqi insurgency. According to Army journal Military Review, "nothing less than the future of the civilized world might depend on it." Oh, and uh please double space.

After a bit, he brings on John Hodgman, who reads part of one of the submissions, and complains about misuse of the word "literally".

Jon Stewart: All right, John.
Uh, you're reading these counter-insurgency essays for grammar.
John Hodgman: No, and also style and usage. I mean,
you can't fight a War on Terror if you're ending a sentence with a preposition.
Jon Stewart: Wh- Why is that?
John Hodgman: Uh, in the Middle East, that's seen as a sign of weakness.

In addition to critiquing grammar, style, and usage, Hodgman also offers some advice on rhetoric:

Jon Stewart: As to what would be an example of a good essay?
John Hodgman: Well, when it comes to writing an expository essay about counter-insurgent tactics, I'm of the old school.
First you tell them how you're going to kill them; then you kill them; then you tell them how you just killed them.

The contest is for real, and the winner should be announced soon. But Stewart is not quite right in calling this a "civilian-only" contest. The call says:

Anyone conducting serious research on issues related to counterinsurgency is invited to submit papers for consideration. However, papers written by government employees during the course of their government employment are not eligible.

I understand this to mean that active-duty military personnel could enter the contest, as long as the submitted paper was done on their own time.

Hodgman's argument -- or rather a slightly modified version of it -- is the only valid argument for thinking twice about sentence-final prepositions. However, we need to delete the reference to the Middle East: the people who irrationally see sentence-final prepositions as a "sign of weakness" are mostly Americans, I think, even though this particular superstition originates with a catty remark that John Dryden once made about a line in one of Ben Jonson's plays.

[Tip of the hat to John McChesney-Young, who sent a link to the Daily Show piece and supplied the title.]

[If your browser or OS has trouble with the Daily Show link given above, a YouTube link is here.]

Previous Language Log posts on phrase-final prepositions:

An Internet Pilgrim's Guide to stranded prepositions (4/11/2004)
A Churchill story up with which I will no longer put (12/8/2004)
A misattribution no longer to be put up with (12/12/2004)
Better a spectacular blunder than a hint of unseemliness (4/25/2005)
Ending with a preposition (5/17/2005)
More on Canadian French prepositition stranding (5/21/2005)
Who are you writing to? (6/2/2005)
If we look, simply, to the French (6/29/2005)
The French aren't really against (6/30/2005)
Avoidance (7/5/2005)
New Yorker search engine stark staring mad (9/20/2005)
Churchill vs. editorial nonsense (11/27/2005)

[Update: Kathleen Burt emailed

To me, "during the course of their government employment" means "while they are still employed by the government" or to put it another way, a person may not submit a paper while still being a government employee.

I feel that this is the plainest interpretation, and also ( a separate consideration) likeliest to be correct. I base this on having been, for most of my working life, a government employee. What we would have said, in order to convey what you understand by that sentence, is "while they are on the clock," although of course, salaried employees don't actually clock in.

Kathleen's interpretation may very well be true, though in other contexts the {"in the course of their * employment"} clearly does mean something like "while they are on the clock" or "as part of their duties". Thus

Under the NHS indemnity for clinical negligence, NHS bodies agree to meet the cost of claims for negligent harm caused by NHS staff in the course of their NHS employment, including their involvement in clinical trials.
Employees who lose or damage personal property in the course of their County employment may process a claim for reimbursement through the Claims Review Board as provided for in the Kern County Administrative Procedures Manual.

Presumably NHS employees who have car wrecks while shopping are on their own, as are Kern County employees who put their iPods through the laundry. In any case, the wording in the contest restriction doesn't distinguish between military and civilian government employees.

Another question, though: is "government" restricted by context to "U.S. Federal government", even though there is no explicit limitation? ]

Posted by Mark Liberman at 06:24 AM

May 12, 2006

Water may or may not run through it

Ten years ago, when I moved from the flatland East to the mountains of Montana, I had to learn some new language. I asked my wife, a native Montanan, what people call those V-shaped indentations in a mountain a few blocks from our home. She patiently explained that they are gulches. I had heard other people refer to these as ravines and gullies, and I needed to know the proper term. "How do you differentiate these?" I stupidly inquired. "Gulches have water running through them," she replied.

That seemed like an okay answer until I realized that I had never seen any water running down these gulches. I also remembered visiting Helena's Last Chance Gulch, where there was absolutely no water in evidence. It's now a sort of toney business center in that town. "Oh, there used to be a creek running down it," was her answer. So now I learned that a gulch is a formation where there is now or used to be water running through it but I'm not sure how an outsider is supposed to know this stuff. There are lots of things called canyons in Montana too, most of which have creeks running down them, so I asked her to whatdifferentiated a gulch from a canyon. "Size," she replied. "Big ones are canyons. Little ones are gulches." My next obvious question was, "How big is big?" No answer. Montanans know these things I guess. Easterners do not. Then there are ravines, which appear to be smaller than canyons but fit the same description: water formed them but it's not necessary for the water to be there now. A gully's formation stems more from a downpour than a creek. What I can't figure out is how I'm supposed to know whether or not water formed these things orginally and whether it's still doing it.

Now, in today's local paper, The Missoulian (see here) I see a story about a lawsuit centering on the terms, slough and ditch. Here's what happened. Rivers tend to migrate here in Montana, changing their courses every once in a while. When this happens, a slough develops near where the rivers used to flow. A man who owns the property where a slough formed did some channel changing construction, stocked it with trout, and barred others from fishing in it (a serious problem in this fishing paradise state). It now fits the definition of a ditch. Here's what the newspaper said:

District Judge Ted Mizner, in a long awaited, closely watched decision, has ruled that the slough, which draws most  of its water from the Bitterroot River, is "no longer a natural water body." "Perhaps as early as 130 years ago, the Mitchell Slough may well have been considered a natural water body under the Stream Access Law," wrote Mizner..."However, it cannot be seriously disputed that through natural processes the Bitterroot River has migrated to the west and its bed is substantially lower than the bed of the Mitchell Slough."

The judge's decision was a triumph for property owners. So now what used to be a slough has been renamed a ditch. The judge's ruling created a storm of protest from both conservationists and fishermen. They believe the stream access laws allow the public to use that slough. Objections were made that it was the extensive construction done by the property owner that caused the channel to change so much that it's no longer a natural part of the river, refining it in the process as a ditch. The judge admitted this but still ruled that the water therein is no longer any part of the Bitterroot River.

Although the legal battle here is about whether a man's private construction "improvements" can turn a slough into a ditch, in the process turning a public water access into private property, there seem to be some lexical issues here as well. It's probably okay for me to be confused about the terms we use to describe nature's rearrangements of the landscape, such as gulches, ravines, canyons, and gullies, but I wonder about the right of humans to convert and therefore rename nature-made sloughs into man-made ditches. Maybe, like trademarks, this is another example of law's attempts to control our language.

Posted by Roger Shuy at 01:47 PM

What a difference a comma makes

Pop quiz! In an NYT article entitled "Bolivian Says He Won't Pay Energy Companies", Bolivian President Evo Morales was quoted in one of the following two ways. Guess which one. (Answer below the fold.)

(1) "What we are looking for are partners, not bosses, that exploit our natural resources," Mr. Morales said.

(2) "What we are looking for are partners, not bosses that exploit our natural resources," Mr. Morales said.

Answer: (1), but I'll bet that Mr. Morales actually said (2) (more accurately, I'll bet that he said something in Spanish that means roughly the same thing as (2) as opposed to (1)). In case the distinction in meaning that I'm thinking of here is not clear, here's what I think each of the sentences above means -- and I trust this makes clear why I think Mr. Morales actually said (2), not (1).

(1') What we are looking for are partners that exploit our natural resources, not bosses that exploit our natural resources.

(2') What we are looking for are partners that do not exploit our natural resources, not bosses that exploit our natural resources.

I think that (1') is entailed by (1), while (2') is merely (though quite strongly) implicated by (2).

I'm also willing to bet that the crucial additional comma in (1) was added by a writer or editor, and that it does not reflect what Mr. Morales's interpreter said. The reason I think that is because if the interpreter had said something like (1), I think s/he would have said (1'') instead:

(1'') What we are looking for are partners, not bosses, to exploit our natural resources.

Deceptively small difference between that in (1) and to in (1''), isn't it? But the basic meaning in (1') is conveyed much better by (1'') than by (1), it seems to me.

(Readers may recall that I commented on how weird I find references to President Morales simply as "Bolivian" here; it appears this sort of thing is more common than I'd thought.)

[ Comments? ]

Posted by Eric Bakovic at 01:08 PM

Asking Dr. Language Log

In this morning's mail:

Dear Dr. Log:

A full page ad for the Red Cross in May 15, 2006 issue of The New Yorker has the following headline

What if harm's way was headed yours?

Why is this so jarring? It doesn't seem to be a straightforward syntactic problem:

What if John's car was blocking yours?

One hypothesis is that "way" has different meanings in "harm's way" -- path that harm is following -- and "your way" -- path towards you, and this mismatch interferes with ellipsis reconstruction (cf. Andy Kehler's thesis). Or is it something simpler?

Referentially Challenged, Philadelphia

A few minutes later, Prof. Challenged wrote back with "something simpler":

The problem is that "Harm's way" is not a sensible subject for "was headed".

He also suggested something more complicated, namely that the Red Cross question might be a kind of Escher sentence.

It's a treat to have correspondents who write in with interesting questions, and then write back with answers. This could become a regular Language Log feature.

[Credit for this Q&A belongs to Fernando Pereira]

[Update: Joe Malin points out that "headed yours" is a bit of old radio operator shorthand, for example:

One memorable Mason wireless dispatch: "Twenty-five torpedo bombers headed yours." The message cost the Japanese Imperial Navy every one of those airplanes, save one. [emphasis added]

So maybe the apparent ellipsis in this case is actually pragmatically-controlled anaphora. An argument against: {"headed yours"} gets only 27 Google hits.]

[Update #2: Paul Kay writes

I think harm's way has to be the object of a preposition, perhaps only one of {in, out of, from}. Also this is one of those PPs that can only be used predicatively:

*The platoon was foolishly relaxing in harm's way.
The platoon was foolishly relaxing, while [they were] in harm's way.
*[I hate it here.] Harm's way is a shitty place/Harm's way sucks.

[Well, maybe "Harm's way sucks" could work as a jeu de mots. It would require a lot of previous context.]

I.e., the problem seems more general than that harm's way isn't a proper suject for headed. I don't think it's a proper subject at all.

Also, it's one of those closely bound PP objects that resist extraction:

??Harm's way, I don't want my son to be put in.
??We reluctantly put the platoon in harm's way that couldn't be avoided.
We reluctantly put the platoon in danger/a dangerous position that couldn't be avoided.

I think Paul is right.

Here's a curious thing. In the large part of English poetry indexed by LION, there are 23 instances of "harm's way", of which 21 are "out of", 1 is "in", and 1 is prepositionless. The lone bare example is the 2nd through 4th lines of Paul Simic's "Ballad" (from Return to a Place Lit by a Glass of Milk, 1974):

A little girl picking flowers in a forest
The migrant's fire of her long hair
Harm's way she comes and also the smile's round about way

(Simic is apparently playing with the fact that we normally say "coming my way" or "coming Paul's way" but not "coming harm's way" -- despite the one web citation "without the Nova Scotia Health Research Foundation, all health research in Nova Scotia will come harm's way." This the same wordplay in the ad slogan Fernando cited.)

In quantitative constrast, on the web there are roughly twice as many examples of "in harm's way" as "out of harm's way":

out of 716,000
in 1,500,000
into 225,000
from 48,200

Without looking into it, I believe that this represents an idiomatic preference for boldness over protectiveness, perhaps connected to this idiom by the echoes of John Paul Jones' famous remark "I wish to have no connection with any ship that does not sail fast; for I intend to go in harm's way." Jones did not invent the idiom, however -- the OED gives citations for "out of harm's way" from 1661, and "in harm's way" from 1677:

a1661 FULLER Worthies (1840) I. xviii. 61 Some great persons..have been made sheriffs, to keep them out of harm's way.
a1677 T. MANTON Serm. Psalm cxix, civ in Wks. (1872) VIII. 5 To stand nicely upon terms of duty is to run in harm's way.

The web offers a few semi-convincing examples of "harm's way" with other prepositions:

Well, maybe those hurricane shutters can wait until 2007 - after all you slipped by harm's way in 2005, and maybe you'll do the same in 2006, right?
When he had driven Hood beyond harm's way, he returned and made all haste to put his army in readiness for the march to the sea.
Can you lead these 5 other men, and yourself, through harm's way, intentionally, and come out alive on the other side?
...he wanted to create a robotic spy plane that could fly above harm's way at altitudes above 60000 feet.

And there was an episode of the cult TV series Angel named Harm's Way (episode 9, season 5), which excuses various otherwise-odd word sequences on the web.

]

Posted by Mark Liberman at 07:54 AM

Mock Spanish or Mock Mock Spanish?

When the news broke that Cingular Wireless had revoked a cell-phone ringtone featuring Mock Spanish in a poorly conceived joke about border-crossing, I rattled off a post that suspected "racist intent" at work behind the ringtone (which had already drawn the ire of Latino advocacy groups). The post was based on early reports filed before anyone had investigated the origin of the offending ringtone, in which the Southern-sounding voice of an agent for "La Migra" (the Border Patrol) says, "I repeat-o, put the oranges down and step away from the telephone-o. I'm deporting you back home-o." Now we know more about the source of the ringtone, and the details not only undermine some of my initial assumptions but also raise a whole new set of questions to ponder.

The AP has reported that the "Migra" ringtone was the work of Mexican-American comic Paul Saucido. The company that developed the ringtone, Barrio Mobile, has apologized but has said that it was intended as a work of satire. A spokesman for the ringtone's distributor is quoted as saying of Saucido, "His position is that people of Hispanic background need to maintain a sense of humor about the immigration situation."

An interview with Saucido on Gearlog provides some further insight:

The ringtone, "La Migra Alert," is Saucido pretending to be an immigration official with a really bad fake Southern accent, saying, "I'm deporting you back home-o."
The character came from a brainstorming session between him and a few other Latino comedians, Saucido said, citing Dave Chapelle and Carlos Mencia among his comedic influences.
"It was inspired by other comedians who riff on the same stereotype of the immigration officer ... you know how people try to phonetically speak when they talk down to you, like, 'where is the bathroom-o?'" he said.
The ringtone came as part of a package of comic ringtone characters developed by Saucido, including a hovering, novela-obsessed Mexican mom, a Mexican dad, and a "barrio kid" who would say "I can't make it to the phone right now, I'm busy rotating the tires on my low-rider." All of Saucido's ringtones have been removed by Cingular, he says.
"I think because of the times, right now people are a little extra sensitive [about immigration issues,]" he said. "I'm sensitive to this issue! But people obviously leave their senses of humor behind when they get so much fever in them. I thought the Migra character was the last character that would get that kind of reaction."
Saucido says there's "absolutely" room for edgy comedy in the ringtone world.
"I've played it for all my friends and they love them - they're waiting for them to be sold, and they're like, where can we buy them?" he said. "These companies have got to have some backbone to say we bought this content, we believe in it, and we're not going to get rid of it just because the first advocacy group calls racism. Dude, everyone that produced them and worked for them - we're all Mexican!"

Satire's a very tricky thing, and context is key. If Saucido had presented the Mock Spanish of the "Migra" character in the context of a standup routine, his audience would be prepared to hear lines like "I'm deporting you back home-o" as the work of a Latino comic parodying "how people try to phonetically speak when they talk down to you" — Mock Mock Spanish, if you will. But the discursive frame of a cell-phone ringtone is wildly different from a standup act. Saucido's voice was recorded, disembodied from the original context of utterance, and commodified in the form of a downloadable audio file (Cingular ringtones cost $2.49 a pop). And the joke was further decontextualized in news reports about the controversy — especially since the ringtone had already been pulled from the Cingular site, leaving accounts in the print media as the only representations of the original routine. Despite the comedian's amazement at the backlash against the ringtone, it's not too surprising that the radical recontextualization of Saucido's work would lead many observers (including me) to miss the intended satire completely.

Saucido's original bit relied crucially on its own type of recontextualization: it took the condescending use of Mock Spanish by immigration officials and reframed it as satirical social commentary through the mimicry of standup comedy. But the misinterpretation of the joke once it was let loose on the world as a downloadable ringtone only demonstrates the unpredictable effects of recontextualization (and re-recontextualization, and re-re-re...), particularly in a highly charged political atmosphere as we currently find surrounding the issues of immigration and Spanish language use in the United States. Po-mo types talk about these pragmatic pitfalls in Derridean terms like "iterability" and "citationality" (see, for instance, the work of Judith Butler, who has written extensively on the difficulties of reining in "subversive resignification"). Whatever one's theoretical outlook, the controversy over the "Migra" ringtone would make a fascinating case study in misconstrued speech acts and failed performativity.

Posted by Benjamin Zimmer at 01:11 AM

May 11, 2006

Good story, bad headline

Since we often criticize journalists here on Language Log, I try to praise good reporting on language-related issues when I can find it. And Rafaela von Bredow's May 3 story about Dan Everett and the Pirahã, in Spiegel Online, is very good. She explains the facts, the interpretations and the issues in a clear and readable way. Unfortunately, her work is spoiled by a seriously misleading headline and sub-head -- which I'm sure that she didn't write. As Nicole Stockdale explains, "[h]eadlines are written by copy editors, who battle deadlines to clean up or rewrite the reporter's copy, massage it, then crown it with a spiffy headline". When the editor doesn't understand the story, the result is what you'd expect.

In this case, some anonymous Spiegel editor gave Rafaela von Bredow's story the title "Living without Numbers or Time", and the sub-head "The Pirahã people have no history, no descriptive words and no subordinate clauses. . ." It's true that the Pirahã lack number words, but it's false that they "[live] without Time". It's apparently true that they have no subordinate clauses, but false that they "have no history [and] no descriptive words".

So three of the five cited facts about the language are wrong. That's 40% correct, a failing grade by any reasonable standard. Good stories are often spoiled by bad headlines -- isn't it past time to do something about this dysfunctional aspect of journalistic culture?

[Before going on, I should note that the body of Bredow's article is marred by a couple of unfortunate phrases, like

What the tribesmen didn't realize, however, was that Everett, a linguist, was eavesdropping, and he could already understand enough of the Amazon people's cacophonic singsong to make out the decisive words. [emphasis and dictionary links added]

The idea that the Pirahã communicate via "harsh and unpleasant monotonously rising and falling inflections" is a value judgment added by the reporter or her editor, and it ill behooves a speaker of the much-maligned German language to sling around words like cacophonic. However, there are only few issues of this sort in the body of the article, and in my opinion they don't spoil its generally clear and insightful presentation of the basic facts and issues.]

For those interested in the aspects of Pirahã treated in the headline and subhead, the stuff about numbers is well covered in the links given here. As for subordinate clauses, Everett does argue that Pirahã lacks them, as has also been claimed for several other human languages. (Here's a sketch of what English might be like if it worked that way.)

With respect to time and descriptive words, I'll quote a few relevant passages from Daniel L. Everett, "Cultural Constraints on Grammar and Cognition in Pirahã: Another Look at the Design Features of Human Language", Current Anthropology, Volume 46, Number 4, August-October 2005 (a preprint is available for those without access to a subscription).

Dan's discussion of time and tense does suggest that something a bit unusual might be going on:

I have argued elsewhere (1993) that Pirahã has no perfect tense and have provided a means for accounting for this fact formally within the neo-Reichenbachian tense model of Hornstein (1990). This is an argument about the semantics of Pirahã tense, not merely the morphosyntax of tense representation. In other words, the claim is that there is no way to get a perfect tense meaning in Pirahã, not merely an absence of a formal marker for it. Pirahã has two tenselike morphemes, -a `remote' and -i `proximate'. These are used for either past or present events and serve primarily to mark whether an event is in the immediate control or experience of the speaker ("proximate") or not ("remote").

     In fact, Pirahã has very few words for time. The complete list is as follows: 'ahoapió 'another day' (lit. 'other at fire'), pi'í `now', so'óá 'already' (lit. 'time-wear'), hoa `day' (lit. `fire'), ahoái 'night' (lit. 'be at fire'), piiáiso 'low water' (lit. 'water skinny temporal'), piibigaíso 'high water' (lit. 'water thick temporal'), kahai'aíi 'ogiíso 'full moon' (lit. 'moon big temporal'), hisó 'during the day' (lit. 'in sun'), hisóogiái 'noon' (lit. `in sun big be'), hibigíbagá'áiso 'sunset/sunrise' (lit. 'he touch comes be temporal'), 'ahoakohoaihio 'early morning, before sunrise' (lit. 'at fire inside eat go').

Specifically, Dan thinks that this is a one of many linguistic symptoms of a general pattern, in which

Pirahã culture constrains communication to nonabstract subjects which fall within the immediate experience of interlocutors.

There's plenty of room for argument here about whether "nonabstract" is a fair characterization of morphemes that mean things like "remote" vs. "proximate", "other", "temporal" and so on. With respect to talk about tense and time, Dan argues that

[I]n the context of the present exploration of culture-grammar interactions in Pirahã, it is possible to situate the semantics of Pirahã tense more perspicaciously by seeing the absence of precise temporal reference and relative tenses as one further example of the cultural constraint on grammar and living. This would follow because precise temporal reference and relative tenses quantify and make reference to events outside of immediate experience and cannot, as can all Pirahã time words, be binarily classified as "in experience" and "out of experience."

In any case, there's no support for the view that the Pirahã "[live] without time". As one more nail in the coffin of this notion, I'll quote one of the example sentences from Everett 2005:

kohoai -kabáob -áo ti 'ahoai -soog -abagaí
eat -finish -temporal I you speak -desiderative -frustrated_initiation
"When [I] finish eating, I want to speak to you."

By the way, you might think that this example includes a subordinate clause, but Dan says "no":

There is almost always a detectable pause between the temporal clause and the "main clause." Such clauses may look embedded from the English translation, but I see no evidence for such an analysis. Perhaps a better translation would be "I finish eating, I speak to you."

What about the claim that the Pirahã have "no descriptive words"? The only part of the the Spiegel article that might have given rise to this preposterous claim is the sentence

Apparently colors aren't very important to the Pirahãs, either -- they don't describe any of them in their language.

Dan Everett does argue that the Pirahã have no basic color terms (though Paul Kay, one of the commentators on the Current Anthropology article, is not convinced). I'm not sure what it would mean for a language have "no descriptive words", but a couple of additional Pirahã examples should establish that it's not true in this case:

bii -o3pai2 ai3
blood -dirty/opaque be/do
"blood is dirty"

 

kahaí kai -sai hi ob -áa'áí
arrow make -nominative he see -attractive
"He knows how to make arrows well."
(lit. "He sees attractively arrow-making.")

[Update: Julia Hockenmaier raises a possibility that should have occurred to me -- the headline might have been botched in translation. She points out that the headline and subhead in the German version read:

LINGUISTIK: Leben ohne Zahl und Zeit
Das Volk der Pirahã kennt keine Vergangenheit, keine Farbwörter, keine Nebensätze. Das macht seine Sprache zur merkwürdigsten der Welt - und zum Zankapfel der Linguisten.

and comments:

'Zeit' in German means both tense and time, so I think this is just a translation error. Similarly, 'Vergangenheit' means either 'past' or 'past tense' (especially in this list of language-related terms), but not 'history' (that would be'Geschichte'). And the original doesn't say 'descriptive terms', but 'color words'.

She also asks

By the way, how is this absence of tense different from, say, Chinese? It really doesn't seem that unusual to me.

As I understand Dan's argument, he's claiming that the Pirahã's time-related morphology is consistent with their general cultural pattern, not that it's unique. ]

Posted by Mark Liberman at 08:20 PM

The hispanicization of American baseball, the status of Puerto Rico, and the achievements of Roberto Clemente


George F. Will (yes, THAT George F. Will) reports, in a review of Clemente by David Maraniss, New York Times Book Review, 5/7/06, p. 13:

Baseball has come a long way since the San Francisco Giants' manager Alvin Dark, in 1964, banned Spanish in the clubhouse.  In 1989 and 1990, five of the 26 major-league teams had a starting shortstop from the same Dominican town, San Pedro de Macorís.  In 2005, 29 percent of the players on the 30 teams' opening day rosters were born outside the United States -- 70 percent of them from the Dominican Republic, Venezuela or Puerto Rico.  Among the nearly 1,200 players on the 40-man rosters this spring, 10 of the 16 most common surnames were Hernández, Gonzalez, Perez, Ramirez, Rodriguez, Cabrera, Guzman, Lopez, Peña and Sanchez.

Four things of note here: the main point, which is the hispanicization of American baseball; the identification of Puerto Rico as being outside the U.S.; picking out the 20.3 percent of players who are from the Dominican Republic, Venezuela, or Puerto Rico by taking 70 percent of 29 percent; and the shift from origin in the Spanish-speaking Americas to possession of a Hispanic surname.


Here at Language Log Plaza, we've been remarking on American attitudes (often negative) towards the Spanish language, towards its speakers, and towards Latino/Hispanic Americans in general -- most recently, here, here, and here -- so it's nice to see a little report on how our national pastime has come to rely so significantly on Latino players.

Now, the list: the Dominican Republic, Venezuela, Puerto Rico.  All characterized as "outside the United States".  Puerto Rico is the oddity here.  (It's also relevant to Roberto Clemente, who was Puerto Rican.  And black.)  It has a status that puts it firmly both inside and outside the U.S.  Mostly inside in several respects, some of them described on the website of the Puerto Rico Federal Affairs Administration:

Puerto Rico's relationship with the U.S. Federal Government, as defined by the Constitution of 1952, is in many respects, similar to that of any other state. Matters of currency, defense, external relations and interstate commerce are within the jurisdiction of the U.S. Federal Government. The U.S. Constitution as well as most laws passed by Congress are applicable in Puerto Rico. Residents of the island however, do not pay federal income taxes and do not vote for President.  [On the other hand, since defense is within the jurisdiction of the U.S. government, Puerto Ricans are subject to the draft.]

And Puerto Ricans are U.S. citizens, but then so are residents of Guam and the U.S. Virgin Islands, both of which are U.S. territories.  (This is all so convoluted: residents of American Samoa are U.S. nationals but not U.S. citizens.)

According to that 1952 constitution, Puerto Rico is a semi-autonomous entity, officially named a Commonwealth.  (Kentucky, Massachusetts, Pennsylvania, and Virginia are commonwealths, but of course not semi-autonomous entities.)  The Commonwealth also is a possession of the United States, though not called a territory.  In any case, if Guam and the U.S. Virgin Islands are "outside the United States", which I think would be common usage, then Puerto Rico is even more so.

Still, many of us who live in the 50 states and the District of Columbia tend to think of Puerto Rico as more a part of the U.S. than Guam or the U.S. Virgin Islands -- as not really being "foreign" -- so having it on a list with the Dominican Republic and Venezuela seems a bit odd.

On yet another hand, Puerto Rico shares with the Dominican Republic and Venezuela (as against the United States) the property of having Spanish as an official language.  And that's directly relevant to Will's little history of the Spanish language in the U.S. major leagues.

Could Will have avoided "outside the United States"?  Well, there's a problem here, which we can see more clearly when we ask why he chose to refer to 20.3 of the players so indirectly, as 70 percent of 29 percent of them.  Why didn't he just say, "In 2005, of the players on the 30 teams' opening day rosters, just over 20 percent of them came from only three ____ where Spanish is an official language"?  But what plural noun fills in the blank?  Oh dear.  "Countries" or "nations" won't do, because Puerto Rico isn't actually a country or nation; as currently configured, it's not entitled to a seat in the United Nations, any more than Guam is.  "Places" and "lands" are too vague.  "Governmental entities" or the like would be too technical AND too vague.

There are work-arounds, for instance: "In 2005, the Dominican Republic, Puerto Rico, and Venezuela -- in all of which Spanish is an official language -- together supplied just over 20 percent of the players on the 30 teams' opening day rosters."  (Or maybe "1 in 5" rather than "20 percent".)  This avoids the "outside the United States" problem and also the 70-percent-of-29-percent problem, and makes the Spanish language point explicit.  It doesn't note explicitly that only three places account for so much of the rosters, but then Will's original didn't either.

Finally, the shift to Hispanic surnames, which rather muddies things, since the surnames point neither to place of origin nor to the real matter under discussion, the use of the Spanish language.  [Clarification added 5/13/06: The spellings of these surnames above -- with their inconsistency in the use of the acute accent -- are exactly as they appeared in the NYT review.]  For a moment, I entertained the idea that Will was slyly trying to insert the idea that if your ancestors were Spanish-speaking foreigners (from Latin America, at any rate), then you're (still) a foreigner too -- in which case that last sentence would be not merely only indirectly relevant to the topic, but also slimy.  Then I decided he was only noting that that Latinos, for some value of "Latino", are all over baseball these days, something that certainly wasn't the case in Roberto Clemente's time, and that Clemente himself, laudably, had a lot to do with that.

The review is mostly about Clemente, and it's sympathetic to and admiring of the man.  It even notes that he was "arrestingly handsome" as well as, in several ways, heroic.  Catch the sympathetic resentment in this report:

Clemente, playing in a city with a minuscule Latino population [Pittsburgh], said he felt like a "double nigger." As late as 1971 -- in one game that year, the Pirates became the first team ever to have nine black players in its starting lineup -- some sportswriters still quoted him in phonetic English: "Eef I have my good arm thee ball gets there a leetle quicker."

This about a man who died tragically, while trying to get aid from Puerto Rico to Nicaragua after a severe earthquake there in 1972. 

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 07:37 PM

No concept of the future, no yuccas either


Juan Forero reports, on the front page of today's New York Times, on a group of Nukak-Makú hunter-gatherers who have emerged from the Colombian jungle to seek refuge in the town of San José del Guaviare.  They are described as classic primitives, people who "have lived a Stone Age life" and are innocent of the ways of the modern world:

The Nukak have no concept of money, of property, of the role of government, or even of the existence of a country called Colombia. They ask whether the planes that fly overhead are moving on some sort of invisible road.

Their conceptual poverty extends, in Forero's somewhat confused account, to at least one basic temporal notion:

When asked if the Nukak were concerned about the future, Belisario, the only one in the group who had been to the outside world before and spoke Spanish, seemed perplexed, less by the word than by the concept. "The future," he said, "what's that?"


But much later in the story, we see that they are perfectly capable of planning for this putatively unconceptualizable future:

That is not to say the Nukak do not have plans.

Ma-be explained that the idea is to grow plantains and yucca and take the crops to town. "We can exchange it for money," he said, "and exchange the money for other things."

Now I don't know what word Belisario used to translate Ma-be here -- yucca is attested as an occasional variant of yuca, the name of a starchy tuber better known as cassava -- but American readers unfamiliar with tropical foodstuffs will mostly be puzzled by the idea that the Nukak hope to grow the spiky agave yucca as a crop.  "Yuca" would have been a better choice, and "cassava" even better than that.

Back to the future.  It's hard to see how Belisario's perplexity was about anything BUT words.  Somebody asked him if the Nukak were concerned about "el futuro", and Belisario asked what "el futuro" was.  End of story.  At this point we can begin to suspect that Belisario's command of Spanish, in particular its vocabulary, is not so great.  And we can begin to wonder what Spanish translations he gave to the other Nukak's reports of their plans for the future: did he use future tense forms?  In any case, the exchange about the future was about a word, not a concept.

So why did Forero report the exchange as being about a concept?  Because, once again, "primitive" peoples are being imagined as deficient in abstract thought.  It's a cousin of "the X have N words (for some large N) for Y, but no word for Y in general, so the X are incapable of conceptualizing Y as an abstract notion".  You know, those poor Eskimos, stuck in an avalanche of highly specific words for snow.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 12:56 PM

May 10, 2006

Pulling within

Here's an easy bet. Tune in to an upcoming NBA playoff game — say, tonight's matchup between the New Jersey Nets and the Miami Heat — and wait for one team to fall behind by a significant margin. Let's say, for the sake of argument, the Heat fall 10 points behind the Nets. Then wait for the team that's behind to stage a bit of a rally, for instance if the Heat bridge a 10-point gap and make it a 5-point game. I wager that the announcer will say one of two things:

a) And the Heat pull (to) within five!
b) And the Heat pull (to) within 90 to 85! [Or whatever the score is.]

How did "pulling (to) within" a scoring differential, or even more oddly, "pulling (to) within" a score, become the standard sportscaster-talk to describe a losing team rallying against a winning team? The answer lies in how the spatial metaphors of racing contests have been transformed by American team sports.

We're used to hearing the verb "pull" in various competitive contexts to describe one contestant's movements relative to another, as in "pulling even," "pulling ahead," and so forth. This sense of "pull" goes back to the literal sense of pulling on oars in a boat race, attested since the 1860s. It didn't take long for "pull" to get applied to other types of races, such as horse-racing. (And, like so many other horse-racing terms, "pulling ahead" and similar forms quickly entered the world of politics to refer to the jockeying of candidates in the polls.) With "pull" coming to mean "move in relation to the position of another competitor," the expression could be used to quantify the relative distance between competitors, as in "lengths" for horses on a racetrack (a measurement likely borrowed from boat-lengths in crew racing). An announcer at a horse race could say: "Horse A pulls (to) within three lengths of Horse B," with the implication that Horse A was more than three lengths behind and has now passed into a position within three lengths relative to Horse B.

When the idiom of "pulling within" a distance in a race started to get applied to team sports like baseball, football, and hockey, the metaphorical fit was not exact. If a baseball team is down 6-0 and then a player on the team hits a two-run homer, the gap has been narrowed from six to four. Unlike the continuous movement of boat-racing or horse-racing, however, scoring in baseball and other team sports is "discrete," as mathematicians call it: a team's score jumps from one non-negative integer to the next with no fractions in between. So the spatial metaphor of being "within" a certain relative distance doesn't exactly match the context of a baseball team down by a discrete number of runs. Nevertheless, it became common by the mid-20th century for announcers and reporters to talk about teams "pulling (to) within" a certain number of runs, points, goals, or even games in the standings.

Once this usage of "pulling (to) within" a scoring difference was established, the spatially defined sense of "within" began undergoing a subtle change in sporting contexts. It no longer implied necessarily a movement "inside" a given limit, like a number of lengths on the racetrack. Instead, the preposition "within" could simply suggest a narrowing of the differential between two team's scores. From there it was a short step for the score itself, rather than the differential, to be used as the object of the preposition "within." By the mid-1970s it became rather common for sports reporters to use this new sense in print, as in "The Penguins pulled within 3-2," or "Houston pulled to within 115-112." And it wasn't long before verbs other than "pull" were used in conjunction with the sense of "within" meaning "to a score of": for instance, the AP reported in Game 1 of the Nets-Heat series that Shaquille O'Neal had a late run "that drew Miami within 92-83."

For now, this emergent usage is restricted to sporting competitions. But I wonder how long it will be before we see a news story reporting that proponents of a failed Senate bill pulled the vote to within 52-48. After all, everyone loves a good horse race.

[Update #1: Paul Kay notes that in today's on-line Time Magazine, under the headline "Why Jeb Bush Won Big," Tim Padgett writes: "And when challenger and political rookie Bill McBride pulled to within three points in polls taken earlier this fall, it looked like they [the Democrats] just might [get revenge]." Indeed, "pull (to) within" has long been applied in political contexts to poll differentials measured in percentage points. (Bruce Rusk sends along examples going back to 1972.) What I have yet to see, however, is a political equivalent for "pull within (a score)." That's what I was trying to get at with my posited example of pulling a Senate vote (to) within 52-48.

Meanwhile, here's a mathematical perspective from Adrian Riskin:

I think that at least mathematically speaking it makes sense to describe one team as being within a whole number of points of another team. Mathematicians at least use "within" to denote being in an interval, and the question then becomes whether or not the interval contains its endpoints. Often it does not, as in the case when X is said to be "within epsilon" of Y. This always has the implicit meaning that the absolute value of X-Y is strictly smaller than epsilon. On the other hand, it's not uncommon in other contexts to see "within" used to mean "less than or equal to the actual differential". To find examples of this I searched for the word "within" in mathematical articles on arxiv.org (a scientific preprint server) and found a number of examples where it's used to denote the differential of whole numbers. ...
So I think at least the first usage of "within" by sportscasters that you discuss is not so strange (assuming that being similar to what mathematicians do can be described as being "not so strange"). The second usage I won't try to defend. On the other hand, I'd like to hear a sportcaster say, when the score is 6-3, that the losing team has pulled within 8 of the winning team. It's still true!

Even if it's mathematically defensible to say that Team X is "within N" of Team Y when the difference in scoring is exactly N, I find I'm not the only one who considers the sporting usage a bit annoying. Vivian de St. Vrain, aka Dr. Metablog, posted peevishly about this use of "within" back in March.]

[Update #2: Paul Kay sends along another political example, this time a bit closer to "pull within (a score)":

Both events would help the Democrats pull within parity in one or both houses this fall. (The Left Coaster) ]

[Update #3: These turns of phrase are even older than I had thought. Documentation here.]

Posted by Benjamin Zimmer at 02:56 PM

Colossus

OUP has just (May 4) published "Colossus : The secrets of Bletchley Park's code-breaking computers", edited by Jack Copeland.

According to the blurb on amazon's site,

The American ENIAC is customarily regarded as having been the starting point of electronic computation. This book rewrites the history of computer science, arguing that in reality Colossus--the giant computer built by the British secret service during World War II--predates ENIAC by two years.

Colossus was built during the Second World War at the Government Code and Cypher School at Bletchley Park. Until very recently, much about the Colossus machine was shrouded in secrecy, largely because the code-breaking algorithms that were employed during World War II remained in use by the British security services until a short time ago. In addition, the United States has recently declassified a considerable volume of wartime documents relating to Colossus. Jack Copeland has brought together memoirs of veterans of Bletchley Park--the top-secret headquarters of Britain's secret service--and others who draw on the wealth of declassified information to illuminate the crucial role Colossus played during World War II. Included here are pieces by the former WRENS who actually worked the machine, the scientist who pioneered the use of vacuum tubes in data processing, and leading authorities
on code-breaking and computer science.

A review by Alan Cane in today's Financial Times (subscription only) explains that

... the full significance of the Colossi in shortening the war is now becoming clear, with the release a few years ago of a 500-page document under the curious title "General Report on Tunny" that had remained highly classified since the war.

Written by three of the Bletchley Park codebreakers, Jack Good, Donald Michie and Geoffrey Timms, the report describes how the 11 Colossi were designed and used. Not to break the Enigma traffic - that was the preserve of Alan Turing's "bombes" - but to attack the German's most secret cipher which the Allies codenamed Tunny. This cipher carried the highest grade of German intelligence. Breaking Tunny was key to the success of the D-Day landings.

The "General Report on Tunny" can be found here. The FT review goes on to say that

Computing history, therefore, has to be rewritten. The credit for creating the first electronic computer has so far rested with ENIAC, an 18,000 thermionic valve monster built by Presper Eckert and John Mauchley at the University of Pennsylvania.

ENIAC, however, ran its first program at the end of 1945, two years after Colossus successfully attacked the Tunny codes.

After the war the Colossi were largely broken up, the documentation destroyed and any mention of Colossus or the part it played in the Allied victory suppressed under the weight of the Official Secrets Act.

The article concludes:

Are there lessons to be learned from Colossus? Only that the UK still lacks the skills to profit from its ability to innovate: and that until it acquires them, it will have to be satisfied with the tacit knowledge that "We did it first."

Well, ENIAC didn't bring Philadelphia any lasting dominance in the digital hardware business, either.

[More information can be found in the Wikipedia article on the Colossus, including some discussion of the code-breaking methods that Colossus was designed to implement. You'll see from that description

The Colossus computers were used to help decipher teleprinter messages which had been encrypted using the Lorenz SZ40/42 machine. Colossus compared two data streams, counting each match based on a programmable boolean function. The encrypted message was read at high speed from a paper tape. The other stream was generated internally, and was an electronic simulation of the Lorenz machine at various trial settings. If the match count for a setting was above a certain threshold, it would be output on an electric typewriter.

that it's not clear that Colossus should be considered to be a "general purpose" computing machine: perhaps ENIAC's laurels are safe.]

[Update: I should have known that I couldn't get away without citing Atanasoff and Zuse.

Linda Seebach wrote:

Eckert-Mauchly don't deserve credit for the first digital computer; John Atanasoff at Iowa State was ahead of them. But he didn't patent it. http://www.cs.iastate.edu/jva/jva-archive.shtml
[section below quoted from cited URL]
The Atanasoff-Berry Computer was the world's first electronic digital computer. It was built by John Vincent Atanasoff and Clifford Berry at Iowa State University during 1937-42. It incorporated several major innovations in computing including the use of binary arithmetic, regenerative memory, parallel processing, and separation of memory and computing functions. On October 19, 1973, US Federal Judge Earl R. Larson signed his decision following a lengthy court trial which declared the ENIAC patent of Mauchly and Eckert invalid and named Atanasoff the inventor of the electronic digital computer -- the Atanasoff-Berry Computer or the ABC.

Clark Mollenhoff in his book, Atanasoff, Forgotten Father of the Computer, details the design and construction of the Atanasoff-Berry Computer with emphasis on the relationships of the individuals. Alice and Arthur Burks in their book, The First Electronic Computer: The Atanasoff Story, describe the design and construction of the ABC and provide a more technical perspective. Numerous articles provide additional information. In recognition of his achievement, Atanasoff was awarded the National Medal of Technology by President George Bush at the White house on November 13, 1990.

Julia Hockenmeier asked:

I was also wondering, how exactly does Zuse's Z3 compare to the Colossus and the ENIAC?

And Mike Albaugh wrote:

Colossus vs. Enigma: sounds like a WWF match. Anyway, neither machine was as "General Purpose" as Konrad Zuse's Z1, built from scrounged metal bits in the living-room of his parents' apartment, in 1936. Of course, within their domains they were undoubtledly faster, as Z1 was purely mechanical. (Z2 was a later version in electro-mechanical form.)

As a docent at the Computer History Museum, I have learned to very carefully lay out the adjectives when describing a "first". There are a huge number of "first" computers, depending on the precise mix of adjectives.

Further details are available from the Wikipedia articles on Atanaoff and Zuse.

Meanwhile, my copy of Copeland's Colossus book has just arrived, which was the point for me here in the first place.]

Posted by Mark Liberman at 11:41 AM

Depreciate and deprecate: stay out of it

Furious altercations down the hall from the water cooler in One Language Log Plaza today. Nunberg was shouting, red-faced: "Doctor Johnson's vocabulary was good enough for him and it's good enough for me!" Several younger staffers were arguing with him: "You gotta move with the times!" Then somebody said, "For heaven's sake, there's only one letter different," and everybody turned and started shouting at once. Should Nunberg have used "depreciate" to mean "to lower in estimation or esteem" (the first meaning in the Webster entry), i.e., "lower the value of by expressing the opposite of appreciation for", hence roughly what "denigrate" or "deprecate" would mean? He did, in this post. And he meant it; it wasn't a typo. And the dictionaries back him up: the word can mean that. Yet a proofreader wrote to Nunberg saying he had a ROTFL moment when he saw it... Well, I don't know. I stayed well away from the whole rowdy scene. I've seen this kind of word-quarrel spiral down into violence, with men dashing glasses of chardonnay in each other's faces. One time I saw young Bakovic and Beaver actually come to blows over an adverb. We care about language here at Language Log. You might want to look up depreciate and deprecate. Their meanings are very close, yet their etymologies are quite different (the first from the Latin pret- root meaning "price", the second from the Latin prec- root meaning "pray"). Did Nunberg pick the right one? Maybe, maybe not. But I'm staying out of it.

Posted by Geoffrey K. Pullum at 09:25 AM

Mock Spanish in the cellular age

In a 1995 paper, the linguistic anthropologist Jane Hill argued that the register of "Mock Spanish" serves as "a site for the indexical reproduction of racism in American English." Though it may be hard to accept Bart Simpson's "Ay, caramba!" or Arnold Schwarzenegger's "Hasta la vista, baby" as racist discourse, even covertly so, there's no denying [*] the racist intent of a cell-phone ringtone recently pulled from the Cingular Wireless website. According to an article in the Brownsville Herald (picked up by the AP wire), the ringtone was called "La Migra," a Spanish term for the U.S. Border Patrol. The Herald describes the ringtone as follows:

In it, a siren is heard, followed by a male voice that says in a southern accent: "Calmate, calmate, this is la migra. Por favor, put the oranges down and step away from the cell phone. I repeat-o, put the oranges down and step away from the telephone-o. I'm deporting you back home-o."

The recording makes extensive use of -o suffixing, a feature Hill observes is one of the hallmarks of Mock Spanish. The most common example of this jocular suffixing is "No problemo," heard along with "Hasta la vista, baby" in the movie Terminator 2. As Hill notes, "No problemo" doesn't derive from Spanish (where the equivalent expression is No hay problema) but rather is simply the English colloquialism "No problem" with -o added. Hill's paper includes many more examples of such suffixation, from routine putdowns like "el cheap-o" to this personal ad in the UC San Diego student newspaper which seems to combine Mock Spanish with Mock Sicilian:

"Don Thomas -- Watcho your backo! You just mighto wake uppo con knee capo obliterato. Arriba!"

Cingular Wireless, to its credit, denounced the "La Migra" ringtone as "blatantly offensive" and pulled it from the site as soon as a reporter from the Brownsville Herald pointed it out. The AP reports that the ringtone was developed by "Barrio Mobile" and was available on the Cingular site beginning in late February or early March (since which time it had only been downloaded eight times). The timing of the discovery is rather inopportune, given how polarized the debates over immigration and Spanish-language usage have become since the eruption of the "Nuestro Himno" controversy (see here, here, here, here, and here). One can only hope that the news of the pulled ringtone might provoke some healthy introspection about ugly stereotypes of Mexican immigrants and the frequent offensiveness of Mock Spanish.

[* Update #1: Paul Postal takes exception with my assertion that "there's no denying the racist intent" of the ringtone:

It is not racist to make fun of Mexican/Spanish accents and cannot be. Spanish is spoken by people of many racial groups and made fun of by many. If an African-American makes fun of Arnold Schwarzennegger's accent, is that racist? Give me a break.

I acknowledge that it was overly glib of me to say "there's no denying..." when there may be many who deny the point. However, I use "racist" here as Jane Hill does in her paper, to imply a "racializing" effect (regardless of whether one consciously imagines Spanish speakers in the U.S. as a separate "race"). Quoting Hill:

I would argue, along with many contemporary theorists of racism such as van Dijk (1993), Essed (1991), and Goldberg (1993), that to find that an action or utterance is "racist", one does not have to demonstrate that the racism is consciously intended. Racism is judged, instead, by its effects: of successful discrimination and exclusion of members of the racialized group from goods and resources enjoyed by members of the racializing group. It is easy to demonstrate that such discrimination and exclusion not only has existed in the past against Mexican Americans and other members of historically Spanish-speaking populations in the United States, but continues today.

As I said, it may be hard to accept that, say, Arnold Schwarzenegger's catchphrases in Terminator 2 have the racializing effect that Hill ascribes to them, but the "La Migra" ringtone makes no bones about its reliance on racist stereotypes of Mexican border-crossers and arguably Mexican-Americans and Latinos more generally. This is an overt type of offensively racializing discourse, I think many would agree, regardless of one's feelings about what Hill identifies as "covert indexes" of racism in other Mock Spanish utterances.]

[Update #2: Turns out the ringtone was a work of satire by a Mexican-American comedian. Details here.]

Posted by Benjamin Zimmer at 01:35 AM

A racy WTF coordination

Joe Gordon spotted a headline that is both off-color (erotically so) and off-kilter (grammatically so) on Drew Curtis' Fark.com, a popular website where users comment on a variety of weird and wacky news