Language Log: June 2004 Archives

June 30, 2004

A small rant

Our field is spectacularly backwards sometimes. The web site for the journal Language is a disgrace. It doesn't even have titles and abstracts up yet for the excellent June issue, which reached me by snailmail some time ago. As for full on-line access to articles, well, the LSA's web site has a page for "LANGUAGE Online" that offers a (dead) link to a Project Muse site that promises access to the volume for 2001.

As I clicked the link and got this helpful page, several spiders and mice scurried back into the woodwork and a puff of dust wafted across the room.

In fact, Project Muse does really offer "Vol. 77 (2001) through current issue", at this address -- and the June issue is already there -- if your institution has a subscription. The disfunctional LSA page says that individual subscribers also get access, but I couldn't find any information on the Muse web site about that.

There's a free sample issue -- 77.1, March 2001.

You know, I think we could do better.

[Update: let me make it clear that I mean "we as a field could do better than we are now doing", not "some group that I profess to speak for could do better than the people now responsible". But really, under any construal, the current web site for the LSA's journal is a disgrace.]

[Response to comments:
Ken says: "The username and password for individual subscribers to get on to project muse is in the welcome letter that comes to new members and (I would hope) also makes its way to renewing people."
I say: I'm a subscriber, and I read every issue as it arrives, and I go to the meetings more often than not, but I've never seen this information. I don't doubt that it was in some accompanying letter that I didn't read. That doesn't excuse not having updated the relevant page on the journal's web site since 2001, with a dead link for the Muse connection.

Joe Tomei asks "Isn't one explanation the fact that LinguistList has soaked up much of the volunteer effort that online efforts require?"
I respond: I don't know, but LinguistList is not a substitute for a usable web site for the journal Language.

He adds "I have found that almost all the individual subscriber online systems to be problematic, and even the for profit sites (Oxford, Cambridge, Blackwell come to mind immediately) have some serious interface problems."
I reply: That hasn't been my experience. I just checked the web sites for Language Variation and Change, Computational Linguistics, International Journal of American Linguistics, American Anthropologist, American Speech, Speech Communication, and Linguistic Inquiry; all the sites were current and I didn't find any dead links. The APA, the ACL, the AAA, the ADS, etc. all seem to have working web sites that are kept current. To find a site as musty as the site for Language, I have to prospect among third-rate journals; but Language is a first-rate journal.

There are plenty of complaints to make about the scholarly and scientific publishing; I think the arguments for Open Access/author-pays approaches are compelling, for example. But this is not an argument at that level -- it's much more basic. When the journal's home page has a three-year-old dead link to the online version, it's like misspelling the editor's name on page 1. ]

Posted by Mark Liberman at 10:52 AM | Comments (5)

To pass into a certain condition, chiefly implying deterioration

OK, everybody, listen up. Sit back, hang on to your chair, and brace yourself. I'm going to violate the ultimate taboo.

No, I'm not going to defend the Sapir-Whorf hypothesis. I've already done that. And I'm not going to argue that chimps, dogs and parrots are starting down the road to language. I'm still thinking about that one.

I'm going to venture into territory where no linguist has ever dared to set foot -- until now. I'm going to praise William Safire.

The poor guy writes a few sensible paragraphs on gone missing in the NYT magazine, and he gets artfully slammed for it by our own David Beaver. Now, sometimes Safire deserves a rebuke. Many of his 50 "fumblerules" are recycled nonsense from centuries of self-appointed experts who invented linguistic principles out of thin air and tried to impose them by force of cultural authority. However, his discussion of gone missing is reasonable, and even uses some grammatical terminology correctly.

David expresses surprise that Safire writes as if "the opinions of people who don't have Language Maven stamped on their business cards actually count". This is unfair to a man who once wrote a book entitled "In Love with Norma Loquendi", referring explicitly to Horace's dictum that language will change "si volet usus / quem penes arbitrium est et ius et norma loquendi" ("if it be the will of custom, in the power of whose judgment is the law and the standard of language").

Safire starts with a question from a reader, Daniel Baldwin of New York: ''My intuition tells me that the term goes missing is grammatically incorrect,'' he writes. ''Here is a possible explanation: It is proper to link goes with a gerund (e.g., goes fishing) but not with a participle (e.g., goes missing). Am I on the right track?''"

Baldwin is all wet, actually. Go fishing involves a different sense of go: transitive sense 3 "to engage in" in the American Heritage Dictionary's entry, rather than intransitive sense 10.b. "to come to be in a certain condition". And fishing in go fishing is probably not a traditional gerund at all, since it can't be replaced by a noun (in contrast to "I regret destroying it" vs. "I regret its destruction").

Safire responds "A technically correct track -- I salute all gerundologists -- but headed in the wrong direction." I take this as a brief, polite way to pat Baldwin on the back for using words like gerund and participle (which Safire is probably as uncertain about as almost everyone else is), though in the service of an analysis that's wrong. Safire goes on to say "This is a tale told by an idiom that leaves many of its users vaguely uncomfortable", and I think this is correct, if a bit compressed.

Here at Language Log, we're not constrained by the petty word count restrictions of a New York Times column, and so I can go into more detail in support of Safire's point of view. As he explains, "one sense of to go is ''to pass from one state or place to another'", and this is the sense that is involved in go missing as well as many other expressions. His analysis is not new, but it's correct. The OED puts go missing under sense 44 of go:

44. To pass into a certain condition. Chiefly implying deterioration. a. With adj. complement: To become, get to be (in some condition). (Cf. COME 25a.) to go less: to be abated or diminished. Also with n. complement: to become, use, or adopt the characteristics of (something specified); to go all ____: see ALL C. 2c; to go bush: see BUSH n.1 9e; to go missing: to get lost; to go native: to turn to or relapse into savagery or heathenism; also transf. (cf. FANTI b); to go ____ on (someone): to adopt a particular mode of behaviour towards or affecting (that person); to go public: to become a public company.

Safire notes that go missing is "British English", and supplies an earlier citation than the OED does. He quotes "a naval correspondent for The Times of London on Aug. 10, 1877, in a dispatch about the Turkish armies in the Balkans'' as writing "I was obliged to return to Adrianople to get some supplies, as a box which should have reached me at Tirnova had gone missing.'' The OED's earliest citation is from 1944:

1944 E. BENNETT-BREMNER Front-Line Airline (1945) viii. 50 Qantas Empire Airways have been called upon to conduct searches for missing aircraft, and it was only natural, therefore, that being ‘Johnny-on-the-spot’ they should be asked to join in when aircraft went missing.

As the OED suggests, "go PREDICATIVE" often suggests that the predicative is a kind of deterioration: go bad, go ballistic, go bananas, go bankrupt, go blank, go cold, go crazy, go dead, go gray, go Hollywood, go lame, go mad, go native, go numb, go nuclear, go nuts, go sour, go vacant, go wrong.

Sometimes the corresponding positively-evaluated condition doesn't work: go bad but ?go good (in the sense of "become good", not as an informal version of "go well"), go crazy but ?go sane. However, there are some positively evaluated conditions in common inchoative collocations with go: go live, go platinum, go blonde.

On the other hand, there are also common adjectives expressing deterioration that don't seem to work well with go, preferring get instead: ?go sick, ?go fat, ?go dizzy, ?go sleepy, ?go antsy.

Note also that the meaning of the predicate is often restricted: thus went dead is fine if you're talking about a phone line or a radio, but doesn't work to mean that an animate being died.

All in all, the collocational propensities of go, in this construction, seem much more like derivational morphology than like normal compositional syntax. A good shorthand term for this situation would be idiom, and that's what Safire calls it. Good for him.

I need to temper my praise with a bit of criticism. Safire says that gone missing "may well stretch our hard-wired sense of syntax." This is completely nuts, at best. Is Safire saying that we are genetically disposed against making an inchoative out of a motion verb and a predicative? This would be weird biology and worse typology. Is he saying that our genes have been programmed by evolution to resist inchoatives that involve becoming lost? or include two-syllable predicative words starting with /m/? This is beyond even the bounds of journalism.

No doubt Safire means something much less grand. I bet his train of thought went something like this: "gone missing is on the edge, stylistically; I'll call this 'stretching our sense of syntax' in order to get the alliteration I'm famous for (remember the 'nattering nabobs of negativism'? those were the days!); and I'll add "hard-wired" as a nod to my fellow language maven Steve Pinker..."

Prof. Beaver expresses uncertainty about whether Safire has read The Language Instinct or not:

NEWSFLASH: Safire reads Language Instinct

Of course, I could be wrong about this. All I have to go on is the following paragraph in Safire's latest On Language piece...

I also can't be certain that Safire has read The Language Instinct, but Google tells me that he reviewed it for the New York Times, calling it "[a] deliciously erudite, if somewhat grainy, glass of Metamucil for the legion of English speakers troubled by irregular verbs."

Lila Gleitman has also testified that Safire has read some of her work, as well as Pinker's, though her experience was apparently not a positive one (perhaps because of the ritual uncleanliness alluded to earlier in this post?):

I believe the worst nightmare for any linguist would come in these three parts:

   (1) being cited by William Safire in the NY Times
   (2) being cited approvingly by William Safire in the NY Times
   (3) being cited approvingly as claiming the OPPOSITE OF WHAT ONE HAS CLAIMED by William Safire in the NY Times

These three nightmarish events happened to me (and my collaborators, Henry Gleitman and Elissa Newport) this week. Probably this was God evening things up for granting me a fond linguists' dream last year, i.e., being elected President of LSA.

Continuing briefly on the subject of go missing as opposed to William Safire... There are some more elaborated structures that seem to be more permissive than simple inchoative "go MODIFIER", like "go (all) MODIFIER on PRONOUN":

(link) Alexa just went all Googley on us.
(link) I asked Bernard for a few moments of his time, and he went all poetic on me.
(link) moveabletype went all 730 on me and purged everyone that had signed up for notification after may 31.
(link) they looked at their desks and scuffed their feet and went all shy on me.

Even just "all" seems to help:

(link) The Day the Universe Went All Funny.

as do some other modifiers:

(link) One of the projects, restoring a historic pier in Swansea, went somewhat pear-shaped when it was discovered the relevant planning permission hadn't been approved.

There also seem to be special cases involving colors (go white, go red, go yellow, go brown, go green) and some other semantic classes.

In fairness to Daniel Baldwin, there really is an issue about the syntactic category of the complement of verbs like seem -- "this seems disturbing" vs. "*he seems sleeping" -- and missing may fall on the wrong side of that line -- "*it seems missing". However, gerund vs. participle is not the issue, which in any case will have to wait for another day.

Update: Eric Bakovic points out to me that the Safire quote on Pinker -- "metamucil" etc. -- is actually from a review of Words and Rules. Oops. I still bet that Safire read The Language Instinct long ago -- or at least had one of his researchers summarize it for him.

Posted by Mark Liberman at 06:20 AM | Comments (4)

June 29, 2004

Speako?

In his recent post, Adam Albright uses parentheses, scare quotes, and a question mark to indicate a certain amount of insecurity with the use of the word "speako" to refer to the oral equivalent of a "typo". So much CYA-age seems unnecessary, though; Adam also provides a convenient link to wordspy.com's entertaining definition of "speako", giving it at least some external validation.

Still, I'm with Adam (assuming I understand his hesitations correctly): "speako" just doesn't sound exactly right, although I also agree with wordspy.com that "wordo" is worse (by far).

At a recent linguistics event in LA, Bruce Hayes used the word "talko" to refer to an oral typo. It made sense in context, but John McCarthy thought that Bruce was offering culinary advice ("talko" being homophonous with "taco" in Bruce's apparently Californiated accent). But even respecting the East Coast vowel distinction, "talko" doesn't quite do the trick, either.

I think the relative unacceptability of all of these forms has something to do with trying to promote -o to the status of a productive nominalizing suffix. But I haven't thought about it much beyond this. As a side-stepping alternative, my wife Karen suggests "mistalk", with final stress like "mistake" ... nah.

[ Comments? ]

Posted by Eric Bakovic at 11:09 PM

His speech lacked syntax

From Arnold Zwicky by email comes a citation to this quote from Barry Bearak's article "Poor Man's Burden" in the New York Times Magazine, June 27, 2004, p. 32; it's about Lula da Silva, the working-class president of Brazil:

His speech lacked syntax; he cut off the S's on his plurals like a peasant.

Bearak at least appears to be illustrating his claim about Lula having no syntax with a comment about something that is either morphological or phonetic. The Brazilian Portuguese of the lower classes, it would seem, drops either the plural -s morpheme (that would be about noun inflection) or any final s (that would be purely about pronunciation). Why is it that journalists make remarks of this kind about language when they don't even have a clue how to distinguish between sentence structure, word endings, lexical choice, pronunciation, spelling, and punctuation? It's like journalists writing about ecology being unable to distinguish between birds, insects, fish, and land mammals. It's like a political journalist having no inkling of the distinction between the legislative, executive, and judicial branches of government. As we have pointed out over and over again on Language Log (don't make me list the links), it isn't good enough. Journalists should know at least something about language if they're going to make derogatory, and supposedly informed, comments on people's speech.

Posted by Geoffrey K. Pullum at 09:17 PM

A bird in the hand is, is...

On the topic of "...is is..." (following up on Mark Liberman's recent posts, here and here), and while we're in the business of speculating about G. W. Bush's grammar, it seems worth mentioning that GWB is also by far the most pervasive "double is" speaker that I've ever heard. This was first called to my attention during his April 13th prime-time press conference, in which he used the construction 3 times in quick succession, along with a related variant:

"The other lesson is, is that this country must go on the offense and stay on the offense"
"And my answer to that question is, is that, again I repeat what I said earlier — prior to 9/11 the country really wasn't on a war footing."
" And so what I'm telling you is, is that sometimes we use military as a last resort..."
"That is an important part of the 9/11 Commission's job, is to analyze what went on and what could have, perhaps, been done differently..."

In fact, he uses this construction so often in his press statements that at times I have wondered whether there's something intentional going on. (Is he stalling for time? do his speechwriters think that the construction sounds folksy? are they reducing the length of his phrases?) A quick survey of statements made by the president during "press availabilities" in the past month shows quite a few occurrences (in a rather small corpus), and the answer seems to be more mundane.

Transcripts from about a month's worth of press statements turned up the following examples:

6/28	"Yes, my sense is, is that there's a hope that we succeed with all the nations sitting around the table. Everybody understands the stakes."
6/24	"But the reason why is, is because her job is to give grant and loan programs for rural development."
6/17	"What I'm telling you is, is that the economy's strong, it's getting stronger."
6/14	"Another problem is, is that people — they feel like it may be too complicated, the procedures may be too complicated to get a drug discount card."
6/10	"What I can tell you is, is that we're going to make sure we fully understand the veracity of the plot line. And so we're looking into it, is the best way I can tell you."
6/10	"The point is, is that we understand that the Iraqi people need help to defend themselves, to rebuild their country — and, most importantly, to hold elections."
6/1	"What did happen is, is that we moved too quickly"
5/25	"The thing about Sid is, is that he is such a loving guy that he wants to help somebody in life." "My view is, is that we need to empower consumers and doctors."

Some of these examples do seem to involve hesitations or disfluencies (as with the "...is, is that, again to repeat what I said earlier..." example; see also the 6/14 quote), and we might wonder whether they were merely "typos" ("speakos"?). In other cases, though, the sentence is short and otherwise unremarkable: "What did happen is, is that we moved too quickly". They occur in both (apparently) scripted and unscripted utterances (though slightly more often, perhaps, in questions than in the text of a speech), and they are transcribed faithfully in White House press releases. He seems to produce them without prior planning, and there is no effort to edit them out. I suppose we'll never know whether he subconsciously associates this construction with folksy, hypermasculine speech (as Arnold Zwicky suggests for some other features). At the most basic level, however, the answer seems to be that, like so many other speakers, he simply finds them grammatical, and uses them routinely. Maybe if I had a PR machine transcribing my every word for a month, someone would find just as many "...is...is..." constructions in my own speech...

Posted by Adam Albright at 07:06 PM

Information infections in blogspace

An interesting paper from some folks at HP's Information Dynamics Lab: Eytan Adar, Li Zhang, Lada Adamic & Rajan Lokose, "Implicit Structure and the Dynamics of Blogspace." Here's the abstract:

Weblogs link together in a complex structure through which
new ideas and discourse can flow. Such a structure is ideal for
the study of the propagation of information. In this paper we
describe general categories of information epidemics and create
a tool to infer and visualize the paths specific infections take
through the network. This inference is based in part on a novel
utilization of data describing historical, repeating patterns of
infection. We conclude with a description of a new ranking
algorithm, iRank, for blogs. In contrast to traditional ranking
strategies, iRank acts on the implicit link structure to find those
blogs that initiate these epidemics.

And here's a cute picture. [link via email from Nick Montfort]

Posted by Mark Liberman at 05:07 PM

Everything you always wanted in a language, and less

Richard Lederer thinks that "... the essential reasons for the ascendancy of English lie in the internationality of its words and the relative simplicity of its grammar and syntax." Trevor at kaleboel thinks that "... English is so popular (and so strongly associated with ideologies of freedom) not because of its status as the world's primary language of intercommunal transaction but simply because it is such a delightful chaos."

I'd bet on Hollywood, GNP statistics and the Pentagon, myself, though Trevor's theory is a sentimental favorite. Anyhow, if languages had commercials, this would suggest a remake of the old Miller Lite debates:

"International and simple!" "Delightful chaos!"

As Adweek points out, Miller's "Tastes great! / Less filling!" campaign provided "a retirement haven for ex-jocks" for more than 15 years (and it's recently been revived using mud-wrestling supermodels). Our version could give aging language mavens something better to do with their time than telling us all what not to say.

An image is beginning to form: it's the bar at the 2005 LSA meeting; William Safire is chatting with Geoff Pullum... no, it's fading, I've lost the signal.

I'm also skeptical that Miller's master slogan will work: "Everything You Ever Wanted in a Language and Less". Maybe for Interlingua; though a quick check with Google suggests that no one has used this modified slogan yet. However, "tastes great" "less filling" is quite popular, with 8,500 Google hits, almost twice as many as "tomorrow and tomorrow and tomorrow" with 4,850.

Posted by Mark Liberman at 04:32 PM

English: international and simple

If Richard Lederer is right, then the newly-free people of Iraq will soon all be speaking English -- in part because English-speaking armies will probably still be occupying their country for quite a while, but primarily because English borrows many words from other languages and because English grammar is relatively simple. (Though this latter fact appears to be somewhat easy to forget.)

Lederer has a PhD in English and Linguistics from the University of New Hampshire, and writes many books on language and other matters. He lives here in San Diego and can be regularly heard on my local public radio station. He co-hosts* a locally-produced radio show on language (on which Geoff Nunberg was once a guest) and makes other public radio appearances, such as during pledge drives (become a regular member for $120 and get an autographed copy of one of his books, become a Producer's Club member for $1000 and get Richard to speak at your favorite function). On November 5, 2003, Lederer appeared as a guest on another locally-produced radio show, promoting one of his new books A Man of My Words: Reflections on the English Language.

I wasn't tuned in at the time, but heard about it later from a colleague. Apparently, Lederer claimed sometime during this appearance that English is "the most cheerfully democratic and hospitable language on earth", and offered as evidence that English speakers borrow words like kimono effortlessly from Japanese while Japanese speakers must adjust English words like baseball, pronouncing it besuboru, in order to borrow them. (The bit in quotation marks is quoted from a subsequent e-mail exchange I had with Lederer, as noted below.)

Now, I'm thinking that this would be a perfect topic to discuss in my 101 class: we could pick this claim apart with the help of judgments from the Japanese-speaking students in my class who will undoubtedly find English speakers' pronunciations of kimono to be atrocious. But, being the responsible academic that I at least pretend to be, I wanted to know exactly, word for torturous word, what Lederer said. So I wrote to Lederer requesting a recording or transcript. After a little back-and-forth, Lederer eventually provided me with an electronic version of a chapter entitled "In Praise of English" from his 1991 book The Miracle of Language in which the claim in question is supposedly supported by evidence.

The main thesis of this 12-page (double-spaced, 12-point type, standard margins) chapter is as follows:

"The emergence of England and then the United States as economic, military, and scientific superpowers has, of course, contributed to the phenomenal spread of the English language. But the essential reasons for the ascendancy of English lie in the internationality of its words and the relative simplicity of its grammar and syntax."

As you might suspect, the evidence adduced to back this thesis up is seriously lacking. The support for the "internationality" claim consists of lists of words borrowed from many languages into English; no comparison is made with borrowings in any other language, lists or otherwise. What's more, not a single shred of evidence for the "relative simplicity" claim is made: not one rule of English "grammar and syntax" is mentioned, not even obliquely, much less compared to the rules of some other language.

Returning to the kimono vs. baseball example, we can imagine what Lederer might have been thinking: Japanese admits a proper subset of the types of syllables that English admits. But doesn't this make English syllable structure rules more complex than Japanese syllable structure rules? Besides, this all ignores the fact that English and Japanese have nonoverlapping sets of phonemes, and that their phonological rules are also not in a subset/superset relationship. Ask an English speaker to pronounce sukiyaki, with an unrounded high back vowel and high vowel devoicing, or futon, with a bilabial fricative, high vowel devoicing, and a nasal glide, or any of the many other words English has borrowed from Japanese, and Japanese speakers will giggle just as much as English speakers might at besuboru (or any of the many other words Japanese has borrowed from English).

I wrote again to Lederer to point some of this out to him, but never got a response. I did talk to him on the phone recently, however, and took the opportunity to mention that Japanese has sounds that English doesn't have. Lederer responded that he wasn't aware of that, though he didn't thank me for raising this point. You'd think he would, because he also mentioned to me that he considers The Miracle of Language to be one of his "serious" books, and that he writes all of those other popular/humorous books in order to support his writing of the more serious ones. Clearly, "serious" does not mean "held up to basic standards of scholarship" in Dr. Lederer's view.

*(Lederer's co-host recently left the show due to a "dispute over contract language", and there was a search for a replacement co-host. I'm eagerly waiting to find out who, if anyone, got the job.) back

[ Comments? ]

Posted by Eric Bakovic at 02:08 PM

Isis Fest, with emergent free-bees

I've gotten a lot of fascinating feedback on my "... is is ..." post, from people who have studied the construction for a lot longer than I have, and in more depth than I'm ever likely to. We're talking about sentences like

(link) The worst thing is, is that procrastination is so easy to stop.

(Be warned: the rest of this post is just lightly-commented references and links. Some more easily-digested commentary is likely to come along later, but if you're interested in the contruction, you'll want to look at this stuff.)

Patrick McConvell sent in a couple of early references:

Bolinger, Dwight. 1987. The remarkable double "is". English Today 9:39-40.
McConvell, Patrick. 1988. To be or double be: current change in the English copula. Australian Journal of Linguistics 8.2:287-305.

along with a brand-new set of slides from a talk he gave a couple of days ago, entitled "Catastrophic change in current English: Emergent Double-be's and Free-be's." [warning -- these are ~1MB files -- .ppt, .pdf].

Arnold Zwicky emailed to say that

Locally, we call it "Isis", with a bow to ancient Egypt. (if you know some Mozart, you can even sing it.)

In any case, I have quite a lot to say about it. More bibliography, *lots* more analysis, the observation that the historical source(s) of the construction and its current status aren't necessarily (in fact, aren't) the same, and the observation that there are several distinct systems for its use now. Lord knows how i could get this into a LL posting...

Arnold sent in a bibliography, a bunch of notes, and a handout from an Isis Fest on Memorial Day 2003. He'd like to update the bibliography before posting it, and edit the notes in various ways, but here's the IsisFest handout (.pdf) while we're waiting.

Posted by Mark Liberman at 08:02 AM

The thin line between error and mere variation II: going nucular

[Continued from part 1]

Ordinary people, faced with what are for them deviant, "wrong", bits of language, see nothing but a mistake, period. They are resistant to the linguist's idea that there could be a rationale for the "mistake", even a system to it, or that, in fact, the very same thing could result from different sources or represent different systems. (This attitude presents a tough challenge when we teach beginning linguistics courses -- not only when we talk about dialects, but also when we talk about language acquisition. One of the hardest lessons for many students is that instead of saying what's wrong, what people "can't" or "won't" do, they should be describing what people *do*, and making hypotheses about *why* they do that.)

Geoff Nunberg's Going Nucular" piece makes a significant advance in trying to get these ideas out to linguistically unsophisticated people. First, it makes an inadvertent/advertent distinction (via the labels "typo" vs. "thinko"); some people say "nucular" because they've inadvertently reshaped the pronunciation to fit a common -ular pattern for learned words (tabular, globular, tubular, vernacular, oracular, popular, spectacular, oracular, etc., but especially molecular), but other people say it because they think that (at least in some contexts) this is the way the word is pronounced. What Nunberg doesn't stress is that these days virtually everybody who says "nucular" is in the second group; though the support of other -ular words helps to make "nucular" sound right, these people are saying it because other people say it. (The same point can be made for almost any innovative usage. Though hypercorrection surely played some role in the development of nominative coordinate object pronouns -- the famous "between Kim and I" -- for some time now people with this usage have it because that's what they hear, with some frequency, from the relevant people.)

Second, Nunberg doesn't stop there, but speculates some about the possibility of different systems for the use of "nucular". In particular, he cites at least one speaker for whom "nucular" refers specifically to nukes, with "nuclear" used in expressions like "nuclear family" and "the nuclear material of the cell". This is a tremendous advance, with many analogies in other areas (there are several different systems of nominative coordinate object pronouns, several different systems of multiple negation, and so on), but it stops well short of telling the whole truth. To do that, the whole discussion has to be re-framed.

Instead of talking about "nucular" as a mere thinko, we need to treat it as a variant pronunciation for a word, an alternative to "nuclear". Just like alternative pronunciations for: radiator, apricot, tomato, envelope, and many, many other words (with item-specific variants). So, put aside judgmental attitudes for a while, and ask how people use these alternative pronunciations. There are five types of systems:

Type 1: "nuclear" all the way. (This is my system, for what that's worth.)

Type 2: free variation, or as close as people come to this. While you might be able to discern reasons for one choice or the other in particular contexts, for the most part the motivations for choosing one variant over the other are too context-specific, too idiosyncratic, too much in the moment: inscrutable, in fact. As far as I can tell, that's my situation for the /a/ vs. /E/ pronunciations for "envelope", and for the cursive vs. the printed variants for the capital letter a, even in my first name.

Type 3: variation according to context, say according to formality, with "nuclear" as the formal, fancy, or scientific pronunciation, and "nucular" as the informal, homey, everyday pronunciation. My own pronunciation of "tomato" is mostly /a/ (thanks to living with an /a/-speaker for decades and to residence in the U.K. for significant periods), but more and more I'm inclined to use /e/ when speaking to Americans.

Type 4: variation according to semantics, as in the nucular-nukes variety reported by Nunberg.

Type 5: "nucular" all the way; the -ular pronunciation is *the* pronunciation for the word. There are, I belief, very many speakers of this sort. They understand that other people say the word differently, just as I understand that some people have /ae/ in "radiator" or "apricot", instead of my /e/. That's ok for them, but what I do is ok for me.

Nunberg suggests that George W. Bush might be a Type 4 speaker, but he could well be a Type 5 speaker. Instances of the "nuclear" pronunciation are so rare in his speech as to preclude the other three possibilities.

There's a further dimension to all of this, namely the question of intentionality, or conscious choice. Nunberg is inclined to see GWB as having *chosen* the "nucular" variant, to project a particular persona; in even less neutral phrasing, GWB "puts on" his folksy, Texas-rancher, hypermasculine persona, with the linguistic accoutrements that go along with that.

I don't doubt that some people sometimes consciously re-shape their behavior in certain respects. But I think that most accommodations to social varieties and most constructions of personas via behavior (linguistic and otherwise) happen below the level of consciousness, usually with very little awareness of what features are being chosen or why. (In a sense, this *has* to be true. There are just too many bits of behavior for choices among them to be under conscious control. This is especially true for bits of linguistic behavior, which have to be produced in tiny amounts of time, many at the same time.)

Some years ago it was pointed out to me that when I'm trying to be very precise in talking about linguistics, I use dental rather than alveolar articulations for consonants. Eventually, this astute observer (Ann Daingerfield Zwicky) noted that I'd never done that before I went to graduate school. After some reflection on this odd state of affairs, we realized that I was reproducing the articulations of my graduate school adviser, Morris Halle, in my Serious Linguist persona. All entirely unconsciously, I assure you.

Anecdotes like this could be multiplied endlessly. There's even some research on the matter. As a result, I'd be very very cautious in attributing someone's ensemble of linguistic features to conscious choice. GWB could come to his pronunciation "nucular", his extremely high use of "-in'" over "-ing", and so on without ever thinking any of it through (and without consciously rejecting standard or formal variants). He could get there just by behaving like the kind of person he believes himself to be. Like, in fact, the rest of us.

[Also posted 6/28/2004 to the ADS-L listserv]

Posted by Arnold Zwicky at 06:26 AM

The thin line between error and mere variation (part 1 of 2)

In trying to write some commentary on Geoff Nunberg's discussion of "nucular" (in a 10/2/02 Fresh Air commentary on NPR, now in his collection Going Nucular), I've been reflecting on that thin line between error and mere variation.

Nunberg begins this piece by drawing a distinction between "typos" and "thinkos" -- in my terms, between inadvertent errors, things that are "wrong" for the person who produces them, and advertent errors, things that are ok so far as the producer is concerned but "wrong" from the point of view of at least some other people. (Faced with a typo, you call in the psycholinguist; faced with a thinko, you call in the sociolinguist.)

The distinction is a familiar one in the literature on language errors. In the typo camp, you have, for instance, Fay/Cutler malapropisms (so called from a 1977 article by David Fay and Anne Cutler), like my (alas, only too frequent) productions of "verb" for "vowel", or vice versa, in class lectures. In the thinko camp, you have, for instance, classical malapropisms (so labeled by me in a 1979 article), like "behest [beset] with all these difficulties", written by someone who *meant* to write "behest" (and was willing to defend this word choice). It might be hard to decide, in any particular instance, which kind of malapropism you're looking at, but in principle, with more information about the producer and their intentions, you can sort things out.

But matters are not so clear in the world of thinkos. The deviance of thinkos ranges from extremely high, as in clear examples of classical malapropisms, to extremely low, as in violations of the more fanciful proscriptivist pronouncements, like the one against possessive antecedents of pronouns.

(A side issue: It would be a good thing to expunge the moral language usually applied to thinkos, even by Nunberg, who should know better: Typos "can make you look foolish, but they aren't really the signs of an intellectual or ethical deficiency, the way thinkos are. It's the difference between a sentence that expresses an idea badly and a sentence that expresses a bad idea." (p. 59 of GN). Look at the most extreme case... Someone who writes "behest" for "beset" is certainly wrong. But they aren't morally defective, or evil, or stupid. Technically, they are very specifically ignorant, of one of the zillion facts about the world one might be called on to marshal in everyday life. It's like getting Björk mixed up with Bork, or not knowing at all who Hugo Wolf is.)

The "behest" thing is, yes, an extreme case. But things don't get any clearer as we work towards possessive antecedents. They just get messier and messier, in fact. As soon as we leave the clear "behest" zone (where almost everyone says the usage is wrong for them), we have to confront a world in which usage is contested and variable.

We come first to the Retart Zone, a label I use to honor a poster to the newsgroup sci.lang, 6/24/04:

Called by Peter Daniels on the voiceless final consonant in his insult "What a retart", the poster responds:
And what's wrong with my use of "retart"...it's a perfectly acceptable word when describing those who are SLOW. A retart is a SLOW person.

In later discussion, the pugnacious poster concedes that (some) other people say, and write, "retard", but maintains that *his* version is perfectly fine. That is, he claims that this is a case of variation, not error. He is surely in a small minority in his pronunciation, but probably not a loner; I have no doubt that some searching would turn up others with his pronunciation.

Certainly, there *are* plenty of examples of variation. Some English speakers (I am one) have a voiceless final consonant in "with", some have a voiced final (a fact that I did not appreciate until I gave an exercise in phonetic transcription in an introductory linguistics course); I believe that the voiced variant is statistically the predominant one, by a considerable margin (some dictionaries list only this pronunciation), but theta-speakers like me don't provoke dark looks and snickers with our minority pronunciation. Similarly, some English speakers (including a great many South Africans) have edh rather than theta in the "South" of "South Africa"; I believe that they are definitely in the minority in the English-speaking world, but who am I, an American theta-speaker, to tell South Africans how to pronounce the name of their country? Similarly, many New Yorkers stand "on line" rather than "in line"; they're a small minority in the English-speaking world, and they are aware (at some level) that other people use "in" here, but everybody knows that people speak differently in different places, so where do you get off telling them they're "wrong"?

On the other hand, we do tell "needs V-ed" speakers (again, a small minority in the English-speaking world) that they're "wrong". These folks are aware (at some level) that other people say "needs V-ing", but most of the people they know personally are "needs V-ed" speakers, so from their point of view, they're talking appropriately, and the dark looks and snickers from outsiders are just nastiness.

Even in the Retart Zone, we're in trouble. What's unremarkable variation, and what's a thinko-type error?

But then we get to the Nucular Zone, the Hone-In-On Zone, and the Another-Thing-Coming Zone. The percentage of people who use the (historically) innovative variant steadily increases. (Google web searches have "home in on" somewhat above "hone in on", 64,200 to 35,200 in raw numbers, but "another thing coming" *way* over "another think coming", 21,400 to 5,830.) Those who use the innovative variants are probably aware (at some level) that other people have other variants, but for them this is just unremarkable variation, and their version is, well, *their* version, and perfectly ok.

The argument from history isn't going to carry much weight for these people, and anyway it's intellectually disreputable, since very few current standard variants have a pedigree going back to Old English; almost everything was an innovation at some point. How to decide when the ship of language change has sailed?

The argument from authority won't carry much weight, either. I can tell you that *I* (a noted linguist and writer) use "nuclear", "home in on", and "another think coming" (and "too big a dog" rather than "too big of a dog", but don't use positive "anymore", etc.), but you're entitled to ask why I should be telling you how to talk and to note that anyway you think I sound bookish and prissy.

If anything might work, it would be the appeal to the practice of those who are noted for their abilities in writing and speaking -- there's a reason AHD ended up with a Usage Panel, awkward though it turned out to be -- but in fact these experts are quite often divided in their practices and in their opinions, and in any case they're not necessarily models for writing and speaking in other than formal contexts.

The fact seems to be that the line between mere variation and error is largely a matter of intellectual fashion -- lord knows why speaker-oriented "hopefully", restrictive relative "which", split infinitives, logical "since" and "while", etc. get picked on while other variants thrive without criticism -- rather than a result of observation and reasoning. In this context, the label "thinko" doesn't really seem much better than "error" or "mistake".

[on to part 2]

[also posted 6/26/2004 to the ADS-L listserv]

Posted by Arnold Zwicky at 05:23 AM

NEWSFLASH: Safire reads Language Instinct

Of course, I could be wrong about this. All I have to go on is the following paragraph in Safire's latest On Language piece, in a discussion of gone missing:

Is it good grammar? It may well stretch our hard-wired sense of syntax. To critics, a simple is missing would solve the problem. But because gone missing has acquired the status of an idiom, which is ''an unassailable peculiarity,'' it is incorrect to correct it. As the fumblerule goes, ''idioms is idioms.'' Relax and enjoy them.

Our hard-wired sense of syntax? Like, when he says our, does he mean our as in we and us? Meaning that the opinions of people who don't have Language Maven stamped on their business cards actually count? And when he says hard-wired, is he referring to some sort of like, you know, genetic endowment thingy? Not a tablet of official rules given down to Safire on Mount Sinai? Oh my, how sophisticated!

I don't think Safire thought of that stuff about hard wires himself. I think he's been reading linguistics. Or at least Pinker's pop Chomsky. (What? Chomsky and Pinker are related?)

Even more amazing, Safire seems to completely buy the idea that usage can trump what would otherwise be good grammar. All you have to do is find an excuse to label your preferred usage idiomatic, and it's just fine and dandy with the relaxed Mr. Safire. For idioms may be peculiar, but as everyone knows, they are unassailable.

The term fumblerule appears to be Safire's. He has a book about them which I have not read. It is called Fumblerules: A Lighthearted Guide to Grammar and Good Usage, and it seems to be a book primarily devoted to helping you avoid ordinary usage in favor of strange rules. For example, it apparently tells you not to end sentences with prepositions. According to a definition I found which is probably Safire's, a fumblerule is a mistake that calls attention to the rule, and the lighthearted twist to the book appears to be that every rule is stated in such a way as to break itself. I like this idea. Apart from introducing self-referentiality (which, of course, is something I myself try to achieve at least once in every blog post), the fumblerule style has the benefit of making traditional mavenesque prescriptions sound exactly as silly as they are. Here is the prepsoition fumblerule (which I found in this excellent critique):

Fumblerule #49 Never Use Prepositions to End Sentences With

But the rule Safire cites in the paragraph above, idioms is idioms, is very special. If applied widely enough, it is the exception to end all rules. What if ending a sentence with a preposition becomes idiomatic? Relax! Enjoy it!

Safire, if I read him correctly, is no longer one of the critics, those naive language experts who think that since gone missing can be replaced by is missing, it should be. No, Bill Safire is one of us. Welcome, Bill, we've been waiting a long time for you.

Posted by David Beaver at 03:09 AM | Comments (11)

June 28, 2004

More Dogs of Language

[Note: the author of this post is Arnold Zwicky. It appears over another name by accident]

Hot on the heels of the Jennee the Talking Dog story comes a Palo Alto Daily News opinion column -- 6/28/04, p. 12 -- by Rowland Nethaway (RNethaway@dailynewsgroup.com) summarizing the achievements of Rico and Jennee and putting forward a third contender for "absolute king of the smart dogs": Jim the Wonder Dog, of Marshall MO, who died in 1937 and is memorialized there by a bronze statue, park, and small museum.

To hear Nethaway tell it, when Jim, a black and white Llewellyn setter, was 3, he "amazed [owner Samuel] Van Arsdale [elsewhere: VanArsdale] during a hunting trip when Jim trotted to the shade of an [sic] hickory tree after his owner suggested they rest there. Van Arsdale... then asked Jim to show him an oak tree, which Jim did. Then Jim obliged Van Arsdale by showing him a walnut tree, a cedar tree, a stump, and a hazel bush."

The whiff of Clever Hans gets stronger and stronger as the story unfolds. "For the rest of his life, Jim displayed signs of great intelligence by obeying commands as though the dog could understand English, along with a number of foreign languages, assuming English was Jim's native tongue."

Not only was Jim multilingual, he "could go outside and find a cars [sic] by their colors, their owners, their make and even their license plate numbers. Jim could also pick out people by the color of their clothes or even their occupations. Dogs are supposed to be colorblind."

Even admitting that dogs are not in fact colorblind -- which, if it were true, would have made Jim not merely astonishingly smart, but also telepathic -- and admitting that some dogs distinguish people in uniforms from everybody else, which is at least a stab in the direction of picking out people by their occupations, this story is way too good to be true.

To appreciate just how extravagantly good the story is, compare Jim to Rico. Rico was taught to respond to one of some roughly 200 German expressions (uttered by his owner) by fetching the specific object the expression named. That is, Rico learned responses to what we would think of as proper names (though, as other critics have noted, it's not clear that the dog viewed the expressions as names), denoting individuals. Jim, however, is described as responding to category names.

Now, dogs are certainly capable of acquiring category knowledge (beyond the categories, like the dog category, that are good candidates for innate knowledge): uniformed vs. non-uniformed person, for example. Presumably, they are capable of associating names (pronounced by people) to these categories, although that wasn't actually demonstrated in the Rico study. Presumably, they are capable of picking up some of these associations spontaneously, without explicit training, although spontaneous category learning wasn't demonstrated in the Rico study, either, which used a standard behaviorist stimulus-reward design. Jim, in contrast, is supposed to have spontaneously acquired categories like hickory vs. oak vs. walnut vs. cedar (this is not entirely preposterous, though you could wonder why a dog might have an interest in these distinctions) and to have spontaneously learned associations between these categories and their English names (even supposing that his owner chattered on incessantly to Jim about vegetation, makes of cars, and the like, this really is preposterous).

Well, Jim the Wonder Dog died in 1937. But, like any world-class phenom, he has a website: http://www.jimthewonderdog.com/. His talents, as reported there, substantially exceed Nethaway's account, since he exhibited not only phenomenal abilities in comprehending English (and Italian, French, German, Spanish, and Greek), but also, like Jennee, communicated to human beings, though not vocally: "As Jim could not speak, a variety of answers were written on slips of paper even in different languages and Jim would always pick out the correct one."

As a sideline, Jim predicted the winners of elections, the World Series, and the Kentucky Derby. "And most amazingly, he could predict accurately the sex of an unborn infant." Rather less amazingly, he seemed to know "which fields contained birds and which ones didn't."

There's a tape of Jim performing. A book. And other merchandise. And you can contribute to the Jim, the Wonder Dog Memorial [sometimes printed with a comma, sometimes without]. Everything except links to Clever Hans sites.

Posted by Mark Liberman at 09:52 PM

Scouse is getting scouser

So says Kevin Watson, quoted in an article in the Liverpool Echo.

You can listen to some scouse examples in a .swf animation by Patrick Honeybone.

According to Mike Kemble's " Lernin' Yerself Scouse" page,

Scouse - or to give it its full title, Lobscouse, is of course a food rather than a dialect; it is the native dish of the Liverpudlian, or Scouser. Scouse is to Liverpool what Bouillabaisse is to Marseilles or Schnitzel is to Vienna.

I'd heard the term "scouse" for the Liverpool dialect, but did not know this etymology, which the OED agrees with, and fans of Patrick O'Brian will be happy to learn about -- if they've previously shared my ignorance of the lobscouse/Liverpool connection, at least. Kemble describes scouse as follows:

A simple stew made from the cheapest cuts of meat, usually mutton, boiled with potatoes and onions. The meat ingredient is optional, without which the Scouse becomes Blind Scouse. Either kind is eaten with red cabbage pickled in vinegar. However, like the years of poverty, Scouse is now part of the history and the visitor to Liverpool will search in vain for a restaurant that serves Liverpool's own dish, although it is sometimes possible to find Irish Stew, a direct ancestor, on bills of fare. The author found in a German Cookery Book the following translated recipe.

Labskaus (Sailors dish, original recipe)
Boil a piece of fairly lean salt beef (or equal quantities of beef and ham) till soft and chop it into coarse pieces. Meanwhile boil some potatoes in unsalted water and add a great quantity (!) of small onions which have been braised in butter. Mash all of this together, season with pepper and pour over it enough of the meat stock to produce a mash of soft consistency. This simple dish is extremely tasty and nourishing, especially when taken with pickled cucumber and a glass of beer.

Posted by Mark Liberman at 09:00 PM

jmfuw ovorzuhle mtworo - instant degree!

The first paragraph of an email message I received today read as follows:

jmfuw. ovorzuhle mtworo. nrsjdmu yjesj umxcbd vvxccr kxszujpyv twdmhngn rjgccvx nfanfttgj dlwoikmqa vrbytalpc qbeoaqst rizxdn cigoli kferlsrtt- lagna nuuvna llsaw bkrxmexmi afimqwiqm ezyayo txuyuxkrd sxahjkfio bdndt uozxzikqw. hbgedjmsg fjdru- ypuuukoo okwxjaua uzbao xxoxd rcmtar nkjsf kvqhezojd. oacyce. kcdlvwxxo vwlpufdgl feoiwwo niikoa atlwia dwwjik uhcja jeuio ouiphubfv jwrfewd hkhqu djjuw isdjs

Have you ever had one of these? I think I know what it all meant. And I didn't even need to use my secret decoder ring (though, being a linguist, I do have one, of course, and I have a babel fish too). Let me explain.

The idea, I think, was to send enough plain text gibberish at the top of the message to delude spam filters into thinking that the message is ordinary personal plain text mail. Spam-filtering robots can't really read, of course. They're good at spotting HTML-heavy advertisements, but they don't read on to the end before they make a snap judgment: just as the rest of us sometimes do (admit it), they just browse the first bit to get the general impression. On the basis of that they form their prejudices about whether the message should be blocked.

As a statistical approximation to what English text is like, the above is pretty terrible (three letter u occurrences in a row??), but annoyingly, it worked: my spam robot was fooled (I have had some sharp words with it, and it has undergone some retraining).

Following the above appetizer, of course, was the main dish of spam that the message conveyed: an HTML-laden ad for one of those degree mills that give diplomas for money to lazy morons who then can say "I have a Master's degree!" in their job interviews so they get hired in some government job only to be found out and humiliated and fired later for having faked their credentials. Thanks, but no thanks. I happen not to have any Master's degrees, and I'd love to have one, but I don't think I'll get one this way.

Posted by Geoffrey K. Pullum at 06:19 PM

Harkening back

I read Mark's post on "harp back to" vs. "hark back to" this morning and then, not 15 minutes later, coincidentally read an article in which both "harks back to" and "harkens back to" are used (the former in the article's abstract, the latter in the body).

Subsequent Google searching revealed the following:

	whG		whG
harpen back to	0	harken back to	15,100
harpens back to*	9	harkens back to	38,400
harpened back to	0	harkened back to**	5,480
harpening back to	0	harkening back to**	13,100
TOTAL	9		72,080

* All 9 hits for "harpens back to" are separate copies of the same music review: "The later songs are more bare in a way; the band sheds some of the piano and effects and harpens back to their old sound a little."
** I also found a small handful of hits each for "harkenned back to" and "harkenning back to".

This is not inconsistent with Mark's "harp on" hypothesis, but I think it offers more support for the hypothesis that the "harp back to" variants are cases of (lexicalized) place assimilation: since none of the forms with -en pit the /k/ against the /b/, alternate (assimilated) forms with /p/ are not expected and thus not found.

Incidentally, the OED lists "harken" as a variant of "hearken", of which most of the senses are more or less the same as the first three of "hark" ("give ear, listen"). None of the senses refer to the hunting-dog-call business, and none are declared to be used with "back".

[ Comments? ]

Posted by Eric Bakovic at 05:18 PM

The Dogs of Language

In the wake of reports about Rico the German Wonder Dog [Science article here, Language Log discussion here, here, here and here], dog owners, I suppose inevitably, are singing the praises of their pets. Now this, from the 6/27/04 Palo Alto Daily News, p. 6: Owner says dog is talking:

The English language is not solely reserved for humans. Jennee, a 10-year-old mixed-breed Laborador retriever and pit bull who resides in Redwood City, has been taught to speak several English words through the lack or presence of food. The canine was bought from a Labrador shelter in Redwood City. Jennee's owner Paul Severino said Jennee's journey through the dictionary began three years ago while his sons were wolfing down a pizza pie and started flipping pepperoni slices in her direction. "My kids used to throw pepperoni slices at her all the time," Severino told the Daily News. "And one day I started giving her commands to say 'rebberoni." [PADN's punctuation] Severino said a package of hot dogs and a tupperware container of ground beef is a familiar and friendly site [sic] for Jennee. The genius canine can also say hamburger, I love you, hungry, London broil and foooooooooood, Severino said. "I've literally had people fall on the floor laughing," Severino said.

There is an accompanying photo, with the caption: "TALKING DOG -- Jennee begs for a treat from her owner Paul Severino of Redwood City." Sweet dog.

The article doesn't say if Jennee rejects hamburger or pepperoni when she calls for London broil. Somehow I doubt it. (Rico, at least, usually fetches the thing he was told to get.)

Severino's description of his training regimen is desperately thin. (For Rico, we have a pretty detailed account of the training.) Merely telling a dog -- even a dog that's devoted to pleasing, not to mention hungry -- to say "rebberoni" just won't do.

By a wonderful accident, the same day's New York Times Book Review reports the paperback publication of Carolyn Parkhurst's The Dogs of Babel, a book that was recommended to me a year ago by a neighbor who knew I liked dogs (well, certainly, her dog) and was a linguist whose partner had just died. Unlike the linguist protagonist of Parkhurst's moving, creepy book, I didn't have my partner die in mysterious circumstances, with the family dog as the only witness, so I didn't assuage my grief by trying to teach a dog to speak (and testify as to the cause of death), as Parkhurst's protagonist does, with predictably unsatisfying results. The book is about love and grief and self-deception and, alas, an underground organization devoted to the mutilation of dogs (modifying them so as to make speech easier). Not always easy to take, but, as I said, moving.

Posted by Arnold Zwicky at 01:47 PM

Big disagree

In a recent Washington Monthly article on Niall Ferguson, Benjamin Wallace-Wells cited a deverbal noun that was new to me. The context is a talk by Ferguson at the Council on Foreign Relations:

In the row in front of me, a broad-shouldered, uniformed officer stood up. "Big disagree here, sir," he bellowed. "Big disagree with your characterization."

It's obvious what this means, and perhaps it's become a conventional idiom of objection in the U.S. armed forces these days, but it hasn't made it onto the internet much. I could find only a couple of examples, neither one military:

(link) Well, I've got to butt in on this one and put in a big DISAGREE.
(link) i have to give a big DISAGREE!

I wonder if this usage comes from response categories in surveys: "AGREE ... DISAGREE."

The American officer's reaction came in response to Ferguson's assertion that recent problems in Iraq are an inevitable consequence of military occupation:

"In behaving the way they did," Ferguson said, "those soldiers and military policemen [at Abu Ghraib] were largely doing to their prisoners what routinely people in the American military do to new recruits."

The American officer who objected went on to say

"The institution I have spent my life in abhors what went on in Iraq," he said. "It's not the way we treat anyone-- a fresh recruit or a plebe at West Point." The crowd clapped vigorously. In less than 10 minutes, Ferguson had pulled off that rarest of Washington double plays, alienating liberals and conservatives alike.

Based on my own experience, which was limited but took place at a time when the U.S. Army might have been rougher with its recruits than than it is now, I have to agree with him. Basic training was often painful and humiliating, and not always safe, but nobody was ever attacked by dogs or made to stand on a box holding (even fake) electrical wires. More generally, no one was ever (what I would call) tortured.

I suppose that what Feguson meant was that military recruits are always subjected to a systematic process intended to break them down and build them up again in a new way. When it was done to me, aspects of this process were things that might be called torture in another context: six weeks of systematic sleep deprivation, or running a couple of miles while holding a heavy rifle over your head, or belly-crawling in freezing mud while drill instructors yell at you and kick you back flat if you get up on your hands and knees, or having your bunk upended at 4:00 a.m. if you don't get out of it before the sergeant can get to you.

But context changes interpretation. These things were done as part of a process of training and initiation, not part of a process of interrogation. I suppose there's a common thread of subjugation, of breaking down someone's will by pushing them beyond their normal physical and social boundaries, but there's still a big difference between something done to recruits or inductees and something done to prisoners. Even if it's the same thing, it's not the same thing.

And there were definite limits. They were often crossed, but not always with impunity. While I was in basic, trainees had a clear concept of what the limits were, and were quick to object when they thought something was wrong. Objections were usually overruled and even punished, but drill instructors who were too fond of hitting recruits, or whose trainees were injured or killed as a result of questionable practices, were brought up on charges and disciplined.

There were definite limits at Abu Ghraib too, and they were definitely crossed, and the people who crossed them are now in trouble. I think that this similarity is another basis for the officer's "big disagree", which I also agree with: even partial and hypocritical adherence to ethical norms is a step forward over no recognition of such norms at all.

I wonder what the history of military initiation is. Some of the techniques are apparently quite old. As I recall, Patrick O'Brian describes seamen in the British navy of 1810 or so being awakened by a call of "Here I come, with a sharp knife and a clear conscience!" Anyone who wasn't out of his hammock in time found himself abruptly on the deck. This is the same technique that was used on us in basic training, and it certainly works. Especially from an upper bunk, it's not a way of waking up that anyone wants to experience twice.

Posted by Mark Liberman at 10:22 AM

Intermittent outages

Over the past 24 hours, there have been intermittent outages in the connection between Language Log's server and the internet. Hundreds of computers in several buildings have been affected, so Penn's highly competent tech support people will no doubt diagnose and fix the problem soon. Sorry for any inconvenience; as always, the Language Log marketing department stands ready to refund your subscription fees in full.

Posted by Mark Liberman at 10:19 AM

June 27, 2004

The thing is is people talk this way. The question is is why? The answer is is (drumroll please) ...

The answer is is I don't know, but I'll speculate anyhow. Here's another case where it seems that a common syntactic pattern is a grammatical confusion:

(link) The thing is is that it all depends on the graphic card's drivers.
(link) The best thing is, is that they are spill proof.
(link) The worst thing is, is that a client believes them.
(link) "The important thing is, is that it is for all the right reasons—for Columbia College and its students," Schlossberg said.
(link) The amazing thing is is that this data can now be represented as a vector (gradient vector).

However, this time I think it's different. In the case of the extra that that I wrote about yesterday, I argued that people are indeed just confused: those are production errors. In the case of the extra is that I'm writing about today, I believe that people are not getting confused, but are producing phrases that are grammatical -- in terms of a non-standard grammar.

I could be wrong, in either or both instances. And some people might argue that it's misguided to make a qualitative distinction between production errors and grammatical modifications. But let's go on a bit anyhow.

Why do I think these are the result of a non-standard conception of English grammar, rather than just a faulty implementation of standard English grammar?

Well, for one thing, many of the examples are way too short for the speaker or writer to be befuddled by length and complexity. For another thing, I know from experience that (at least some of) the people who use double-is constructions don't see any problem with them, on reflection, whereas I suspect that (at least some of) the perpetrators of double-that constructions would see them as errors on careful reading. (Though this is a weak argument at best, since it might only show differences in degree of prescriptive awareness or obedience).

But the main reason is that it's easy to see how to make double-is constructions grammatical. One obvious idea is to treat them as variants of structures like:

(link) What the thing is, is not cowardly, but profoundly and detestably wicked.
(link) What the result is, is that when the carb gets hot, almost all of the clearance at the shaft is taken up by expansion.
(link) How serious the problem is is less important than how serious it feels to them.

These are fully grammatical in standard English -- the first one is a quote from the esteemed prose stylist G.K. Chesterton, from an essay that is well worth reading on other grounds.

This analysis is not without problems. As far as I know, every instance of a double-is construction has a plausible (fully grammatical) alternate with an overt wh-word; but the mapping doesn't always work in the other direction.

Thus I think you could get a non-standard version of the second example above:

The result is, is that when the carb gets hot, ...

but not the first one

???The thing is, is not cowardly, but profoundly and detestably wicked.

So it's not good enough just to say that [the ADJ N is] can sometimes be a noun phrase, roughly equivalent to [What the ADJ N is]. Still, this might work if we insist on the right kind of interpretation for what, which is surely different in the two sentences contrasted above.

The construction doesn't require a clause introduced by that -- it can be a bare sentence

(link) And the key thing is, is we have Elizabeth back right now.

or some kind of indirect question, as in the title of this post, or this internet example:

(link) To restate Rabbi Nachman's point about fear, life is a narrow bridge and the important thing is is not whether you are afraid; the important thing is that you choose to try to cross the bridge.

It seems to be optional -- at least, there are lots of other examples like the one just cited, where the double is phases in and out:

(link) And we've got some problems. One problem is there's misinformation about these cards. Another problem is, is that people -- they feel like it may be too complicated.

There are some examples where the complement (of the second is) is just a nominal:

(link) The answer is is verse 22.

or an infinitive

(link) I think the answer is is to have Thread B not terminate but rather have the Thread A delegate release the Mutex for Thread B when bytes are receieved.

or some other sort of element:

(link) Let’s say the answer is is no.

There's a theory of this presented in the following paper, which I haven't read (though I took 2/3 of the title of this post from its title): Tuggy, David. 1996. "The thing is is that people talk that way. The question is is why?." In Eugene H. Casad (ed.), Cognitive linguistics in the redwoods: The expansion of a new paradigm in linguistics , 713-52. Cognitive Linguistics Research, 6. Berlin: Mouton de Gruyter. Its content is summarized here. Apparently Tuggy points out that this construction gets reinforcement not only from the what-clauses, but also from real production errors, namely disfluent repetitions of is, a very common occurrence.

It's worth noting that this construction, though stigmatized, is widely used by highly educated people. I have a valued colleague who can be counted on to use it several times per lecture, and here's a quote from Bob Moffet, who is a health analyst at the Heritage Foundation and was deputy assistant secretary in the Department of Health & Human Services during the Reagan administration:

(link) But the important thing is, is that it would give individuals and families the right to pick and choose the plans they want at the prices they wish to pay and control their own health care.

I can't resist adding one more quote before closing -- the one that I took the last third of the title from:

(link) You got it; that's the question. The answer is is (drumroll please) "Delegate to the modules".

Posted by Mark Liberman at 10:31 PM

Psycho Simon

A couple of weeks ago, we discussed the fascinating cover story in the July/August Atlantic, in which James Fallows examines the debating histories of George W. Bush and John Kerry. Yesterday, NPR got around to interviewing Fallows on this topic, and as Geoff Pullum pointed out last night, the segment features Scott Simon (the host of Weekend Edition) making a bad joke about "psycho linguists".

Here's a link to the NPR Weekend Edition program segment, which is worth listening to, apart from Simon's boorish intervention, because it includes some audio clips of past debates. If you focus on the passage where Simon makes an idiot of himself, you'll notice something odd.

Fallows: A second hypothesis, which a number of psycholinguists told me, is that there's a particular form of dys-
Simon: Excuse me, I've never heard that term. Do you mean -- linguists who are, are MAD?
Fallows: No -
Simon: or - or - um - or -
Fallows: There's an actual field called "psycholinguistics", believe it or not.
Simon: OK.

Now, Fallows de-stresses "linguists" (and/or contrasts "pycho") in psycholinguists, because he's just been talking about a theory due to "the linguist George Lakoff", but in context, that's not odd. What odd is that Simon uses the word mad when he clearly means "mentally disturbed", as in the slang sense of psycho for "psychotic". The American way to say that is "crazy". For Americans, mad usually means "angry", except in certain expressions (like "mad with rage") or when imitating British speakers.

I can't figure out why Simon would have chosen this word. Perhaps he associates puerile sarcasm with a British debating style? If so, he was insulting the British as well as the linguists.

It's worth adding that Simon was probably not telling the truth when he claimed never to have heard the term -- presumably this was just his jocular way of intervening to clarify a word that he felt his audience might not understand. Even if Simon has managed to remain unaware of the general use of Greek compounding to name scientific disciplines, the particular terms "psycholinguist" or "psycholinguistics" make the news from time to time. Slate's Today's Papers has two hits over the past few years; the Atlantic has three; the NYT index returns 14; and even npr.org has one -- in a review of The Language Instinct. More recently, Fortune had an article a few weeks ago about Annie Duke that features the term in its first sentence.

Posted by Mark Liberman at 12:12 PM

June 26, 2004

Psycho linguists?

Psycholinguistics was mentioned on NPR this morning by Jim Fallows, editor of The Atlantic Monthly, who was talking about the communication styles of John Kerry and George W. Bush, and the explanation for the latter's evolution in the direction of inarticulacy. The program host, Scott Simon, seized the opportunity to interrupt to say he'd never heard the word psycholinguist, and to ask whether it meant a linguist who's crazy. Ha! Ha! Oh, my sides are aching!

Get a grip, Scott. If someone mentioned psychopharmacology, would that elicit a similar joke? You can guess what compounds with psycho- mean. The difference, I think, is that people realize that pharmacology is a scientific subject that they may not personally know much about, so they accept psychopharmacology as a name for a psychological variant of it; but they don't think there is any such thing as the scientific study of language (everyone's an expert on that; failed sports reporters and political journalists can easily turn their hands to writing books about language), so the term psycholinguistics seems fit for a giggle. Us linguists don' get no respect.

Posted by Geoffrey K. Pullum at 07:36 PM

Oops, I did "that" again

Sometimes people put in more thats than they ought to:

(link) It's strange that given the huge community of programmers at slashdot, that the number of books isn't really that long.
(link) No but seriously, I think it's obvious that because Michael Moore eats frequently that everything he says must not have one iota of truth.
(link) I'm surprised that after a bus falling 100 feet and landing on its front, that anybody survived at all.
(link) I'm sure that given the technological complexities and the demanding financial needs of both theater and national missile defense programs, that we will be able to spend this additional $1 billion.

I used the prescriptive formulation "put in more thats than they ought to" because as far as I can see, this is not a variant or non-standard grammatical pattern, it's just a mistake. I take it as obvious that there's a difference. But in fact, I'm not entirely sure which diagnosis applies here.

[Warning: what follows is some stream-of-consciousness musing about this general question, focused on the "extra that" case. Unless you're interested in both syntax and psychology, you'll probably want to turn your attention to some of our other fine posts...]

Why do I think these apparently extra complementizers are a mistake, rather than a non-standard grammatical pattern? Logically, there are two arguments for such a position: that a "different grammar" theory doesn't work, and that a "production error" theory does.

I won't say anything specific about the failure of "different grammar" theories, except to say that I've tried to make up grammars that would license "... that ADV that ..." patterns, and don't find any of them convincing. I might well be wrong about this, and welcome suggestions.

What about the production error theory? If these repetitions of that are mistakes, why do they happen? The obvious answer is that you put that in before the adverbial, and then forget you did it and put it in a second time after the adverbial. One piece of evidence for this is that the stretch between the two instances of that is often very long:

(link) I'm convinced that given they totally fucked up the planning of this whole thing, and they're totally chained to their neo-con preconceptions, and they totally lied from start to finish, that if the same people are left in charge certainly someday this will all end, but it won't be a good ending.

However, there are a couple of problems with the forgetfulness theory. One is that the mistake sometimes happens with short adverbials, both in writing:

(link) I believe that we have a Creator who made us to live in a certain way, and that therefore that there is ‘natural law’.
(link) This ensures that all students have the same outline information in anticipation of examinations, and that therefore that coverage across discussion groups is as uniform as possible.

and in speech:

(link) MR. BOUCHER: We have not changed our position, and in fact, we believe that the jurisdiction of the International Criminal Court needs to be -- can't be established over nationals of states that are not party to the Rome statute and that, therefore, that Americans and others who are not members of the Rome statute, who participate in UN peacekeeping, need to be protected from some kind of misguided prosecution because of actions they might undertake while participating in those operations.

If the forgetfulness theory is correct, some people have very short memories, at least in some circumstances, or else such examples are the result of different sorts of mistakes, namely careless post-editing in writing, or self-correction and restarting in speech.

We could certainly check whether the double-that construction is more frequent, other things equal, when the adverbial is longer. After a bit of searching for examples, I have the (scientifically unreliable) impression that it is. This is another case where one could perhaps do some valid, quantitative Google psycholinguistics.

A second problem with the "oops I did that again" theory is that the two string positions for that seem to be semantically incompatible. This doesn't invalidate the theory, but it adds some additional implications.

We're talking about phrases with a matrix (like "It's clear", "I'm surprised", "It's obvious", "I'm convinced") and a complement clause ("that S"). Possible locations for an adverbial in such sentences include these three:

(a) ADVERB MATRIX that S ("Therefore I'm convinced that blah")
(b) MATRIX ADVERB that S ("I'm convinced therefore that blah")
(c) MATRIX that ADVERB S ("I'm convinced that therefore blah")

When the adverbial modifies the whole thing, you get (a) or (b):

Moreover, I'm convinced that our alumni and friends recognize and accept the basic premise of what I'm proposing.
I’m convinced, moreover, that this sentiment is shared by the US Administration.

but not (c):

?I'm convinced that, moreover, our alumni and friends recognize and accept the basic premise of what I'm proposing.
?I’m convinced that, moreover, this sentiment is shared by the US Administration.

In contrast, when the adverbial modifies only the complement clause, you get (a) or (c):

I'm convinced that before long she'll be sleeping in it.
Before the rebuild, I'm convinced that oil in the rocker area couldn't drain away fast enough and was getting sucked up by the PCV and blown into the intake manifold.

but not (b):

?I'm convinced before long that she'll be sleeping in it.
? I'm convinced before the rebuild that oil in the rocker area couldn't drain away fast enough and was getting sucked up by the PCV and blown into the intake manifold.

Anyhow, some kinds of adverbial phrases like "given blah" can take either scope. I can intend to say

(1) "given X, I'm convinced that Y", where I mean that X is what convinces me of Y

(2) "I'm convinced that given X, Y", where I mean that I'm convinced of the inference from X to Y.

If you object that the difference in meaning is a subtle one in most cases, you're right. After all, it's not easy to think of a circumstance in which (1) would be objectively true and (2) false, or vice versa.

It seems plausible to me that a speaker or writer may on a given occasion mean both (1) and (2) at the same time, and/or may be in a sort of mixed state of mind, part way in between intending to say (1) and intending to say (2). You can think of this as a mixture of underlying communication intentions, or a mixture of implementational choices, or both, it doesn't matter to this argument.

The point is that meaning (1) licenses the two forms (a) or (b), while meaning (2) licenses the two forms (a) or (c). Since what we actually get (in the "double-that" examples) is a blend of form (b) and form (c), it seems that the folks who produced the "double-that" examples were in a kind of psychological superposition of state (1) and state (2).

That's fine with me. I generally feel like I'm trying to say several different things at once, and I'm fond of the idea that linguistic knowledge and linguistic intentions should be modeled as distributions over linguistic structures, not as specific, individual forms. But others will find this notion less attractive.

I have a feeling that someone is going to write in to complain that all such examples are fully and completely grammatical; and that someone else will inform me that this usage was the norm in Chaucer's English, or 18th-century legal prose, or something. Well, we'll see.

Posted by Mark Liberman at 02:02 PM

Korean blog censorship

According to Blinger, it seems that the South Korean government is blocking access to several weblog hosts (at least blogger and typepad), because "they run a threat of carrying the video of the beheading of Kim Seon Il". Here's an extensive and thoughtful discussion from The Marmot, with more from him here (and some posts in between), including the news that 12 people have been arrested for uploading the video to a P2P site. Here's a story from the Korea Times, which writes that "The Ministry of Information of Communication (MIC) on Thursday said it ordered all the nation's Internet service providers (ISPs) to shut down access to Web sites that carry the execution of Kim Sun-il", and that "As a preemptive measure, the MIC also called for local Internet portals to ban searches using such terms as 'beheading' and 'Kim Sun-il footage.'"

As yet, there's nothing on Google News (at least on the index page), nor in any of the American papers whose websites I've checked.

Posted by Mark Liberman at 08:20 AM

June 25, 2004

Harping back or harking back?

In an essay in Spiked, Dolan Cummings critiques some critiques of critiques of the modern world. He observes that these critiques of critiques all take the same rhetorical stance: the complainers are accused of 'declinism', looking back to a golden age that never existed.

He quotes from three anti-declinists, two of whom accuse their subjects of of harking back to a mythical past, while one uses the term harping back.

As he explains,

Leaving aside the question of whether one harps back or harks back to a golden age, clearly it is considered a very bad thing to do. The charge of utopianism works by portraying someone, often unfairly, as a hopeless dreamer, but accusing opponents of nostalgia for a golden age is an even dirtier trick. Not only are they deluded, but they are reactionary too, dreaming of the past rather than embracing the future.

But here at Language Log, we won't leave aside the question of whether one harps back or harks back. First Google:

	whG		whG
harp back to	897	hark back to	27,200
harps back to	592	harks back to	41,800
harped back to	109	harked back to	7,390
harping back to	600	harking back to	29,800
TOTAL	2,198		106,190

So writers on the internet hark back about 48 times more often than they harp back. And a good thing, too, because that's the historically sanctioned idiom.

As the OED explains, hark back comes from a cry used to get the attention of hunting dogs, and if you want, you (or rather the dogs) can hark away, hark forward, hark in, hark off, or hark on as well.

hark, v.
4. intr. Used in hunting, etc., as a call of attention and incitement, esp. in conjunction with an adverb directing what action is to be performed: hence denoting the action .

1610 SHAKES. Temp. IV. i. 258 Pro. [setting on dogs] Fury, Fury: there Tyrant, there: harke, harke. Goe, charge my Goblins that they grinde their ioynts.

a. hark away, forward, in, off: to proceed or go away, forward, in, draw off.

[...]

b. hark back. Of hounds: To return along the course taken, when the scent has been lost, till it is found again; hence fig. to retrace one's course or steps; to return, revert; to return to some earlier point in a narrative, discussion, or argument.

1829 Sporting Mag. XXIV. 175, I must ‘hark back’, as we say in the chace.
1868 HOLME LEE B. Godfrey xli. 225 Basil must needs hark back on the subject of the papers.
1877 CRUTTWELL Hist. Rom. Lit. 223 The mind of Lucretius harks back to the glorious period of creative enthusiasm.
1882 STEVENSON Stud. Men & Bks., J. Knox 349 He has to hark back again to find the scent of his argument.
1895 F. HALL Two Trifles 31 To hark back to scientist..I am ready to pit it against your agnostic.

c. trans. hark on, forward: to urge on with encouraging cries. hark back: to recall. [...]

d. hark after: to go after, to follow.

Where does harp back come from?

First, like any other eggcorn, it's very similar in sound to the original. Second, there is probably some resonance of the phrasal verb harp on, which the AHD defines as "To talk or write about to an excessive and tedious degree; dwell on." Many of the eggcorn examples use harp back to refer to someone's complaints about something, which might well be described as harping on it as well as harking back to it:

If ever you get a tiresome old relative harping back to the good old days...
These people lament the coming of the backpacker age, harping back to the sixties and seventies when you had to drop out of society to get on the trail...
Even when Grahame wrote it he was harping back to a time that he missed...
He made no new concessions and harped back to "bold steps" he had taken and India's non-response to them.
It is still very much harped back to because it was the first and the only full study of what was needed...

To my surprise, Paul Brians' list of errors doesn't have this one.

Posted by Mark Liberman at 10:49 PM

Rural-urban bidialectalism and politics

The June 19-25 issue of The Economist reports (p. 33 of the print edition) that Claire McCaskill, who is running for governor of Missouri, has been documented by the St Louis Post-Dispatch as having pronounced the state's name "Missourah" in a commercial aimed at rural areas but "Missouree" in ads running in the cities, and (if I read the implication right) is being portrayed as two-faced and untrustworthy for it. Funny, it is generally accepted as the height of sociolinguistic sophistication to shift the shiftable aspects of your speech (vowel quality being a prime example) in the direction of the speech of those you are speaking to. A mark of respect, politeness, solidarity [though people may reject your attempt at solidarity if they think you're just imitating them, as Ray Girvan points out to me by email]. We are in awe of the Swiss when we learn that if two of them are speaking to each other in French and an Italian speaker joins them, they are likely to switch into Italian just to be polite. Yet in American politics, linguistic sophistication (like almost everything else) may actually be held against you (as I believe I mentioned once before).

Posted by Geoffrey K. Pullum at 07:20 PM

How fakirs became fakers

In connection with a post on Thomas Jefferson's attempt to learn Gaelic, I read an interesting paper by Jack Lynch entitled "Authorizing Ossian", in which he calls James MacPherson "history's most perfidious literary fakir". Lynch is being unfair to fakirs -- though in a characteristically American way. Fakirs were not fakers, before a series of 19th-century American shifts of meaning.

The OED tells us that fakir was borrowed from Arabic faqīr "poor, poor man", and has variously been spelled fokers, fuckeires, facquiers, faquirs andfakeers. Its meaning is

1. a. ‘Properly an indigent person, but specially applied to a Mahommedan religious mendicant, and then loosely, and inaccurately, to Hindu devotees and naked ascetics’ (Yule).

1609 RO. C. Hist. Disc. Muley Hamet vii. Ciij/2 Fokers, are men of good life, which are onely given to peace.
1638 W. BRUTON Newes from E. Indies 27 They are called Fuckeires.
1704 Collect. Voy. (Church.) III. 568/1 You shall take care to embark all the Facquiers.
1763 SCRAFTON Indostan (1770) 27 Bestowing a part of their plunder on..Faquirs.
1813 BYRON Giaour xi, Nor there the Fakir's self will wait.
1861 DICKENS Tom Tiddler's Gr. i, A Hindoo fakeer's ground.
1874 MORLEY Compromise (1886) 178 A fakir would hardly be an estimable figure in our society.

b. erron. for FAKER, pronounced (ˈfeɪkə(r)). U.S.

1882 in S. Poe Buckboard Days (1936) 99 Thieves, Thugs, Fakirs and Bunkco-Steerers.
1902 A. D. MCFAUL Ike Glidden xvii. 127 Each day brought its new characters, fakirs, peddlers, schemers and promoters.
1903 N.Y. Even. Post 31 Oct. 5 One may see at almost any of the downtown corners a street fakir selling shoestrings.
1932 E. WILSON Devil take Hindmost ix. 87 Some listen to a patent-medicine fakir.

If the 1932 citation is to Edmund Wilson, then Lynch is in good company. But is it?

The OED 2nd edition has 37 citations from "1932 E. WILSON Devil take Hindmost", and the new edition adds a few more. Like the fakir quote, many involve content or word uses that are specifically American, e.g. one of the citations for bellboy:

1932 E. WILSON Devil take Hindmost xxiii. 245 Glimpses as a bellboy of the luxurious life of the hotel.

The date, the prominence given to the work, and the range of topics covered make it seem likely that E. Wilson is indeed Edmund Wilson, the "American writer, critic and social commentator" who died in 1972. And I have a dim memory of having once seen a book entitled "Devil take the Hindmost", perhaps by Wilson. But it's not listed in Wilson's works, and the chronology of his life shows only "American Jitters" as a publication for 1932. Checking the RLG Union Catalog doesn't turn it up either, nor any other "Devil Take (the) Hindmost" by any other E. Wilson.

The printed copy of the OED bibliography in the "Compact Edition" is not helpful, as the only thing by an "E. Wilson" is a 1675 publication by one Edward Wilson entitled Spadalcrene Dunelmensis.

Ah, but the online OED bibliography lists Wilson, Edmund, Devil take the hindmost: a year of the slump (US ed. with title The American jitters) 1932.

OK, this settles it: Edmund Wilson used fakir in the sense of "dishonest sidewalk salesman" or something like that.

Wilson was reflecting a common usage that arose out of the American spiritualism craze of the 19th century, whose context can be glimpsed in this 1890 article on fakirs by Helena P. ("Madame") Blavatsky. Blavatsky complains about the generalization of fakir, from Muslim religious ascetic to Hindu "Yogi" to a sort of streetcorner or marketplace "producer of illusions":

First of all, we ask them why they call the "juggler" a "fakir"? If he is the one he cannot be the other; for a fakir is simply a Mussulman Devotee whose whole time is taken up by acts of holiness, such as standing for days on one leg, or on the top of his head, and who pays no attention to any other phenomena. Nor could their "juggler" be a Yogi, the latter title being incompatible with "taking up collections" after the exhibition of his psychic powers. The man they saw then at Gaya was simply--as they very correctly state--a public juggler, or as he is generally called in India, a jadoowalla (sorcerer) and a "producer of illusions," whether Hindu or Mohammedan. As a genuine juggler, i.e., one who makes us professions of showing the supernatural phenomena or Siddhis of a Yogi, he would be quite as entitled to the use of conjuring tricks as a Hoffman or Maskelyne and Cook.

It's easy to see how the idea of "street magician" was further generalized to "street salesman depending on trickery". Lynch's usage is a further generalization, removed from the context of untrustworthy American sidewalk peddlers in the late 19th and early 20th century, and applied to an 18th-century literary hoax.

There's a whiff of the eggcorn about all of this; at least, the similarity in spelling and sound is likely to have played a role in encouraging what is otherwise an ordinary process of historical meaning shift.

As a closing digression in this divergent post, I'll point out that the White Dog Restaurant, whose menu I discussed earlier, is located in what was once Helena Blavatsky's house, at 3420 Sansom St. in West Philadelphia. The White Dog's name commemorates a important event in her life:

While living on Sansom Street, Madame Blavatsky became ill with an infected leg. During her illness, she underwent a transformation which inspired her to found the Theosophical Society. In a letter dated June 12, 1875, Madame Blavatsky described her recovery, explaining that she dismissed the doctors and surgeons who threatened amputation, ("Fancy my leg going to the spirit land before me!") and had a white dog sleep across her leg by night, curing all in no time.

Posted by Mark Liberman at 09:38 AM

Jefferson the Gaelic scholar

I recently learned that Thomas Jefferson, the well-kown 18th-century American linguist and politician, once set out to learn Gaelic.

Impressed by James MacPherson's Ossian, he wrote on Feb. 25, 1773 to a relative of MacPherson's that he had met, one Charles McPherson Albemarle, asking for a copy of the (nonexistent) Gaelic originals:

Merely for the pleasure of reading his works, I am become desirous of learning the language in which he sung, and of possessing his songs in their original form. Mr. McPherson, I think, informs us he is possessed of the originals. Indeed, a gentleman has lately told me he had seen them in print; but I am afraid he has mistaken a specimen from Temora, annexed to some of the editions of the translation, for the whole works. If they are printed, it will abridge my request and your trouble, to the sending me a printed copy; but if there be more such, my petition is, that you would be so good as to use your interest with Mr. McPherson to obtain leave to take a manuscript copy of them, and procure it to be done.

The letter goes on to make it clear that Jefferson proposed to learn the language in order to read the manuscripts:

I would further beg the favor of you to give me a catalogue of the books written in that language, and to send me such of them as may be necessary for learning it. These will, of course, include a grammar and dictionary.

Ossian, which purported to be translated from a body of Gaelic epic poetry analogous to Homer, was in fact mostly faked. However, Jefferson was not alone in being impressed. Among others outside of the British Isles who were deeply affected, Goethe depicted his hero Werther as preferring Ossian to Homer ("Ossian has taken Homer's place in my heart. What a world, into which this magnificent hero leads me!"), in a work based on Goethe's own unhappy love affair of 1772-73; and J.G. von Herder saw Ossian as a key example of "the songs of ancient peoples" (in his seminal essay Briefwechsel über Oßian und die Lieder alter Völker, written 1771, published 1772).

Since Jefferson is in many other ways a typical Enlightenment guy, it's interesting to see him responding so enthusiastically to this early harbinger of Romanticism, in exactly the same time period as Goethe and Herder.

At the same time, it's also possible to read into Jefferson's letter a bit of the early skepticism about Ossian's authenticity, more pointedly displayed in this 1775 letter from Samuel Johnson to MacPherson. Certainly he's taking a very American "show me the facts" attitude.

I'm confident that Jefferson never succeeded in getting the Gaelic originals of the Ossian poems, and I suspect that his plan to learn Gaelic did not go any further forward. He may have been disappointed by his failure to get the Gaelic originals, or distracted by subsequent events. And yet, I'm impressed that he wanted to try, and believed that he could.

Posted by Mark Liberman at 09:33 AM

June 24, 2004

From phrase list to drama

Trevor at kaleboel passes the time, "when a talking pig looks like being the highlight of the televisual entertainment", by transforming a Chinuk-wawa glossary into a tragic drama, with "a romantic comedy set on Wall Street ... on its way". The Effle page says that Ionesco used a similar technique to create The Bald Soprano. As I pointed out last winter, classic glossaries and phrase lists often enter into this process with a suspicious degree of ease, requiring little of the creative invention that Trevor displays in working with the more skeletal Chinuk-wawa material.

Posted by Mark Liberman at 09:42 AM

Wild-caught eggcorns preferred

Q_pheevr cites an eggcorn sighted on OxBlog -- "...soft-peddle news..." instead of "...soft-pedal news..." -- and comments that "I know that there are discovery procedures for finding eggcorns in bulk, but I think I prefer the ones that come by chance, singly, and present themselves as remarkable because of the sense they make".

Posted by Mark Liberman at 09:19 AM

Like a submarine out of the Gobi desert

Last week, I noted that Kyrie O'Connor at the Houston Chronicle has apparently coined a new word: smoothistas, for the people who mix and serve smoothies. This is obviously an analogy to barista, which has come to be used for the folks who work behind the counter at coffee bars.

At the time, I searched via Google and found nothing at all for smoothistas, and for smoothista, only a single usage in Finnish: "jotka hyppää toistaiseksi smoothista kokonaisuudesta esiin kuin sukellusvene Gobin autiomaassa". The spelling made it seem likely to be a borrowing, with the -sta part being elative case, but I can't read Finnish, and didn't have time to check with someone who can, so I left it there.

Now Stefano Taschini has sent email with the translation:

... a finnish friend told me that the sentence you reported ... is a relative clause meaning "that jump out of a seemingly smooth whole like a submarine out of the desert of Gobi."

Looking at the page where you excerpted that sentence from, he also told me that they are talking about music and this daring simile refers to some "riffs in minor mode."

OK, so Ms. O'Connor's international cross-linguistic priority on smoothista is safe. Meanwhile, though, it's been a whole week since smoothista hit print, and if anybody else has picked it up, Google doesn't know about it.

For added value, here's the OED entry for barista, showing a history in English back to 1982, or in the more modern sense to 1988:

A bartender in an Italian or Italian-style bar. Also spec. (orig. U.S.): a person who makes and serves coffee in a coffee bar (the more frequent sense in English).

1982 P. HOFMAN Rome, Sweet Tempestuous Life 24 A good barista can simultaneously keep an eye on the coffee oozing from the espresso machine into a battery of cups, pour vermouth and bitters..and discuss the miserable showing of the Lazio soccer team.
1988 Boston Globe (Nexis) 13 Dec. 61 A feisty but cordial competitor to the larger caffeine chains the [Boston Coffee] Exchange has unfurled a help-wanted poster titled ‘Learn to be a coffee barista’.
1990 Atlantic Nov. 157/2 This ritual unites all the baristas in Italy. But not everyone accomplishes the layer of light-colored crema, or foam, that is the pride of an expert espresso-maker.
1999 Dominion (Wellington, N.Z.) (Nexis) 24 Feb. (Business section) 24 New bariste undertake an intensive training programme which covers the philosophy, history, and science of coffee, and the psychology of service.
2001 Times 7 Mar. II. 5/1 The key to a good espresso lies in the barista..and whether he or she cares enough to do it right.

[Update 6/25/2004: Abnu at Wordlab has posted a note on smoothista from a naming and branding perspective. ]

Posted by Mark Liberman at 08:35 AM

Nativism clings to life at 100 or 101

Under the intriguing heading, "After Century in a Log Cabin, Emma Buck Dies at 100 or 101," the NY Times ran an affecting obituary the other day for Emma Buck, who died on the Illinois farm "originally settled by Miss Buck's maternal great-grandparents, Christian and Christina Henke, German immigrants from East Friesland who came by boat from New Orleans and settled in western Illinois, about 35 miles down river from St. Louis, in 1841."

What caught my eye was that Ms. Buck was described in the piece as "speaking in a thick German accent." That's a dramatic reminder of how tenacious foreign languages could be in rural America in the 19th and early 20th centuries. It's unimaginable that any child born today whose great-grandparents immigrated to America before the Second World War would still be speaking English with an accent -- in fact such a child is almost certain to grow up knowing no more of his or her ancestral language than the names of a few ethnic dishes, at best. But that's the specter Semantic Compositions considers in a thoughtful post entitled "Huntington contra Nunberg," which contrasts some of the things I have said about English-only in a recent LanguageLog post and an earlier article in The American Prospect with the views offered by Samuel Huntington in his recent book Who Are We? (The post is the third of a four-part discussion of Huntington, of which the other parts are here, here and here.)

As SC notes, Huntington acknowledges that patterns of Spanish retention suggest that Hispanics are following the same pattern of linguistic assimilation that earlier generations do -- though, as the Times obituary reminds us, a lot more rapidly than people did a century ago. But Huntington also suggests, as SC puts it, that "this time it's different." SC offers a cogent summary of Huntington's arguments:

Looking at Mexican immigration patterns since 1975, Huntington identifies several features which are distinctly different from previous generations of immigrants: substantially higher proportions of illegal immigration than any other ethnic group, high regional concentration (most notably in the Southwest , Florida, and New York City), persistence (no significant closing of the borders has occurred in the past 30 years), and historical presence (unlike all other migrant groups, Mexicans have a plausible ownership claim to American territory grounded in historical facts), and finally, a government which encourages the mindset that emigrants are still Mexicans first and foremost.

SC's response to this is too complex and nuanced for me to summarize here, but he winds up giving credance to the possibility of the specter that Huntington raises, though allowing that "It is probably 30-40 years too early to attempt to verify these claims empirically." And in his fourth and final post, gives a qualified endorsement to some of the symbolic measures that Huntington advocates -- a roll-back of Executive Order 13166, for example, which mandates the accommodation of LEP speakers for programs receiving federal funding (apart from the provision of emergency services), and an end to offering driver's licence tests in languages other than English.

The point about licences SC bases on the assertion that

In most states [driver's licenses] remain exclusively the privilege of citizens. American-born citizens presumably are brought up speaking English; naturalized citizens are either presumed to have learned enough English to pass a basic examination, or to be too old to acquire adequate English skills.... The notion that a test in English permitting a privilege with life-and-death consequences is an unreasonable imposition on people who theoretically have undertaken to learn English is itself a mocking of the idea that English was learned. To the extent that American society requires mobility, and this represents a handicap to the economic opportunities of non-English-speakers, Huntington might well rejoin that this is an excellent method for deciding between the validity of his analysis and Geoff Nunberg's. If Nunberg is right, this sort of policy reform should ultimately only serve as a barrier to those immigrants who are too old to learn English; if Huntington is right, the number of citizens driving illegally should skyrocket.

This example is worth considering. For one thing, I don't know whether most states restrict the issuance of driver's licenses to citizens -- frankly, that claim surprises me -- but I do know that there is no such requirement in many states with large immigrant populations, like New York and California. And a good thing, too. When I'm driving my daughter to school in San Francisco, the last thing I want to run into -- figuratively or literally -- is a driver who is ignorant of the rules of the road because he or she had insufficient competence in English to take the license exam. Here as elsewhere, that is, making it harder for LEP residents to access various privileges and services doesn't impost a burden merely on them.

More generally, the argument against giving driver's licences to those with limited English proficiency, like the argument for rolling back 13166, rests on some questionable assumptions. First, it assumes that immigrants will learn English only if that becomes a means to attaining certain legal privileges and government services, rather than out of an interest in acquiring the cultural and economic benefits that English proficiency confers -- and by implication, it suggests that immigrants are too ignorant or lazy, or too much under the thrall of native rabble-rousers, to recognize those advantages. Hispanics are right to bristle at that implication, which has no grounding in fact.

Second, it assumes that a language learned for these reasons alone would be the vehicle for inculcating a stronger sense of identification of the national culture. (There have been many cases in which states have been able to impose a national language on minorities who were otherwise reluctant to learn them -- you think of the Slovaks in Hungary, the Hungarians in Slovakia, the Catalonians, and the Irish -- but it's hard to think of any instance in which that has enhanced the sense of identity with the national culture, in the absence of broader cultural and economic opportunity.)

Third, as I suggested in my earlier post, it presumes that English itself can be the bearer of the values implied by the phrase "Anglo-Protestant creed" -- a kind of irredentist Herderianism that linguists, at least, will recognize as a persistent fallacy in thinking about the relation between language and national identity. Somehow, that is, a people doing their daily business in English will naturally come to identify with the majoritarian cultural values it stands in for. Tell that to the Irish.

In fact, English is too useful and important to imagine that any immigrant group would be willing to turn its back on it in order to maintain a marginal, ghettoized existence. Whether the acquisition of English will continue to bring with it a sense of belonging to a national culture depends entirely on the economic and social opportunities that assimilation offers to immigrants, and on our ability to refashion the idea of American citizenship to meet new challenges. To date, the prospects are every bit as promising as they were a generation ago -- and a lot more so than they were in Emma Buck's day. Paul Starr put this point beautifully in the closing paragraphs of his review of Huntington's book in the New Republic (unfortunately available only to subscribers):

There is a legitimate case to be made.. for a deepened sense of common citizenship in America. If we want Americans to vote and to participate in civic life, citizenship has to matter for them. Huntington is entirely right when he observes that "those who deny meaning to American citizenship also deny meaning to the cultural and political community that has been America." But he is wrong, repugnantly wrong, about how to strengthen that community, and wrong also to suggest that those who disagree with him about the means of doing so are betraying the country.
In the book's foreword, Huntington remarks that Who Are We? has been shaped by his identities as a scholar and a patriot. But he has put distorted scholarship at the service of a misconceived patriotism. The idea of building American identity around an Anglo-Protestant revival would be entirely self-defeating. Far from unifying Americans, Huntington's vision of America as a re-energized Christian society would be deeply divisive. Samuel Huntington's nightmare of an American crackup could come true, but only if more people think as he does.

Posted by Geoff Nunberg at 03:05 AM

June 23, 2004

Next store neighbor

Welcoming Cass Sunstein to the Volokh Conspiracy, Randy Barnett writes

Cass Sunstein was in the office next store in his very first year of teaching and we spent quality time together that year. Now Cass is a "Visiting Fellow" of the Volokh Conspiracy. Welcome to the office next door, Cass!

"Next store", after a bit of consonant cluster simplification, is phonetically similar if not identical to "next door", and "next door" is a semantically non-compositional idiom, and "store" is roughly as close to the meaning of dwelling as "door" is, so "next store" is a likely eggcorn candidate.

And Google doesn't disappoint us:

(link) When I received these little remembrances, I often thought of a comment our next-store neighbor made after he left that first morning.
(link) He needed photos of our next store neighbor's garbage cans.
(link) It's one of those "urban nightmare stories" in which your newly befriended next store neighbor turns out to be a cold blooded mass bomber, and a mastermind who never loses.
(link) The best thing I can say about 'The Girl Next Store' is that it had all the requisite components for a Stupid Teen Movie:
(link) John Brooke is a tutor to the boy next store. The boy next store 's name is Laury.
(link) Hiding already in the alcove was a young man who lived next store to the March's with his grandfather.
(link) US 1880 Census show that Adam & Mary lived next store to Mary's parents.
(link) I think of the elderly couple who lived next store to me, so in love and wondering what was happening to them physically.

This is a new one to me, but not to Paul Brians. No doubt Prof. Barnett used the phrase as a joke :-).

[Eggcorn alert in email from Linda Seebach].

Posted by Mark Liberman at 09:21 PM

Something almost like people skills

I recently re-read Ken Macleod's SF novel Cosmonaut Keep, which is partly set on earth in 2048, after "the Fall of the Wall, the Millenium Slump, the Century Boom, the Unix rollover, the War, the Revolution." It's clever of Macleod to slip the Unix rollover in there -- it's due at 03:14:07 Tuesday, January 19, 2038 (UTC).

Old-fashioned computer operating systems are relevant to Macleod's story in several ways, starting (and ending) with Matt Cairns, a Scottish hacker who makes a living on the fringes of the mid-21st-century computer industry. Cairns introduces himself in chapter 2 like this:

Software project management has always been like herding cats. So I've been told, anyway, by old managers, between snorts of coke in the trendy snow-bars where they blow their well-hedged pension funds. In their day, though, the cats were human, or at least the kind of guys who are now code-geeks. These days, the programmers are programs, as are the systems analysts. My job as a project manager is to assemble a convincing suite of AIs -- not untried, but not too far behind the curve, either -- then let loose marketing strategy webcrawlers to parade their skills before the endless bored beauty-contest of the agencies' business 'bots, take the contracts and ride herd on the whole squabbling mob when a deal comes in.

You need something almost like people skills to do it, but you need to be practically borderline Asperger's syndrome to develop these skills with AI. And when you need code-geeks for the bottom-level stuff, you need to be something of a sociable animal after all. It's a sufficiently rare combination to be worth more than the average wage. I'm an artist, not a technician. It pays the bills.

I think that Macleod has put his finger on something about software project management in all eras. The fact is, the technical side has always had an aspect that is like dealing with really strange and difficult people: legacy systems, software and hardware combinations that don't quite work the way the interface definitions say they should, and so on. And then you do need to deal with other programmers, as well as system architects and perhaps even travelers from the far lands of marketing and ergonomics. There aren't very many people who are good at all this, and few of them can also hack.

Posted by Mark Liberman at 11:26 AM

"In all cases by a possible combination of any of the above"

I'd like think that Trevor was right when he wrote that

I sometimes suspect that English is so popular (and so strongly associated with ideologies of freedom) not because of its status as the world's primary language of intercommunal transaction but simply because it is such a delightful chaos.

However, some contrary evidence is provided by the failure of the Nilotic language Nuer to become more popular.

According to the sketch of Nuer grammar in the now-online Nuer Project Field Notes, nouns have four cases and two numbers, and "[i]t is difficult to anticipate what the various case forms will be due to the extravagance in noun classes. It appears that the majority of nouns each form a class in themselves".

The sketch identifies the "possibilities of case identification" as follows, indicating that none of these these subregularities outnumbers its exceptions:

In genitive singular by the suffix -kä or -ä.
In nominative, genitive and locative plural by the suffix -ni.
In nominative plural by a complete word change.
In all cases by a possible change of medial vowel to any of 3 lengths.
In all cases by a possible change of the stem vowel.
In all cases by a possible change of medial vowel and final consonant.
In all cases by a possible change of tone.
In all cases by a possible combination of any of the above.

The sketch provides a table of sample forms, which are charmingly identified as "some poignant examples".

Verbs are similarly idiosyncratic: "the possibility of stem changes in one verb are numerous. The difficulty is that they follow no distinct and easily grasped pattern."

The Nuer (or Naath as they call themselves) have suffered greatly from the genocidal conflict in southern Sudan, which alas is different from the recently-reported genocidal conflict in western Sudan.

[link to online Nuer materials provided by Language Hat]

Posted by Mark Liberman at 09:10 AM

The most untranslatable word

Perhaps the BBC hasn't had the resources to check or retract its stories on telepathic parrots and mutant frog sex because it's been busy with this:

The world's most difficult word to translate has been identified as "ilunga" from the Tshiluba language spoken in south-eastern DR Congo.

It came top of a list drawn up in consultation with 1,000 linguists.

Ilunga means "a person who is ready to forgive any abuse for the first time, to tolerate it a second time, but never a third time"

1,000 linguists? And they didn't ask me? How will I show my face at the next meeting of the LSA cabal?

OK, in fairness to the Beeb, it wan't their survey at all. According to the story, it was carried out by "Jurga Zilinskiene, head of Today Translations". And since googling these names turns up no further information about the survey, I'll wait a bit before feeling slighted.

The thing that puzzles me, though, is where Zilinskiene turned up 1,000 linguists who know Tshiluba vocabulary. I'm beginning to get the feeling that this survey might have been a class project in one of Zilinskiene's Problematics courses...

The BBC's intrepid reporter, Oliver Conway, gives us the top three hard-to-translate words from the "survey".

In second place was shlimazl which is Yiddish for "a chronically unlucky person".

Third was Naa, used in the Kansai area of Japan to emphasise statements or agree with someone.

I'm no kind of expert on translation, but if they'd asked me, I would have been tempted to nominate some morphological category like inchoative, or some preposition or determiner. The thing about words like shlimazl is that they have pretty clear definitions, and you can always just borrow them in a pinch -- as English has done with shlimazl, which was even featured on a TV show. I'm not sure how useful ilunga will really be, but if you feel you need it, and none of the available paraphrases or approximations will do, you could just start using it.

Apparently Conway asked some similar questions:

Although the definitions seem fairly precise, the problem is trying to convey the local references associated with such words, says Jurga Zilinskiene, head of Today Translations, which carried out the survey.

"Probably you can have a look at the dictionary and... find the meaning," she said. "But most importantly it's about cultural experiences and... cultural emphasis on words."

Fair enough. But I'm still wondering about that survey. Googling Zilinskiene, I find her described by a feature in the Guardian's jobs section as having "recently won the Shell LiveWire Award for young entrepreneurs". But her company Today Translations seems to be so new that it doesn't yet have a web site indexed by Google, and likewise her LiveWire award seems to be so recent that it's not yet in the LiveWire news archive.

Conway's story makes an amusing little feature, which probably didn't take him any longer to research and write than the 20 minutes that this post took me. But if BBC News were a serious organization, you'd think his editor would have asked him to ask a few linguists or translators for some reactions. Or they could even have assigned a writer with some knowledge of the genuine problems of translation, and some interest in the methods that translators use to solve them.

[Link sent in by Peter Conn.]

[Update: Alexander Koller emails:

not to mention the problem that the notion of a "most untranslatable word" is inherently ill-defined anyway. Surely you need to fix the target language to decide what the most untranslatable word would be. I can easily translate "shlimazl" into German "Pechvogel", and that seems to express exactly the same meaning (although I don't know about the finer points of the cultural references associated with "shlimazl").
Yes, I suppose it means "hard to translate into English". But the news article carefully avoids this level of precision.

Yes. The problem is that Alexander's dashed-off email represents an order of magnitude more thought than Oliver Conway devoted to the topic. Perhaps Conway just re-wrote a press release from Today Translations, or maybe he went so far as to interview Zilinskiene on the phone.]

Posted by Mark Liberman at 08:34 AM

Obsessing about punctuation

I have a confession to make: I have almost no interest in punctuation. Out of respect for the opinions of others, I try to use apostrophes and commas correctly, but I'm less interested in the details of punctuation than in nearly any other topic I can think of. Give me a choice between talking about varieties of dashes and debating the choice of lining material for suit jackets, for example, and I'll be all over the rayon-vs.-polyester controversy. Give me a choice between reading about the order of quotation marks and commas or perusing a random phone book, and I'll dive right into the A's.

Luckily, these are not choices that life has often presented me with. Aside from the occasional copy editor, I've rarely met anyone who focused much on punctuation. So I was skeptical of this paragraph in Louis Menand's review of Eats, Shoots and Leaves in the New Yorker:

The supreme peculiarity of this peculiar publishing phenomenon is that the British are less rigid about punctuation and related matters, such as footnote and bibliographic form, than Americans are. An Englishwoman lecturing Americans on semicolons is a little like an American lecturing the French on sauces. Some of Truss’s departures from punctuation norms are just British laxness. In a book that pretends to be all about firmness, though, this is not a good excuse. The main rule in grammatical form is to stick to whatever rules you start out with, and the most objectionable thing about Truss’s writing is its inconsistency.

Well, I thought, I guess Louis Menand hobnobs with a different class of Americans than I do. But Margaret Marks at Transblawg zeroed right in on this passage, and she agrees with it:

How true this rings. Oh, the times I used to tell my students, ‘You can’t do that. You know, the Americans are even more pedantic than we British are.’ Did they believe me? No - because pedantry is bad and Americans are good.

(Dr. Marks is a professional translator who has dealt with clients from many countries, and in this passage she is explaining the norms of the business to her students.)

Reading this, my instinct was to leap to the defense of my fellow Americans, who are surely... But wait a minute, should that be "fellow-Americans", as Edmund Morris has it in his New Yorker piece on Ronald Reagan:

Merely by breathing, “My fellow-Americans,” he made his listener trust him.

In fact, this seems to be one of the rules that they stick to over there at the New Yorker:

(link) Kerry had made the decision along with three close friends, classmates and fellow-members of Yale's not so secret society, Skull and Bones...
(link) While Lieberman and his fellow-Democrats were doing the bidding of the public-employee unions...
(link) The camera captures one particularly wild-eyed defendant in a green caftan as he extends his arms through the bars of the cage, screams, and then faints into the arms of a fellow-prisoner.
(link) This is what happened when a fellow-critic and I emerged, on December 11th, from a screening of “The Lord of the Rings: The Return of the King.”

This is something that you see in lists of punctuation principles -- it's rule #4 in this list, for example -- but I find it weird. Seeing "my fellow-Americans" in the Edmund Morris article took me aback just as much as seeing a non-standard apostrophe did in Jefferson's letter ("we are disgusted with it's deformity").

While the New Yorker isn't alone in hyphenating this way, it disagrees with the practice of most other American publications, such as the New York Times:

(link) But do Americans really despise the beliefs of half of their fellow citizens?
(link) "American Taboo" has assembled considerable evidence that Mr. Priven murdered one of his fellow volunteers and got away with it.
(link) As European leaders gathered in Brussels over the last few days to negotiate a proposed constitution for their ever-closer union, hundreds of thousands of soccer fans from all over the continent descended on Portugal for the 2004 European Championships, to wave their national flags and jeer at their fellow Europeans.

the Washington Post:

(link) But most of her fellow Christians still do not view gay marriage as a personal threat, Raglin said.
(link) His paper is both a catalogue of recent examples of such partnerships and a call to fellow environmentalists to look more actively for common ground with the world's religious.
(link) "Art brings us closer to our fellow man" -- is true only in a grim, comical sense.

And the Atlantic:

(link) The continental peoples are grave, compared with our jocose fellow citizens, and especially in their hours of business.
(link) ...for all intents and purposes embracing Dylan as a fellow wordsmith, perhaps even a fellow poet...
(link) He had been highly respected by fellow specialists for the papers he wrote while in charge of Lepidoptera at Harvard's Museum of Comparative Zoology...

And...

Shucks, maybe Margaret is right.

Posted by Mark Liberman at 06:41 AM

Writing about writing about writing about writing about a film about Bush

When I originally read Hitchens' Unfairenheit 9/11, gosh, it must have been all of 12 hours ago, I was struck by the choice of words in one paragraph, words which smell as sweet as a very good blue cheese that has just spent a week long vacation in a warm sock. But I must confess that I didn't notice anything patently illogical. Shame on me!

Now Mark comes along and reads the same paragraph (see here). The odious vapor of insult hits him, but his logical senses are not overwhelmed. With bloodhound acuteness he sniffs right through that heady verbiage and smells, hmm, what is that? No, nothing ungrammatical, the spelling is just fine, and even the punctuation is pristine. But something does not belong. He's off! Unstoppable, he tracks Hitchens' intentions through a forest of semantic weed and trope, and finds in a clearing kilomeanings from anywhere... two hopelessly confused interpretations running round in circles.

The runaway interpretations belong to the following sentences:

To describe this film as dishonest and demagogic would almost be to promote those terms to the level of respectability.
To describe this film as a piece of crap would be to run the risk of a discourse that would never again rise above the excremental.

And these sentences might have been intended to convey that:

This film is worse than dishonest and demagogic.
This film is worse than crap.

But Libermanic decomposition, turned up to 11, so watch out, reveals:

If calling this film dishonest and demagogic would make "dishonest and demagogic" respectable, then the film must add respectability to words used to describe it, so presumably the film itself is respectable.
If calling this film a piece of crap would make the use of "a piece of crap" ubiquitous, then this film is not significantly more crappy than all the other things that would be described as "a piece of crap" afterwards. Given that many of these things that in this hypothetical future would be called pieces of crap are actually pretty darn non-crappy, it follows that the film is not crap.

Wow!

Mark's analysis is pretty convincing. Yet there is, I think, a way out of this mess. Let's look at the full paragraph in all its gory.

To describe this film as dishonest and demagogic would almost be to promote those terms to the level of respectability. To describe this film as a piece of crap would be to run the risk of a discourse that would never again rise above the excremental. To describe it as an exercise in facile crowd-pleasing would be too obvious. Fahrenheit 9/11 is a sinister exercise in moral frivolity, crudely disguised as an exercise in seriousness. It is also a spectacle of abject political cowardice masking itself as a demonstration of "dissenting" bravery.

One way to think about what is going on is in terms of an imagined, implicit question. So ask yourselves this: what is the implicit question which the above paragraph answers?

I suggest the implicit question is a meta-question about Hitchens' own writing, namely: how should Hitchens describe Fahrenheit 9/11 at this point in the article? Hitchens, in the first three sentences of the paragraph, is not talking in general about how people should describe the movie or what descriptors for it would be accurate. If accuracy alone were the issue, Hitchens would not go on to talk about one description being too obvious, since obviousness does not suggest inaccuracy at all. No, he is talking about the words he should choose there and then. He answers the implicit question as if revealing his own authorial thought processes. The first three sentences of the paragraph tell us what the answer is not, and the last two tell us what it is. Yes, in case you wondered, he considered describing the film as dishonest, demagogic, a piece of crap and an exercise in facile crowd pleasing but he rejected those word choices (for which we, the audience, are duly grateful) in favor of sinsister, morally frivolous, and so on.

But the above implicit question analysis sets Mark's logic in a new light. If the the paragraph in question is a meta-commentary on Hitchens writing process, then what is the discourse he refers to in the second sentence? Why, it is none other than Hitchens' own discourse, the article Unfairenheit 9/11. So then the risk being run is not that discourse in general would never rise above the excremental. No, the risk is that Hitchens' article would consist of nothing but gutterworthy invective. (Heaven forbid!) There is then, contra Mark, no implication that terming Fahrenheit 9/11 a piece of crap would lead to anything other than Fahrenheit 9/11 being described in excremental terms, because the discourse in question is only about Fahrenheit 9/11.

Let us turn now to the trickier first sentence of the paragraph. The worry Hitchens presents to us is that describing the film as dishonest and demagogic would make these terms appear respectable terms to use in the context of the remainder of the article. Now, this doesn't sound like a disaster to me. But that may be because I am not Hitchens, a master stylist, wordsman and Oxford graduate. In fact, he graduated from Balliol, a college that long, long ago rejected my application to study there as an undergraduate after a positively absurd interview, so I have reason to believe he knows something I don't. I used Google to find out just how smart with words Hitchens is. The results are astounding. He's off the scale. At least by his own ambitious reckoning. For example, in this interview he describes the title of a book of his as involving a triple entendre. Can you imagine the genius it would take to put three different meanings into a single title? Yes, it's true that Eats, Shoots and Leaves has two meanings, but neither of them have anything to do with the subject matter of the book. And Hitchens has written 20 books. With more than 20 titles. Can you even conceptualize just how many meanings that might add up to in total? Well, I worked it out, and it's a lot. Hitchens is obviously someone who cares about word choice. More than me. More than Mark. But I digress.

The point Hitchens is making with the first sentence of the paragraph is a subtle one, so subtle that I wonder whether some daft editor made it more subtle than Hitchens intended it to be. But it is just about possible to discern what the point is. It is that Hitchens thinks that describing the film as dishonest and demagogic would be a rhetorical dead-end for his article. He believes this because, in his view, within the context of an article that details the true nature of the film, the terms dishonest and demagogic would appear quite mild and commonplace, or, as Hitchens puts it, they would be promote[d] [...] to the level of respectability. And as Hitchens makes clear in the final two sentences of the paragraph, he thinks that to give the reader any less than the Full Monty of an evaluation of the film at that point would be wrong. For if you have read the article, you will realize that, title aside (Holy hieroglyphics, Batman - Unfairenheit 9/11 is a triple entendre!!!) when we get to the paragraph under discussion, Hitchens has not yet told us what he thinks of the film. In fact, what he has said up to that point suggests that in some way Moore's film might be a positive contribution, a new and much needed voice for Democratic thinking. What Hitchens' has done is use the classic device of setting his target up for a fall. And what is needed is a heavy fall onto spikes, to be followed by a herd of elephants trampling the victim into unrecognizable squishy ucky stuff. The terms dishonest and demagogic are not those spikes. And that is what Hitchens is trying to tell us with the first sentence of the paragraph. Quite why he chose to tell us that, I cannot say. But it certainly confused the hell out of my main man Mark, about whose writing I am writing.

Mark thought that Hitchens was writing about a film about Bush, but, if I'm right, then in the crucial paragraph Hitchens was writing about the process of writing about a film about Bush. Which would mean that in this piece I have been writing about writing about writing about writing about a film about Bush, and am now in an infinite recursion of writing about writing about writing about.... Maybe we should all be writing about Bush instead. Nahh! Not on your Language Log!

Posted by David Beaver at 04:09 AM

June 22, 2004

Overstating understatement

In a Slate review entitled Unfairenheit 9/11, Christopher Hitchens makes it clear that he doesn't like Michael Moore or Moore's new documentary:

To describe this film as dishonest and demagogic would almost be to promote those terms to the level of respectability. To describe this film as a piece of crap would be to run the risk of a discourse that would never again rise above the excremental. To describe it as an exercise in facile crowd-pleasing would be too obvious. Fahrenheit 9/11 is a sinister exercise in moral frivolity, crudely disguised as an exercise in seriousness. It is also a spectacle of abject political cowardice masking itself as a demonstration of "dissenting" bravery.

It's obvious that this paragraph is not part of a positive review. I got a similar impression of Fahrenheit 9/11 from a journalist acquaintance, who saw it last weekend, and said "I hate Bush, but the movie was so unfair that it made me want to defend him". However, my concern here is not with the politics of Moore's documentary, but with the semantics of the first two sentences of Hitchens' paragraph quoted above.

The rhetorical trope in play is a routine one: "To describe X as P is an understatement", where P is some scalar evaluative predicate. P can be something negative:

(link) To describe this split as acrimonious would be an understatement.
(link) To describe this book as patently ridiculous would be an understatement.

or P can be something positive:

(link) To describe this report as timely is an understatement.
(link) So to describe this machine as portable is an understatement.
(link) To describe this room as simply a venue is to understate its place as a truly unique experience.

In either case, the point is that X has property P to an extreme degree, perhaps even to the point where some other description, on beyond P on the same scale, should be used instead. The point is emphasized by making it metalinguistically, stepping outside the descriptive flow to comment explicitly on the terminology to be used, thus suggesting that the author is choosing words with special care.

There are lots of different ways to say "is an understatement", some of them simple:

(link) To describe this book as “a gold mine” or as “monumental” does not do it justice.
(link) to call this "a reach" is being kind.
(link) To call this a mistranslation is too euphemistic, we should call this just what it is; another Christian falsification of their Bible translations...

and some of them more elaborate:

(link) To describe this lot as Limousine Liberals is to slander liberals.
(link) To describe this film as the worst movie that I've seen at the Cannes Film Festival so far is to do a disservice to all other movies that actually attempted to put together a narrative that makes sense on an actual cinematic level.
(link) To describe this maneuver as a tackle would be to make it sound far more polite than it is.
(link) The eponymous Carrie has a mother, of course, and to describe this woman as a clichéd and one-dimensional stereotype would be to pay a compliment to the characterization.

So it's pretty clear that when Hitchens writes

To describe this film as dishonest and demagogic would almost be to promote those terms to the level of respectability.

he means that "dishonest" and "demagogic" are not strong enough terms to describe Fahrenheit 9/11, which is much worse than that. But how does his sentence actually deliver that meaning? Let's stipulate that descriptive contact with this film might make the terms "dishonest" and "demogogic" respectable -- how do we get from there to the idea that the film is significantly beyond "dishonest" and "demogogic" on the scale of partisan propaganda?

The line of argument seems to be something like "X is so much beyond simple criminality that in comparison, a criminal is like an honest person"; but that doesn't imply that calling X a criminal would "promote" the term criminal to honesty. On the contrary, it seems to mean that if we use the term criminal to describe X, then we would have to call regular criminals honest folk. This would degrade the term honest, not promote the term criminal, which in fact has been made to mean something even worse than before.

Likewise, when Hitchens writes

To describe this film as a piece of crap would be to run the risk of a discourse that would never again rise above the excremental.

he clearly wants to say that F. 9/11 is worse than crap. It's somehow so much worse than crap that to call it crap would mean that we could never call anything else non-crap again. I'm getting a glimmer here: does he mean that F. 9/11 shifts the scale of crap so far towards turpitude that all the terms appropriate for all normal objects of disdain just kind of slide off the far end? But when I try to set up a numerical model of the evaluative process that would have this result, I keep getting the opposite outcome, namely that in comparison to something as bad as Hitchens judges F. 9/11 to be, everything else seems good.

Can someone help me out here, with a bit of formal semantics that works the way Hitchens wants it too? Otherwise I'm going to have to conclude that this is a sort of disguised overnegation, a rhetorical thunderbolt that blows back semantically the wrong way.

Be careful, when you do the analysis, not to be fooled by the fact that there is a common rhetorical trope parallel but opposite to the one we've been talking about, of the form "to describe X as P is an overstatement":

(link) To call this a bid is an exaggeration.
(link) To call this a rapids is stretching things a bit, but it made nice subject for the shot.
(link) To call this a consultation is really stretching the definition of the word.
(link) To call this a mall is being very generous.
(link) To call this a comedy is a sign of optimism; to call it a comeback for Murphy is a sign of blind faith.
(link) To call this a 'farm' is perhaps, a little misleading.
(link) To call this a memorial is nuttier than squirrel poop.
(link) For them to call this a crime is an insult to victims of real crimes.
(link) ...to call this a war is an insult to those who fought in wars
(link) To call this a madhouse is an insult to (psychiatric patients)...

Posted by Mark Liberman at 07:17 PM

Inter faeces et urinam nascimur ridemus

Of all the stories I've heard that are based on malapropisms, I think the one that Edmund Morris tells about Ronald Reagan in this week's New Yorker is the most amusing.

Perhaps the best of Reagan’s one-liners came after he attended his last ceremonial dinner, with the Knights of Malta in New York City on January 13, 1989. The evening’s m.c., a prominent lay Catholic, was rendered so emotional by wine that he waved aside protocol and followed the President’s speech with a rather slurry one of his own. It was to the effect that Ronald Reagan, a defender of the rights of the unborn, knew that all human beings begin life as “feces.” The speaker cited Cardinal John O’Connor (sitting aghast nearby) as “a fece” who had gone on to greater things. “You, too, Mr. President—you were once a fece!”

En route back to Washington on Air Force One, Reagan twinklingly joined his aides in the main cabin. “Well,” he said, “that’s the first time I’ve flown to New York in formal attire to be told I was a piece of shit.”

I guess this is also a morphological joke, since feces is from Latin faeces, which is the plural of faex "grounds, sediment, lees, dregs of liquids". As far as I know, the English borrowing has never had a singular form, though the AHD says that it's "used with a sing. or pl. verb".

By the way, apparently it's now OK to write "Cardinal John O'Connor" rather than "John Cardinal O'Connor, according to this wikipedia entry.

Posted by Mark Liberman at 11:18 AM

Chatnannies: still missing

Ray Girvan surveys the on-going Chatnannies story, including links to a New Scientist piece and the compendious history at Waxy.org. It's nice to see that the New Scientist editors are making up for their originally credulous story on this topic, in contrast to the BBC, who just silently removed their story from their web site, and (as far as I know) still haven't retracted their much-ridiculed mutant frog and telepathic parrot pieces, much less apologized. As I asked earlier, isn't it time for the BBC to get an ombudsman?

Posted by Mark Liberman at 07:33 AM

Computational origami

A NYT article on something else that David Huffman did. Another relevant link is here. Amazing stuff. There's no obvious language hook, but as usual, the Language Log marketing department will cheerful refund your subscription fees if you are dissatisfied in any way.

Posted by Mark Liberman at 12:24 AM

Tongue and cheek, hole in corner

Fernando Pereira emailed an "eggcorn alert": tongue and cheek for tongue in cheek.

This one is among the 780 errors listed on Paul Brians' Common Errors in English site.

You could certainly make up a story to explain the phrase tongue and cheek -- it makes as much sense as a lot of idioms do -- but it's not a sanctioned collocation, even though (as Fernando points out) it has 9,280 Google hits. That's 2,166 whG/bp (web hits on Google per billion pages). The original phrase "tongue in cheek" has 330,000 Google hits, or about 77,009 whG/bp.

Sometimes it's hard to distinguish between an incompetent editor and a very subtle joke, as in this sentence from a recent AP story:

(link) In a tongue-and-cheek opinion poll released Friday, 30 percent of 1,277 people aged 35 and older, said Swedish success on the pitch would increase their sex drive.

On reflection, it's probably safe to bet on incompetence.

Just to keep the prepositions and conjunctions in balance, I'll pick a random idiom of the form NOUN and NOUN, and see whether we can find NOUN in NOUN. We should be able to count on substitutions in both directions being common, since the sounds are nearly identical except in facultative pronunciations. Sure enough, "hole in corner" has 548 Google hits by comparison to 2,840 for "hole and corner", or about one (possible) eggcorn per 5.2 originals.

However, most of the instances of "hole in corner" are for real:

(link) Cut a small hole in corner of bag; squeeze to drizzle over madeleines.
(link) Install valve in the provided hole in corner post (Fig. 5) and attach water supply line.
(link) ISBN:671-10303-2. cover has small hole in corner, worn and torn from use, book in very good condtion.

Some others are certainly mistaken versions of the idiom, including uses by journalists and (other) intellectuals:

(link) This is odd because similar hole-in-corner meetings to create the fiction of grassroots support for such assemblies are being organised in all England's other eight "Euro-regions" (Scotland, Wales, Northern Ireland and London already have their assemblies).
(link) These considerations indicate that, appearances to the contrary, Qumran was hardly the hole-in-corner establishment the "Essene" hypothesis would lead us to expect.

as well as other uses in less intellectual contexts:

(link) They have been taught so very well by their Boomer parents; who, in turn, were taught so very well by their Authoritarian Hole-in-corner Parents, or A-hole for short.

In fact, I suspect that you have to be pretty literate to know this idiom well enough even to get it wrong. In any case, the mistake is not common enough to make Paul Brians' list.

The OED has citations for "hole and corner" starting in 1835 -- I wonder when the eggcorn "hole in corner" started?

hole-and-corner adj. phr.

Done or happening in a ‘hole and corner’, or place which is not public; secret, private, clandestine, under-hand. Contemptuously opposed to ‘public’ or ‘open’.

1835 FONBLANQUE Eng. under 7 Administ. (1837) III. 205 Hole-and-corner meetings are got up to speak the voice of the nation.
1839 STONEHOUSE Axholme 77 Any manufacturer of the hole and corner political petitions of the present day.
1862 H. KINGSLEY Ravenshoe III. 55 Tell me at once what this hole-and-corner work means.
1878 S. WALPOLE Hist. Eng. I. vi. 600 The Queen's friends declared that the King's supporters were ‘hole-and-corner’ men.

WordNet (and various derivatives and rip-offs thereof) sanctions hole-in-corner as a synonym of hole-and-corner, but I haven't been able to find any other dictionaries that do so. In particular, Webster's 2nd, Webster's 3rd, the OED, the American Heritage Dictionary and Encarta don't mention it.

Posted by Mark Liberman at 12:03 AM

June 21, 2004

Reads, zaps and digresses

Lynne Truss may believe that "people who put an apostrophe in the wrong place ... deserve to be struck by lightning, hacked up on the spot and buried in an unmarked grave", but apparently she herself isn't very careful about where she puts her commas. In a New Yorker review posted today, Louis Menand comes down on Truss like a whole squall line of Jovian thunderbolts, and after his first 1200 words, there's not enough left to hack up and bury.

He starts this way:

"The first punctuation mistake in “Eats, Shoots & Leaves: The Zero Tolerance Approach to Punctuation” (Gotham; $17.50), by Lynne Truss, a British writer, appears in the dedication, where a nonrestrictive clause is not preceded by a comma. It is a wild ride downhill from there. “Eats, Shoots & Leaves” presents itself as a call to arms, in a world spinning rapidly into subliteracy, by a hip yet unapologetic curmudgeon, a stickler for the rules of writing. But it’s hard to fend off the suspicion that the whole thing might be a hoax."

He ends his discussion of Truss by pointing out that her fans are mad as hell

"and they do not wish to be handed the line that “language is always evolving,” or some other slice of liberal pie. They don’t even want to know what the distinction between a restrictive and a non-restrictive clause might be. They are like people who lose control when they hear a cell phone ring in a public place: they just need to vent. Truss is their Jeremiah. They don’t care where her commas are, because her heart is in the right place."

Having vaporized Truss in about 1200 words, Menand devotes the second half of his review to an interesting series of digressions about the differences between speech and writing and the nature of a writer's "voice". Punctuation plays a minor role in this discussion, and Truss almost no role at all.

Menand may have foolish, hypocritical and incoherent ideas about possessive antecedents, but he can sling a mean lightning bolt, and I think it's fair to say that Truss was asking for it. And after the smoke clears, Menand tells a couple of nice stories about W.H. Auden, James Agee and Luciano Pavarotti.

Posted by Mark Liberman at 11:30 PM

MLA Language Map: now without "density"!

An email from David Goldberg, Acting Director, MLA Foreign Language Programs and ADFL, in reference to the MLA Language Map discussed here last week:

Thank you for the posting on Language Log. We have had a number of comments about our too-casual use of the term density and we are grateful to have been alerted to this. We have changed the wording (it now reads Numbers of Speakers), and hope that an anticipated expansion of the site will include a reflection of actual density of speakers, that is, numbers of speakers of each language in relation to the entire population of each county or zip code.

So, an immediate terminological improvement, and soon, more useful information! It's a wonderful world...

Posted by Mark Liberman at 03:14 PM

Legal redundancy

In response to my post on by no manner of means, Abnu from Wordlab wrote in to suggest that the legal variant by no manner or means is a typical example of lawyer's redundancy, citing the entry for aid and abet in the law.com legal dictionary:

v. help commit a crime. A lawyer redundancy since abet means aid, which lends credence to the old rumor that lawyers used to be paid by the word.

I'm sure that Abnu is right, though this doesn't modify my tentative conclusion that "all those legal by any manner phrases are back-formations from a misconstrual of by any manner of means as by any manner or means".

Other claimed examples of legal redundancy from the law.com dictionary include:

copartner
n. one who is a member of a partnership. The prefix "co" is a redundancy, since a partner is a member of a partnership. The same is true of the term "copartnership."

due
n. and adj. owed as of a specific date. A popular legal redundancy is that a debt is "due, owing and unpaid." Unpaid does not necessarily mean that a debt is due.

rebuttable presumption
n. since a presumption is an assumption of fact accepted by the court until disproved, all presumptions are rebuttable. Thus rebuttable presumption is a redundancy.

Legal English, like Chinese compounding, seems to be a case where redundancy is not widely excoriated as ridiculous and unnecessary. Perhaps lawyers get a pass on this because they are felt to be guilty of more important sins, but it seems more likely that people value the goal of explicitness and completeness enough to forgive a few perhaps-unnecessary words.

Posted by Mark Liberman at 10:25 AM

"Per usual", per usual

Today's edition of Eric Umansky's Today's Papers in Slate includes a phrase that struck me as being one that Umansky favors:

The Marines announced one of their men was killed around Fallujah. Per usual, they didn't give details.

A search of Slate's archives suggests that I might be right.

Here are some earlier Today's Papers where Umansky uses the same phrase:

(link) Today's editorials in the NYT, per usual, confront the crucial issues of the day...
(link) Responsibility for the attack was claimed by the Al Aqsa Martyrs Brigade, which, per usual, the papers describe as a group linked to Yasser Arafat's Fatah militia.
(link) Per usual, Woodward doesn't interject with any context and simply lets the president tick off his talking points..

Furthermore, this seems something that is specific to Umansky, rather than deriving from the topic or context of the daily Today's Papers feature. Other authors for the feature include Benjamin Healy, Emily Biuso, Sam Schechner, Zachary Roth, Hudson Morgan, Avi Zenilman and Michael Brus, but according to the archives, Umansky is the only one of these who has ever used per usual.

Umansky appears to have written 550 of the 2481 Today's Papers pieces in Slate's archives (can the feature really have been running for 2481/365 = 6 years and nine months? I guess so!). Each of these pieces is around a thousand words long, so Umansky has used per usual 4 times in 550,000 words, or about 7.27 per million words, whereas the other Today's Papers writers have used the same phrase 0 times in about 1.93 million words (and yes, I know that it is silly to use three significant figures when my estimate of article length has only one...). If the other writers had the same propensity to use per usual that Umansky does, we would have expected to see about 14 instances. Without doing the statistical calculations in detail, we can guess that 0 is significantly different from 14, over this span of time and text.

I've noted before that a word or phrase can come to seem characteristic of a speaker or writer, even if we don't encounter it very often in their productions. I gave an example of a common expression, "and yet", which becomes a sort of stock phrase associated with a particular character in a novel, although it's only used three times.

Another curious statistic emerges from the OED's entry for per usual (part of the entry for per, edited for relevance):

III. As an English preposition.

1. By, by means of, by the instrumentality of; esp. in phrases relating to conveyance, as per bearer, per carrier, per express, per post, per rail, per steamer, etc. Also = according to, as stated or indicated by, as per invoice, per ledger, per margin, etc.; as laid down by (a judge) (quot. 1818). So, in humorous slang use, (as) per usual = as usual; also with ellipsis of usual. Also (exceptionally) in other senses, as per this time = by this time, per instance = for instance (cf. F. par exemple). Also in other humorous and extended uses.

1874 W. S. GILBERT Charity IV, I shall accompany him, as per usual.
1922 JOYCE Ulysses 343 As per usual somebody's nose was out of joint.
1923 ‘K. MANSFIELD’ Bad Idea in Doves' Nest 146 So I took her up a cup of tea..as per usual on her headache days.
1938 J. PHELAN Lifer xxi. 212 That's right,..no grounds, as per.
1959 N. MARSH False Scent (1960) i. 12 He'll be bringing his present later on, as per usual.
1960 S. BARSTOW Kind of Loving II. vii. 263, I reckon after tonight we can't carry on as per.
1972 ‘A. ARMSTRONG’ One Jump Ahead i. 13, I came back as per usual about five o'clock.
1977 J. BINGHAM Marriage Bureau Murders i. 9 I'll stay in a pub... As per usual.

The curiosity is that of the eight citations for (as) per (usual), three (Phelan, Marsh and Bingham) are from detective or mystery novels. Maybe four -- I'm not sure about Armstrong.

[Update: another observation on Umansky's frequency of usage... If we take the observed frequency of 4 in 550 pages as a valid estimate of his propensity to use this phrase, we'd predict (4*10^9)/550 as the frequency per billion pages, or 7,272,728. By comparison, "per usual" actually occurs 63,900 times in the 4,285,199,774 pages that Google currently indexes, corresponding to a rate of 14,912 whG/bp ('web hits on Google per billion pages'). Thus Umansky is using "per usual" roughly 488 times as often as the background rate.

If the other Today's Papers writers were using the phrase at the background rate of 14,912 whG/bp, we'd expect 0.03 instances in the 1931 pages that they've collectively written. This is quite consistent with the observed value of 0.]

Posted by Mark Liberman at 09:21 AM

June 20, 2004

Problematics

Margaret Atwood's recent dystopian novel Oryx and Crake is based on the idea that biological science will soon lead to the extermination of the human species, but she takes a lick or two at other fields as well, including the subjects favored by "word people":

Problematics was for word people, so that was what Jimmy took. Spin and Grin was its nickname among the students. Like everything at Martha Graham it had utilitarian aims. Our Students Graduate With Employable Skills, ran the motto underneath the original Latin motto, which was Ars Longa Vita Brevis.

Jimmy had few illusions. He knew what sort of thing would be open to him when came out the other end of Problematics with his risible degree. Window-dressing was what he'd be doing, at best -- decorating the cold, hard, numerical real world in flossy 2-D verbiage. Depending on how well he did in his Problematics courses -- Applied Logic, Applied Rhetoric, Medical Ethics and Terminology, Applied Semantics, Relativistics and Advanced Mischaracterization, Comparative Cultural Psychology, and the rest -- he'd have a choice between well-paid window-dressing for a big Corp or flimsy cut-rate stuff for a borderline one. The prospect of his future life stretched before him like a sentence; not a prison sentence, but a long-winded sentence with a lot of unnecessary subordinate clauses, as he was soon in the habit of quipping during Happy Hour pickup time at the local campus bars and pubs.

I found this amusing. Imagine if future journalists, PR flacks and advertising folk really took college courses in logic, rhetoric and semantics, applied or otherwise!

I have two comments on the list of courses. First, "advanced mischaracterization" is implausible the name of an imaginary course in the imaginary field of Problematics, whose experts would surely do a better job of verbal window-dressing on their own product. And second, what happened to syntax? should we assume that "word people" don't need to know anything about the structure of sentences in order to "spin and grin"? How will they identify those "unnecessary subordinate clauses"? (And while we're at it, there's sociolinguistics, the phonetics of spoken performance, and other courses to fill the syllabus out with ...)

There's been some controversy about this novel's category. Atwood says that it's not sci-fi, it's "speculative fiction", because "[s]cience fiction has monsters and spaceships; speculative fiction could really happen".

"We have a big box, called The Brown Box... it's a brown cardboard box - in which all the research clippings are filed: so there's nothing I can't back up,"

Her research doesn't seem to have included reading much SF, which doesn't always have spaceships, and often has fewer monsters than Oryx and Crake does.

The reviews have been mixed. Michiko Kakutani called it "this lame piece of sci-fi humbug", while Thomas Disch compared it favorably to Brave New World and 1984. For me, it was mostly too silly to be engaging. I was grateful for this, because it would have been a depressing story if it had managed to draw me in.

Atwood is a effective writer, even in the service of a lame story line. And she has a special flair for depressing puns, like the one on "sentence" in the passage quoted above, or this example from one of her poems:

You fit into me
like a hook into an eye
A fish hook
An open eye

[Update 6/21/2004: David Elsworthy observed via email that most likely "Attwood was also poking fun at academics who use the word 'problematize' and (God help us) 'deproblematize'".

He suggested a couple of links ( here and here) for examples and discussion. At the second link, Alice Shirrell Kaswell explains the technique as follows:

Author Elizabeth Manus explains the concept thusly:
In academia, reading a text in a new way is generally known as "problematizing" a text.
This technique can be applied to anything.
It offers exciting possibilities for scientists. Problematicization can be applied to anything that is generally accepted as being understood. The result: the subject is no longer understood. This creates an instant infinity of publishing opportunities.

So: "Imagining a future academic discipline of Problematics, informally known as "Spin and Grin", Atwood problematizes postmodernism by equating it with public relations..." Am I using the word right?]

Posted by Mark Liberman at 04:40 PM

"Under God" as "Inshallah"

In support of Geoff Nunberg's uncovering of the old meaning of the phrase "under God" as "contingent on God's will" or "God willing", I observe that Robert Browning clearly intended it in this sense in the passage that I quoted a few days ago from his 1855 poem An Epistle Containing the Strange Medical Experience of Karshish, the Arab Physician:

1 Karshish, the picker-up of learning's crumbs,
2 The not-incurious in God's handiwork
3 (This man's-flesh he hath admirably made,
4 Blown like a bubble, kneaded like a paste,
5 To coop up and keep down on earth a space
6 That puff of vapour from his mouth, man's soul)
7 ---To Abib, all-sagacious in our art,
8 Breeder in me of what poor skill I boast,
9 Like me inquisitive how pricks and cracks
10 Befall the flesh through too much stress and strain,
11 Whereby the wily vapour fain would slip
12 Back and rejoin its source before the term,---
13 And aptest in contrivance (under God)
14 To baffle it by deftly stopping such:---
15 The vagrant Scholar to his Sage at home
16 Sends greeting (health and knowledge, fame with peace)

As I wrote:

I take "under God" in this passage to be modifying "contrivance", to express the conventional caveat "inshallah" meaning "God willing". It might be objected that this is an adverbial use -- but it is such a loose sort of adverbial that it could be placed nearly anywhere, including in the pledge. "One nation (God willing) with liberty and justice for all". Thus we've found another interpretive option!

The point of Browning's usage -- put in the mouth of the Arab physician Karshish -- was that Abib, "all-sagacious" in our art", is "aptest in contrivance" to save his patients' lives, but only "under God"; that is, subject to God's will. This expression of divine contingency would have been required by the linguistic culture of Islam, or at least seemed so to Browning; but it is imposed (or at least recommended) by other theologies as well.

Posted by Mark Liberman at 04:36 PM

"(Next) Under God," Phrasal Idiom

In my previous post on "under God," I missed the real meaning of the expression, as Lincoln and others used it -- and so, by a wide mark, did the people who interpolated it in the Pledge.

The OED gives as one entry for under the meaning "In addition to; besides," as in "This woman lovid by wey of synne an other knyght, vndir hire husbond." That sense was obsolete by the 16th century, but it seems to have partially survived in the idiom "under God," for which the dictionary gives a sense, "under God: as a secondary cause or mediate object of gratitude."

That definition may be a little hard to understand, but you can see how the phrase is used when you search for it in the works collected in the Library of America, where it's actually rather frequent in works published before 1860, usually with the meaning "with God's help," or "after God" (with an implicit "of course") in expressions of indebtedness, gratitude, obligation, and the like:

...their labors have certainly been the means, under God, of producing fruits of moral and social regeneration. The United States Democratic Review 1843

On their [Evangelical ministers'] skill, their judgment, their decision, their energy, their faith, will depend under God the glorious result. What are Ministers to do in the Great Controversy of the Age, 1844

He then, thanked him very kindly.for his help in our great danger, and said to him, John, ye have been the means under God to save our natural life, suffer me to be a means under God to save your soul, by good information to bring you out of your dangerous errours. Books Relating to America, 1815.

And it occurs in this meaning in Parson Weems' biography of Washington, as well:

"Sons and daughters of Columbia, gather yourselves together around the bed of your expiring father--around the last bed of him to whom you and your children owe, under God, many of the best blessings of this life."

The meaning of the phrase is particularly evident in the variant "next under God," which occurs several times in the collection:

"The death of William Barents put us in no small discomfort, as being the chiefe guide and onley pilot to whom we reposed ourselves next under God." Early English Explorers, 1856

Thereto help me, next under God, the confidence of my fellow-countrymen! Freiligrath's Poems, 1845

In short, the phrase "under God" had nothing to do with God's temporal sovereignity; it was, rather, a way of acknowledging that the efforts of men are always contingent on His providence. And that is how Lincoln intended it, as meaning something like "with God's help, of course":

It is rather for us to be here dedicated to the great task remaining before us--that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion--that we here highly resolve that these dead shall not have died in vain, that this nation under God shall have a new birth of freedom...

Lincoln would have had trouble making sense of the use of the words in the Pledge -- to him it would have been an ungrammatical way of saying something like, "one nation, with God's help (of course), indivisible..." or "one nation, after God, indivisible..." As I said in my earlier post, a strategic misreading of history.

Posted by Geoff Nunberg at 03:07 PM

I Might Have Guessed Parson Weems Would Figure In There Somewhere

In response to the discussions that Geoff Pullum,. Bill Poser, and I have been having about the syntax and meaning of "under God" in the Pledge of Allegiance, a lawyer named John Brewer writes to suggest that the phrase might be best understood as a rendering of "sub Deo":

If you Google "sub Deo et lege" (less common variant add another "sub" before "lege") you will see that it's part of a phrase used in English legal, political and constitutional rhetoric since circa 1200, with a revival by Lord Coke in the 17th century during the struggles between Parliament and the Crown, which was a bit of history very well-known to 18th century Americans. The phrase is strongly associated with the tradition of limited government which we inherited from England and then elaborated ourselves. ... I would paraphrase the point of the phrase as traditionally used as being that legitimate political authority as distinct from tyrannical force must be sub Deo et lege.

An interesting observation, but "sub Deo" doesn't seem to have played any role in inspiring the wording "under God." In fact, the actual story has some interest of its own.

Credit for the inclusion of the phrase is often given to the Rev. George Docherty, pastor of the New York Avenue Presbyterian Church, who argued for the change in wording in a sermon given at a Lincoln Day observance on February 7, 1954 at which President Eisenhower was present. In proposing the phrase "under God," Docherty made explicit reference to the Gettysburg Address:

What, therefore, is missing in the Pledge of Allegiance that Americans have been saying off and on since 1892, and officially since 1942? The one fundamental concept that completely and ultimately separates Communist Russia from the democratic institutions of this country. This was seen clearly by Lincoln. "One nation under God this people shall know a new birth of freedom," and "under God" are the definitive words.

It's notable that Docherty scrambled Lincoln's wording so as to make "under God" a noun modifier -- the Address actually reads "that this nation, under God, shall have a new birth of freedom," where its adverbial function is clear.

In fact, the sermon wasn't actually the origin of the proposal, which had been floated by the Knights of Columbus several years earlier (Docherty himself had given the same sermon two years before the 1954 date, but it got no attention the first time around). And the phrase was widely understood as a reference to the Gettysburg Address. As a May 23, 1954 New York Times article explained:

Over on the House side where fifteen such resolutions have been introduced, it was Representative Louis C. Rabaut, Democrat of Michigan, who offered the granddaddy of them all. On April 20, 1953, he dropped his resolution int the hopper at the suggestion of H. Joseph Mahoney of Brooklyn, N. Y. In a postscript on a letter to Mr. Rabaut, Mr. Mahoney wrote: "Why don't you recommend the addition to the pledge of allegiance of the words 'under God.'"

The suggestion impressed Mr. Rabaut greatly because these were the very words found in Abraham Lincoln's Gettysburg Address. He dropped in a bill.

It's likely that Docherty's endorsement had the effect of making the change in wording acceptable to Protestants who may have been chary of adopting what seemed a Catholic proposal.

But where did Lincoln get the words? In Lincoln at Gettysburg, Gary Wills points out that contemporary newspaper reports had Lincoln saying "This nation shall, under God...," rather than "This nation, under God, shall..." (if the papers' version had prevailed, we might have been spared the subsequent reanalysis of the phrase as a noun adjunct, whatever the loss to rhythm). Wills notes that some people have suggested that the phrase was a spontaneous addition inspired by Edward Everett's use of "under Providence" in the speech immediately preceding Lincoln's.

But according to a 1930 book by William E. Barton, also called Lincoln at Gettysburg (referenced by Wills but not in this context), Lincoln took the phrase from the writings of Parson Weems, the confabulator of the story about Washington and the cherry tree. The phrase occurs several times in Weems' biography of Washington:

Sons and daughters of Columbia, gather yourselves together around the bed of your expiring father--around the last bed of him to whom you and your children owe, under God, many of the best blessings of this life.

As James Piereson points out in an article in the Weekly Standard, Weems may have taken the phrase from Washington himself, who in his daily orders of July 2, 1776 wrote:

The time is now near at hand which must probably determine whether Americans are to be freemen or slaves. . . . The fate of unborn millions will now depend, under God, on the courage and conduct of this army."

Piereson, who is executive director of the conservative John M. Olin Foundation, goes on to say that "When Congress added these words to the Pledge of Allegiance, it drew upon a phrase that had a long and meaningful association with the great statesmen and events in the history of the Republic."

Maybe so, but it's worth noting that Washington, Weems, and Lincoln used the phrase adverbially. Its functional shift to a noun adjunct in the Pledge is a small but significant reinterpretation of the past to make it more congenial to current ideological purposes. Not that Parson Weems would have had any problem with that.

Posted by Geoff Nunberg at 08:57 AM

Shooting eggcorns in a barrel

Like blast fishing or jacklighting deer, hunting eggcorns by web search is so easy that it's hardly sporting. All you have to do is to think of a pair of words with the same or similar sounds but different spellings, choose a common usage for one of the pair, and then look for examples in which the other one is substituted.

You'll almost always find plenty of genuine examples:

(link) For one IPFW grad, a flare for the theater led to award-winning costume design
(link) Radiofree West Hartford is looking for few Right Minded Conservatives with a flare for writing and a passion for politics to donate original essays for publication.
(link) Are you a negotiator with a flare for persuasion?

(link) "At last!" replies a hoarse, base voice.
(link) His repertoire includes the main parts for an authentic base voice in the works belonging to Mozart, Verdi, Wagner, Glinka and Mussorgsky.
(link) At a young age this pitch is usually high even for a person with a base voice.

along with a few jokes or puns:

(link) Astronomers Acquire a Flare for Forecasting ... Thanks to two pairs of stars doing the cosmic do-si-do and a marathon radio survey, astronomers are now able to spy brewing stellar storms and predict looming flares on stars other than the sun.

In addition to classic eggcorns like the two given above, where people have made a sensible but wrong guess about which words are used in a particular expression, there are more puzzling cases, where there seems to be little or no resonance from any meaning of the substituted word:

(link) However, they use blatant bate and switch tactics with room locations and reservations.
(link) I was ready for a run around, bate and switch or the wrong item shipped, but none of that happened.
(link) I found great prices and I liked how they didn't bate and switch like I've seen a lot of others do.

Apparently these people are just causelessly confused about how to spell "bait" in this expression -- unless these are hypercorrections caused by spill-over from the bated breath vs. baited breath confusion? In any case, there are certainly some examples where we can be quite sure that we are looking at an out-and-out typographical error:

(link) Nearly all of mankind has heard the oft-repeated lines " We hold these truths to be self-evident, that all men are crated equal; that they are endowed ...
(link) Because every product is not crated equal, we have gathered the largest selection found anywhere.
(link) Not all foam is crated equal, let our experienced staff offer advise on the right urethane foam for your application.

Surely the (staff of the) American ambassador to Equador (the author of the first quote above) knows that Jefferson wrote created and not crated -- but I also have to point out that there are 54 examples of "crated equal" and only 4 of "creted equal", as predicted if the attractive force of the alternative word "crated" is having an effect.

Finally, there are examples that appear to be in a special category of non-native-speaker eggcorns. For example, a certain number of physics instructors seem to have gotten the idea that

describes the emissive power of a "block body" rather than a "black body":

(link) Thermal/block-body radiation and the black hole evaporation.
(link) A more quantitative approach on absorption and emission of radiation in terms of absorptive power and emissive power are explained with specific references to block–body radiation.
(link) Modern Physics : Atomic structure, Block body radiation, Photon, de-Broglie’s Waves, Photoelectric effect, compton effect, Mass-Energy conversion relation.

and as far as I can tell, none of them are native speakers of English. Of course, the vast majority of physicists who are not native speakers of English get this right -- and the vast majority of native speakers of English are unaware of the issue in the first place.

Anyhow, finding eggcorns by this method is so easy that one could imagine programming a computer to systematically hunt down all the eggcorns in the language, on purely distributional grounds. The process would be fun and challenging -- especially for the cases where words are not swapped one-for-one, like acorn=egg corn, or intents and purposes=intensive purposes -- and it would probably be interesting to look at the statistical structure of the resulting set.

Then we'd have to find some other topics to fill this space, but I'm sure that the Lord would provide.

Posted by Mark Liberman at 08:26 AM

June 19, 2004

Transcriptional abduction

Did George W. Bush say "these kind of extremist thugs" or "these kinds of extremist thugs"? One excellent answer is "Who cares?" However, the relentless focus on Bushisms makes silly questions like this a matter of modest political consequence, and so yesterday I looked at the evidence. When I first posted the item, the evidence was equivocal -- the non-standard phrasing "these kind of" was on CNN's web page, whereas all other sources had the standard version "these kinds of".

So Abnu at Wordlab emailed some evidence from transcriptions past, strongly suggesting that "these kind of" is in W's repertoire. His conclusion, based on the citations that I repeat below, was that

One could go on and on finding these kinds of quotes, but I think it's a safe bet that CNN got it right, off their tape, and that other news networks worked from the official "transcript" which was guided by a scripted Press Release.

Fair enough; but, as it turns out, wrong. I later found the original video on CNN's site (to be more precise, I decided I was willing to pay $6.95/month for the right to access such videos), and verified that what W actually said was the standard version "these kinds of extremist thugs."

I haven't tried to track down the audio or video for the transcripts (some from whitehouse.gov) that Abnu cites, but in all such cases, there's a question of whether a particular property of the transcript is a fact about what the speaker said, or a fact about what the transcriber heard, or remembered, or wrote. This is similar to the problem of attributional abduction that we've discussed here over the past few months: when a journalist attributes nonsense to someone who ought to know better, we need to figure out who is really responsible.

Here are Abnu's examples:

(link) According to a recent official transcript, the President said, "We're a great country because we're a free country, and we do not tolerate these kind of abuses."
...
But Bush also seemed to plead with Iraqis to support the U.S. effort in the face of the attacks. "The people have got to understand, the Iraqi people have got to understand that any time you've got a group of killers willing to kill innocent Iraqis, that their future must not be determined by these kind of killers," he said.

(link) BUSH: Well, first let me kind of step back and talk about intelligence in general, if I might. Intelligence is a vital part of fighting and winning the war against the terrorists. It is because the war against terrorists is a war against individuals who hide in caves in remote parts of the world, individuals who have these kind of shadowy networks, individuals who deal with rogue nations.

(link) Bush spoke during a White House press conference. He said U.S. troops will remain in Iraq to help maintain order and stability and help the Iraqi government set up security forces.

He said the U.S. forces in the country will remain under U.S. control. "The American people need to be assured that if our troops are in … harm's way, they will be able to defend themselves without having to check with anybody else other than their commander," he said. "At the same time, I can assure the Iraqi citizens as well as our friends in Europe, that we have done these kind of security arrangements before. Witness Afghanistan. There's a sovereign government in Afghanistan, there are U.S. troops and coalition troops there, and they're working very well together."

Posted by Mark Liberman at 11:11 AM

Recursive titles

Yesterday I posted something on Joshua Macy's review of The Language Instinct. Since Macy called his piece "So what's wrong with The Language Instinct?", I considered titling mine "So what's wrong with 'So what's wrong with The Language Instinct?'?", so that he could respond "So what's wrong with 'So what's wrong with "So what's wrong with The Language Instinct?"?'?", and so on.

In the end I chickened out and called it "Criticizing Pinker the right way", in honor of Larry Brown's team winning the NBA title.

But this all reminds me of a great linguistic in-joke. The thing is, I can't remember all the details -- but if I post what I remember, I hope that I can count on someone else to explain the rest.

In 1950, Robert Hall published a popular anti-prescriptivist book entitled "Leave your language alone." (Everything isn't on the internet yet -- I couldn't find a review or even a summary on line!). By the mid-1960's, Hall's book was already a classic, which everyone in the field knew at least by name. So it was a plausible joke to make up a fake bibliography containing a sequence of polemics and rejoinders including the titles "Leave 'Leave Your Language Alone' Alone", "Leave 'Leave "Leave your Language Alone" Alone' Alone", and so on.

This joke satirized the passions of (anti-)prescriptivism, the polemical propensities of linguists, and the problems of center embedding.

What I can't remember is whose joke it was, or exactly how it was expressed. I believe that it was in a dittoed sheaf of similar in-jokes given to me by either George Lakoff or Haj Ross, and the author might have been Jim McCawley.

[Update: George Lakoff provided the missing authorship information by email -- it was Robin Lakoff.

"It appeared in a fake Language cover page that Haj, Robin, and I made up -- with Jim's help. It was published in a Festschrift for Jim that CLS put out in 1972 or so."

And the formidably scholarly Q_pheevr supplies more details:

The recursion in question appears in the book Studies out in Left Field: Defamatory Essays Presented to James D. McCawley on the Occacion of his 33rd or 34th Birthday, edited by Arnold Zwicky, Peter Salus, Bob Binnick, and Tony Vanek. One of the many linguistic in-jokes in this collection is something that purports to be the front cover of an issue of Language: The Journal of the Debating Society of America, "communicated by Carolyn Killean, George Lakoff, Robin Lakoff, Michael O'Malley, and Lester Rice." The third item in the table of contents is:
Paul Schachter: Leave "Leave LEAVE YOUR LANGUAGE ALONE Alone" Alone: A Reply to Pulgram's "Leave LEAVE YOUR LANGUAGE ALONE Alone"
SoiLF originally appeared in 1971 (it was reissued by Benjamins in 1992), which places it in the temporal gap between an influential article from the field of computer science and the famously recursively titled reply to that article. E.W. Dijkstra's "Go To Statement Considered Harmful" appeared in 1968, and Frank Rubin responded in 1987 with "'GOTO Considered Harmful' Considered Harmful" (see the Jargon File, s.v. Considered Harmful).

This helps explain why I'm a bit hazy on the details, since I was in the Army in 1971. And I've owned several copies of SoiLF over the years, but it's one of those books with a high vapor pressure.]

[Note -- Q also shamed me into removing the apostrophe from ditto'ed. It's better that way, now I don't have to worry about Lynne Truss lurking in dark corners with a cleaver.]

Posted by Mark Liberman at 08:15 AM

This is not your granddaughter's spelling bee

(First of a planned triblogy on the wonders of Dutch spelling.)

The scene is a lushly appointed red-carpeted assembly room, walls hung with golden-threaded tapestries and heavily framed portraits of monarchs, princes and ministers; even the ceiling is wall to wall fine art. This is the main assembly room of the Secretary of State. Sitting in carved wooden seats, pew-like except for the heavy green upholstery, are sixty worthy citizens of two lands, many of them famous from the worlds of literature, politics, television, theater and sports, while a panel of even more distinguished judges sits on a dais at the head of the room. Two gray-haired gentlemen in full evening dress, white bow tie and tails, stand unmoving in either wing: their sole function in the next hour and a half will be to collect pieces of paper. In the room's two huge upper galleries hundreds of common folk eagerly watch.

The event is televised in one of the most widely watched broadcasts of the year. Almost one in ten of the population is sitting at home, glued to the screen, volume turned up slightly higher than usual, pen and paper in hand. Though many of the worthies are media personalities, and usually able to affect an easy, relaxed air before the cameras, most of them seem a little awkward - they must be nervous. They also have paper in front of them, and fine pens.

A senior television personality, smartly dressed, stands at the center of the room. He has cataloged for us the history of the room and the event, and now he warns us that we have nearly reached the appointed time. He provides instructions for those in the hall, and then gives advise to viewers at home. He tells us we must make sure to be sitting comfortably, and must seal the rest of the world out: close curtains, turn cell-phones off, and shut the door (much as you do when reading LanguageLog.) After more procedural details, he wishes everyone strength. He pulls his spectacles from his pocket, flicks them open with a practiced definiteness and sits them on his face. What is about to happen in this scene is something that happens every year. The outcome is a matter of pride for the two small countries that compete. But what is about to happen?

What is about to happen is: The Great Dictation. Or more properly, for the proceedings are entirely in Dutch and intended only for Dutch and Flemish eyes and ears, it is Het Groot Dictee. The host will read a 200 hundred word text, and all across the Netherlands and Flanders (the Dutch speaking part of Belgium), people will take dictation. The 2003 text begins: In deze dagen van buiig sinterklaasweer en goeddeels illusoire kerstpakketten.... (In these days of windy Santa Claus weather and largely illusory Christmas packets [i.e. the sort you get from an employer]....) You can watch the entire 2003 proceedings in Realplayer (big/small) or Windows Media (big/small) formats.

Amazing, I hear you cry. Yet you are not amazed at the mere fact of spelling being treated with so much gravitas. No, as a language lover you find that easy to understand, albeit that most of your fellow countrymen couldn't give a toss. What amazes you is that a Dutch Great Dictation would even be possible.

Is not Dutch spelling state controlled, and the responsibility of the Dutch and Belgian Ministers of Culture? Were there not four times in the twentieth century panels of language experts assembled by the two states, panels which legislated new and improved Dutch spelling? Did not the most recent, 1994 commission finally straighten out almost all uncertainty or illogicality left in the language? And did not our very own Bill Poser recently worry that spelling reform would mean the end of even the humble American Spelling Bee, a competition which by comparison with Het Groot Dictee is literally child's play? Yes, yes, yes and yes. So how can it be that educated Dutch and Flemish people still make so many spelling errors that a dictation can be a challenge? How come even the winner (in 2003, a Dutchman) made seven mistakes in 200 words?

Well, Dutch spelling certainly has a logic to it, and is wonderfully deterministic. When you don't know how to spell something, there is almost always a right answer, and the answer is probably in the official bi-nationally approved word list. But deterministic is not the same as easy, as the next installment of this triblogy may clarify. I will tell you about ants fucking. Or ant fucking, as the Belgian minister of culture argued back in 2002. (It was probably just cheap populism, vote seeking before the elections soon afterward.)

[To be continued.]

Posted by David Beaver at 03:40 AM

What For

I don't know anything about English syntax, but Mark's reference to Churchill's example of the avoidance of preposition-stranding reminded me of my favorite example of preposition-stranding. I don't know who came up with it. The context is that a little girl and her father have discussed what her bedtime story will be, but when he goes up to her room, he brings the wrong book. She says:

Daddy, what did you bring that book that I didn't want to be read to out of up for?

It is possible to remove the stranding of the propositions, but only by a significant rearrangement, including a change from passive to active and the use of why instead of what ... for. The result is quite stilted:

Daddy, why did you bring up that book out of which I didn't want you to read to me?

There doesn't appear to be any way to avoid preposition-stranding if you use what ... for rather than why. Substituting what for for why is ungrammatical:

*Daddy, what for did you bring up that book out of which I didn't want you to read to me?

Moving for to the end makes the sentence acceptable:

Daddy, what did you bring up that book out of which I didn't want you to read to me for?

So it appears that the what ... for construction requires preposition-stranding. Interestingly, in my opinion, although the last version is acceptable, it is still rather awkward. By far the most natural of these sentences is the first, with the long string of stranded prepositions.

Posted by Bill Poser at 01:55 AM

Cooking Verbs

Some languages, such as English, have quite a few words for different kinds of cooking. Just off the top of my head, English cooking verbs include: fry, sautée, braise, deep-fry, stew, boil, simmer, steam, bake, roast, broil, toast, and grill. In contrast, Carrier, the native language of a large part of the central interior of British Columbia, has only two cooking words. Carrier has no infinitive or other obvious citation form, so I'll use the second person singular optative affirmative, roughly "mayst thou ..." which one might use in a recipe. One form is [onliz]. It means "to cook by immersion in hot liquid or steam". So it corresponds to English boil, stew, simmer, and steam. The other form is [oɬtes̪]. This covers everything else, frying, baking, roasting, etc.

Now, you might think that this was a function of the culture. Until pretty recently, the number of cooking techniques available to speakers of Carrier was limited. They could cook directly in the fire, they could bury food under the fire or in the coals, and they could boil it by putting the food and water into birch-bark baskets, heating stones in the fire, and putting the stones into the basket. They didn't have pans or griddles and so couldn't fry things. So it would have been reasonable to distinguish roasting, baking, and boiling or stewing, but it wouldn't make sense to have any of the frying words or to distinguish broiling from baking. Even so, the language actually has fewer distinctions than it might since there is no distinction between roasting and baking. That may be because, from what I know, although baking in a pit or in the coals was something they could have done, Carrier people don't seem to have done it.

What is really interesting about cooking terms is that even languages of cultures that have had a wide variety of cooking techniques for centuries don't always have many basic cooking terms. Japanese distinguishes just 煮る [niɾu] "boil, stew", 蒸す [musu] "steam", 揚げる [ageɾu] "deep fry", and 焼く [jaku] which covers "fry, roast, bake, broil, toast". 焼く [jaku] is really a general purpose verb meaning "burn" and is not limited to cooking. There is also the verb 煎る [iɾu], but in my experience it is just a relatively little used synonym of 焼く [jaku], not another kind of cooking. Within these, there are some more specific verbs. One can distinguish stir-frying 煠める [itameru] from other dry-heat cooking 焙る [aburu]. My impression at least is that such distinctions are not often made, whereas one hears 焼く [jaku] all the time. One can also distinguish two types of boiling. 炊く [taku] is used specifically for boiling rice while 煠でる is used for everything else. I didn't initially think of 炊く [taku] as a cooking verb because it has all sorts of other uses in the range "burn, heat, stoke, kindle". Among other things, it is the appropriate verb for burning incense. In Japanese as in Carrier, it is possible to describe the cooking process more specifically by adding adverbs and postpositional phrases, e.g. "gently" or "in a frying pan", but the number of basic verbs is limited.

I've revised the above description of Japanese a bit since I first posted it last night. I hadn't included all of the subsidary terms, and used what proves to be a non-standard term for steaming. Russell Lee-Goldman had some comments in his blog including various of the more detailed terms, but he raises the question of whether there is anything odd about the Japanese terminology since it includes all of the cooking techniques in common use. I think that it still contrasts with a terminology of the English type because English forces a more detailed set of distinctions. In Japanese the only distinctions that you have to make, and in my experience the only ones that people commonly make, are among boiling, steaming, deep-frying, and everything else. 焼く [jaku] covers a lot of ground that must be divided up in English. In English there is no term that covers frying, roasting, and baking; you've got to make the distinction.

Maybe languages like English with a lot of cooking verbs are unusual. I can't tell. As far as I know, this subject has not been discussed in the linguistic literature, and I don't have a good sense of it even among the languages with which I am familiar. I can read a newspaper or a linguistics article in a fair number of languages, yet I realize that there are only a few in which I have any command of the cooking terminology. It's the sort of thing that you probably only learn if you actually live your life in a language for a while.

Posted by Bill Poser at 01:07 AM

June 18, 2004

Criticizing Pinker the right way

A few days ago, Joshua Macy posted an interesting review of Pinker's The Language Instinct. I agree with his overall evaluation:

It’s entertaining, it’s informative, it’s even funny sometimes, so what more could you want of a popular science/language arts book? My problem is that Pinker comes across, as the song goes, just an inch too sure of himself for me. The Language Instinct gives the strong impression that there are no rival theories worth mentioning, or at least it never bothers to mention any except when relating how Chomsky destroyed this or that primitive misunderstanding.

However, when he treats a couple of examples in detail, Macy undermines his thesis by unfairly misconstruing Pinker's argument.

The issue that Macy picks is one on which Pinker is vulnerable: the difference between mice-eater (which seems OK) and rats-eater (which is disfavored in comparison), and the interpretation and implications of the difference. We've discussed some of the issues on Language Log over the past few months, at least in passing, and I cited a recent paper that's extremely critical of Pinker's position: Haskell, T.R., MacDonald, M.C., & Seidenberg, M.S. "Language learning and innateness: Some implications of compounds research". Cognitive Psychology, 47, 119-163. (2003).

However, Macy's argument against Pinker is just wrong. He starts out by quoting Pinker:

The rules of syntax can look inside a sentence or phrase and cut and paste the smaller phrases inside it. For example, the rule for producing questions can look inside the sentence This monster eats mice and move the phrase corresponding to mice to the front, yielding What did this monster eat? But the rules of syntax halt at the boundary between a phrase and a word; even if the word is built out of parts, the rule cannot look “inside” the word and fiddle with those parts. For example, the question rule cannot look inside the word mice-eater in the sentence This monster is a mice-eater and move the morpheme corresponding to mice to the front; the resulting question is virtually unintelligible: What is this monster an -eater?

and then comments:

Nonsense. What is this monster an eater of? is perfectly intelligible. (As is Of what is this monster an eater? for you anti-Churchillian prescriptivists, which puts it in a form closer to that of Pinker’s question rule.) Compound words of this sort are not indivisible “syntactic atoms” even if the syntactic rule for splitting and rearranging them is more complex than the one that Pinker proposed.

But What is this monster an eater of? is completely irrelevant, because "an eater of mice" is not a compound noun, it's a phrase consisting of a noun phrase connected to a prepositional phrase. Pinker's point was that compound words (like mice-eater) are "syntactic islands" (to use Haj Ross's justly celebrated phrase), in a way that syntactic phrases (like eater of mice) made up of the same morphemes generally are not. Macy's example counts in favor of Pinker's position, not against it.

Macy compounds his error by bringing Winston Churchill into the discussion.

He's referring to Churchill's legendary attempt at a reductio ad absurdum of the stupid prescriptivist prohibition against preposition-stranding: "This is the sort of bloody nonsense up with which I will not put".

The trouble is, Churchill's example is almost (not quite) as unfair as Macy's: "... up with which I will not put" is bad because the fronted string involves not just the preposition with, which is genuinely connected to the which, but also another preposition, up, which is not:

...bloody nonsense [I (will not put up) (with which)] ↔
...bloody nonsense [up with which I will not put ( )]

That's roughly like trying to compose a phrase as

This is something [I (do not agree completely) (with which)] ↔
This is something [completely with which I do not agree ( )]

(as opposed to "This is something with which I do not agree completely ( )", which is syntactically fine, if pragmatically wishy-washy).

I suspect that Churchill knew better -- no one ever accused him of fighting fair when he had a point to make. But it's amazing that people interested in language hardly ever catch him out. This tells us something about the state of syntactic education in the Anglosphere today, alas.

I don't mean any disrespect to Joshua Macy, who is clearly an intelligent, insightful and accomplished person. Nor do I mean in any way to send the message that amateurs are unwelcome. In the field of syntax, I'm an amateur at best myself. But we desperately need to find a way to help people to learn to do elementary syntactic analysis in a coherent way.

Posted by Mark Liberman at 11:46 PM

Predicting random eggcorns

Francis Heaney writes:

Morrissey drops an eggcorn on his new CD (as good as "Vauxhall and I", not quite as good as "Your Arsenal"), in the song "I Like You": "Something in you caused me to / take a new tact with you".

Some other common examples recently sent in by readers include "slight of hand" for "sleight of hand", and "for all intensive purposes" in place of "for all intents and purposes".

As usual, the phonetic difference between the original and the eggcorn ranges from nil ("slight of hand") to small ("for all intensive purposes").

I've pointed out in the past that one can use web search to compare rates of eggcorn usage in different contexts, for instance in news vs. the web at large. I'd like to reiterate here that web search makes it possible to predict eggcorns and investigate their occurrence experimentally -- and even to estimate their rates of occurrence.

Thus I can open a magazine on my desk to a random page, and pick a random phrase -- here is "marginal cost" -- and predict a likely eggcorn -- say, "margin of cost" -- and check the web to find it!

(link) What IT potentially offers, he says, is economies of scale, the possibility of enlarging the scope of educational activities at a relatively low margin of cost, and mass student customization of education.
(link) Leaving the law on one side and turning to economics, it is a well established principle that if one wants to maximise one's returns, one carries out an activity until the margin of cost is equal to the margin of revenue.
(link) Realize, of course, that if the report costs more to compile the margin of cost to the report to one more copy would go down, making one more copy even cheaper in comparison.

The raw rates of occurrence (195,000 whG for marginal cost, 148 for margin of cost) are not necessarily to be trusted. One needs to do some sampling to estimate the rate of valid hits -- and to do that, we really need one additional feature from Google or other web search systems, namely the ability to get a random sample of N hits. But with that proviso, one could really use such techniques to do Google psycholinguistics.

Posted by Mark Liberman at 10:42 PM

A CNN-ism?

This afternoon, CNN's front page ran a composite picture of Paul Johnson, under the headline "Captors behead U.S. hostage Johnson", with the following paragraph below the picture:

Al Qaeda militants beheaded an American hostage, posting on an Islamist Web site today three photographs of the head and body of Paul Johnson, who was abducted in Saudi Arabia six days ago. President Bush reacted to the killing by saying, "America will not be intimidated by these kind of extremist thugs."

The presidential quote, as given, exemplifies the non-standard use of singular kind with plural these that I discussed a few days ago.

In contrast, Reuters quoted Bush as saying "America will not be intimidated by these kinds of extremist thugs." So did ABC News, The Guardian, The Australian, the Associated Press, the Washington Post, and the New York Times, among others. This is also the version found in the official White House transcript.

So far, I haven't been able to find an audio clip of the presidential statement. NPR gives the beginning, but cuts it off before the phrase in question.

In the absence of a clip to check, two alternative explanations present themselves:

(1) Bush said "these kinds of extremist thugs" and CNN's sub-head writer blew the quote, whether by hearing it wrong or by copying another story incorrectly;

(2) Bush said "these kind of extremist thugs" and all the other reporters cleaned up the quote.

Given how unwilling the press corps has traditionally been to give W the benefit of the linguistic doubt, I'm going with (1) for now.

Note that in neither case is there any real linguistic blame to be assigned. The sequence "these kind of" has 225,000 hits on Google. It's still a minority taste: "these kinds of" has 2,080,000 hits, or about 9 times as many. But it's the sentiment that matters most, not the formality of its extemporaneous expression.

[Update: I found the video of Bush's reaction (recorded at the airport in Seattle) on CNN. It's clear that he says "these kinds of extremist thugs", just as all the sources except CNN's web site had it.

As I wrote in my earlier discussion

You can make any public figure sound like a boob, if you record everything he says and set hundreds of hostile observers to combing the transcripts for disfluencies, malapropisms, word formation errors and examples of non-standard pronunciation or usage. It's even easier if the critics use anecdotes based on the perceptions and verbal memories of equally hostile listeners.

Here we have a case where Bush was quoted by a national press source using a non-standard way of talking, but the "error" turns out to have been made by the journalist rather than the politician. ]

[Update 6/19/2004: Slate's Today's Papers gives the "these kind" variant of the quote: will we see it appearing as a Bushism? ]

Posted by Mark Liberman at 06:18 PM

Repeating repeatedly

Repeating something is not the same as repeating it repeatedly, though it may be stylistically unwise to say so explicitly:

(link) I am aware that people need to vent, but these many long letters that are repeated repeatedly are too much.

Even when a modifier seems to be genuinely redundant, it may still add something, if only emphasis:

(link) A broad analysis should clarify clearly what these partners and organizations are able and willing to contribute to the European standardization process.
(link) Form questions in order to clarify clearly a problem, topic, or issue.
(link) At least they knew, the point that he made was that the ordinance as it was drawn up, did not clarify clearly what the changes actually were that they were being asked to vote on.

People who write clarify clearly are setting themselves up to be ridiculed, certainly -- but why? It's understood that Chinese languages have changed over the past couple of millennia by adding redundant modifiers, whether to reduce ambiguity or to add emphasis or just for the fun of it, and no one ridicules the Chinese for this.

I have to admit that it's often hard to swallow the results of a similar impulse in English. Since specify is often used to mean nothing more than "say" or "explain" or "list", it's natural to want to add something when you want to specify that a statement, explanation or list needs to be explicit and complete:

(link) We also specify specifically that it contains text data.
(link) Some union contracts, company personnel policies or group health care contracts may specify specifically how long an insured worker or other disabled person is entitled to have group health coverage in the event they are not actually on the job and working.
(link) A request by a School District employee for a release of rights to copyrightable material shall be directed to the Superintendent of the School District, shall be in writing, and shall specify specifically the material for which the release is requested, preferably by submitting two copies of such material with the request.

Natural, but imprudent -- if plain old specify is not specific enough, it would be better to write "explain specifically" or "list specifically" or even "specify explicitly."

I have less sympathy for those who write interact interactively -- what are these people trying to say, I wonder? Has plain interact become bleached to the point where it just means "communicate" or "access" or "use"?

(link) TUG's target is to create a network of people who live and interact interactively.
(link) Development of multiresolution data structures for effective representation of terrain and city models, as well as development of technology to visualise and interact interactively with 3 D-terrain and city models on mobile computers.
(link) Note that the buyer role could also be implemented using a web server, so buyers could interact interactively with “Beautiful Flowers Wholesale” via their web page.

The logic of redundancy can be subtle, but superficial stringwise repetition is striking and easy to ridicule.

Posted by Mark Liberman at 06:27 AM

Delightful chaos

Trevor at kaleboel wrote

I sometimes suspect that English is so popular (and so strongly associated with ideologies of freedom) not because of its status as the world's primary language of intercommunal transaction but simply because it is such a delightful chaos. For this reason I am happy that it shows every sign of resisting the linguistic hygiene and spelling reform brigades and becoming ever more diverse and confusing.

English morphology is not a patch on Russian (or Navaho or Hausa) for chaotic delights, and the English writing system, however chaotic, is surely inferior in chaos to Japanese. But still, I think that Trevor is on to something.

Posted by Mark Liberman at 06:10 AM

Ah me -- ah moo

In response to Geoff Pullum's post about the syntactic and legal status of "under God", Q_pheevr offers an argument with two fascinating scholarly citations:

Hiram Walker, the founder of the Canadian Club distillery, contracted to sell, to a banker named T. C. Sherwood, an Aberdeen Angus cow, Rose 2d of Aberlone, whom both parties believed to be infertile. Accordingly, Rose's purchase price was set, on the assumption that she was of value only qua beef, at five and a half cents per pound (minus fifty pounds for shrinkage). Upon being weighed, Rose turned out, in fact, to be pregnant. Walker reneg(u)ed; Sherwood sued; the case went to the Supreme Court of Michigan.

Sherwood lost, and thereby hangs an argument: is a nation being "under God" like a cow being pregnant, or not?

I'm even more impressed by Q's second citation:

I should probably dispel the illusion I may have just created that I am capable of citing hundred-year-old state supreme court opinions at will. The fact is, Sherwood v. Walker is the only such case at my fingertips, and the only reason I am so well acquainted with it is that my late grandfather, the legal scholar Brainerd Currie, wrote a long poem (mostly in the style of S. T. Coleridge's "Christabel") about Rose 2d of Aberlone. (Members of my immediate and extended family can sometimes be heard quoting the immortal refrain Ah me! -- Ah moo!) A few lines from section 4 will give you a sense of the flavour of the thing:

Now, there's a distinction, as I've been taught,
Twixt a cow that's pregnant and one that's not.
In fact, the fallacy is arrant
That places a potential parent
In even the same taxonomy
With that drain on our economy
That we deprecate by all that's holy—
The wretched beast that's sine prole.

Read the whole thing...

Posted by Mark Liberman at 06:06 AM

June 17, 2004

Legal manners or means

The expression by no manner of means is doubly archaic. The relevant sense of means is best known these days through collocations like ways and means, by means of, means of production, and so on, or through the contrast between means and ends. And manner of is an archaic way of saying kind of or type of, as in "what manner of man is this?"

So by no manner of means is an archaic emphatic form of by no means, just as in no kind of way is an modern emphatic form of in no way. An informal modern example: "To hear one of those amplifiers compared to what we have now, they ain't no relatives in no kind of way."

Originally, by no manner of means was just as straightforwardly compositional as in no kind of way. But as a result of the double archaism, many people no longer understand how this expression means what it does. We can see this by the way they substitute other words into it:

(link) The course material in "Sweet Endings: Coffee, Tea, Port, Dessert Wines and Single Malt Scotch," is by no matter of means frivolous, and at the end of the thirty hours, the students will have acquired a lot more than some good social chatter.
(link) This was by no manner or means a distinguished performance by the winners who must improve out of all recognition if they are to pose a threat to wither Cork or Tipperary in next month’s final.

Likewise not ... by any manner of means is historically just an emphatic way to say not ... by any means. It's analogous to not ... in any kind of way in today's English: "We weren’t related in any kind of way." Here too, we can find plenty of examples of all the expected substitutions:

The designation "Solo" does not say it all, by any matter of means.
From what I have heard, there will be an upgrade in the area, but it won't be
a magic cure by any matter of means -- an incremental advance at best.
The legend of The Black Dog of West Peak did not end with the death of geologist Pynchon by any matter of means.

I think you should know that we are not laggards in this field by any manner or means.
I'm not a big royal fan by any manner or means, I just dislike unchivalrous acts more.
In short, one didn't think of him as a warm personality by any manner or means.

They aren't all screwball reports by any matter or means.
I had courses in accounting when I was in college and I was not, by any matter or means, stupid about the fact that the business had to make money.
This list is not complete by any matter or means!

And even some less expected ones!

The Daniela hotel is not "luxury" by any manor of means, but it is a lovely place to stay, and allows easy access to the centre of Rome.

The big surprise, however, is how common by any manner or means is in legal discourse:

(link) "Manufacturer" shall include every person who, in the process of filling or refilling an original package with alcoholic liquors purchased by such person, changes the degree or quality of such alcoholic liquors by any manner or means whatsoever.
(link) It is unlawful for any licensee, or any employee thereof, directly or indirectly to make, disseminate, represent, claim, state, or advertise, or cause to be made, disseminated, represented, claimed, stated or advertised by any manner or means whatever, any statement or representation concerning structural pest control, as defined in Business and Professions Code section 8505, which is unfair, deceptive, untrue or misleading, and which is known, or which by the exercise of reasonable care should be known, to be unfair, deceptive, untrue or misleading.
(link) Upon payment in full to the Animation Writer of the Script Fee and subject to Articles 807 and 811, the Producer shall be deemed automatically granted from the Animation Writer an exclusive licence for the full term of copyright (and any extensions thereof) to exploit by any manner or means, now or hereafter known the copyright in the Script Material.
(link) In this permission form, “disclose” means to permit access to or the release, transfer or other communication to any person (other than school officials SBISD has determined have a legitimate educational interest) by any manner or means, including oral, written, or electronic release to the news media and on the Internet’s World Wide Web.

and so on for more than 350 other cases in Google's current index.

Are these examples just misconstruals of by any manner of means, or are they independent, fully compositional phrases? It's not clear. In other legal writing, we do get things like

(link) No part of this website may be reproduced in any form, by any manner, without explicit written permission.
(link) Any proposed acquisition of property by any manner shall be subject to the approval of the Board.
(link) It is unlawful to take or attempt to take any fish in fresh waters by any manner except in the manner commonly know as angling with handline or with rod and line, or as otherwise allowed by law.

So given that we can say that "no part of this website may be reproduced by any manner" and also that "no part of this website may be reproduced by any means", it seems perfectly compositional to express the disjunction: "no part of this website may be reproduced by any manner or means." And that is exactly what is happening, when the Pennsylvania Game Commission stipulates that

...it is unlawful for any person to:

(1) Hunt or take any game or wildlife by any means or manner or device, including the use of dogs, without first securing and personally signing and displaying the required license.

However, it seems to me that you should do (or be prohibited from doing) something in any manner, not by any manner. In support of that view, I can offer the fact the OED's examples contain 28 cases of "in any manner", e.g.

1886 Laws Lacrosse ix. §7 The goal-keeper..may put away with his hand or foot, or block the ball in any manner with his crosse or body.
1677 GALE Crt. Gentiles IV. 404 The Divine Wil is universally efficacious, insuperable..nor impedible and frustrable in any manner.

and only four cases "by any manner", three of which are instance of "by any manner of means", and one of which is the following, which I am unable to parse at all:

1533 CRANMER Let. to Duchess Norfolk in Misc. Writ. (Parker Soc.) II. 255 When it shall be by any manner way void.

If I'm right that manner should take in, not by, then maybe all those legal by any manner phrases are back-formations from a misconstrual of by any manner of means as by any manner or means. If this analysis is correct, then this phrasal "folk etymology" has not only modified the idiom by any manner of means, it's also modified the prepositional affinities of the noun manner -- thus restoring healthy compositionality to a mangled idiom!

[Update 8/3/2006 -- John Cowan writes:

I well remember the tale of an official (and officious) sign saying "PLEASE LEAVE THIS TOILET IN THE MANNER IN WHICH YOU FOUND IT" under which someone had written "You mean by groping around?"
I think the Book of Common Prayer's "prayer for all sorts and conditions of men" may be the ancestor of this strange use of "manner", though googling for "all manner and conditions" comes up void.

]

Posted by Mark Liberman at 05:16 PM

Smoothista

In today's MeMo at the Houston Chronicle, Kyrie O'Connor seems to have coined a new word:

Insight at the Slightly Depressed Smoothie Place. The world would be a lot better if bosses didn't behave like jerks. OK, I just said insight. I didn't say original insight. But here's an example. The formula for making the squid-ink smoothies must be very inexact, because whether the manager or one of the smoothistas makes one, there is always a lot left in the blender. Now, what the smoothistas do is they just get a bigger cup and give me, the customer, the extra. The manager fills the right-size cup and then dumps the rest out. What can be the possible rationale for this? So I won't be disappointed if they ever figure out how to make smoothies the proper size?

Google finds no instances of smoothistas, and the only return for smoothista is a Finnish inflected form in the context "jotka hyppää toistaiseksi smoothista kokonaisuudesta esiin kuin sukellusvene Gobin autiomaassa" (which I hope is not unsuitable for quotation in a family weblog...).

Now, maybe smoothista is already well established down there in Houston, or maybe Ms. O'Connor just coined it, I don't know. But it looks to me like it hasn't been written down before, and I think it's a keeper.

Posted by Mark Liberman at 05:04 PM

Turn about

After a two and a half centuries of British complaints about Americanisms polluting the pure fount of the English language, we've now reached the point where American professor Ben Yagoda feels entitled to complain that "Briticisms have passed their sell-by date, and the odor (or should I say odour) is getting a bit rank".

This is more than just the usual objection to pretentiousness, though that stereotype is certainly there. Yagoda seem genuinely offended that his language is being invaded by foreign expressions, in just the same way that Fowler was, a hundred years ago on the other side of the Atlantic.

Posted by Mark Liberman at 12:51 PM

Mysterious Google

Query: antonym
Helpful Google: Did you mean: synonym

The relation is not symmetrical. A query for synonym just gives the list of results, with no helpful suggestion that you might mean antonym. This probably has something to do with the fact that there are 1,230,000 hits for synonym, and only 78,900 for antonym. However, more is going on than just Levenshtein distance and hit counts, since meronym, with the same edit distance and only 1,680 hits, also fails to get any helpful "did you mean" suggestion.

[initial observation from a note at SC's site]

Posted by Mark Liberman at 12:25 PM

Feel like that

For many people, "feel like" has become a complex verb that takes sentential complements:

(link) I feel like that I am neglecting my diary.

(link) Within weeks you will notice dramatic changes in your energy levels, and in about 3 months you will feel like that you are a completely different person.
(link) But my problem is that when we talk on the phone he is not much interested in talking and I feel like that he is not interested in me anymore but when I see him he is more passionate than before.
(link) Her and the owner are very close and I feel like that she is telling the owner not to give me the full responsibilities of being a manager because she feels threatened by me.
(link) We feel like that we're making a lot of progress, but a few things cause us to be of continuing concern.
(link) I feel like that they really should have should have taken care of their promise.
(link) I was wondering if you feel like that the Peninsular War is your turf?
(link) Do you feel like that a woman can handle this job?

Apparently phrases such as "I feel like a truck ran over me" are analyzed as analogous to "I believe a truck ran over me", and thus it is felt that a missing that can be supplied in both cases.

The same thing seems to have happened in these examples -- along with some other re-analysis or blending in the cases with referential subjects:

(link) It seems like that you discovered yourself when you found the writer Lorenzo in the story.
(link) The few times that Dantzler was sacked or stopped by the defensive line, it seemed like that they would gain it all back and then some with the next down.
(link) He seems like that he could care less if it is anything worse.
(link) Every time she sees me, she seems like that she wants to kill me.
(link) It's hard to believe that the live tracks were improvised, as they seem like that they had have been composed.
(link) It looks like that I'll be arriving there in the middle of June.
(link) I like him coz he looks like that he is always thinking about something.
(link) i think that she is really pretty!!! but in the first one she looks like that she smoked a pack of cigarets in 5 minutes and cant breath.

[Update: this is apparently not a new development. In a letter from S. Randolph Adkins to his aunt Sarah Adaline Thorne Merritt, datelined Petersburg, Va. July 6, 1864, he complains that "I have not herd from you nor the peeple of that neighborhood in sum time it appears like the peele has all quite riting and has all forgottten the soldiers that is far from home and friends", and points out that:

it dus not appear like that it would be a very difacult matter to rite a fw lines to a friend at a lesure time

]

Posted by Mark Liberman at 10:30 AM

June 16, 2004

Mr. Brain's Faggots

I've visited the U.K. more than a few times, and read many British novels, memoirs, biographies, histories and news articles. But every once in a while, something surprises me.

There's a discussion of the varied history of the word faggot on the two sides of the Atlantic here; the "meatball" meaning is mentioned but not explained. Geoff Pullum suggested in email that the shared meaning is something like "bundle of small pieces of X", where X is meat in the case of meatballs. The OED's sense 3.a. is "A bundle or bunch in general, e.g. of rushes, herbs, etc.", and 3.b. is "A 'bundle', collection (of things not forming any genuine unity)". But all of the examples involve either thin stick-like things (herbs, bones, etc.) or completely abstract things ("faggot of compliments", "faggot of utter improbabilities").

The OED's sense 6.a. is "term of abuse or contempt applied to a woman", with citation back to 1591, and sense 6.b. "(male) homosexual slang (orig. and chiefly U.S.)", with citations from 1914, is felt to be derived from that.

The OED gives the "meatball" meaning as sense 5., with the definition given only as reference to this quote (as a result of which I missed it the first time I read the entry!):

1851 MAYHEW Lond. Labour II. 227 He..made his supper..on 'fagots'. This preparation..is a sort of cake, roll or ball..made of chopped liver and lights, mixed with gravy, and wrapped in pieces of pig's caul.

It seems to be assumed that 6.a. comes from some concept like "bag of bones" or "bundle of sticks" -- though "meatball" is also a commonplace insult...

[via Wordlab]

Posted by Mark Liberman at 09:43 PM

ISMS

The Institution of Silly and Meaningless Sayings (ISMS) has a database of "more than 900 genuine isms". This category appears to combine what we have taken to calling eggcorns (e.g. legal waver) with mixed metaphors (e.g. "He put his finger right on the nail"), redundancies (e.g. "The Regional RMT (regional regional management team)"), malapropisms (e.g. "He was pruning himself in the mirror"), tautologies (e.g. "The dogs were... they were... just like... animals"), overnegations (e.g. "There's a complete lack of indiscipline in the team") and some other things (e.g. "He's had three attempts on goal and neither of them were any good").

Posted by Mark Liberman at 09:19 PM

MLA Language Map

The MLA has just opened up a site where you can "[v]iew an interactive map showing the density of speakers of thirty-seven languages and language groups", with all kinds of neat features, based on the 2000 census data. The site is pretty heavily loaded, so you might have to wait a while to see your maps, until the rush dies down.

From my own experiences, I surmise that they are not caching maps, even common ones like "English by county for the mainland U.S." As a simple sample, here's a map of the "density" of Armenian speakers by county for the mainland:

[Update: as reader Margaret S. emailed to point out, this is an odd use of the term "density", since it refers to the number of speakers per geographic region (county or zip code) for regions that are highly non-uniform, both in size and in population. It would be nice to be able to see some alternative normalizations, say proportion of the local population, or speakers per unit area -- and perhaps the MLA site should refer to what they now show as "the population of X speakers by county (or by zip code)", or "the number of X speakers ...", rather than the "density of ..."

Whatever the nomenclature, here's a map of the number of Chinese speakers per zip code for the region between Philadelphia and New York:

]

Posted by Mark Liberman at 08:41 PM

Under God an Idiom?

I don't buy Geoff Nunberg's argument that if the meaning of under God were compositional it should occur frequently in other contexts. A phrase will occur frequently in various contexts if people have frequent occasion to express the idea it expresses and if there are not very many ways to express that idea. The rarity of under God is probably due to the fact that these conditions are not satisfied. First, it isn't all that often that people feel the need to express this idea. Nobody wants to express the idea that the the United States is physically located beneath a physical God because to my knowledge nobody believes that. And those people who believe that the United States has been, is, or should be "under God" in the sense of being watched over by God or ruled by God or conforming to God's will, are all, in my experience, Christians, and when they express such ideas they speak in more specific terms, e.g. about a Christian nation.

This brings us to the second condition. There are lots of ways of expressing this sort of idea. Many of them wouldn't be good candidates for this particular context because they couldn't be made to fit the syntax or the metrics. Remember, the words under God were an addition to an existing text, which constrained the possible choices of phrasing. An additional constraint is that, although it may be possible to get away with non-denominational deism, it is pretty clear that a reference specifically to Christianity would have been much more problematic and more likely to be struck down by the courts. So it seems very likely to me that under God is not found in other contexts simply because nobody has felt the need to say the same thing subject to the same syntactic, metrical, political, and constitutional constraints.

We can see this by considering the phrase with liberty and justice for all. That phrase too appears to be unique to the Pledge of Allegiance. The only other hits that Google turned up (and that I could spare the time to look at) are obvious references to the Pledge, such as the title of a book. Yet nobody thinks that this phrase is an idiom. It's a perfectly compositional phrase that happens to occur only in the Pledge of Allegiance and references to it.

Indeed, suppose that under God were an idiom. Why wouldn't it occur in other contexts? Except for deadwood known only to scholars, idioms, like morphological irregularities, have to be fairly frequent or nobody would learn them. I submit that the plausible answer is because nobody felt the need to express whatever it is that this means subject to similar constraints.

It seems to me that there are only two possibilities here. One is that under God is compositional and has one of the several meanings in which it presupposes the existence of a single deity. In this case, it is unconstitutional. The alternative is that it is an idiom, created at the time it was added to the Pledge, whose meaning we really don't know. I find that highly implausible, because it means either that the people who proposed the addition and the legislators who voted for it didn't know what it meant or that they knew but have somehow failed to pass this information on to us. Furthermore, it's hard to believe that so many people would care so much about retaining it if it had no meaning. Indeed, if its meaning is really unknown, given that the campaign to insert it was led by the Knights of Columbus, we wouldn't expect such strong support from evangelical Protestants for retaining it. Instead, I would expect indifference from some and support for removing it from others, who would see it as a Papist plot, probably with a Satanic meaning. In any case, if it really is an idiom of unknown meaning, it may not be unconstitional, but it has no place in the Pledge because it is meaningless.

Posted by Bill Poser at 03:47 PM

Word Wars

"This is not your grandmother's scrabble." [via join-the-dots]

Posted by Mark Liberman at 02:10 PM

Okay, make that "half-assed legomenon"

Mark may have " found citations in Victorian poetry for the use of "under God" as a noun modifier, but they're pretty thin on the ground in Modern English prose. A Google search turns up virtually no instances of "under God" applied to nouns like schools, physicians, organizations, lawyers, doctors, physicians, and the like (which would be parallel to the poetic examples that Mark mentions), apart from those that clearly involve explicit or implicit references to the wording of the Pledge.

It's true that "under God" sometimes appears in verse and religious writing used adjectivally. That may be because "I am under God" has sometimes been offered as a translation of Joseph's remark in Genesis 50:19, Ne timeatis: numquid enim loco Dei sum? (though "in the place of God" is more standardly used). But it clearly isn't part of modern English idiom.

I have a problem, too, with Bill's suggestion that the meaning of "under God" should be clear here even if the phrase is a hapax legomenon, since phrasal meaning is compositional. But if the meaning of "under God" here could really be deduced compositionally, why isn't the phrase regularly used in that way -- why don't we see frequent references to "an organization under God," "We're all under God around here," and so forth? The explanation could only be either that "under God" as used in the Pledge expresses a compositional meaning that is somehow appropriate only to this unique context -- which is surely not what the people who inserted the phrase had in mind -- or that the phrase is perceived as an idiom that's used only in this context. In the latter case, it's no different from ordinary single-word hapax legomena, whose meaning can't be adduced from the single context in which it appears.

Posted by Geoff Nunberg at 01:59 PM

Out with Under God

Two comments on the under God business. First, whether or not the phrase is a hapax legomenon doesn't really matter. We don't in general need to appeal to previous usage to know what a phrase means. This is not because we are so clever but because of the way language works. In general, meaning is compositional. That is, the meaning of an expression can be computed from the meanings of its components. If this were not true, we would not be able to produce and understand novel utterances. In philology a hapax legomenon is almost always a word and is a problem because a single usage doesn't generally give us enough information to figure out what it means. Phrases are rarely considered hapax legomena even if they are in fact unique because so long as we know their components and the construction is a familiar one, we can figure out what the phrase means. If none of the meanings that we compute in this way make sense in context, we may infer that we are dealing with an unknown idiom, that is, an instance of non-compositionality. Only then do we have a problem. In the case at hand, there are some ambiguities as to what the intended interpretation of under God is, but they stem from ambiguities of English syntax and semantics; they have nothing to do with it being a hapax legomenon.

Secondly, although there is some question as to what under God means in this context, this is irrelevant to the question of whether this phrase is properly included in a pledge recited by children in public schools. On all of the interpretations that have been offered, this phrase presupposes the existence of a single deity. Whether it means that the country is physically located beneath this deity or has been under this deity's rule or ought to conform to this deity's wishes, the common factor is that there is such a deity. It is offensive to those who are atheists or agnostics or polytheists. Causing children to recite a pledge containing these words therefore violates their freedom of religion and constitutes an improper establishment of religion.

Posted by Bill Poser at 01:16 PM

An armed society is a polite grammatical society

This PartiallyClips cartoon illustrates a calm, civilized approach to grammatical argumentation, avoiding the fervor that sometimes accompanies discussions of usage. In an earlier post on preposition stranding, I suggested that a great deal of subsequent grief might have been spared, if only Ben Jonson had survived long enough to bring an analogous form of persuasion to bear on John Dryden.

[link emailed by Michael Albaugh].

[Other Language Log links to PartiallyClips, which seems to be our unofficial Official Cartoon: here, here, here, here, here, here, here, here, here, here, and here].

Posted by Mark Liberman at 12:50 PM

Never say never

I agree almost entirely with Geoff Pullum and Geoff Nunberg about "one nation, under God".

As Geoff Pullum says, "those extra claims in the appositional NP are secondary" and "you can put your hand on your heart and pledge a valid pledge of loyalty regardless of whether you think the appositionally tacked-on claims are sound".

And as Geoff Nunberg says, "[the] meaning [of the phrase under God] is up for grabs". The American Heritage Dictionary gives many senses for the preposition under that might be relevant: In a lower position or place than; Inferior to in status or rank; Subject to the authority, rule, or control of; Subject to the supervision, instruction, or influence of; Undergoing or receiving the effects of; In view of; because of; With the authorization of.

However, I must respectfully disagree with Geoff Nunberg when he says that "[t]he phrase is actually a hapax legomenon, at least in the role of locative adjunct... this is the only place in English that 'under God' is used in this way." The LION database of English poetry has 144 instances of "under God", and quite a few of them seem to me to be unambiguously locative adjuncts modifying noun phrases. I speak under correction, not being a syntactician; but there have been few contexts in which errors will be noted more quickly than in a widely-read weblog entry in 2004.

One striking example is this hymn verse by John Wesley (1703-1791) and/or Charles Wesley (1707-1788):

Sent forth by Christ indeed,
His true apostles go,
Through earth the joyful tidings spread
Of heaven display'd below:
Physicians under God
They for His patients care,
And all the grace on them bestow'd
To others minister.

In a similar vein, Caroline Sheridan Norton (1808-1877) wrote in The Child of the Islands (1846):

3106 Lo! out of Chaos was the world first called,
3107 And Order out of blank Disorder came.
3108 The feebly-toiling heart that shrinks appalled,
3109 In Dangers weak, in Difficulties tame,
3110 Hath lost the spark of that creative flame
3111 Dimly permitted still on earth to burn,
3112 Working out slowly Order's perfect frame:
3113 Distributed to those whose souls can learn,
3114 As labourers under God, His task-work to discern.

There are several examples in the work of Robert Browning (1812-1899). For instance, in The Ring and the Book (X. The Pope), we have

1454 What is this Aretine Archbishop, this
1455 Man under me as I am under God,

Well, this one is not really an adjunct, as I understand the term, and it is certainly not modifying a noun phrase, at least directly, but rather is used (I guess) as a predicative complement of to be. On the other hand, Geoff did cite "The U.S. has been under God since its founding" as an example of things that "[p]eople don't ordinarily go around saying". And he was right to do so, since the structure under discussion is often effectively a sort of reduction of such a predicative usage.

Continuing with Robert Browning, from In a Balcony (1843) we have

244 ... This eve's the time,
245 This eve intense with yon first trembling star
246 We seem to pant and reach; scarce aught between
247 The earth that rises and the heaven that bends;
248 All nature self-abandoned, every tree
249 Flung as it will, pursuing its own thoughts
250 And fixed so, every flower and every weed,
251 No pride, no shame, no victory, no defeat;
252 All under God, each measured by itself.
253 These statues round us stand abrupt, distinct,
254 The strong in strength, the weak in weakness fixed,
255 The Muse for ever wedded to her lyre,
256 Nymph to her fawn, and Silence to her rose:
257 See God's approval on his universe!

Another R. Browning example, whose construal is perhaps a bit more uncertain, is from the start of An Epistle Containing the Strange Medical Experience of Karshish, the Arab Physician:

1 Karshish, the picker-up of learning's crumbs,
2 The not-incurious in God's handiwork
3 (This man's-flesh he hath admirably made,
4 Blown like a bubble, kneaded like a paste,
5 To coop up and keep down on earth a space
6 That puff of vapour from his mouth, man's soul)
7 ---To Abib, all-sagacious in our art,
8 Breeder in me of what poor skill I boast,
9 Like me inquisitive how pricks and cracks
10 Befall the flesh through too much stress and strain,
11 Whereby the wily vapour fain would slip
12 Back and rejoin its source before the term,---
13 And aptest in contrivance (under God)
14 To baffle it by deftly stopping such:---
15 The vagrant Scholar to his Sage at home
16 Sends greeting (health and knowledge, fame with peace)

I take "under God" in this passage to be modifying "contrivance", to express the conventional caveat "inshallah" meaning "God willing". It might be objected that this is an adverbial use -- but it is such a loose sort of adverbial that it could be placed nearly anywhere, including in the pledge. "One nation (God willing) with liberty and justice for all". Thus we've found another interpretive option!

Another example can be found in the monumental but (deservedly?) little-known 1877 work Festus, by Philip James Bailey (1816-1902). This passage is from book XX (and yes, the line numbering apparently begins again from 1 at the start of each book):

13331 How am I answerable for this my soul?
13332 My master, free with me, as fixed with fate;
13333 As a star which moves a certain course in mode
13334 Certain, its liberties are laws; its laws,
13335 Tyrannic, under God. All that we do,
13336 Or bear, is settled from eternity
13337 Endless, beginningless. To act is ours;
13338 Quite sure, not less, all done, or good or ill,
13339 Is for God's glory always, and is ordered.

"Endless, beginningless" indeed.

One last example, from a play in verse by Thomas Lowell Beddoes (1803-1849), entitled Death's Jest-book: Or the Fool's Tragedy. This is from Act II, Scene III:

Athulf.
104 Then all the minutes of my life to come
105 Are sands of a great desart, into which
106 I'm banished broken-hearted. Amala,
107 I must think thee a lovely-faced murderess,
108 With eyes as dark and poisonous as nightshade;
109 Yet no, not so; if thou hadst murdered me,
110 It had been charitable. Thou hast slain
111 The love of thee, that lived in my soul's palace
112 And made it holy: now 'tis desolate,
113 And devils of abandonment will haunt it,
114 And call in Sins to come, and drink with them
115 Out of my heart. But now farewell, my love;
116 For thy rare sake I could have been a man
117 One story under god. Gone, gone art thou.
118 Great and voluptuous Sin now seize upon me,
119 Thou paramour of Hell's fire-crowned king,
120 That showedst the tremulous fairness of thy bosom
121 In heaven, and so didst ravish the best angels.
122 Come, pour thy spirit all about my soul,
123 And let a glory of thy bright desires
124 Play round about my temples. So may I
125 Be thy knight and Hell's saint for evermore.
126 Kiss me with fire: I'm thine.
Isbrand
126 Doth it run so?
127 A bold beginning: we must keep him up to't.

This "bold beginning" will be my ending; there are several other clear cases in the list from LION, and many ambiguous ones, but I'll spare you.

I should add in closing that the title of this post is a bit misleading. Sometimes it's fine to say "never". There certainly are examples of putative English usage that have never occurred and never will occur, other than as slips of the tongue or pen, unless the language changes in basic ways. But if you see a simple grammatical construction or lexical usage happen once, it's probably wrong to say that it's never happened before, and it's almost certainly wrong to say that it'll never happen again.

So a better title would have been "Never say hapax". At least not with respect to a language as a whole, as opposed to a fixed corpus, in which there will always be plenty of hapax legomena.

Posted by Mark Liberman at 10:25 AM

Dysfunctional Shift

There's an interesting historical sidelight to Geoff Pullum's "observations about the slippery syntax of "one nation under God." The phrase is actually a hapax legomenon, at least in the role of locative adjunct, a point I made in a "Fresh Air" piece I did on the pledge that appears in my book Going Nucular.

"Under God" was originally taken from the Gettysburg Address, but Lincoln used it adverbially: "...this nation, under God, shall have a new birth of freedom" (i.e., "under God" modifies have). But in the Pledge, the phrase was somehow changed to a modifier of the noun nation. And a very odd one, at that -- this is the only place in English that "under God" is used in this way. People don't ordinarily go around saying things like "The U.S. has been under God since its founding" or "Organizations under God will benefit from the faith-based initiative measure."

The uniqueness of the construction leaves its meaning up for grabs. Is that the under of "under heaven," the under of "under the marketing manager," or the under of "under orders"? Does it mean that we believe in God or that we report to Him or that we have His personal attention? It's anyone's guess.

It may be that the functional shift of "under God" was simply the result of a misreading of the Gettysburg Address, but if so it was a serendipitous one. What better way to signal the doctrinal neutrality of the state than to express our official deism so obscurely?

Posted by Geoff Nunberg at 02:30 AM

June 15, 2004

Grammatical uproar at LiveJournal -- news at 11

The folks over at LiveJournal are having a ~~brawl~~ lively discussion over "than me" vs. "than I", occasioned by this cartoon:

Jason's mom Andrea is playing the prescriptivist role in this exchange, as well as the straight man ("straight mom"?) role in the first round of the joke. As is often the case with such prescriptions, the underlying grammatical analysis is faulty. The issue is discussed on p. 1113-1117 of the Cambridge Grammar of the English Language: in sequences like "than X", where X is a single element, is X a reduced clause ("no one has more intelligence than I [do]"), or an immediate complement ("no one has more intelligence than me")? CGEL points out that there are some examples that can't possibly be reduced clauses:

It is longer than a foot.
He's inviting more people than just us.
He's poorer than poor.
I saw no one other than Bob.

On the other hand, CGEL gives (more complex and equivocal) arguments that some examples should be considered to be reduced clauses. The conclusion is to agree (details of terminology aside) with Ken Wilson's sensible views in the Columbia Guide to Standard American English:

Than is both a subordinating conjunction, as in She is wiser than I am, and a preposition, as in She is wiser than me. As subject of the clause introduced by the conjunction than, the pronoun must be nominative, and as object of the preposition than, the following pronoun must be in the objective case. Since the following verb am is often dropped or “understood,” we regularly hear than I and than me. Some commentators believe that the conjunction is currently more frequent than the preposition, but both are unquestionably Standard. The eighteenth-century effort to declare the preposition incorrect did succeed in giving trouble, not least because it called the than whom structure into question, but it too is again in good order: He is a fine diplomat, than whom we would be hard-pressed to find a better.

(Wilson's entry was also cited by Lexabear in the LiveJournal melee).

[Update: Abnu at Wordlab wrote to say that "The goings-on over at LiveJournal did not amount to a brawl nor a lively discusssion, but rather, a brouhaha." I have to admit that he's right.]

Posted by Mark Liberman at 05:59 PM

Reading Chinese Menus

Cover of book It isn't often that major international news breaks here on Language Log, but I think we've done it this time. An anonymous source (Helma Dik) has just informed me that Jim McCawley's classic The Eater's Guide to Chinese Characters has been reprinted by the University of Chicago Press.

The late Jim McCawley [pdf] (1938-1999), who is greatly missed, was renowned among linguists for his expertise on Chinese food (among many other things). This book is a practical guide to reading Chinese menus. It explains the structure of typical Chinese menus, a variety of culinary terms, and even the conventions for writing prices while taking the reader through several real menus. It contains a total of 23 menus, some handwritten. These are provided with printed equivalents. The latter part of the book is a Chinese character dictionary containing words likely to be used in menus. It does not use the traditional radical system, so for people familiar with Chinese dictionaries it takes a little getting used to, but the system used is probably easier for non-specialists. Unfortunately, it went out of print some years ago, so this reprinting is a great development.

If you don't already have it, and aren't already thoroughly literate in culinary Chinese, you should order it immediately. The University of Chicago Press will be glad to supply it, as will Amazon.com. I suggest buying at least two copies. You'll probably want one for home and one for away, and if you lend it out, you shouldn't count on getting it back. Remember, there is a deep connection between linguistics and Chinese food. Now that it is available again, I'm sure that the better linguistics departments will supply copies to their entering graduate students.

Posted by Bill Poser at 05:44 PM

Antipleonasm and reredundancy

In response to my post on sentences like "The two men share greatly differing views", Arnold Zwicky, who is at home on babysitting duty, emailed to point out that

"Share" instead of "have" in "share differing/different/diverse" is, I think, just the flip side of this alternation in "share the same/identical", which is often disapproved of as a "pleonasm" or "redundancy".

Does that make it an antipleonasm? or perhaps an disredundancy?

Arnold goes on to say:

Even more exciting, it's easy to find combinations of "share the same" and "both have the same" (also sometimes noted as a pleonasm): "both share the same", which nets 11,400 Google web hits. Reredundancy!

Well, hang on to your hat, Arnold, because we can also find "both share different" and "all share different":

(link) Since some students find it difficult to work with other students, because they both share different knowledge, characteristics, and interests.
(link) Both male and female Eastern Rosellas are monomorphic and both share different shades of colours some more brighter than others of the same sex.
(link) The professors both share different views of the text; this makes the discussions very lively and exciting.
(link) I live with many cats, and they all share different gifts.
(link) Not only does each member of the band claim a different musical background, but the[y] all share different ties to Nipe as well.

Is this disreredundancy ? or perhaps redisredundancy?

I'm relieved to be able to report that Google's index does not contain any instances of "both share the same different" or "all share the same different" -- that might have been too much excitement for one day!

It could have happened, though -- the "three charges" mentioned below could be said to "all share the same different velocities...":

(link) Start the simulation with three identical charges, traveling at the same different velocities perpendicular to a magnetic field using the button below.

... at least this should be possible for a speaker of the "share different" persuasion, which I am not.

Posted by Mark Liberman at 05:24 PM

How To Cook and Eat in Chinese

In a piece in last Sunday's New York Times Magazine [registration required] Jason Epstein, a former editor at Random House, talks about an unusal book on Chinese food that he edited. It was called How to Cook and Eat in Chinese and the author was Buwei Yang Chao, by profession a medical doctor. Mrs. Chao's husband was Yuen Ren Chao (1892-1982), a famous linguist. He is known particularly for his work on Chinese dialects, to which the Yuen Ren Society is devoted. To phonologists of a certain generation, he is known for his 1934 paper "The non-uniqueness of phonemic solutions of phonetic systems" which appeared in the Bulletin of the Institute of History and Philology of the Academia Sinica, part 4: 363-397 and is reprinted in Martin Joos (ed.) (1957) Readings in Linguistics pp. 38-54. Y. R. Chao was also known for the romanization of Chinese that he promoted, the Gwoyeu Romatzyh, whose distinguishing feature is that it indicates tone by changes in the letters used, not by diacritics. For an example, you might want to look at this version of Professor Chao's famous translation of an episode from Alice in Wonderland into Chinese. The connection between linguistics and Chinese food runs deep.

Mrs. Chao described the process of the composition of the book thus:

I speak little English and write less. So I cooked my dishes in Chinese, my daughter Rulan put my Chinese into English and my husband, finding the English dull, put much of it back into Chinese again.

Her daughter Rulan, by the way, is Rulan Chao Pian, Professor Emerita of East Asian Studies and Music at Harvard. The book is considered a classic by many, arguably the first to introduce authentic Chinese food to Western readers. It is now out of print.

The article does introduce one confusion. In discussing 点心 Epstein says they are called tien-hsin but that this is now transliterated as dim sum. This might give the impression that the difference is merely a change in transliteration system. Actually, tien-hsin is the Wade-Giles romanization of the Mandarin Chinese for 点心, while dim sum is a transliteration of the Cantonese. The current pinyin transliteration of the Mandarin is diăn xīn. In Gwoyeu Romatzyh it is dien shin.

Posted by Bill Poser at 03:44 PM

base26

A visualization of (pieces of) the space of 4-letter-words in English, by toxi, using processing.

It would be nice to do some appropriate dimensionality reduction first, which would enable (a version of) the whole 4-letter-word space to be shown at once, or to be seen from perspectives different from those provided by adjacent letter pairs. The same approach would also allow visualization of the whole vocabulary, or various pieces of it. It would be interesting to try similar things in the space of pronunciations, or the joint spelling/pronunciation space.

Posted by Mark Liberman at 11:47 AM

Who knew?

Voidoid guitarist Robert Quine, who committed suicide recently, was W.V.O. Quine's nephew.

[via radio free narnia]

Posted by Mark Liberman at 11:37 AM

Sharing diversely

Some people write as if share is the plural form of have:

(link) Rep. Michael McNulty (D, NY), my Congressman, and I share very divergent views, he doesn't represent me.

(link) The two men share greatly differing views and ideas to the acquisition of grammar in humans.
(link) Even neighborhoods twenty or thirty miles apart share differing environmental burdens. (The number of high-ozone days in Houston and along the Ship Channel is not similar to the low number of days in Galveston.)
(link) LINKS FOR COUPLES WHO SHARE DIFFERENT FAITH TRADITIONS.
(link) There are societies in this world in which people share very different styles of conditioning and, therefore, very different styles of aging.
(link) All of the wolves seem to share very different pasts and come from very different parts of India.
(link) To say this is only to say that Siegel and I share very different views of philosophy, and about why it's important to our lives.
(link) Four different models will be available, where each of the four has similar features but sharing very different looks.

Perhaps the idea is that if you parcel something out into shares, everyone gets a different piece; and therefore it makes sense to talk about A and B sharing properties X and Y, where you mean that A gets X and B gets Y. But all these examples seem odd to me -- the writers and I clearly share somewhat different meanings for share :-).

This divergent sharing may be a new development -- otherwise it's surprising that none of the thousands of poets in the LION database have used the word sequences "share different", "sharing different", "share differing", "share diverse", "sharing diverse" etc.

Posted by Mark Liberman at 11:16 AM

Signs or symbols? Words or tools?

In response to my posts about Rico (the border collie who can fetch in response to about 200 different spoken words), Mark Seidenberg wrote to draw my attention to a paper that he co-authored nearly 20 years ago:

For me, Rico is deja vu all over again, since the discussion so closely recapitulates responses to the early reports about Kanzi, the bonobo, who was also supposed to be an incredibly fast learner, have a remarkable vocabulary, etc. We even used the Helen Keller case in a 1987 article about Kanzi.

(Seidenberg, M.S., & Petitto, L.A. (1987). " Communication, symbolic communication, and language." Journal of Experimental Psychology: General, 116, 279-287.)

Mark's note agrees with the interpretation of the Helen Keller "water insight" story that I took from Walker Percy:

I think it's highly doubtful that Helen Keller's memory of this event is accurate, or that there is a naming-insight-moment that occurs in normal child development. However, her description (and the dramatization of it in The Miracle Worker) does effectively bring out the idea that there might be something about having the concept of a name, rather than mere associations between linguistic forms and objects (or even classes of objects).

Kerim Friedman also sent email on this point, citing his post last fall at Keywords that quotes Keller's autobiography (following a comment by Lisa at Language Hat's site) in support not only of general pre-disease language acquisition, but even of specific knowledge of the word "water":

I am told that while I was still in long dresses I showed many signs of an eager, self-asserting disposition. Everything that I saw other people do I insisted upon imitating. At six months I could pipe out "How d'ye," and one day I attracted every one's attention by saying "Tea, tea, tea" quite plainly. Even after my illness I remembered one of the words I had learned in these early months. It was the word "water," and I continued to make some sound for that word after all other speech was lost. I ceased making the sound "wah-wah" only when I learned to spell the word.

This further undermines the mythic image of a moment of electric insight -- it becomes more like Plato's notion of learning as remembering -- but as Mark says, the main point of the story is that naming is more than associating signs with objects. In Walker Percy's way of talking, the key difference is between a symbol and a sign. I quoted Percy's 1975 essay, which in turn quoted Langer's 1957 book and various even earlier sources, because I wanted to avoid triggering the understandable frustration that animal communications researchers feel when they accuse skeptics of "moving the goalposts" after each new demonstration of animal abilities. The 1987 Seidenberg and Petitto paper also helps to make it clear that these issues have been out there for some time.

Mark's email continues:

Most critics of the animal language research (over what is now a 30+ year period) have tended to grant that the animals in question had acquired lexical knowledge comparable to a young child's, but drew the line at syntax. Whereas I took the position that the animals couldn't even name; if you look at their behavior closely it deviates from what you would expect if the animal really knew what a name is. And if they don't know what names are, how could that be taught?

In the 1987 paper, Seidenberg and Petitto wrote that "intriguing results from an ongoing study of two pygmy chimpanzees" suggest that "their behaviors are similar to nonlinguistic gestures used by 9- to 16-month-old children". They argue that to characterize animals' behaviors "in terms of a general notion of symbolic communication ... is unsatisfactory because the term is not clearly defined and it is not clear what range of behaviors it subsumes. Terms such as gesture, symbol, and word ... are not equivalent."

The crux of their conclusion is that when chimpanzees use signs in natural contexts, they "mand" in the sense of Skinner (1957)[see below]. They argue that "[t]his behavior is surely communicative; the important point is that it does not require knowledge of words or symbols at all. Under the appropriate circumstances, a smile can serve the same function."

Putting it another way, they write:

Our view is that Kanzi's behaviors are more like the use of tools than the human use of language. Tools are the instruments by which we attain certain outcomes. They are not symbols.

Those interested in philosophy of language will note that this is similar to Searle's Chinese Room argument, which they cite.

Mark's recent email points out that

The fact that Rico is always comprehending language in the context of fetching is related to our observation that Kanzi (and other chimpanzees like Nim Chimpsky, with whom I worked for a while) used their "language" for instrumental purposes like getting food. I would also note how notoriously difficult it is to assess comprehension in children, let alone dogs. How much information does the child/animal need to have in order to perform a task (like picking out the referent of a word)? Often such tasks can be done on the basis of very partial knowledge that would not necessarily license the attribution that they know the words.

Rico doesn't change anything about the linguistic capacities of animals; however, his use of mutual exclusivity ("process of elimination" is how they are describing it in the news reports) to determine the likely associate of a new word makes him a very smart fellow.

Specifically, they suggest that

Kanzi has learned about the instrumental functions of lexigrams in the experimental context. He does not know that lexigrams designate, represent, symbolize, or name objects and events; rather, he knows how to use them in order to effect desired outcomes such as obtaining objects, being allowed to engage in favored activities, or receiving the approval of his trainers. Lexigrams are the means by which he obtains positive responses from the teachers who control these outcomes. This account explains several aspects of the chimpanzee's performance that would otherwise seem incidental or unrelated.

The origin of this hypothesis was their own work in the late 1970s with another chimpanzee, Nim, from which they concluded that "he seemed to know the outcomes associated with producing signs--that is, their pragmatic functions--not the concepts associated with them, or what they named, or that they were names at all".

One of the reasons for this conclusion is the difference between behavior in "vocabulary tests" and in "naturalistic exchanges".

The vocabulary test shows that Kanzi can associate lexigrams with pictures of objects and spoken words; conversely, he can associate spoken words with lexigrams and pictures. This behavior is remarkable, and the fact that he learns quickly without the procedures used in previous studies is important. The difficult question is whether this behavior indicates that lexigrams function as symbols or names. The same question applies to the similar, though more limited, behavior observed in pigeons, who are capable of learning to associate an arbitrary response with exemplars of categories such as trees or bodies of water. It also applies to the early communicative behavior of children.

In the naturalistic exchanges, Kanzi used lexigrams much more broadly:

Kanzi knows something about the outcomes associated with lexigrams and uses them--communicatively--to effect these outcomes. Thus, he has learned to produce juice in contexts where his trainers will interpret it as appropriate, thereby facilitating outcomes such as receipt of juice or a trip to the juice location, not because juice designates specific objects.

Those who think that the chimps have learned words are prone to interpret this broadening as something like metonymy (though Seidenberg and Petitto do not use that word):

For example, if Kanzi learns to touch the symbol for strawberries when he wants to travel to the place where they are found, when he is asking for one to eat, and when shown a photograph of strawberries, he will probably extract the one common referent (red sweet berries) from all of those different circumstances, and assign to that referent the symbol, strawberries. (from Savage-Rumbaugh et al. 1986)

But as Seidenberg and Petitto argue,

Whatever the plausibility of this assumption, it requires empirical validation. We agree that in producing strawberry Kanzi recognized something common to all the situations in which it is used, but what? Is the common element strawberries or the fact that these are the situations in which strawberry can be used with a positive result?

S & P suggest one type of empirical test:

In experiments by Markman and Hutchinson (1984), a child was presented with a picture of a target object (such as cow), a taxonomically related alternative (pig), and a thematically related alternative (milk). Thematically related items were causally or temporally related to the target; other examples include door/key (object/instrument) and ring/hand (object/location). The experimenters introduced a novel word for the target object (e.g., sud). The child's task was to choose the picture that represented another sud ("See this? It is a sud. Find another sud that is the same as this sud."). On approximately 80% of the trials, children chose the taxonomically related alternative rather than the thematically related alternative. Hence, their initial hypothesis about an unfamiliar word is that it refers to a category of objects rather than objects that happen to be causally or temporally related.

In using a lexigram in reference to an object such as juice and to the location where it is found, Kanzi was apparently responding on the basis of temporal and causal relations between these entities.

S & P also observe that

It is possible to extract from child language corpora utterances that resemble Kanzi's.
For example, a child is observed to consistently use the word daddy in reference to her father and no other person. The child then sees the father's special easy chair when he is not present, points to it, and says "daddy." Therefore, the child is using daddy in reference both to a person and to the location where he is often found, much like Kanzi used juice in reference to a drink and its location. The comparison between child and chimpanzee turns on whether the same processes and types of knowledge underlie their respective utterances, something that cannot be determined by examining individual utterances. Our claim is that the bases of these utterances are in fact very different.

Read the paper for the rest of the argument. Two lessons of the long history of research in this area:

It's hard to design experiments that distinguish clearly between knowledge of signs and knowledge of symbols, or between the use of communicative gestures as tools and lingustic communication.
Some animal communication researchers don't believe that there is a difference. In some cases, this is may be because they have considered the distinction and rejected it; but in other cases, they have not really thought about the issue.

Still to come: a close look at the Rico article, and at Paul Bloom's discussion of it.

* B.F. Skinner proposed categorizing instances of speech according to how they are reinforced -- see this description. Attempting a brief summary... his categories were echoic, mand, tact, interverbal and autoclitic. An echoic utterance is a repetition of another's pattern ("Cookie" in response to "Can you say cookie?); a mand is a demand or directive ("Cookie" in order to get a cookie); a tact is an utterance that provides information rather than attending to a state of deprivation ("Cookie" to name a picture of a cookie); an interverbal is a discourse-related word like "please", or a purely associative conversational sequence (you say "milk", I say "cookie"); autoclitic speech seems to be internally-directed speech, or perhaps everything left out of the other four categories -- namely just about all linguistic behavior?

[Update 6/16/2004: interesting comments, and informal experiments with another dog named "Rico", at The Panda's Thumb.]

Posted by Mark Liberman at 09:42 AM

June 14, 2004

More on scientific and scholarly publishing

Nature Web Focus has a fascinating forum on "access to the literature".

Here are some short quotes from interesting recent pieces by Daniel Greenstein,

I believe that the business model of commercial publishing, which once served the academy's information needs, now threatens fundamentally to undermine and pervert the course of research and teaching. Put bluntly, the model is economically unsustainable for us.

Stevan Harnad,

Lawrence found that in computer science citations were three times higher for Open Access articles than for papers only available for payment in print or online. Kurtz et al. have since reported similar estimates in astrophysics, and Odlyzko in mathematics.

We are carrying out a much larger study across all disciplines, using a 10-year sample of 14 million articles from the Institute for Scientific Information (ISI)'s database; initial results, for the field of physics, show Open Access articles being cited 2.5 to 5 times more than articles that users' institutions must pay to access online...

and Theodore and Carl Bergstrom:

For Reader Pays publishing, data on price and quality of journals are abundant. For the nascent Author Pays model, not much historical information is available and the academic world has not yet had time to adapt fully to the technological capabilities of the Internet. To predict the eventual shape of the academic publishing market, it is useful to think about the economic fundamentals of this industry and their likely effects.

Posted by Mark Liberman at 10:32 PM

"Controversial cult" given state OK in Texas?

Like other Unitarians, Rivka at Respectful of Otters was outraged when Texas Controller Carole Keeton Strayhorn denied a tax exemption to the Red River Unitarian Church in Denton, and pleased when the decision was reversed. I'm not sure whether she'll be amused or apalled to read the reversal described in the following terms by the Agape Press Christian News Service:

Texas officials have reversed an earlier decision denying tax-exempt status to a controversial religious cult in that state. The state's comptrollers office initially ruled that the Red River Unitarian Universalist Church was not a religious organization for tax purposes since it did not have one unified system of belief. However, after a review by the agency's general counsel, that ruling was reversed.

Cult is one of those words, like terrorist, that gets people worked up, either because they think it's being used where it's not appropriate, or because they think it isn't being used where it's deserved.

It's easy to see why, given the American Heritage Dictionary's primary definition "A religion or religious sect generally considered to be extremist or false, with its followers often living in an unconventional manner under the guidance of an authoritarian, charismatic leader." Encarta agrees, more or less, giving as the first sense "a system of religious or spiritual beliefs, especially an informal and transient belief system regarded by others as misguided or unorthodox" (though "informal" strikes me as an odd choice for a characteristic feature of cults).

The strongly negative connotations are fairly recent, since the OED's definitions are considerably more neutral:

1. Worship; reverential homage rendered to a divine being or beings. Obs. (exc. as in sense 2).

2. a. A particular form or system of religious worship; esp. in reference to its external rites and ceremonies.
b. Now freq. used attrib. by writers on cultic ritual and the archæology of primitive cults.

3. transf. Devotion or homage to a particular person or thing, now esp. as paid by a body of professed adherents or admirers.

The free online Merriam-Webster splits the difference, putting the "unorthodox or spurious" stuff in third place:

1 : formal religious veneration : WORSHIP
2 : a system of religious beliefs and ritual; also : its body of adherents
3 : a religion regarded as unorthodox or spurious; also : its body of adherents
4 : a system for the cure of disease based on dogma set forth by its promulgator <health cults>
5 a : great devotion to a person, idea, object, movement, or work (as a film or book); especially : such devotion regarded as a literary or intellectual fad b : a usually small group of people characterized by such devotion

As further evidence for the shift in meaning, Webster's 2nd unabridged (1913) has

1. Attentive care; homage; worship.

2. A system of religious belief and worship.

whereas Webster's 3rd (1961) has a much expanded entry, with the "unorthodox or spurious" stuff starting in sense 4:

1 : religious practice: WORSHIP

2 : a system of beliefs and ritual connected with the worship of a deity, a spirit, or a group of deities or spirits

3 a : the rites, ceremonies, and practices of a religion : the formal aspect of religious experience
b Roman Catholicism : reverence and ceremonial veneration paid to God or to the Virgin Mary or to the saints or to objects that symbolize or otherwise represent them (as the crucifix or a statue) --- called also cultus --- compare DULIA, HYPERDULIA, LATRIA

4 : a religion regarded as unorthodox or spurious
also : a minority religious group holding beliefs regarded as unorthodox or spurious: SECT

5 : a system for the cure of disease based on the dogma, tenets, or principles set forth by its promulgator to the exclusion of scientific experience or demonstration

6 a : great or excessive devotion or dedication to some person, idea, or thing
esp : such devotion regarded as a literary or intellectual fad or fetish
b : the object of such devotion
c (1) : a body of persons characterized by such devotion
(2) : a usu. small or narrow circle of persons united by devotion or allegiance to some artistic or intellectual program, tendency, or figure (as one of limited popular appeal)

Other information on what cult has come to mean in the 21st century, at least to some people, can be found here, here, here, and so on. It hardly seems appopriate to use the word in reference to the Unitarians, who are about as far as one can get from the fervor, authoritarianism, dogmatism, charisma and coercion that the anti-cult sources point to -- unless cult is just taken to mean "religion that I dislike", in which case the Agapeans have as much right to call the Unitarians names as others do to apply the same terms to the denominations that the Agapean favor. However, if I were a Unitarian, I think I'd respectfully draw the attention of my Agapean brothers and sisters to Mathew 7 and John 8.

[link to Agape Press sent in by a Philadelphia Unitarian of Texan origin]

[Update 6/15/2004: Rivka explains in detail why some evangelical Christians call UU a "cult" (because they define "cult" as "any group of people that worship anything or anyone other than Jesus Christ, and believe anything contrary to His word or the Word of God according to the Bible"), and also links to a Southern Baptist's thoughtful article on how to evangelize Unitarians.

The same author (Cky J. Carrigan) also has an articles on converting Mormons and Baha'i, and on dealing with Pokemon, among many other topics. His home page explains that his first name "Cky" "is pronounced 'Ky' which rhythms with 'Why'", but doesn't explain where it comes from. Google is not helpful here because the returns for "Cky" are dominated by a very unchristian rock band, a TV station in Winnipeg, the code for the airport in Conakry, Guinea, and (of course) the parsing algorithm. Dr. Carrigan's parents are said to be "full time evangelists and missionaries" rather than computational linguists, so I guess that the unusual name must come from some missionary experience.]

Posted by Mark Liberman at 07:12 PM

One nation [head], under God [adjunct]

Another 50-year anniversary today: exactly a half century ago, Congress added the words "under God" to the Pledge of Allegiance that every schoolchild in America recites each day:

I pledge allegiance to the flag of the United States of America, and to the republic for which it stands, one nation under God, indivisible, with liberty and justice for all.

(There's often some dubious capitalization in written presentations of this text; I've dropped it, and capitalized normally.) An atheist plaintiff from (of course) California has sued to get his daughter's school to stop indoctrinating her, against his wishes and beliefs, by making her recite the obviously religious bit about God. The state is supposed to be rigidly separate from religion in the USA; here it looked like it wasn't. The Supreme Court today sidestepped the issue by saying that the plaintiff didn't have standing to argue on behalf of his daughter on this matter (he doesn't have custody, and the little girl is a Christian and has no objections to the words above).

So does the added phrase "under God" make the sentence quoted above into religious propaganda on the part of the state, or not? Naturally, you're expecting that Language Log will tell you, since we know what phrases and sentences do. So read on.

Actually, I rather fear I don't really know. Sometimes you need a linguist, sometimes a lawyer, sometimes a priest. Today you may have to go away disappointed. But there is one relevant linguistic point I can make.

Under God is a locative adjunct in the structure of a noun phrase (NP). The NP in question is one nation under God, indivisible, with liberty and justice for all, and it's actually in apposition to another NP, the republic for which it stands, which denotes part of what allegiance is being pledged to (the other part is the flag). Adding an appositional NP has the effect of conventionally implying that the appositional NP is also a valid description of the other NP that it's attached to. For example, when you say Ray Bradbury, the science fiction author, you're referring to Ray Bradbury, and also adding a secondary claim that he can be correctly referred to as a science fiction author. If it turned out later that he'd never written any of those stories, he plagiarized everything from a housewife in Cedar Rapids, Iowa, you'd still have referred to him.

The complex appositional NP in the pledge does commit the pledger to a number of claims: that the USA is (1) a single nation, (2) located under God in some sense, (3) indivisible, (4) having liberty for all, and (5) having justice for all. One or more of those claims might in principle be false. One can imagine people objecting to (1) because of Guam and Saipan and Puerto Rico; or objecting to (3) on the grounds that of course the nation could be broken up by secession if all parties agreed. One could imagine radical prisoners' rights advocates objecting to (4) or (5) because there is no liberty or justice for a felon imprisoned for years and permanently deprived of any future chance to vote (not my point of view, but the point is that one could imagine someone arguing this). The objection that (2) is false because there is no God seems on a par with these other imaginable objections.

But the thing about these conventionally implied extra statements is that they aren't the main point. Suppose someone said to me, in front of my lawyer, I hereby permanently and irrevocably give to you this garage, the very one where the Hewlett-Packard company was born. If I find out later that Hewlett and Packard never worked there, it's the wrong garage (the real one is in Palo Alto, and if you happen to walk along the right street you can see it, there's a plaque outside), then I may be annoyed with you, and say that you gave me some bad information about my new garage, one thing is for damn sure: it's my garage now. No question about that. The main thing your utterance did was to give me the garage free and clear. The stuff in the appositional NP was secondary, and although its falsity does mean you said something false, that doesn't undercut what the main part of the utterance accomplished. So what I'm saying is that even if there is no God, it doesn't matter. The pledge is valid anyway.

Now, I'm not going to say qua linguist whether including one adjunct containing the NP God is religious indoctrination or not. I'm inclined toward saying it's not, since the adjunct makes no real assertion about what religious beliefs you should have; but I'm happy to leave that to the lawyers. All I'm going to say that the patriotic atheist need not worry too much: the pledge is primarily a linguistic formula for announcing one's loyalty to the flag of the United States of American and to the republic for which it stands. If it is not actually located under God because there's no such entity, or it's not indivisible because voluntary secession is constitutional, that doesn't matter. Those extra claims in the appositional NP are secondary. You can keep an open mind about the truth of any or all of them.

So I'm not settling the matter of the suit (it will probably come up to the Supreme Court again), but I am saying, on this US flag day, that you can put your hand on your heart and pledge a valid pledge of loyalty regardless of whether you think the appositionally tacked-on claims are sound, and so can your kids. So don't hold back from saluting our star-spangled banner because of doubts about the extra adjunct they put into the pledge 50 years ago today.

Posted by Geoffrey K. Pullum at 06:30 PM

Dealing with these type situations

Listening to C-SPAN with half an ear the other day, I heard some retired general saying something like "... in this type situation, we..."

It occurred to me to wonder: what part of speech is type, in this construction? The plural form "these type situations" makes type seem more like an adjective or quantifier than a noun; but then we also have "these type of situations" along with (the expected form) "these types of situations."

Curiously, we don't seem to get "*this kind situation" or "*these kind situations", but we do get "these kind of situations".

Some examples:

You're a Taurian, you will have faced this type situation many times before ...
Animal Control Officers must deal with vicious and wild animals. Do you feel you could handle this type situation?
Also, officer safety suggestions for dealing with these type individuals will be discussed and resources available to departments dealing with these type situations.
If you have any doubts concerning these type situations, discuss with your immediate supervisor.
A dietitian can be very helpful in these type of situations to make sure that your body receives all of its nutrients.
It also makes sense to have free Iraqi forces and interpreters there to talk to people in the vehicles, so you don't get in these type of situations.
Also, successful supplier/vendors needs to invest in training of manpower for these kind of situations.
We are improving error reporting in these kind of situations (and also, you will be able to save even if the document is not valid).

As you can see in these examples (and others easily available on the web), these type constructions are used by native speakers -- though I don't use them myself, other than as a joke, either in speech or in writing. Furthermore, the contexts of use seem to be more formal rather than less formal. In fact, my stereotype for this pattern is someone who is not highly literate and is trying to speak or write formally and clearly. What might have been "stuff like this" in a less formal context comes out as "this type stuff".

That's just a conjecture, but here are some Google counts for various words in various relevant patterns. I'll mostly leave interpretation for another day, except to point out that the solecism "these kind of things" is about 1/4 as common as the correct "these kinds of things".

One curious thing, though: compare the counts for "these types of Xs" and "these kinds of Xs" in the second table. They're almost identical, so much so that I thought at first I must have screwed up transferring the numbers to the table. But I checked, and that's really what Google is telling me. Is this because type and kind are really in such an exact numerical balance in this case? or is it this telling us something weird about Google's strategy for encoding strings and counting hits?

	this type X		this type of X		Ratio
	whG	whG/bp	whG	whG/bp
situation	902	210	55,100	12,858	61
guy	4	0.9	911	213	228
idea	5	1.2	580	135	116
thing	871	203	77,000	17,969	88
individual	94	22	4,280	999	46
person	598	140	12,400	2,894	21
woman	36	8.4	922	215	26
man	143*	-	1,970	460	-
circumstance	12	2.8	580	135	43
car	326	76	13,900	3244	43
vehicle	223	52	6,890	1608	31
automobile	18	4.2	536	125	30
truck	44	10	917	214	21
bicycle	17	4.0	736	172	43
bike	22	5.1	741	173	34

*skewed by phrases like "To do this, type man command at the system prompt".

(whG in all cells)	(A) these type Xs	(B) these types of Xs	(C) these type of Xs	(D) these kind Xs	(E) these kinds of Xs	(F) these kind of Xs
situations	383	27,300	1,810	4	27,100	2,340
guys	67	1,780	603	*	1,770	749
ideas	15	2,690	109	1	2,690	780
things	1,590	81,300	5,330	*	81,000	20,500
individuals	54	2,080	218	*	2,080	96
people	500	18,900	4,230	*	18,900	7,350
women	43	593	348	*	596	296
men	62	1,850	622	*	1,850	752
circumstances	16	2,880	178	0	2,880	518
cars	176	5,060	725	2	5,050	517
vehicles	14	3,640	459	0	3,590	102
trucks	14	199	36	0	160	46
bicycles	10	247	29	0	243	45
bikes	24	396	55	0	294	145

*counts dominated by kind = nice

A few other relevant (and real) examples:

This is exactly the type thing I'm looking for.
List owners often describe the type things that are allowed and not allowed on their lists.
I support the troops and all, but it is not my type thing.
And, one of your type things is that you have to build up in yourself that you are tougher, you can take it, this is why you are here.
Do you think it depends on how many type things have to fit in the store and which products make the most money?
This is a type thing that the Zoning regulations need to deal with for the sake of all property values.

Here's an idea -- maybe the source is pairings like

a joke-type thing	a joke type of thing
a 1980s-type thing	a 1980s type of thing
moth-type things	moth type of things
just a few admin-type things	just a few admin type of things

which relate structures like ((moth type) things) and (moth ((type (of things)). The plausibility of this connection is increased by the fact that we don't get "*some moth-kind things."

But in "moth-type things", type could be the head of a compound noun, an option that's not available in "these type things". And this still doesn't really help with "these kind of things", "these type of things" -- or for that matter "a few admin type of things" -- where it seems that "type" is semantically plural even though it's morphologically singular

Someone has probably straightened all this out in a CLS paper or the like -- I await enlightenment.

[Update: as pf emailed to point out, Hawaiian pidgin has developed kind (spelled "kine") along the same lines as type in "standard non-standard" English, producing examples like "fun kine stuffs". ]

Posted by Mark Liberman at 02:26 PM

June 13, 2004

W the debater

Readers who are interested in the question of George W. Bush's linguistic skills (some of our previous discussion of this is here, here, here, here and here) will want to study James Fallows' cover story "When George Meets John", in the July/August Atlantic (the story is not yet on the web, so go buy the magazine!).

After reading Fallows' article, I'm more convinced than ever that Jacob Weisberg's Bushisms enterprise is an equal mixture of cultural prejudice, cheap shots and gullibility, and that the Democrats accept the view of Bush as a tongue-tied boob at their peril.

Some quotes from Fallows' article:

Recently I saw an amazing piece of political video. It was ten-year-old footage of George W. Bush, and it changed my mind about an important aspect of the upcoming campaign. [...]

...it was the hour in which Bush faced Ann Richards [in the debate during the 1994 Texas gubernatorial election] that I had to watch several times. The Bush on this tape was almost unrecognizable--and not just because he looked different from the figure we are accustomed to in the WHite House. He was younger, thinner, with much darker hair and a more eager yet less swaggering carriage than he has now. But the real difference was the way he sounded.

This Bush was eloquent. He spoke quickly and easily. He rattled off complicated sentences and brought them to the right grammatical conclusions. He mishandled a word or two ..., but fewer than most people would in an hour's debate. More striking, he did not pause before forcing out big words, as he so often does now, or invent mangled new ones. "To lay our my juvenile-justice plan in a minute and a half is a hard task, but I will try to do so," he said fluidly and with a smile midway through the debate, before beginning to list his priniciples.

Richards's main line of attack--in fact, her only one--was that Bush had done so poorly in a series of businesses that he would be over his head as governor. Each time she tried this, Bush calmly said, 'I think this is a diversion away from talking about the issues that face Texas"--which led him right back to the items on his stump speech ("I want to discuss welfare, education. I want to discuss the juvenile-justice system ..."). When talking about schools he said, "I think the mission in education ought to be excellence in literature, math, science and social science"--an ordinary enough thought, but one delivered with an offhand fluency I do not remember his ever showing at a presidential press conference.

[...] The man on the debate platform looked and sounded smart and in control. If you had to guess which of the two candidates had won the debate scholarship to college and was about to win the governship, you would choose Bush.

Fallows goes on to say that "I bored my friends by forcing them to watch the tape--but I could tell that I had not bored George Lakoff, a linguist from the University of California at Berkeley, who has written often of the importance of metaphor and emotional message in political communications."

After seeing the Bush-Richards debate from 1994, Lakoff concluded that it should be used to teach successful debate technique. Within the first few words of each reply Bush had figured out how to use the question as an opening not simply for his major campaign themes but also for the personal and emotional messages he wanted to project.

Fallows also raises the question of why Bush has seemed so different in recent years.

Yolette Garcia, who as the executive producer at KERA-TV, in Dallas, had supervised negotiations for the Bush-Richards debate, says that in those days Bush was noted for his poise and ease in public appearances--including the informal Q&As he has tried to avoid as President. [...]

Obviously Bush doesn't sound this way as President, and there is no one conclusive explanation for the change. I have read and listened to speculations that there must be some organic basis for the President's peculiar mode of speech--a learning disability, a reading problem, dyslexia or some other disorder that makes him so uncomfortable when speaking off the cuff. The main problem with these theories is that through his forties Bush was perfectly articulate. George Lakoff tried to convince me that the change was intentional. As a way of showing deep-down NASCAR-type manliness, according to Lakoff, Bush has deliberately made himself sound as clipped and tough as John Wayne. ...

I say: Maybe. Clearly Bush has been content to let his opponents, including the press, think him a numbskull. ... But to me the more plausible overall explanation is the sheer change in scale from being governor of Texas to being President of the United States.

I think we need to start out by observing that part of the change may be in the audience rather than in the performance. Bush's disfluencies as president have been relentlessly exaggerated by Weisberg and others, and this has probably affected Fallows' perceptions along with everyone else's. That said, I'm inclined to agree with Fallows about the reasons for whatever real effect there may be.

Fallows gives three specific examples of poor verbal performance by Bush on the national stage: one in a debate with McCain on 2/15/2000, another on Meet the Press in 2/2004, and the last in a prime-time press conference in April. In the 2000 debate, the problem was not lack of fluency, but rather confrontation with an angry McCain over some campaign ads accusing McCain of abandoning veterans. This was not so much a linguistic failure as a moral one, and it put Bush in the position of defending the indefensible. In that case, "staying on message" was no help at all, since the "message" was an absurd counter-charge that McCain's ads had compared Bush to a Democrat.

In the two cases from 2004, Fallows' diagnosis is that Bush did not "even go through the motions of politely trying to connect the question that was asked to the on-message theme he had previously decided to stress." It was indeed a rhetorical flaw for Bush to fail, under stress, to give due deference to the ritual pretense that the norms of conversation are being observed in a discussion between a politician and the press. However, this is surely more a matter of temperament -- or lack of practice -- than basic linguistic facility. "Poise and ease" may be involved, but there is hardly any reason to invoke "some organic basis" such as dyslexia or a learning disability. Fallows' explanations seem adequate: a lot was at stake, and not just for Bush personally; and he lacked recent practice in interactive press rituals. The daily experience of being president, in modern times, must tend to exaggerate the natural arrogance of anyone who makes his way to that position; and this makes the ritual stance of politician as a conversational equal with the press all the more unnatural, and all the more dependent on acting skills.

I'd also add that it's not helpful to raise the question of whether a change in self-presentation -- if one exists in this case -- is "intentional". I suspect that this is Fallows' word, not Lakoff's. As qualified support for the plausibility of (what I take to be) Lakoff's position, I can refer back to recent posts on acquired disfluency as a sign of (male) status in some cultures (here and here).

Fallows gives equal space to a discussion of John Kerry's debating skills, to which he also gives high marks. It should be an interesting campaign.

Posted by Mark Liberman at 03:45 PM

Modeling banality

Mimi Smartypants starts a recent post with this:

I like those moments where you say or hear an unusual sentence and think: Wow, I bet no one else in the entire world said that today.

She goes on to give examples (with context and/or links) like these:

I want this to be more like a yo-yo than it can realistically be.
I blame the mango.
The patient had a history of ingesting inadequately cooked frogs.
Don't feed your racist toothpaste to the cat.

Both formal syntax and statistical language modeling have their flaws, but I think they (and we) can agree that Ms. Smartypants is too modest in her aspirations. It's a fair bet that no one else in the entire world said any of her sentences, not only on the day she noticed them, but in the whole previous history of the world. (Well, "I blame the mango" might be an exception.) Furthermore, this is true not only of the striking examples that she cites, but of the great majority of all the sentences that she and her interlocutors use in their everyday lives.

But we can also agree that she's right to notice something striking about the particular examples that she quotes. Each of them involves some pragmatically or semantically unexpected juxtapositions, like toothpaste being racist or toothpaste being fed to a cat.

Even the shortest and most ordinary lead sentences in today's Philadelphia Inquirer are surely unique:

Carlos Silva chatted up his former teammates Friday afternoon at the Metrodome.
Home-sale prices have exploded throughout the Philadelphia region in the last five years.

The thing is, these are examples of completely ordinary and even banal ideas: a recently traded athlete talking with former teammates, home prices rising in a relevant span of space and time.The particulars are variable: which athlete, which stadium, what place and time. The choice of words is also variable -- "chatted up" or "talked with" or "spent some time talking to"; "have exploded", "have risen explosively", "have soared"; etc.. Take the cross-product of all the variables and you get an astronomically large number of possible banalities, only a tiny fraction of which will ever actually occur, even in the conversations and writings of billions of people.

You don't have to work very many combinations to get beyond what's on the web. Thus the string "chatted up his former" is unknown to Google, even though you can find things like "chatted up his old Quake 2 buddies" and "talked with his former teammates" and "spent some time talking to his former teammates". The string "Carlos Silva chatted" is also unknown to Google, but you can fine "Carlos Silva spoke" and "Michael Silva chatted" and so on.

Because such alternatives are pretty common, the sentence "Carlos Silva chatted up his former teammates Friday afternoon at the Metrodome" is never going to strike anyone as unique or original, in the way that "Don't feed your racist toothpaste to the cat" is. The "Carlos Silva" sentence has probably never been said or written before, but there's a real sense in which it's more likely than the "racist toothpaste" sentence.

Even a crude bigram model might even yield this result, though it's hard to overcome the difference between twelve words and eight. Slightly more sophisticated statistical models can account for the perceived difference between the famous sentence "Colorless green ideas sleep furiously" and the various ungrammatical re-orderings of the same words. Still more sophisticated models consider the frequency of modifier-head or verb-argument combinations -- to the extent that they can estimate them -- and would be capable in principle of noticing that toothpaste is rarely racist, or that it is rarely fed to cats. However, I'm not sure that our models are yet able to capture the true banality of the two Inky ledes, or the true originality of Ms. Smartypants' examples, because such models have no notion of the frequency of conceptual fragments except insofar as these are pretty directly represented in word strings or at least in local syntactic relationships among words.

Posted by Mark Liberman at 12:02 PM

Pronouns are like sweaters: Kusunda revisited

I should be a little clearer about my reasons for being excited about the Kusunda case.

Fellow lister Bill Poser is quite right that historical linguists have all reason to be skeptical of Greenberg and Ruhlen's claims. I properly ought to have written that historical linguists "assume" that this work is shaky for the very real reason that so much of it HAS been proven to be full of holes. Although I do think that interdisciplinary evidence is making it clearer by the year that Greenberg and Ruhlen have been on to something, way too often their arguments prove to have been founded upon misanalyses of data.

Some relevant demonstrations to consult are Stefan Georg and Alexander Vovin, "From mass comparison to mess comparison: Greenberg's Indo-European and its Closest Relatives" in the latest issue of DIACHRONICA, and pages 287-303 of a book called THE POWER OF BABEL whose author's name momentarily escapes me.

My feeling, as I flagged above, is that the theories of Greenberg and Ruhlen may not be demonstrable through linguistic evidence alone, but that in combination with archaeological, genetic, historical, and even phenotypic evidence, the general thrust of these theories may require acknowledgment. Ruhlen, coruscating arguer though he is, is not always quite explicit on this point: that the language data usually will NEVER be able to make a solid case.

But on the Kusunda-"Indo-Pacific" link, I would like to venture two observations, in full acknowledgment of Bill's comments.

First, certainly "Indo-Pacific" has not been demonstrated with anything approaching authority. But I am not sure that we risk essentializing indigenous people in not just proposing -- but considering it highly likely -- that if:

Humans are now demonstrated by mounting genetic and archaeoological evidence to have migrated out of Africa along the southern Asian coast and sailed to New Guinea;
Today, there are scattered remnant populations on the southern Asian coast who physically resemble people of New Guinea, to an extent that no humans elsewhere on earth do;

then there might be a historical relationship between (a subset of) languages of New Guinea and those of the remnant populations in question. And as such, if there are superficial resemblances between words in these languages, then this might be a distant echo of that relationship.

Most likely, words for things like CAT and DESPAIR have drifted too far apart in these languages for there to ever be a nice family tree (even with borrowings acknowledged) like the one for Indo-European. The linguistic data alone, then, will never provide anything but suggestions. But if other evidence draws a historical link between the people in question, then I wonder whether linguists need necessarily assume that the linguistic parallels still mean nothing at all.

Second, indeed, pronouns are borrowed all over the world. Certainly it is not that pronouns are NOT borrowed -- my point was that their borrowing is the marked case. And what most interests me is that where this happens, there is close and usually long-term contact between people. This is the kind of thing that allows people to get cozy enough to use parts of each other's languages beyond the usual exchange of words for food and gadgets.

And what makes the pronominal similarity between Kusunda and languages like Juwoi is that these people have not been in contact at all. Nor are the Kusunda a coastal people, nor is there a string of related languages or cultures extending from the Kusunda homeland down to the coast of India across from the Andaman Islands.

If Kusunda and Juwoi were spoken next door to one another, then the pronominal resemblance would be a yawn, as it would also be if either the Kusunda or the Juwoi were imperialist colonizers.

But imagine spending, say, a year in constant contact with Spanish speakers. When it was over, you might well find yourself saying HERMANO for "brother" or, depending what you were doing over that year, BORRACHO for "drunk." But how likely is it that you would start popping up with things like "So, what do TÚ want to do today?" Pronouns are borrowed, sure -- but they resist it.

We allow acquaintances to borrow a book, a kitchen appliance, some twine. But if someone borrows our sweater, they are likely either a close friend or a lover. Pronouns are like sweaters. When you see pronoun resemblances between languages spoken by two tiny, isolated groups, I suspect that something is up.

And then there is the possible "echo" of these features eastward in other languages, although we may choose to remain agnostic as to whether there is an "Indo-Pacific" group.

Thus I fully understand that Greenberg and Ruhlen do not have a great track record when it comes to following up on their ideas by checking the grammars. But there may be gold nuggets at the bottom of the pan.

Posted by John McWhorter at 10:14 AM

Bugaboos

I'm a couple of weeks late learning about the copepods in New York City's water, which the original complaint apparently called "insects", and which the International Herald Tribune refers to as "bugs". The University of Michigan's Animal Diversity Web makes it clear that these oar-footed critters are aquatic crustaceans, and not insects at all. Second cousins once removed of insects, at best. The phylum arthropoda splits into subphyla that include crustacea and uniramia; there are five classes of crustaceans, of which the maxillopoda are one; and the maxillopods include "barnacles, copepods, mystacocarids, tantulocarids, branchiurans, ostracods, and related groups". Down the other branch of the arthropod family, the uniramian subphylum includes millipedes, centipedes, and insects as classes.

The Smithsonian has a great copepod page, with pictures and a lot of other interesting information (though there are no recipes :-)).

Whether or not copepods are bugs is less clear. Middle English bugge is thought to have been derived from Welsh bwg "ghost", and originally meant "object of terror, usually imaginary", a sense that survives only in forms like bugbear, bugaboo, bogey and boggle. At some point, the word came to be used for insects, especially members of the order heteroptera, which biologists call "true bugs"..

The other extended senses of bug are varied, specific, and curiously unrelated to one another. The OED's compilation includes:

a. "A person obsessed by an idea; an enthusiast." (e.g. litterbug, firebug).
b. "A defect or fault in a machine, plan or the like."
d. "A microbe or germ; also, a disease."
f. "A concealed microphone."

The OED provides citations for sense 3.b. from 1889 on:

1889 Pall Mall Gaz. 11 Mar. 1/1 Mr. Edison, I was informed, had been up the two previous nights discovering ‘a bug’ in his phonograph--an expression for solving a difficulty, and implying that some imaginary insect has secreted itself inside and is causing all the trouble.

and more of the history is available in the current edition of the Jargon Lexicon, which points out that "in jargon the word [bug] almost never refers to insects. Here is a plausible conversation that never actually happened: 'There is a bug in this ant farm!' 'What do you mean? I don't see any ants in it.' 'That's the bug.'"

In this sense, the copepods in the New York City water supply are bugs. At least, they're certainly not features.

Posted by Mark Liberman at 06:48 AM

June 12, 2004

Canine intelligence

Well, I tried to improve Geoff Pullum's mood with the story about Helen Keller learning the meaning of water, but it didn't work. Geoff wrote me that "[a]ctually, the stuff about Helen Keller annoys me almost as much as the stuff about dogs and parrots." And then Ray Girvan wrote to give me no end of well-deserved grief, about the discrepancies between Keller's autobiography and Sullivan's diaries, and about the fact that Keller was 19 months old before she went deaf and blind, and so might well really have "remembered" many words rather than learning them for the first time at age 8.

So I'll try again later, with a more serious discussion of canine speech understanding. Meanwhile, Margaret Marks came through by sending a link to a good dog joke:

A dog walks into a butcher shop, spends a number of minutes looking at the meat on display, and eventually indicates with a nod of his head and a bark that he would like some lamb chops.

The butcher, thinking the dog would know no better, picks up the lowest quality chops in the shop.

The dog barks furiously and continues to bark until the butcher selects the finest chops from the display counter.

The butcher weighs the meat and asks the dog for $5.90. Again, the dog barks furiously until the butcher reduces the bill to the correct price of $3.60.

The dog hands over a five dollar note and the butcher gives him 40 cents in change. Once again, the dog barks continuously until the butcher tenders the correct change. The dog then picks up his package and leaves the shop.

Now, the butcher is extremely impressed and decides that he would like to own a dog so clever. He shuts up shop and follows the dog to see where it goes.

After ten minutes or so, the dog climbs the steps to a house. When it gets to the top, it shakes its head as though in frustration, gently places the package of meat on the floor and, standing on its hind legs, rings the doorbell.

A man opens the door and starts to yell obscenities at the dog. As he does so, the horrified butcher leaps up the steps and begs the man to stop. "It's such an intelligent dog," he says, "surely it doesn't deserve this kind of treatment."

He then went on to explain how the dog had procured the best lamb chops in the shop, insisted on paying the advertised price and quibbled over incorrect change!

The man looked at the butcher and said, "Intelligent he may be, but this is the third time this week he's forgotten his keys".

This joke is a good illustration of one of the four main ways in which human language and language use seem to differ from communication among non-human animals, and also from interactions between humans and other animals. But more of that later.

Posted by Mark Liberman at 06:54 PM

At loggerheads

In one post about the G8 summit at Sea Island, Abnu at Wordlab comments first about Bush and Chirac being "at loggerheads", and second about summit-related efforts to preserve loggerhead sea turtles by letting schoolchildren from around the world propose names for individual animals to be tagged with GPS devices.

Loggerhead is one of those complex words that seem to work pretty well, despite being at best quasi-compositional in today's English. For most of my life, I've perceived it as a compound like bridegroom or bell hop, whose pieces fit together by allusion rather than by systematic composition. The only living meaning of logger -- someone who cuts down trees -- seems irrelevant except as a dimly resonant association, and the connection with head is almost equally obscure. According to the OED, logger means "a. A heavy block of wood fastened to the leg of a horse to prevent it straying ... b. Lumps of dirt on a ploughboy's feet ... [or] c. ‘Meat which is sinewy, skinny, lumpy, “chunky”, or not worth cooking’, and is " [apparently] a word invented as expressing by its sound the notion of something heavy and clumsy", but I didn't know any of that until I looked it up just now, and I doubt that many readers did either.

Relevant meanings of the compound form loggerhead, as given in the OED, include

1. a. A thick-headed or stupid person; a blockhead.
2. a. A head out of proportion to the body; a large or ‘thick’ head.
3. a. An iron instrument with a long handle and a ball or bulb at the end used, when heated in the fire, for melting pitch and for heating liquids.
4. a. ‘An upright rounded piece of wood, near the stern of a whale-boat, for catching a turn of the line to’
6. As the popular name of various heavy-headed animals. a. (Also loggerhead turtle, tortoise.) A species of turtle, Thalassochelys caretta.
8. pl. in various phrases. to fall, get, go to loggerheads: to come to blows. to be at loggerheads: to be contending about differences of opinion; also, rarely, to come to loggerheads.
[The use is of obscure origin; perh. the instrument described in 3, or something similar, may have been used as a weapon.]

Of these meanings, the only ones that I've encountered in common use are the turtle name and the idiom "at loggerheads". Given how little explicit metaphorical support the idiom has, it's interesting that it survives so well.

The earliest citation for any of the senses of loggerhead is Shakespeare with the S-word version:

1588 SHAKES. L.L.L. IV. iii. 204 Ah you whoreson logger~head, you were borne to doe me shame.

The turtle is almost three quarters of a century behind:

1657 R. LIGON Barbadoes (1673) 4 The Loggerhead Turtle.
1697 W. DAMPIER Voy. (1729) I. 103 There are 4 sorts of sea turtle... The Loggerhead is so call'd, because it hath a great head.

and the various idioms for fighting or contention just a couple of decades after that. The first citation for the "at loggerheads" version is from 1831, and the idiom doesn't seem to settle firmly into that form until the 20th century:

1680 KIRKMAN Eng. Rogue IV. i. 6 They frequently quarrell'd about their Sicilian wenches, and indeed..they seem..to be worth the going to Logger-heads for.
1681 Trial of S. Colledge 49 So we went to loggerheads together, I think that was the word, or Fisty-cuffs.
1755 SMOLLETT Quix. (1803) I. 66 The others..went to loggerheads with Sancho, whom they soon overthrew.
1806 JEFFERSON Writ. (1830) IV. 63 In order to destroy one member of the administration, the whole were to be set to loggerheads.
1831 J. W. CROKER in C. Papers 25 Jan., I hear from London that our successors are at loggerheads.
1887 FRITH Autobiog. I. xxiv. 347 The Lord Chancellor..and the Bishop came to loggerheads in the House of Lords.
1955 Bull. Atomic Sci. Mar. 90/3 Uranium men and oil and gas producers had long been at loggerheads due to the fact these natural substances frequently occur on the same site, though at different horizons.
1955 Times 19 May 4/2 The jury would not have much difficulty in getting rid of that suggestion, because those two were obviously at loggerheads.
1975 J. GARDNER Killer for Song i. 13 ‘James, it's good to see you.’ His expression was at loggerheads with the words.

Speaking for myself, I learned only a few years ago that loggerheads were "iron instruments with long handles and balls or bulbs at the end", and then realized that being "at loggerheads" might involve a metaphorical reference to fist-fighting -- arms being the handles and fists being the balls or bulbs at the end. The occasion was reading in Patrick O'Brian's The Commodore: "...They had been sparring, in a spirit of fun, with loggerheads, those massy iron balls with long handles to be carried red-hot from the fire and plunged into buckets of tar or pitch so that the substance might be melted with no risk of flame. 'They are sober now, sir; and penitent, the creatures.'" [This is one of the terms covered at Gibbons Burke's page Nautical Expressions in the Vernacular].

Before reading O'Brian, I always thought of "being at loggerheads" as a relatively immobile sort of head-to-head grappling, sort of like two turtles pushing against one another, or two logs bumping ends, though I had never formulated the idea consciously.

[Update: abnu at Wordlab adds a reference to a variant form of the word:

In his otherwise thorough analysis, the linguist did not comment on the variation of loggerhead in the southern dialect, lager head, which was probably as descriptive of the Sea Island Summit delegates as the turtles.

Harbor Island is a sleepy little oceanfront community that exudes relaxation and rejuvenation. While they do have tennis courts and swimming pools, we prefer sandy beach strolls, chicken-on-a-string crabbing, and watching the lager head sea-turtle hatchlings make their way to the ocean.

This description from " High on Life in South Carolina's Low Country" conjures up images of drunken turtles stumbling across the beach. Isn't language wonderful?

The "lager head" variant was new to me, but I suspect it is more of a sporadic folk etymology (i.e. " eggcorn") than a regional variant. I did manage to confirm Abnu's conjecture about the diplomats attending the conference -- one of the German delegates is shown at the right.

Google has 412 examples of "lager head" and another 1,190 of "lagerhead", nearly all of which are not turtles. Among the 40 or 50 that I checked, references to beer foam and to beer enthusiasts are roughly in balance. However, there are a few other lager-head turtles in the mix:

( link) By the way they should bring that show back. If only to ensure that one hour could pass without an appearance from lager-head turtle lookalike Kenny Chesney. He must have pictures of some high level exec with a goat.
(link) This sterling bracelet is for the lager head turtle enthusiast.
(link)Here is a friendly Lagerhead turtle we came upon and he let us swim and photograph him for awhile!
(link)Samantha and I are counting on the LagerHead Turtle for our livelihood.

When I ask for "lager head", Google's spelling correction algorithm is good enough to ask whether I meant to search for "loggerhead".]

[Update #2: searches for "log ahead turtle", "log or head turtle", "largerhead turtle", "lacquerhead turtle", etc., all come up empty; but one Dutch person has been seduced by "lockerhead turtle":

(link) Het was een enorme lockerhead turtle die een nest ging maken.

]

[Update 6/14/2004: another theory about the at loggerheads metaphor is found in this entry from John Ciardi's "A Browser's Dictionary" (1980), sent in by Jerry Kreuscher:

logger: 1. In Brit. dial. of unknown origin. A wooden Block. A chopping block. A knob. [Perh. ult. < log, but ??] 2. Am. whaling. A snubbing post built into the bow of a whaleboat. The line attached to a harpooned whale was coiled around the logger. [This usage also suggesting a poss. derivation from log.] loggerhead A knob head. A blockhead. [The root ref. is to a knob on the end of a stick. And so in the names of various animals with unusually large heads, as the loggerhead turtle.]
at loggerheads In hot dispute at close quarters, as if bumping heads. By ext. In hot argument, giving it to one another head to head and verbal scald for verbal scald. [A long-handled large iron ladle for pouring melted tar and molten was earlier called a loggerhead. In naval engagements up to early XVII, ships often grappled while sailors used such loggerheads for scalding the enemy with tar, oil, or water that had been brought to a boil in caldrons set up on deck in brick-and-sand pits. The galleass of the XV and XVI in the Mediterranean, with its very high sheer, was esp. well designed for such warm outpourings upon lower-lying vessels. See galleywest.]

]

Posted by Mark Liberman at 05:26 PM

The S-word and the F-word

Back in January, I told the story of some 4-year-olds' concern for taboo language:

A: Do you know the bad words?
B: Yes. My mom says them all the time.
C: Mine too.
A: I know the S word.
C: [covering her ears] Don't say it! Don't say it!
B: [trying to put his hands over A's mouth] That's the worst one! Don't say it, we'll get in trouble!
A: I'm going to say it! "STUPID." There, I said it.
C: No! No! You can't say that! Don't say it again!

As I failed to point out at the time, there is scriptural support for their worry, at least with respect to the closely-related F-word -- in Matthew 5:22.

21 Ye have heard that it was said by them of old time, Thou shalt not kill; and whosoever shall kill shall be in danger of the judgment:
22 but I say unto you, That whosoever is angry with his brother without a cause shall be in danger of the judgment: and whosoever shall say to his brother, Raca, shall be in danger of the council: but whosoever shall say, Thou fool, shall be in danger of hell fire. [Matthew 5:21-22, King James version]

The NASB translation is:

"But I say to you that everyone who is angry with his brother shall be guilty before the court; and whoever says to his brother, ' You good-for-nothing,' shall be guilty before the supreme court; and whoever says, 'You fool,' shall be guilty enough to go into the fiery hell.

In the original Greek of the gospel, these would have been an R-word and an M-word. Here's the version from Perseus, in transliteration (you can go to Perseus and re-set your preferences for the Greek-letter version, if you want it):

[22] Egô de legô humin hoti pas ho orgizomenos tôi adelphôi autou enochos estai têi krisei: hos d' an eipêi tôi adelphôi autou Rhaka, enochos estai tôi sunedriôi: hos d' an eipêi Môre, enochos estai eis tên geennan tou puros.

The two Greek insults referenced in the passage are transliterated as rhaka and môre (vocative of môros). The Liddell-Scott Greek lexicon glosses rhaka simply as "Hebr. word expressive of contempt", referencing Matt. 5.22; though I suppose that it should be Aramaic, not Hebrew... L-S glosses môros as "dull, stupid" (of people) or "insipid, flat" (of taste), which is the source of English moron, as the OED explains:

[< ancient Greek μωρόν, neuter of μωρός, [...] foolish, stupid (further etymology uncertain: a connection with Sanskrit mūra foolish, stupid, is now generally rejected). Perh. cf. earlier MORIA n.
Ancient Greek μωρόν is used as a noun in the sense ‘folly’, but is not used to denote a person (the neuter usually represents inanimate categories).]

Jesus would have been speaking Aramaic, so Môre is Matthew's representation of some Aramaic insult relating to lack of intelligence or sense, and more hurtful than Rhaka.

It's easy to read Matthew 5:22 as supporting the 4-year-old's belief that such insults are religiously forbidden behavior, but the theology here (as often) is obscure to me, since in Matthew 23:17, Jesus himself calls the scribes and pharisees môroi kai tuphloi "ye fools and blind".

Anyhow, Matthew 5.22 and its relationship to a phrase in Shakespeare's Merchant of Venice has been discussed in various weblogs recently, because of a comment by "Jennifer" in a discussion at reason on the NEA. An anecdote in an anonymous comment is even less certain to be true than an anecdote told by someone who identifies themselves, but this one is plausible:

I taught "Merchant of Venice" to seniors one year; in it there's a line where one character is insulting another, by saying something along the lines of "He damns the ears of all who hear him, by calling him 'fool.'" One of the kids asked me what that meant, so I explained that one of the lesser-known verses of the Book of Matthew has Jesus saying that anyone who calls another a fool will be damned. [...] I went on to talk about the very funny use Voltaire made of that in his essay "The Jesuit Berthier" (an angel tells a priest to stop giving his stupid, boring sermons, because instead of winning souls for God he's endangering the souls of all who hear him, because they all call him a fool), and explained also that this is why cartoony villians in movies developed the habit of using "Fool!" as their default insult; for people familiar with the Bible, the fact that the villian always says "Fool!" is just one more proof that this is an evil, evil dude.

"So anyway," I said to the class, "back in Shakespeare's day, when people were far more familiar with the Bible than they are now, instead of insulting someone by saying 'You are a fool,' you'd say 'You are a--well, I can't SAY what you are because then I'd go to hell.' That's what he's doing in the play."

Next day I get called into the principal's office; some parents were FURIOUS that I had told their kids that Jesus said anyone who says 'fool,' will go to Hell.

"But he did," I pointed out.

"It doesn't matter, Jennifer. You can't insult kids' religions."

"Well, the kid asked me what that line from the play meant! What was I supposed to do?"

"Just tell him you don't know."

Posted by Mark Liberman at 01:12 PM

June 11, 2004

The strange, new sight

In the wake of the Science article about Rico the border collie, I thought it might reduce Geoff Pullum's blood pressure to read three paragraphs from the autobiography of Helen Keller, featured in Walker Percy's essay The Delta Factor. The quotation from Keller is introduced by a few sentences of Percy.

Then I began thinking about what happened between Helen Keller and Miss Sullivan in Tuscumbia, Alabama, on another summer morning in 1887. You recall the story. The heart of it is in three short paragraphs. Earlier, Helen had learned to respond like any other good animal: When she wanted a piece of cake, she spelled the word in Miss Sullivan's hand and Miss Sullivan fetched the cake (like the chimp Washoe, who gives hand signals: tickle, banana, etc.). Then Miss Sullivan took her for a walk.

We walked down the path to the well-house, attracted by the fragrance of the honeysuckle with which it was covered. Someone was drawing water and my teacher placed my hand under the spout. As the cool stream gushed over one hand, she spelled into the other the word water, first slowly then rapidly. I stood still, my whole attention fixed upon the motion of her fingers. Suddenly I felt a misty consciousness as of something forgotten--a thrill of returning thought, and somehow the mystery of language was revealed to me. I knew then that "w-a-t-e-r" meant the wonderful cool something that was flowing over my hand. That living word awakened my soul, gave it light, hope, joy set it free! There were barriers still, it is true, but barriers that could in time be swept away.

I left the well-house eager to learn. Everything had a name, and each name gave birth to a new thought. As we returned to the house every object which I touched seemed to quiver with life. That was because I saw everything with the strange, new sight that had come to me. On entering the door I remembered the doll I had broken. [She had earlier destroyed the doll in a fit of temper.] I felt my way to the hearth and picked up the pieces. I tried vainly to put them together. Then my eyes filled with tears, for I realized what I had done, and for the first time I felt repentence and sorow.

I learned a great may new words that day. I do not remember what they all were, but I do know that mother, father, sister, teacher were among them--words that were to make the world blossom for me, "like Aaron's rod with flowers." It would have been difficult to find a happier child than I was as I lay in my crib at the close of that eventful day and lived over the joys it had brought me, and for the first time longed for a new day to come.

In another essay, Interpersonal Process, Percy wrote:

... one can use the word mean analogically and say that thunder means rain to the chicken and that the symbol water means water to Helen Keller. But the symbol does something the sign fails to do. It sets the object at a distance and in a public zone, where it is beheld intersubjectively by the community of symbol users. As Langer put it, say James to a dog, and, as a good sign-using animal he will go look for James. Say James to you, and if you know a James, you will ask, "What about him?"

"Langer" is Susanne Langer; Percy doesn't footnote the reference, but I believe it is to Philosophy in a New Key, Harvard University Press, 1957.

Paul Bloom raises related issues in his "Perspective" (Science, Vol 304, Issue 5677, 1605-1606 , 11 June 2004) on the article about Rico (Juliane Kaminski, Josep Call, Julia Fischer. "Word Learning in a Domestic Dog: Evidence for 'Fast Mapping'". Science, Vol 304, Issue 5677, 1682-1683 , 11 June 2004). Rico's reaction to "James" is not specified, but as I understand the Kaminski et al. article, Langer predicted it correctly. The exciting news -- and it is genuinely interesting, I think -- is that Rico can learn quickly, sometimes even in just one trial, what a vocal sign ike "James" refers to. I have some questions about how Rico classifies vocal noises phonetically -- in other words, what counts as an utterance of "James", versus (say) "chains" by the same speaker versus "James" by a different one -- but that can wait for another day.

[Update: Ray Girvan emails that:

However, some have criticised Langer's 'sudden dawning of meaning' interpretation of Keller's experience. John McCrone, particularly, has argued from the documented sources that the standard Keller story is highly romanticised and based on unreliable personal recollection.
"Helen may have remembered her awakening to language as a sudden revelation at the garden pump, but Annie Sullivan's diary tells that it took many weeks of fingerspelling on Helen's hands before connections started to be made in her mind". - http://www.btinternet.com/~neuronaut/webtwo_features_keller.htm

This seems very likely. Certainly Keller's narrative is literature, not science. But Percy's comment on Keller's story is that she had simply compressed into a few hours an experience that normally takes place over a few years, in the normal development of children. On this interpretation, the exaggertion and romanticization improve the narrative but don't change its basic form.]

Posted by Mark Liberman at 02:55 AM

Lo Catalanisme

One of the founding fathers of the Catalan national movement was Valentí Almirall (1841-1904). His most important work was his book Lo Catalanisme, published in 1886, my copy of which I came across recently. Catalan is dialectally diverse, but one tendency in recent Catalan politics has been the promotion of a standard variety to the exclusion of the other dialects. How ironic that the title of the founding work of the Catalan movement is incorrect in the standard, in which lo should be el.

[Update 2004/06/11: There is a nice map of Catalan dialects here.]

[Update 2004/06/15: Trevor at kaleboel provides some information about Catalan dialectology and takes issue with this post.]

Posted by Bill Poser at 12:04 AM

June 10, 2004

So now it's dogs that understand language (sigh)

My friend Nathan Sanders of Williams College points out to me an AP news report headlined Research Shows Dogs Understand Language. It relates to an article in Science about a border collie that "understands more than 200 words and can learn new ones as quickly as many children."

"Just a little something to increase your blood temperature a few degrees," quips Nathan, with the usual smileyface emoticon that signals quipping. How well he knows me. Blood pressure as well as temperature. I raged and fumed and hurled several medium-sized pieces of furniture across the room. As he well knows, I'm so sick of crappy brain-dead reports by moron journalists about completely fictional animal communicative abilities (and the credulous dimwits who welcome these stories with open arms, like American Kennel Club board member Patti Strand, who immediately hailed the report as "good news for those of us who talk to our dogs").

Don't get me wrong: it's not that I have any objection at all to scientific results on whether dogs can remember enough of a correlation between human speech sounds and specific toys to go fetch the right one in response to the right word, which is what we are talking about in the case of this story. Perhaps there are even some insights to be gained about the complexity of the tasks the brains of lower animals (particularly mammals) are wired to accomplish (though not much, because nobody doubts that mammals are capable of associating large numbers of aural stimuli with particular behavioral responses). It's the confusion of that with understanding language that drives me nuts. (It must have been Stupid German Linguistics Day on American public radio stations today, because in addition to a story on All Things Considered about a contest to find the most beautiful word in the German language, two of the afternoon programs out out by NPR covered Rico, both with the host babbling about the dog "understanding language.")

The trained object-fetching behavior of Rico, the border collie that this German research is talking about, has nothing at all to do with understanding language. The behavior is comparable to what you would have shown if you demonstrated that you had trained your goldfish to swim to a given object in its tank when you showed it a card with a given letter of the Greek alphabet. By all means attempt that too, if you think it would be interesting science. But don't bring it to me for my approval under a headline saying Research Shows Goldfish Can Read Greek, that's all! Unless you actually enjoy seeing the veins standing out in my neck as I hurl some more defenseless chairs and coffee tables and goldfish tanks around the room.

Posted by Geoffrey K. Pullum at 01:44 PM

Kusunda

The claim that John McWhorter referred to, that Kusunda, a poorly known language of Nepal, is related to "Indo-Pacific", has been making the rounds of the historical linguistic grapevine for a while now. I haven't yet had time to study carefully the paper that has now appeared in the Proceedings of the National Academy of Science, but a few comments are possible off the cuff.

First, there is no such language family as "Indo-Pacific". "Indo-Pacific" is the name that Joseph Greenberg gave to a putative language family that included all of the non-Austronesian languages of New Guinea, the languages of the Andaman Islands, and the languages of Tasmania but not the other languages of Australia. As usual with Greenberg's work, the proposal was supported by virtually no evidence of the sort considered probative by historical linguists, and the proposal has not been accepted by most historical linguists.

Here is the evaluation given by George van Driem in volume I of his 2001 book Languages of the Himalayas:

Racial notions have continued to be uncritically applied to language groupings. As late as 1971, Joseph Greenberg resurrected the old idea that "the bulk of non-Austronesian languages of Oceania from the Andaman Islands on the west of the Bay of Bengal to Tasmania in the Southeast form a single group of genetically related languages for which the name Indo-Pacific is proposed." This hypothesis is identical to Finck's 1909 family of "Sprachen der ozeanischen Neger", a group for which indeed the name "Indo-Pacific" had already been in use, with its roots in the "Pan-Negrito Theory" of physical anthropologists (cf. Skeat and Blagden 1906: 25-28). Appropriately, Roger Blench has described the Indo-Pacific hypothesis as "essentially a crinkly hair hypothesis". (pp. 139-140)

The linguistic evidence which Greenberg adduced for Indo-Pacific is unconvincing, and lexical look-alikes and superficial typological similarities in languages cannot convincingly demonstrate a theory of linguistic relationships conceived solely on the basis of the physical attributes of the speakers. (p. 141)

The other important point that can be made is that it is possible for pronouns to be borrowed. Here are a few examples:

The entire pronominal system of Pirahã, a language of the Brazilian Amazon, was borrowed from the Lingoa Geral, the trade language once widely used in the area.
English they and them are loans from Scandinavian.
Young Thai speakers currently use /mi/ and /ju/, borrowed from English me and you, much of the time. This provides an escape from the very complicated native Thai honorific system. These loans have not completely replaced the traditional pronouns, but they have come into wide use, and this situation provides an example of how and why pronoun borrowing resulting in complete replacement might come about.
Japanese has quite a few words roughly equivalent to English I/me. One of them is 僕 [boku], which is used by men in casual circumstances. This is a loan from Chinese. It's original meaning is "manservant", a meaning that it retains in a few fairly obscure compounds in Japanese.

For further information, check out this paper on Pronoun Borrowing [PDF document] by fellow Language Logger Sally Thomason and Dan Everett.

The idea that resemblances in pronouns are good evidence of a genetic relationship because they can't be borrowed has been around for quite a while and continues to be promoted by people like Ruhlen, but it just isn't true. There are some other things to say about resemblances in pronouns, which many linguists now suspect involve sound symbolism, but that's another topic.

Let me close by disagreeing in one detail with John's statement that:

Almost to a man, historical linguists assume that the attempts by Ruhlen and the late Joseph Greenberg proposing that languages of great antiquity retain traces of their origins in single ancestral languages are irresponsible.

It isn't a matter of assuming. The reason that most historical linguists reject the claims of Merritt Ruhlen and his ilk is that when they are investigated they turn out to be badly flawed. The "evidence" presented isn't probative. It consists of lists of words that kind of, sort of, look similar and mean similar things. There's no reason to think that the similarities they present are not due to chance. A large percentage of the data turn out to be wrong. The statements that they make about the history of historical linguistics and the lessons that we should draw from it are false. It's true that at this point when we hear about the latest claim by these people we think "Oh, another crank proposal", but that reaction is based on their lousy track record, and we don't say anything until we've actually looked at it.

Posted by Bill Poser at 01:12 PM

Can relationships between languages be determined after 80,000 years?

I know -- Merritt Ruhlen is insane. Almost to a man, historical linguists assume that the attempts by Ruhlen and the late Joseph Greenberg proposing that languages of great antiquity retain traces of their origins in single ancestral languages are irresponsible.

But are we really being fair?

Ruhlen and some associates have published a paper in a place so obscure they might as well have scratched it on a cocktail napkin somewhere.

(The site to consult is http://www.pnas.org/cgi/content/abstract/101/15/5692.)

But what they show is, to me, spinetinglingly interesting.

Kusunda is a moribund language of Nepal, once assumed to be Sino-Tibetan like a good Nepalese language often is, but which upon examination reveals itself to be something else.

And that something else is, of all things, Indo-Pacific. That is, Papuan. Properly, the Indo-Pacific group extends westward to, for example, the Andaman Islands tucked under Burma in the Bay of Bengal.

And there is spoken a language called Juwoi. And the correspondences between Kusunda and Juwoi are too close to be called an accident. What is crucial about the correspondences is that they involve pronouns. There are two things about pronouns. First, they tend to hone to their original state much longer than more general words for things like dogs and blogs. English has DOG and French has CHIEN, but both have held on to ME for "me" in their grammars. Second, languages do not exchange pronouns much. Usually, a language's pronouns are original stock, not the result of later bartering. So the Japanese say BEISUBORU for "baseball," but their word for "I" remains WATASHI.

So in that light, we must take note that Kusunda for "I" is CHI, where in Juwoi it is TUI. T becomes CH constantly over time: witness how many Americans say "chree" instead of TREE. Then Kusunda for "my" is CHI-YI -- and in Juwoi, TII-YE. Kusunda for "you": NU. In Juwoi, NGUI. "Your" in Kusunda: NI-YI. In Juwoi: NGII-YE. Note that pattern of sticking on a YE or YI -- this is too close to be an accident. "He/she" in Kusunda is GIDA. In Juwoi it is KITE -- and if you think about it, G is basically K enunciated in a slightly different way.

And yet there is no way that the Kusunda have been helicoptering over the millennia to the Andaman islands. And certainly not to Western New Guinea, where in the Seget language, Kusunda's CHI, NU, GIDA comes out as TET, NEN and GAO (remember that CH comes from T all the time). And then way over on the Solomon Islands east of New Guinea, the same pattern echoes: where Seget has NEN for "you" and GAO for "he," Savosavo has NO and GO.

The beauty of this is that according to the latest reports, New Guinea was occupied by humans 75,000 years ago (the source to consult is Stephen Oppenheimer's THE REAL EVE). Now, as it happens, the Kusunda are reported to have the same small, dark physical appearance as Andaman Islanders, unlike other Nepalese people. This suggests that they are remnants of the trek of early humans along the South Asian coast from Africa to New Guinea and Australia. Data is as yet unclear as to exactly when people like the Andaman Islanders got where they are now. But many suppose that humans left Africa to beachcomb South Asia 80,000 years ago, which means that the Kusunda would have found their place between then and 75,000 years ago.

Which means that the likenesses between Solomon Islands languages like Savosavo and Kusunda today represent a relationship almost 80,000 years old!!!

This flies in the face of a common wisdom taught to historical linguists, that it is impossible to trace parentages between languages much further back than ten thousand years or so, or that after that, sound changes will have erased any perceivable likenesses.

I doubt that we can trace any words as far back as the very first language -- i.e. Greenberg and Ruhlen's "Proto-World." But I suspect that the Kusunda data suggest that linguists ought to give these guys a break on their analysis of pronominal patterns in Native American languages similar to the ones between Kusunda and Indo-Pacific.

In my own THE POWER OF BABEL I dutifully trash the "Proto-World" hypothesis, as per my tutelage in the world of historical linguists. But we must let data speak, and I have a hard time seeing this Kusunda-Papua connection as mere happenstance.

Just think -- a handful of elderly people in Nepal, given a charter trip to the Andaman Islands, would find that they had eerily similar words in common with the people they met there. And language parallels can help support archaeological and genetic research. I hope that linguists will take a look at this Kusunda paper and give it a good, fair chew.

Posted by John McWhorter at 04:01 AM

June 09, 2004

At long last

Today is the fiftieth anniversary of an event that should not go unremarked on Language Log: it's exactly half a century today since a pair of well-crafted sentences rang out across a Congressional hearings room in Washington DC and began a process that was of great importance to the integrity and honor of our country:

Have you no sense of decency, sir? At long last, have you left no sense of decency?

In the early 1950s, Senator Joseph McCarthy was famous for his aggressive anti-communist stance, and speeches in which he claimed to be in possession of long lists of names of communists in the State department, the military, and elsewhere in government. He made full use of his position as chair of the Senate Committee on Government Operations and its Permanent Committee on Investigations. He destroyed the careers of many people by claiming that they had belonged to communist front organizations or associated with communists. His success at this owed a lot to the fact that he was able to play (as Harvard law dean Erwin Griswold put it) "judge, jury, prosecutor, castigator, and press agent, all in one."

On June 9 in 1954, McCarthy was pursuing a somewhat peripheral vendetta against the Army over the drafting of a member of his staff. The vendetta had already dragged through over thirty days of Congressional hearings. At one point, out of sheer malice, McCarthy decided to place into the record the quite gratuitous information that the law firm representing the Army, Hale and Dorr of Boston, employed a young lawyer, Fred Fisher, who — though he was by this time a Republican — had once (in law school and for a few months thereafter) belonged to a chapter of a leftist organization, the Lawyer's Guild.

Fisher was not even on the team that was representing the Army in the case at hand in Washington; he worked in the Hale and Dorr's Boston office and had nothing to do with the case at hand. But his career could well be over if he was publicly smeared as a communist, and that would be a blow McCarthy could strike against the senior Hale and Dorr attorney who was representing the Army, Joseph Welch. As McCarthy launched into the speech that would place it on record that Fisher had been in the Lawyer's Guild, Welch went on the offensive, arguing against him fiercely, castigating him personally ("Until this moment, Senator, I think I never really gauged your cruelty or your recklessness"), begging him not to go on. "Let us not assassinate this lad further, Senator; you've done enough," he cried; and as McCarthy showed that he was going to go on regardless, Welch added: "Have you no sense of decency, sir? At long last, have you left no sense of decency?". (The quote is often given with Have you no shame? included, but that is not what Welch said; see the transcript here or the quotation from official Senate history files here; the latter gives a link to the PDF version of the full original hearings. For an NPR report with the original audio, go to NPR's rundown list and click on the relevant story.)

From the moment of Welch's eloquent and much-quoted utterance, Joseph McCarthy's reputation started to wane, and before long it had collapsed. He lost his popularity with the public (his altercation with Welch was seen live on TV, and the newspapers the next day recorded in print for those who didn't see it). Ultimately he was censured by his Senate colleagues. When he died three years later after a period of alcohol abuse he was a broken man. Never was there a clearer example to show that sometimes, in the face of real evil and dangerous power, one person can stand up and win a battle with a simple speech act.

[Note for syntacticians: Welch's syntax is rather old-fashioned for American English (though not so much for British speech): his famous outburst has two features that are rare in American speech and one that is rare in every dialect. (In what follows I put "[%]" in front of a sentence if only some Standard English speakers would accept it.) First, the sentence illustrates subject-auxiliary inversion with the possession sense of have, which Americans don't use much any more (they tend to say Do you have a pencil? rather than [%]Have you a pencil?). Second, his utterance also uses non-verbal negation, which is less common (and more formal) than verbal negation (Americans say I don't have a pencil rather than [%]I have no pencil). And third, the second sentence, Have you left no sense of decency?, shows a rather unusual adjunct placement for left (Americans would say Don't you have any pencils left rather than [%]Don't you have left any pencils?). At long last sounds a bit odd, too; Welch means it in the sense "Hasn't this gone on long enough?", which is not quite the same as the modern sense "finally" (as in At long last Iraq has an Iraqi-led government), though of course they are close. Fifty years is not a long time in language change, but already English is sounding a little different from when Welch spoke.]

[Small revisions made to the text after fact-checking on June 10th, 2004, and a spelling correction on October 13, 2006. —GKP]

Posted by Geoffrey K. Pullum at 07:10 PM

With eggcorn aforethought?

You can usually produce an example of an eggcorn by taking a common idiom or collocation, and inventing a different word sequence with a similar sound. If the substituted words have relevant meanings, so much the better; and if the original collocation is archaic or otherwise non-compositional, that improves the chances still further. Using this method, I created the candidate eggcorn "malicious forethought", from the legal term "malice aforethought". Google has 2,990 hits for "malicious forethought", and 21,400 hits for "malice aforethought", so it looks like we have a winner.

However, it's not so simple. One of ghits for the putative eggcorn is in the entry for kill in Webster's Revised Unabridged Dictionary (1913): "To murder is to kill with malicious forethought and intention." What's going on here? Is it possible that "malicious forethought" is not really an eggcorn?

The OED's definition for murder cites "the unlawful killing of a human being with malice aforethought; often more explicitly wilful murder." Overall, the OED has 10 full-text matches for "malice aforethought" in the 2nd edition, and 9 more in the new edition; with no matches at all for "malicious forethought".

In the entry for aforethought, the OED gives the etymology as a calque of an Old Law-French term:

[f. AFORE adv. + thought: see THINK. Apparently introduced as an English translation of the Old Law-Fr. prepense in malice prepense.]

The OED's entry for malice has an etymological narrative that is consistent with this but a little more complicated:

[< Anglo-Norman malice, malise, malisce, Old French, Middle French, French malice (12th cent. in senses 1a, 3, and 5, 1314 in sense 4a, 16th cent. in sense 1d, 17th cent. in sense ‘desire to tease’: see sense 1a) < classical Latin malitia < malus bad (see MAL-) + -itia -ICE.
With malice aforethought (see sense 2), cf. post-classical Latin malitia excogitata (1235, c1323 in British sources), malitia preconcepta (c1300, 15th cent. in British sources), malitia precogitata (1304, 1391 in British sources). With malice purpensed (see sense 2) cf. Anglo-Norman malice purpensé.]

Webster's 2nd defines aforethought adequately, with a reference to the relevant legal term: "Premeditated; prepense; previously in mind; designed; as, malice aforethought, which is required to constitute murder. Bouvier."

Webster's 3rd (1961) seems to have thought better of the "malicious forethought" business. In its entry for kill, says "MURDER implies motive and usu. premeditation in a criminal human act", dispensing entirely with any morphological derivatives of malice. As far as I can tell, the phrase "malicious forethought" does not occur anywhere in the text of the 3rd edition. I qualify this statement because the only on-line search available to me is via a ProQuest service, offering something called a "keyword search" that in other tests seems willing to find words strings in the middle of entries, but turns up nothing for "malicious forethought". In any case, the 3rd edition's entry for aforethought is significantly more complete than in the 2nd edition:

[malice aforethought trans. of AF malice purpensee; malice prepense alter. of earlier malice prepensed (trans. of AF malice purpensee), fr. E malice + obs. E prepensed premediated --- more at PREPENSE]

: deliberate malice : premeditated malice

specif : malice in fact or implied malice in the intention of one who has had sufficient time to act with premeditation in the doing of something unlawful (as in doing serious bodily harm to another person or as in murdering another person)

To sum up, it seems that the redoubtable 1913 Webster's Unabridged really did perpetrate an eggcorn. However, this is a very strange case, since the "mistaken" phrase means essentially the same thing that the "correct" one does, and has the extra advantage of being compositional in contemporary English.

In the technical jargon recently invented for the purpose, the count of 2,990 Google hits for malicious forethought translates to 698 whG/bp (web hits on Google per billion pages), while malice aforethought's 21,400 ghits is 4,994 whG/bp.

[Update: 6/10/2004: Elissa Flagg emailed to point out another variant, "malice of forethought". She observes that "It has a relatively small number of Google instantiations at 533 (most of which are in reference to legal proceedings, although several seem to be about a punk/metal band of that name), but it's the version I've always thought I was hearing." ]

[Update 2/25/2005: Andrew Gray emailed with yet another variant: "malice and forethought", which he observed in an IM conversation. There are 880 Google hits for this version, though some of them seem to fully compositional cases for which "malice aforethought" could not be substituted (e.g. "Malice and forethought, the essentials of an evil will, were shown not to be necessary for organizing and structuring the machinery for the efficiency of the death camps. ") Like "malicious forethought", this is an especially tricky example since it has an entirely appropriate compositional meaning in most contexts in which it might be substituted for "malice aforethought", and is blocked only because of the prior existence of the almost-homophonous phrase. ]

Posted by Mark Liberman at 05:01 PM

A soul candidly acknowleging it's fault

Lynne Truss, author of Eats, Shoots and Leaves: the Zero Tolerance Approach to Punctuation, say that she is "not a pedant, but a stickler", by which she means that "people who put an apostrophe in the wrong place ... deserve to be struck by lightning, hacked up on the spot and buried in an unmarked grave." Visiting Colonial Williamsburg, I've recently learned how lucky it was, for all concerned, that Ms. Truss was not living in Virginia in the last third of the 18th century

In 1771, Robert Skipwith, the brother of Thomas Jefferson's future wife, asked for "a catalogue of books to the amount of about 50 lib. sterl.". Jefferson, who had lost his family library in a fire in 1770, and had been busily building a new one, responded with a list of 148 titles in 379 volumes, costing several times the cited limit: "such a general collection as I think you would wish and might in time find convenient to procure. Out of this you will chuse for yourself to the amount you mentioned for the present year and may hereafter as shall be convenient proceed in completing the whole."

In the cover letter that Jefferson sent with his catalogue, he felt the need to excuse the inclusion of works of fiction, by arguing that

[w]e are ... wisely framed to be as warmly interested for a fictitious as for a real personage. The field of imagination is thus laid open to our use and lessons may be formed to illustrate and carry home to the heart every moral rule of life.

In addressing this question, Jefferson uses the possessive form of it four times, twice spelled as "its" and twice spelled as "it's":

A little attention however to the nature of the human mind evinces that the entertainments of fiction are useful as well as pleasant. That they are pleasant when well written every person feels who reads. But wherein is its utility asks the reverend sage, big with the notion that nothing can be useful but the learned lumber of Greek and Roman reading with which his head is stored?

I answer, everything is useful which contributes to fix in the principles and practices of virtue. When any original act of charity or of gratitude, for instance, is presented either to our sight or imagination, we are deeply impressed with its beauty and feel a strong desire in ourselves of doing charitable and grateful acts also. On the contrary when we see or read of any atrocious deed, we are disgusted with it's deformity, and conceive an abhorence of vice. Now every emotion of this kind is an exercise of our virtuous dispositions, and dispositions of the mind, like limbs of the body acquire strength by exercise. But exercise produces habit, and in the instance of which we speak the exercise being of the moral feelings produces a habit of thinking and acting virtuously. We never reflect whether the story we read be truth or fiction. If the painting be lively, and a tolerable picture of nature, we are thrown into a reverie, from which if we awaken it is the fault of the writer. I appeal to every reader of feeling and sentiment whether the fictitious murther of Duncan by Macbeth in Shakespeare does not excite in him as great a horror of villany, as the real one of Henry IV. by Ravaillac as related by Davila? And whether the fidelity of Nelson and generosity of Blandford in Marmontel do not dilate his breast and elevate his sentiments as much as any similar incident which real history can furnish? Does he not in fact feel himself a better man while reading them, and privately covenant to copy the fair example? We neither know nor care whether Lawrence Sterne really went to France, whether he was there accosted by the Franciscan, at first rebuked him unkindly, and then gave him a peace offering: or whether the whole be not fiction. In either case we equally are sorrowful at the rebuke, and secretly resolve we will never do so: we are pleased with the subsequent atonement, and view with emulation a soul candidly acknowleging it's fault and making a just reparation.

I did notice the errant apostrophes, but did not find that they spoiled the sentiment. I was also easily able to get past the non-standard spelling of acknowledge. In fact, I'll acknowledge that I didn't even notice it until I saw it in the title of this entry, which I cut-and-pasted from the online html version of Jefferson's letter.

This does not seem to have been a commonplace spelling of the time, but just an idiosyncratic mistake by Jefferson, since the OED has:

1590 SHAKES. Com. Err. V. i. 322 Thou sham'st to acknowledge me in miserie.
1597 1 Hen. IV, III. ii. 111 Through all the Kingdomes that acknowledge Christ.
1611 BIBLE Wisd. xii. 27 They acknowledged him to be the true God, whome before they denyed to know. Prov. iii. 6 In all thy wayes acknowledge him, and he shall direct thy pathes.
1651 HOBBES Leviathan I. x. 43 He acknowledgeth the power which others acknowledge.
1762 GOLDSM. Cit. W. (1837) iv. 16 An Englishman is taught to acknowledge no other master than the laws which himself has contributed to enact.
1781 GIBBON Decl. & F. III. 65 The authority of Theodosius was cheerfully acknowledged by all the inhabitants of the Roman world.

If all this means that I'm not a stickler, but a pedant, so be it. I like to think that Jefferson would have taken the same view.

Posted by Mark Liberman at 08:11 AM

June 08, 2004

Those obstinant n's

I found Gene Buckley's example of inclimate weather delightful, partly because it reminds me of one of my favorite bedtime stories: a little paper written by Otto Jespersen in 1902, called "The nasal in nightingale" (Englische Studien 31, pp. 239-242). It seems that English speakers have been having a hard time with nasals before consonants for centuries—probably ever since the sound change that deleted [n] before consonants in weak positions, as in an tomato > a tomato, mine life > my life, in the > i' the, and so on. Certainly, the phonetic difference between [ɪŋkləmə̃ʔ] (with a nasalized vowel in the final syllable) and [ɪŋkləməʔ] (without) is so subtle as to be hardly noticeable in many instances—especially when the nasal [m] right before it is exerting its own nasalizing influence. In the case of inclement ~ inclimate, the tantalizing connection with climate breaks the tie. But what if there's no handy eggcorn to be had?

Interestingly, the usual trend seems to be to err on the side of caution, adding extra nasals "just in case." Jespersen points that several English words have been altered in this way: nightigale > nightingale, passager > passenger, messager > messenger, etc. For many centuries, the country to the west of Spain was known as Portingall (Chaucer, Epilogue to the Nun's Priest's Tale: "Him nedeth nat his colour for to dyghen With brasile ne with greyn of Portyngale".) Other fun examples that used to be more popular than they are now include skelinton, milintary, and cementery. And this process is by no means dead: Google turns up lots of hits for things like dormintory, compensantory, exanctly, Ambercrombie and Fitch, and even celenbrate. Another stunning example was recently given to me by Jaye Padgett, who admitted that a family member of his says [ompən] for open.

Confusion among -ate ~ -ant pairs is even more prominate, since both are legitimant suffixes. I myself must confess to saying obstinant, inordinant and indiscriminant, with nasalization from the preceding nasal misinterpreted as belonging to the suffix. (I myself don't say fortunant or unfortunant, but a fair number of people seem to.) Even without the preceding nasal, though, folks often feel compelled to sneak in an extra one:

[link] "From the moment of walking in, they would see how lowfully inadequant it was."
(lowfully??? Someone's been listening to Tom Brokaw too much—this one merits a post all of its own!)
[link] "In protected harbours or areas of low water movement more delicant and faster growing seaweeds can dominate"
[link] "But still, I'm willing to trust the judgement of an illiterant, dyslexic, no hoper barman!"
[link] "Mrs. Murrary was very passionant and caring."
[link] "I'm Anglican, which is about as close as you can get to being Roman Catholic without having celebant priests"

There are, of course, some that go the other way round, too:

[link] "They probably reported a delinquate payment because of this girl's ridiculous incompetence"
[link] " Learn to write, pronounce, and recognize the consonates and vowels."

For the most part, however, substituting -ate for -ant/-ent has to be encouraged by folk etymology (as in inclimate) or by the existence of a related -ate verb:

[link] "IP is the dominate protocol in the internet today"
[link] "The benefits to the participates and sponsors include opportunities to learn about and influence formation of international standards..."

(A notable exception is the word elephate for elephant, which pops up a generous handful of times on Google)

One last aside: English isn't the only language that has had sporadic insertion of nasals: the etymology of Spanish manzana 'apple' is mattiana > maçana > mançana, and the use of sandwinch for sandwich seems to be at least as common in Romance languages as it is in English.

Posted by Adam Albright at 05:37 PM

Liberal gemination

Gene Buckley's example of gemination-swap in dissapointed (in response to these other posts on the topic) hits on what may be a general "principle" of English spelling: given a choice, doubled consonants prefer to come early in the word. This effect is seen most clearly in words in cases where conservation of geminates is violated: enemy is more often written ennemy than enemmy, accommodate shows up more often as accomodate than acommodate, and so on.

The following charts illustrate this principle, with raw Google hits. We see that when spellings fail to hit the correct target (in dark grey), they tend to slip down and to the left (first consonant doubled, second singleton).

enemy		C2
enemy		m	mm
C1	n	8,950,000	349
C1	nn	14,500	1

eradicate		C2
eradicate		d	dd
C1	r	706,000	13
C1	rr	6,910	1

accommodate		C2
accommodate		m	mm
C1	c	3,150	5,160
C1	cc	587,000	5,740,000

assassinate		C2
assassinate		s	ss
C1	s	7,090	617
C1	ss	13,600	243,000

commission		C2
commission		s	ss
C1	m	370,000	173,000
C1	mm	32,500,000*	50,500,000

*The hits for commision are wildly inflated here, because many sites are linked to with misspelled links, but don't actually contain the misspelling themselves. I imagine this is a problem for all of these counts, but it is especially dramatic here. (Thanks to Paula Aden for pointing out the reason behind this strange behavior to me.)

If the word has just the second consonant doubled, then "conservation of geminates" should conspire with "geminate early" to make the doubling shift to the first consonant. Indeed, this does often occur—though global degemination is also very common.

resurrection		C2
resurrection		r	rr
C1	s	65,800	2,680,000
C1	ss	98,700	7,900

recommend		C2
recommend		m	mm
C1	c	600,000	30,900,000
C1	cc	300,000	115,000

For words that already obey the "geminate early" rule, switching the doubling to the second consonant is certainly not unusual (this is the Karttunen > Kartunnen error). For many words, however, the most common misspelling seems to be to violate conservation of geminates, and write the word with no doubled consonants at all (upper left corner)

attitude		C2
attitude		t	tt
C1	t	409,000	2,820
C1	tt	9,670,000	2,160

imminent		C2
imminent		n	nn
C1	m	9,870	29
C1	mm	1,690,000	87

Another one of my favorites is mayonnaise, which is sometimes written mayonaisse (but more often mayonaise):

mayonnaise		C3
mayonnaise		s	ss
C2	n	94,000	7,330
C2	nn	663,000	886

Amusingly, gemination can even spread to the y: mayyonaise (42 hits) or mayyonnaise (20 hits); but no misspelling of this word is nearly as common as simply degeminating across the board (mayonaise), in violation of conservation of geminates.

The preference for degemination is also true of many of the other examples that have been discussed in previous posts. Disappointed may show up 288,000 times on Google, but disapointed gets a whopping 1,360,000 hits. The same point is made by the Jennifer/Jenifer/Jeniffer data, in which Jenifer outnumbers Jeniffer by a ratio of 4:1. In fact, it's not clear that conservation of geminates is much of an effect at all, given that the same effect could be achieved the independent forces of (1) global degemination (recommend > recomend), and (2) spontaneous first consonant gemination (enemy > ennemy, recomend > reccomend). There's something intuitively appealing about the idea (encouraged, perhaps, by the number of times that we hear utterances like "that's with 1 c and 2 m's"), but more data is needed. The particular consonants involved could be playing a role here, too.

Interestingly, this same overall pattern (general preference for degeminating, followed by a noticeable occurrence of "gemination swap") is also discussed by Badecker (1996) for dysgraphic patients. (See reference below)

Finally, it should be noted that some words genuinely do seem to obey the Karttunen generalization, with gemination preferentially switching from C1 to C2. One such word is cinnamon:

cinnamon		C2
cinnamon		m	mm
C1	n	17,600	149,000
C1	nn	2,010,000	645

The question of why cinnamon is different from mayonnaise is left as a matter for future research.

References:

Badecker, William (1996) Representational properties common to phonological and orthographic output systems. Lingua 99, 55-83.

Posted by Adam Albright at 05:54 AM

Genocide - present tense

A while back we talked about the New York Times' decision to acknowledge that the Armenian Holocaust is properly referred to as genocide. It took them 89 years. I guess better late than never. Other than a few Turkish heroes who defied their government to save Armenians, no one did anything. With the exception of Denmark, no government acted specifically to stop the Holocaust in World War II. Ten years ago in Rwanda, no one stopped the Tutsi Holocaust.

In the Darfur region of the Sudan the Holocaust is happening again, and no one is doing a damn thing to stop it. Sudanese Arabs are killing and driving out Africans. According to USAID Administrator Roger Winter (Testimony before the US House of Representatives, April 1, 2004):

The government is arming Arab militias to systematically attack civilians, while engaging in a policy of terror, murder, rape, and devastation. This is forcing a mass migration of hundreds of thousands in what amounts to an ethnic cleansing campaign.

The Janjaweed "militia", more accurately "death squads" or "bandits", controlled and supported by the Sudanese government, has made one million people homeless and killed at least 30,000. Estimates of the number who will soon die, as a result of violence or through hunger, thirst, and disease, range from 300,000 to 1,000,000. The Sudanese campaign is directed especially at women. This article in the Montreal Gazette describes the systematic mass rape and kidnapping of women carried out by the Janjaweed. A really comprehensive and detailed report on the situation is available from Human Rights Watch.

The European Union finally got around to issuing a mild declaration on May 26th. The Arab League managed to issue a press release expressing concern over "gross human rights abuses" though they tiptoed around the fact that the Sudanese government is responsible and blamed "inter-tribal warfare". UN Secretary General Kofi Annan has expressed concern, as have some other organizations and politicians, but nobody is actually doing anything. By the time they get around to issuing a strongly worded declaration, thousands more will be dead. The only way to deal with a rabid dog is to shoot it; talking about it doesn't help. What is needed is immediate military intervention.

Genocide is an obscene word that should be used only in reference to the past. We have the chance to bring that about. Let's do it.

[Upadte 2004/06/12: Pressure to take action is growing, but there is still no real action. There is a lot of information here.]

Posted by Bill Poser at 01:09 AM

June 07, 2004

Alan Turing

Today is the fiftieth anniversary of the death of Alan Turing, a pioneer of computer science who also made important contributions to formal language theory. Turing created the Turing Machine, the abstract machine which occupies the highest position in the Chomsky-hierarchy. He also proposed the Turing Test for deciding whether a machine is intelligent.

In addition to his theoretical contributions, Turing was one of the cryptanalysts who worked at Bletchley Park between 1939 and 1945 decrypting intercepted Nazi communications. Together they are credited with changing the course of the war. Ironically, if he were alive today the United States military would not employ him: he was gay.

Posted by Bill Poser at 10:57 PM

People whose email reveals them to be gay

There must be something wrong with me. I just don't seem to be afraid of Gmail's supposed dire threat to my privacy. Google's plan for its Gmail service is that email content will be scanned automatically on the server so that possibly relevant ads can be placed in the margin when they the messages are viewed. Imagine: you email me to say that surf's up and maybe we should wax up our boards and go catch some waves, and the Google text scanner does a keyword scan and decides to drop ads for O'Neill's Surf Shop and a wetsuit repair company in the margin so I see them while I'm reading your message. For my willingness to have these ads on my screen, I get a gigabyte of free storage. Seems like a great idea. But instead people are trying to bring lawsuits to have Gmail stopped, and I am supposed to be terrified of the dark threat of what Google, and maybe even The Government, might do to us. Have a look at the scenario (attributed to an anonymous hacker) that is described in a recent article by Annalee Newitz in Metro Santa Cruz, a free paper in the little paranoid town where I live. Imagine:

...an anti-gay group buys Gmail ads that are targeted at people whose email reveals them to be gay. When these gay people click through the targeted ads, they land on the anti-gay website, which allows the website owners to log their IP addresses — and since IP addresses are often traceable to real-world addresses, the anti-gay group could possibly use targeted Gmail ads to compile a hit list of gay people, complete with directions to their targets' homes.

It is particularly ludicrous for this alarmism to be published in a town as extraordinarily gay-friendly as Santa Cruz, but set that aside; perhaps in Alabama or east Texas gays would be tracked to their lairs via their IP addresses if only the pesky perverts could be identified. Also set aside the issue of Google's company policies (as Newitz goes on to say, Sergey Brin of Google points out that such targeting wouldn't be allowed by the company's policies anyway, even if it were feasible). Forget these issues. This is Language Log, and my concern is with the linguistic angle here: What the hell does Newitz mean by "people whose email reveals them to be gay"?

Remember, a machine is supposed to do the matching of ads with emails, and do it very fast. Is it really possible that Newitz, or the hacker she quotes, truly believes that an algorithm can determine from the text of an email whether or not the sender is gay?

Here's the text of two messages I've received in the past year, side by side, both from close personal friends of mine who are syntacticians. (Purely by coincidence, both use the smileyface ASCII emoticon in these random passages I scooped from my files.) The two are of different sexual orientations. Get to work on a perl script that will figure out which one is the homosexual:

I finally did send in an abstract. My cousins in Philadelphia dropped it off at the main post office and they claimed there that it would be delivered on Tuesday. We shall see... (whether it gets there on time, and whether it is accepted...;-) I'd like to work with you on the problem of the limits between adjectives and prepositions. I've been collecting examples of the use of PPs as the predicative complements of "seem" type verbs and also modification of P(P) by "very" and things like that. The data are interesting I think. I also have a student who wants to do a qualifying paper on extraposition from object. So, she's reading Postal and Pullum 88, and I'll also get back to you on that problem. You are truly a prince to be willing to do this on such short notice, Geoff! I checked with the powers that be, and they will cut you a check for giving the talk. Should be enough for a car rental, gas, parking, plus a decent meal somewhere. We will rearrange our schedule to make room for you on Monday. If you want to come for the full 2 hrs, I'm sure the students would enjoy interacting with you. If you prefer to do only one hour, you can pick whether to do the first or second. We meet in building 160, room 127. That is to the left of the main entrance to the university . It's on the first floor, and it's a showcase high-tech classroom, with all kinds of futuristic equipment.

Reading that phrase "people whose email reveals them to be gay" reminded me once again that even quite intelligent members of the general public have absolutely no idea of what is known about language, what is possible and what is not. They believe both too much and too little. They'll believe things that are wildly and absurdly false (like that parrots or monkeys can hold intelligent conversations); yet they won't believe things that are uncontroversially true (like that genitive noun phrases are allowed to be antecedents of pronouns in English, or that African American Vernacular English is more than just Standard English spoken badly.

Surprisingly (in view of the fact that men are reputed to be from Mars and women from Venus), computer analysis of text can't even reliably tell male from female authors. How likely is it that textual analysis will be able to tell which sex would be a given author's preference for someone to cuddle up in bed with? If this is the sort of thing that is supposed to make me terrified that Gmail will destroy my privacy, then I'm sorry, I just don't seem to be able to muster the expected level of terror.

Posted by Geoffrey K. Pullum at 09:29 PM

Oxymoronic Titles

My all-time favorite combination of titles is one I came across on the title page of a Rumanian medical monograph published in the 1960s, during the Communist period. I no longer remember the exact Rumanian words, but they amounted to:

Comrade Doctor Professor Academician
"comrade" since we are all equal, followed by three academic titles, since some are more equal than others.

Posted by Bill Poser at 06:47 PM

Dear Dr Geoff

The journal Behavioral and Brain Sciences (BBS) publishes papers with accompanying peer commentary from other scholars. To acquaint potential commentators with new papers proposed for publication it mails out abstracts out to thousands of scientists around the world. The automated system that does this has some kind of error in either the database or the program, because it picks out the wrong part of my name. My automatically-generated message always begins with Dear Dr Geoff. It's funny how annoying this tiny slip is.

For some reason, titular prefixes like Dr and Professor go only with a surname (what Americans unwisely call a "last name" — i.e., the bit that comes first in Chinese and many other languages). The Sir to which knights are entitled, on the other hand, goes with the forename (the one Americans unwisely refer to as the first name, the one that is last if you're Chinese, and the British usually refer to, just as unwisely, as the Christian name).

The situation with Lord is much more complex, and I got it wrong here in the first version of this post. (I'm very grateful to Harold Hungerford and Jesse Sheidlower for helping me to get it right. For a really full and detailed technical discussion of correct forms of address in British English, click here. As Jesse points out, this is probably the only document you will ever see in which one of the subheadings reads "Marquess's Eldest Son's Daughter". I'll summarize only part, and very briefly.) There are two kinds of lord in the British peerage system: those who get the title as a courtesy (by being the younger son of a duke or a marquis), and those who actually get it conferred on them (whether by inheritance or direct conferring). A courtesy lord like the fictional Lord Peter Wimsey can be addressed by Lord plus first name; hence the Dorothy Sayers title Lord Peter Views the Body. But an actual recipient of a peerage is addressed by Lord plus whatever name he chooses at the time of receiving the status. (If it's a hereditary title, the lord's eldest son inherits both title and name.)

Let's take as an example the most distinguished and honored grammarian alive, the former Randolph Quirk, who rose through the ranks from ordinary Mr through doctor of philosophy, professor, vice chancellor, knight, and lord. From Mr Quirk he became Dr Quirk and Professor Quirk and Vice Chancellor Quirk. Then he was knighted, and became Sir Randolph Quirk, and at that time it would have been acceptable to address him in a letter with Dear Sir Randolph. Later he was elevated to a peerage (he was made a baron, in fact), and he chose to be known by the name Quirk. So now his legal signature is just "Quirk", and he is officially referred to as Professor the Lord Quirk). A shorter form would be Lord Quirk. Close buddies of his still call him Randolph, of course, and he can still publish as Randolph Quirk, but that isn't his legal name any more. His legal name is the new one he has been given by the Queen: he is the Lord Quirk.

In consequence, *Lord Randolph is not grammatical (or at least, not in accord with the social conventions, given that Randolph is classified as a forename) as a reference to Lord Quirk. (Many people make mistakes on this point, as did I.) Likewise, *Sir Quirk would have been completely ungrammatical as a reference to him when he was a knight; and back in the days when he was a mere PhD and Professor and Vice Chancellor of the University of London, *Dr Randolph and *Professor Randolph and *Vice Chancellor Randolph were completely ungrammatical as references to him (the correct forms would have been Dr Quirk, Professor Quirk, and Vice Chancellor Quirk).

There's no particular logic behind any of this; these are arbitrary facts of English grammar, like the fact that "Professor" and "Dr" can never co-occur with "Mr" (*Mr Professor Pullum is utterly ungrammatical in English, whereas the equivalent Herr Professor Pullum is grammatical in German).

Every time I see a message from BBS beginning "Dear Dr Geoff" it seems like a message from a very inexperienced foreign student who is trying to figure out the syntax of names and titles in English by guesswork. I want to reach out and help the poor student by explaining the syntax of Dr. But of course there's nobody back there, just a automated system incorporating an unimportant error in programming or data entry.

Posted by Geoffrey K. Pullum at 06:26 PM

Conservation of gemination: another example

Gene Buckley emailed:

In another good example of orthographic gemination, I was just reading something that contained the well-attested spelling "dissapointed", which also seems to partake of the feeling of a zero-sum transfer of doubling.

Google has 298,000 for "dissapointed", more than twice the 130,000 for "dissappointed" with two doubled letters. The correct spelling has over 4 million.

Gene added that

Interestingly, "apointed" is quite uncommon (6,850), supporting the view that the doubled "ss" in the longer word plays a crucial role.

and ended by asking

I wonder how rare a hit has to be in order to be statistically indistinguishable from a
recurrent typo, rather than an intended (though erroneous) spelling.

I'm not sure that this dichotomy is really dichotomous. I'm sure that there are some cases where a wrong spelling is just a careless slip of the fingers (or the brain), which the perpetrator knows well to be wrong and recognizes immediately in proofreading. At the other end of the continuum are stubbornly held wrong opinions about how something should be spelled. But in between there are many gradations, and in fact several dimensions of error with many gradations each.

Gene's note refers back to previous Language Log posts here, here, here and (originally) here.

Posted by Mark Liberman at 05:09 PM

June 06, 2004

Psycholinguistics in the logging industry

A little while ago, Sally Thomason's interest in sustainable forestry turned up this quote:

Mammal sightings: black bear, bobcat, cougar, white-tailed deer, mule deer, elk, chipmunk, ground squirrel, flying squirrel, snowshoe hare, mice, and vole

which led her to wonder why "the zero-plural-for-game-animals usage in this list" doesn't apply to the mice. Mice aren't game animals, but as Sally points out, neither are voles and chipmunks (though in Never Cry Wolf, Farley Mowatt claims to have survived for a time by hunting small rodents and eating them whole, in order to prove a point; but I digress).

Sally insightfully suggested that the mice/vole business might have something to do with the morphology of the non-head elements of English compound nouns:

My first thought was that the asymmetry between the plural form mice and the singular forms vole, etc., must have basically the same motivation as the parallel asymmetry in certain types of compounds, discussed, I think, by Peter Gordon: mice-eater but rat-eater, with *rats-eater impossible.

But maybe not, because as I recall (and I might be misremembering), Gordon's explanation for the asymmetry in compounds had to do with late vs. early plural formation, depending on whether it was the default regular plural (as in rats) or an irregular plural like mice; and that wouldn't be relevant for the list in the logging book. I also don't think the asymmetry is an idiosyncrasy of this author, because it sounds fine to me, and replacing mice with mouse doesn't. So what governs this pattern in the list of mammal sightings?

The same issue about noun compounds came up last winter, as Geoff Pullum and I exchanged some thoughts on the rigors of fieldwork and the distribution of activities centers in Las Vegas, Santa Cruz and Chapel Hill NC.

I was a bit surprised not to have gotten more email about those activities centers, because the (near) prison riot over strict transitivity that Sally described in another post is a tea party in comparison to what happens when you toss this topic into a roomful of psycholinguists. A clear presentation of one side of the controversy is Haskell, T.R., MacDonald, M.C., & Seidenberg, M.S. "Language learning and innateness: Some implications of compounds research". Cognitive Psychology, 47, 119-163. (2003):

In noun compounds in English, the modifying noun may be singular (mouse-eater) or an irregularly inflected plural (mice-eater), but regularly inflected plurals are dispreferred (*rats-eater). This phenomenon has been taken as strong evidence for dual-mechanism theories of lexical representations, which hold that regular (rule-governed) and irregular (exception) items are generated by qualitatively different and innately specified mechanisms. Using corpus analyses, behavioral studies, and computational modeling, we show that the rule-versus- exceptions approach makes a number of incorrect predictions. We propose a new account in which the acceptability of modifiers is determined by a constraint satisfaction process modulated by semantic, phonological, and other factors. The constraints are acquired by the child via general purpose learning algorithms, based on noun compounds and other constructions in the input. The account obviates the regular/irregular dichotomy while simultaneously providing a superior account of the data.

I don't recall that either side of this controversy has addressed the "game animal plural" facts that Sally has pointed out, but I'm not sure, and I can't easily check because I'm traveling, with limited time and limited internet access, and the relevant books and papers are back at Penn. However, I think that Sally is right to connect these two phenomena.

One of the results featured in the Haskell et al. paper (cited above) is that various semi-regular plurals (like wolf-wolves) are intermediate in acceptability as the first element of compounds -- so compounds like wolves-eater are rated as somewhat better than compounds like rats-eater, but somewhat worse than those with singular first elements (like rat-eater or wolf-eater), and (presumably) also a bit worse than mice-eater. Putting this together with gradient results from a corpus study, they argue that the facts are against dichotomous theories (like level-ordering or theories based on a regular/irregular dichotomy).

One could do an analogous experiment on the "game animal" zero plural phenomenon: if we were to add canis lupus to the Mammal sightings list, how do we feel about wolf vs. wolves, in comparison (say) to squirrel vs. squirrels? Haskell et al. would predict (I think) that wolves fits better than squirrels but not as well as mice.

For me, I think it depends on where in the list it comes -- near bear and cougar, I think I'd tend to say wolf, but near squirrel and vole, I'd be happy to say wolves. I think. My confidence in this judgment is somewhere between small and tiny.

Google tells us that "hunting wolves" is unexpectedly common, relative to the analogous patterns with other game-like animals. On the other hand, "hunting mice" is unexpectedly rare. Overall, it's not clear, in explaining these data, that the regularity of the plural formation is a major influence:

	singular	plural	ratio
wolf/wolves	862	1,980	0.44
tiger/tigers	933	1,650	0.57
squirrel/squirrels	1,170	1,990	0.59
mouse/mice	995	1,000	0.99
musk ox/musk oxen	68	61	1.12
fox/foxes	3,050	2,320	1.31
bear/bears	9,880	7,040	1.40
cougar/cougars	750	398	1.88
elk/elks	12,800	36	356

Posted by Mark Liberman at 10:10 PM

More on Spelling Reform

There's an interesting new blog called Via de Argilla, whose author states his or her intentions thus:

In iste blog, io intende de scriber super linguas, de traducer novas super linguas in interlingua, o de contar altere cosas varie que me sembla interessante.

[In this blog, I intend to write about languages, to translate news about languages into Interlingua, and to talk about various other things that seem interesting to me.]

The main language of the blog is Interlingua though most if not all of it is also available in Dutch. I've never studied Interlingua, but I find that I have no difficulty reading it. It's an artificial Romance language, like Esperanto, but seems to me rather like Italian. The author doesn't care for Esperanto though; at the top of the current page you'll find the author's argument for the superiority of Interlingua over Esperanto. Among other things, he says:

The raison d'être of Esperanto is stubborn ideology.
The raison d'être of Interlngua is living language.

What brought me to Via de Argilla was the discussion of spelling reform, which among other things links to my recent post. There are also links to Onze Taal (Our Language), an interesting Dutch website (in Dutch), an article about resistance to German spelling reform (in German), an article about a protest at the US national spelling bee entitled Protesters decry English language's illogical spelling and the author's previous piece on spelling reform.

The author doesn't much care for spelling reform. His comment on the protest at the US national spelling bee is:

Funny that people often claim that something is illogical although they really mean that they are not intelligent enough to understand its logic.

Its true that people sometimes criticize aspects of the writing system that do in fact make sense, but I also have to say that there are things that make sense that are nonetheless undesirable. For instance, it is much easier to write if any given sound is written only one way. There are factors that may counterbalance this, but other things being equal, its desirable. So it was a good thing to eliminate the historical spelling of some [o:]s in Japanese as [amu]. When such historical spellings were in use, you couldn't tell how to write [o:] just from the sound. You had to know how that sound was spelled in particular words and grammatical forms. That made it more work to learn to write. Similarly, writing knight with a k and night without one requires extra memorization in modern English since they have the same pronounciation. Historical spellings in both Japanese and English do indeed have a logic to them, but it is the logic of history. The fact is that such spellings impose a considerable extra burden on those learning to read and write, and other things being equal, it is best to reduce that burden.

Via de Argilla's argument against spelling reform begins with the claim that reformers have a false view of languages, namely that writing is simply a means of recording speech. This is true to an extent. Written language is often a special register, distinct from most styles of speech, and the information conveyed by writing and by speech is not identical. He mentions, for instance, that writing does not convey intonation, which is generally true, though some writing systems have devices that convey some aspects of intonation. Writing may also convey information not present in speech. In English, for example, the words knight and night are distinguished in writing but not in speech. Similarly, in Mandarin Chinese he, she, and it are distinguished in writing but not in speech. At the same time, all known written languages are based on spoken languages, and although there are little distinctions like those mentioned, basically it is true that when people write they are trying to record, albeit in slightly modified form, what they have to say. Via de Argilla is right that written language often preserves etymological and morphological facts about the vocabulary that are lost in pronounciation. The question is, is it worth doing this if it makes it difficult for people to write? If you're trying to promote mass literacy, you want to minimize the unnecessary differences between spoken and written language. The history of the language is an additional, less important segment of knowledge, one that can perfectly well be taught separately from literacy. Indeed, historical spelling only contains a fraction of the information one needs in order to understand linguistic history. Rather than tormenting beginning readers and writers with historicizing spellings, it would be better to let them learn to read and write easily and at a later point in their education teach them some real historical linguistics.

Via de Argilla's other argument is that historical spelling alleviates the effects of linguistic variation. If writing closely reflects pronounciation, and if pronounciation varies, we either have to get everyone to use a single standard as the written form, or we will have many different written forms. This is true, up to a point. Historical spellings do sometimes provide a common spelling that can be given a different pronounciation in different dialects. The error in this argument is that only certain kinds of linguistic changes, and therefore only certain differences among dialects of a language, work this way. Where dialects differ in what words they use, it doesn't do any good to use a historical spelling. soda, pop, and coke would not be written the same way in the spelling of any stage of English. Similarly, no choice of spelling can eliminate differences in word-formation or in syntax. Indeed, only certain differences in pronounciation can be glossed over by historical spelling.

It is also important to note that some historical spellings do not serve to unify dialects. For instance, there is to my knowledge no variety of English that preserves [k] in word-initial [kn] clusters. Providing a unified spelling for English dialects therefore does not motivate writing an initial k in knight.

It's true that spelling reform requires care and that creating a good writing system is not as simple as choosing one letter per sound, but the English spelling system is such a baroque mess that it creates real problems for those learning to read and write. Functional illiteracy is a major problem in the United States and other English-speaking countries. This is due in part to the idiotic methods used to teach reading and writing, but the task is made much more difficult by the spelling system.

Posted by Bill Poser at 08:50 PM

Improved technology: On envelopes and WordPerfect

I just mailed a paper to a friend of mine in Australia, using one of a new batch of very sturdy and light large white envelopes made out of some plasticky stuff that will never tear. It was a new and improved product, it said on the back by the flap. I should have been instantly alarmed, knowing that improved products are nearly always worse than the pre-improvement version in at least some ways. But like a credulous fool I turned it over and began writing out the address. Little did I know...

The main improved feature was that a very heavy and robust imitation thread pattern had been introduced into the plasticky fabric. Doubtless they assumed that everyone would be using self-adhesive labels, not writing on the envelope the way I do (either to save stationery costs or because I can't find my labels). But writing on the envelope was like writing on a plastic tablecloth under which someone had inadvertently left a piece of that wire netting gardeners use to protect plant roots from gophers. The simulated threads were so robust that my ballpoint pen jumped off each one in a direction that could not be predicted to a tolerance of less than about a millimeter. It did this about eight times per inch. The result was that my handwriting looked like that of a person with moderately severe dementia. Andrew will wonder what happened to age me so fast and make my hand so trembly. But I was not too amazed to discover this disastrous feature of the new product. As I believe I may have said before, every upgrade is a downgrade.

Well, nearly every upgrade, anyway. Listen, I'm a cruel man, but fair, and I have to admit something to you. My earlier post on technology grumbled that WordPerfect had been getting worse rather than better: each new version I had experience with since about version 5.1 had either lost some useful function or added new bugs or both. Nonetheless, I still have to use WordPerfect every day, whatever the problems with it, because of collaborations with colleagues who use it. So a few weeks ago, when I noticed a real bargain price on WordPerfect 11, I decided (against my inner promptings) to acquire it (there were hopes that it might have enhanced capabilities in such matters as writing in HTML format).

The bargain price goes along with a side story, which, being utterly undisciplined, I will digress to tell. It was from a company that at first appeared to be doing something quite illegal, because the CD-ROM that came out of the slim envelope they sent me said plainly on the label that if I had not purchased it "in conjunction with original equipment" there could be "civil and criminal penalties". I was about to call Corel (the owners of WordPerfect) and the Santa Cruz police, sign a power of attorney for Barbara's use while I was inside, and place myself under citizen's arrest to wait for the arrival of the software authorities who would come in a van and take me away, when I somehow happened to notice that the slim envelope was not quite empty. I shook it, and out slid a small hardware device. Two plastic sockets connected to each other by four colored wires. I think it is a connector for internal disk drives. Original equipment! I was safe from prosecution: my WordPerfect 11 had been purchased in conjunction with original computer equipment, and I was morally entitled to the price I got (a discount of about 80% — and why not, since version 11 was by this time so long in the tooth that version 12 has now been released).

So instead of turning myself in to the software police I installed WordPerfect 11. The install was rapid and efficient, and the program started right up and worked. (Believe me, with some Windows software it has taken an hour or two to get that far.) And since I did all that whining about previous versions in the earlier post, let me now tell you this: it's beautiful product. It's better than earlier versions in various ways, I'm pleased with it, I'm using it all the time, and the final hard copy of the book I'm writing with Rodney Huddleston will be produced with it.

Now, don't misunderstand, I'm not saying things are totally great. There are numerous little stupidnesses in the redesign of some of the editing menus that do illustrate my slogan (did I mention it already?) that every upgrade is a downgrade; for example, it used to be possible to put a word in superscript with six swift keystrokes, and now it isn't, you have to leave the keyboard at one point and reach for the mouse to click on a button and make a menu choice, which slows the operation down by about 300%. I have also had two or three WordPerfect files that arrived here from Australia, where they were created with version 6 for DOS, that it could not open at all (it repeatedly hogged the entire CPU load and then froze). And more seriously, if this program was an SUV rather than a word processor, I'd be dead from fatal multiple rollover crashes, because it has bombed many times, often badly enough to trigger Windows XP's sending a message to Microsoft about the error, and once badly enough that on recovery it automatically sent its own message direct to Corel about its state of ill health.

You may be puzzled to hear this litany of complaints about a program that I seemed actually to be praising in the paragraph before; I give with one hand and take away with the other. But you don't understand: what I have described is good behavior compared to the usual utter shit one takes from Windows programs! WordPerfect 11 often goes whole days with no crash (I seldom had such a day with WordPerfect 6.1 on Windows 95). I have several times been able to recover from a WordPerfect crash without having to reboot the entire machine (that was never possible with earlier versions of Windows).

I'm just trying to give credit where it's due. WordPerfect 11 is a vastly better product than Microsoft Word: faster, slicker, more convenient for the ordinary user, much better endowed with keyboard shortcuts, and above all, much more transparent and accessible for the expert user (there has been no change to that crucial Reveal Codes feature, which enables you to know exactly what invisible format-encoding control characters are in your document and exactly where they are). Living with Windows is still a purgatory of frustration and abuse, but such things are relative: compared to the alternative, i.e., compared to Word, I would recommend WordPerfect 11 for anyone who needs high-quality WYSIWYG word processing. I only wish it were available for Linux and Mac operating systems: it used to be, but not any more, so to use it you have to endure the nightmare of Windows and thus do business with the evil wizards of Redmond. But for the millions who have to live with Windows anyway, WordPerfect 11 (or 12) is a great choice for WYSIWYG word processing needs, and I wouldn't want anyone to think that my hostility to premature forward steps in technology that are really backward steps had blinded me to that fact.

[I own no stock in Corel, nor in any subsidiary or supersidiary of theirs, nor do I know anyone who works for them. This is not a sponsored announcement, it's a disinterested critical opinion from a customer.]

Posted by Geoffrey K. Pullum at 03:43 PM

Great communication

Among the kind things being said around the world this morning about the late President Ronald Reagan, the truest concern his linguistic skill with English. The excerpts from his speeches replayed on National Public Radio reminded me once again that he was a truly a master of spoken presentation. In fact for relaxed and comfortable yet crystal sharp delivery of political speeches, to either an audience or a TV camera, he may have been the best there ever was. Direct and personal, scarcely ever a hesitation or a flub. And people who knew him well report that this was not just a matter of being a TelePrompter-driven automaton: he displayed similar skills when winging it with no script at all. The "great communicator" sobriquet was well deserved.

Posted by Geoffrey K. Pullum at 11:34 AM

Stoics

A new blog, the stoa consortium, from an established outfit.

[via The Adhumlan Conspiracy]

Posted by Mark Liberman at 08:27 AM

June 05, 2004

Wait, wait, don't spell me

Let me say one other thing about the much-admired NPR newsreader Carl Kasell, whose unusually open long o vowel sound I recently discussed elsewhere: I don't know how often people spell his name correctly, but it strikes me as extraordinarily difficult. I got it wrong many times when writing about him. The problem is... Well, if you have any difficulties with spelling at all, ever, don't read on, because what I'm going to say now will really screw you up forever.

The problem is the same one (whatever it might be) that I have with correctly naming the fine Swedish mystery author Hannell Menking (I may have that name slightly wrong). The problem in Kasell's case is that one could plausibly imagine that any of at least the following list of spellings might be correct (I'm assuming American pronunciation here, so spellings like Castle are relevant):

Carl Casel	Carl Casell	Carl Cassel	Carl Cassell
Carl Kasel	Carl Kasell	Carl Kassel	Carl Kassell
Karl Casel	Karl Casell	Karl Cassel	Karl Cassell
Karl Kasel	Karl Kasell	Karl Kassel	Karl Kassell
Carl Castle	Karl Castle	Carl Kastle	Karl Kastle

Several parts of several of these have some foundation from either English or German personal or place names (Carl Sagan, Karl Rove, Howard Cosell, Cassell publishers, the city of Kassel...). And there are just too many to choose from that would all sound exactly the same. And since English will never have a spelling reform, we are just stuck.

Sorry. I did warn you. But now you'll never again be able to spell the poor man's name correctly.

Posted by Geoffrey K. Pullum at 08:48 PM

The 256 names of... that Swedish guy

I'm afraid I have to offer readers of Language Log an apology. In an earlier post I admitted to being unable to remember the correct name of an excellent Swedish mystery writer named Hanking Mennell. Or that might be Manking Hennall (sp?). I can't remember. The apology is because I said that in my view there were 48 possibilities for Hanning's name. But recently I checked for one of them (Hennall Menking) to see if it was correct, and it wasn't on my list! The algorithm that I used to generate the list was wrong (I'm really not a good programmer). So I started again from scratch, and I am now confident that I have it right. There are a frightening 256 plausible-looking possibilities for this man's name.

Here's the list for reference. The correct name is somewhere in this table. Do look him up. (Sorry I can't help, but since I glanced at the table I've completely forgotten, again, which one is the correct name, so you're on your own):

Hankall Manking	Henkall Manking	Mankall Hanking	Menkall Hanking
Hankall Manning	Henkall Manning	Mankall Hanning	Menkall Hanning
Hankall Menking	Henkall Menking	Mankall Henking	Menkall Henking
Hankall Menning	Henkall Menning	Mankall Henning	Menkall Henning
Hankell Manking	Henkell Manking	Mankell Hanking	Menkell Hanking
Hankell Manning	Henkell Manning	Mankell Hanning	Menkell Hanning
Hankell Menking	Henkell Menking	Mankell Henking	Menkell Henking
Hankell Menning	Henkell Menning	Mankell Henning	Menkell Henning
Hanking Mankall	Henking Mankall	Manking Hankall	Menking Hankall
Hanking Mankell	Henking Mankell	Manking Hankell	Menking Hankell
Hanking Mannall	Henking Mannall	Manking Hannall	Menking Hannall
Hanking Mannell	Henking Mannell	Manking Hannell	Menking Hannell
Hanking Menkall	Henking Menkall	Manking Henkall	Menking Henkall
Hanking Menkell	Henking Menkell	Manking Henkell	Menking Henkell
Hanking Mennall	Henking Mennall	Manking Hennall	Menking Hennall
Hanking Mennell	Henking Mennell	Manking Hennell	Menking Hennell
Hannall Manking	Hennall Manking	Mannall Hanking	Mennall Hanking
Hannall Manning	Hennall Manning	Mannall Hanning	Mennall Hanning
Hannall Menking	Hennall Menking	Mannall Henking	Mennall Henking
Hannall Menning	Hennall Menning	Mannall Henning	Mennall Henning
Hannell Manking	Hennell Manking	Mannell Hanking	Mennell Hanking
Hannell Manning	Hennell Manning	Mannell Hanning	Mennell Hanning
Hannell Menking	Hennell Menking	Mannell Henking	Mennell Henking
Hannell Menning	Hennell Menning	Mannell Henning	Mennell Henning
Hanning Mankall	Henning Mankall	Manning Hankall	Menning Hankall
Hanning Mankell	Henning Mankell	Manning Hankell	Menning Hankell
Hanning Mannall	Henning Mannall	Manning Hannall	Menning Hannall
Hanning Mannell	Henning Mannell	Manning Hannell	Menning Hannell
Hanning Menkall	Henning Menkall	Manning Henkall	Menning Henkall
Hanning Menkell	Henning Menkell	Manning Henkell	Menning Henkell
Hanning Mennall	Henning Mennall	Manning Hennall	Menning Hennall
Hanning Mennell	Henning Mennell	Manning Hennell	Menning Hennell

Posted by Geoffrey K. Pullum at 05:52 PM

"Pro-Life" vs. "Anti-Abortion"

This is at third or fourth hand, depending on how you count, but it seems worth spreading around. It's an item that appears on p. 24 of the May 10, 2004, issue of High Country News, in the `heard around the West' section (no, that lower-case h is not a typo):

Political correctness can sometimes get out of hand. The Tucson Weekly reports that a perfect example occurred at The Los Angeles Times, when a reviewer described a 19th-century opera by Richard Strauss as "pro-life''. The reviewer meant that it celebrated and affirmed life. An over-zealous editor followed the newspaper style-book's recommendation, and changed "pro-life'' to "anti-abortion'', giving the opera a whole new theme.

I assume that the perpetrator really was a live editor rather than a not-quite-clever-enough set of programmed instructions designed to keep the term "pro-life'' out of the newspaper. But it 's a bit hard to believe that the reviewer didn't realize that her/his loaded term would fail to convey the intended meaning to many readers (unless readers of opera reviews are really, really different from us ordinary mortals).

Posted by Sally Thomason at 05:20 PM

Duck Dialects Mimick Human Dialects?

The Guardian reports that ducks in Cornwall have different accents from those in London. According to Victoria De Rijke, Lecturer in English at Middlesex University:

The Cornish ducks made longer and more relaxed sounds, much more chilled out. The cockney quack is like a shout and a laugh, whereas the Cornish ducks sound more like they are giggling.

Victoria de Rijke, by the way, is the leader of the Quack Project, which has made recordings of children from different linguistic backgrounds (Arabic, Tamil, Vietnamese, etc.) illustrating the way in which the sounds of common animals are mimicked in their language.

The story was picked up by Agence France-Presse and also appears in the Australian News in Science and the South African IOL. However, this version contains the somewhat implausible suggestion that:

The result was that the ducks' "accents" mimicked those of the humans in their home region.

That isn't what Dr. De Rijke said. If you read through the direct quotes in the story in The Guardian, what she said is that the London ducks "speak" as they do because, in contrast to the Cornish ducks, they have to make themselves heard over the noise of city life. She then compared this to the difference between the Cockney and Cornish accents in English. Her thesis is that both ducks and humans are responding to differences in their environment. I'd need more evidence to persuade me that this is true of the differences in English, but it is very different from the hypothesis that the ducks have mimicked human regional accents. She didn't say that; Agence France-Press did.

The problem of attributional abduction has been discusssed a number of times here on Language Log. When a news item attributes to someone a view that we find strange, we need to be careful not to rush to attribute it to the putative source; in practice the strangeness is often introduced by a reporter or editor. In this case we can be sure that the error was introduced by Agence France-Presse because we know which version is the original and which the derivative. The journalist may have misunderstood, but it is possible that the problem here is a poor choice of words, namely mimicked, which implies copying, rather than something like paralleled, which does not.

Posted by Bill Poser at 12:02 PM

Barrister Idem Idem

I don't usually pay attention to spam, but this caught my eye:

I am Barrister Idem Idem. I am the Personal Attorney to Engr Robert Fitzpatrick, who worked for a Multinational company in Nigeria. On the 31st of oct 1999, my client,and his wife were involved in a aircrash on board Egyptain boeing 767...

I bet his brother is Ibid Idem.

Seriously, Idem is a reasonable Nigerian name, as in the case of this (eminently respectable) chemical engineer, but I wonder whether the string's frequency in legal documents had some role here...

Posted by Mark Liberman at 10:54 AM

What do wine tasting notes communicate?

Following up on our earlier discussion of winetalk, I did a PsycInfo search on "wine". I didn't come up with much, to my surprise, but there were a few interesting things. One was Frédéric Brochet and Denis Dubourdieu, "Wine Descriptive Language Supports Cognitive Specificity of Chemical Senses", Brain and Language, Volume 77, Issue 2 , May 2001, Pages 187-196. The authors analyzed several large corpora of tasting notes, in both French and English, using a method that factored the vocabulary into "classes" based on co-occurrence patterns, and concluded that

(1) Class number and organization are different among experts so that each expert has his own discourse strategy. (2) Wine language is based on prototypes and not on detailed analytical description. (3) Prototypes include not only sensory but also idealistic and hedonistic information.

Most striking for me was how strongly Brochet and Dubourdieu claim a sort of "indeterminacy of radical translation" for winetalk, based not only on the idea that different individuals connect words to experience differently, but also on the idea that the underlying sensory apparatus is not really shared:

...the number of classes and their nature are broadly different among subjects. Lawless (1984) demonstrated that experts were not significantly able to recognize wines based on a description given by others, even when they were experts. Given the small number of terms common to the several authors studied here, it seems clear that wine descriptions are deeply individual and that they make sense mainly to the taster him- or herself. The results confirm that a consensual language for the description of wine does not exist and that only ‘‘individual’’ languages appear in published works. The analysis of the compiled corpus showed only convergence through color. Category divergence was confirmed by Berglund (1973), who demonstrated with basic odorants that flavor categories do not exist at an interindividual level but that they were accurate for individuals. These differences in language used to describe taste sensations may arise from genetic differences among individuals (Buck, 1993). Olfactory receptors may be encoded by a very large multigene family, so the probability that two individuals will possess the same receptors is very low. This diversity is enhanced by the diversity of learning associated to chemical senses. Individuals do not learn to designate odors in the same way so that a same sensation, a same signal, will be categorized differently, which will lead to different denomination, i.e., different languages. This shows that communication of wine sensory properties is not accurate (Lehrer, 1975).

It's obvious to start with that tasting notes don't tell us anything about the wine that was tasted, except insofar as we can make inferences from the linguistic reactions of the person that did the tasting. Brochet and Dubourdieu are arguing that such inferences must be very indirect ones at best, and that in fact the descriptions convey almost nothing about the wine, but only something about the taster's descriptions of his or her reactions. These in turn are doubly decoupled from our own, first by the different choice and use of terminology, and second by a different underlying biology of sensory receptors. Unless, perhaps, we allow our own intrinsic perceptions to be overshadowed by the influence of what we've read.

I'm reluctant to accept this argument. For one thing, I suspect that you could use the same techniques of linguistic analysis to make a similar argument with respect to nearly any sort of linguistic description of human experience, perceptual or otherwise. It's well known that descriptions are reliably inconsistent across individuals, even experts from the same subculture (see e.g. Furnas et al., "Statistical semantics: Analysis of the potential performance of key-word information systems", Bell System Technical Journal. Vol 62(6, Pt 3), Jul-Aug 1983, pp. 1753-1806). This doesn't necessarily mean that the underlying perceptions and categorizations are incommensurable or even that the descriptions are not communicatively effective.

Also, the idea of a strict logical distinction between sensation and expectation probably makes even less sense in the case of food and drink than it does elsewhere. From what little I know of the psychophysics, it seems likely that the integration of perception and expectation takes place at all levels of processing, and at time scales ranging from recent individual experience to evolutionary development across generations. On this view, we might value winetalk for the same reason that we value a lot of other writing: not because the writer's perceptions are the same as our own, but because they're different.

It's still a rather bleak view of communication: either the text we read is just pumping some energy into resonances of our own neuronal system, in a way that is only accidentally connected to the pattern originally expressed; or the text overshadows our own reactions, affecting them only by partly replacing them with an echo of the expert's views, disconnected from our own sensory experience. It's nicer to think of a message being composed, sent, received, understood, evaluated, and acted on.

With respect to the author's hypothesis that "[o]lfactory receptors may be encoded by a very large multigene family, so the probability that two individuals will possess the same receptors is very low", I haven't been able to find any concrete information one way or the other. However, this research finds that "humans have accumulated mutations in odor receptor genes four times faster than have the other primates", and that "[o]lfactory receptors are the largest 'superfamily' of genes in mammals, with over 1,000 different genes. But in humans, more than 60 percent of these genes no longer work." It's not clear (to me at least) how much population variation in this superfamily exists among the current human population. However, from what I can see, the situation might well be as the authors suggest: there are a thousand or so featural dimensions that are available in principle; there is a high mutation rate; there is presumably relatively little selective pressure on the details of the system, as long as adequate overall identification and discrimination can be maintained; and apparently in recent times there has been little selective pressure in the hominid line even to maintain the acuity of the system as a whole.

Another cross-cultural note: Brochet and Dubourdieu use the Max Reinert's Alceste software package, which performs a sort of factor analysis of term-by-document matrices, and seems to be fairly widely used by researchers in France. Rienert describes his approach here, in a way that is somehow very French, despite the pervasive influence of Anglophones such as Harris, Wittgenstein and Peirce. Alceste seems to have a lot in common with Latent Semantic Analysis, originally proposed here, and widely used in the Anglophone world -- but neither tradition seems to be aware of the existence of the other one. This is not strictly a matter of country and language, since there is (for instance) an LSA site at Grenoble, but there is still an interesting cultural divergence here.

Posted by Mark Liberman at 07:42 AM

From scratch

The expression from scratch, as I understand it, means starting from the basic ingredients, without using premade components. If you were to tell me that you had made cookies from scratch, I would take you to mean that you started out with flour, sugar, butter, and so forth. You'd be telling me that you didn't use a mix or premade cookie dough. As far as I know, that is what this expression means to everybody. I wouldn't take you to mean that you had never been taught how to make cookies, did not discuss cookie-making with anybody, and did not consult a cookbook. The reason I mention this is that, believe it or not, the meaning of the expression from scratch comes up in the latest episode in the software wars.

I've talked before about GNU/Linux, the free software movement, and SCO's attack on Linux. Recently a new front has opened in the assault on the free software movement. Ken Brown, the President of the Alexis de Tocqueville Institution, has written a book attacking free software. Some of his claims have been posted on their website, and a draft has been circulated to selected people. The Alexis de Tocqueville Institution has pretensions of being a policy think tank, but seems to be basically a corporate mouthpiece. It is funded by various corporations, including Microsoft, and right-wing foundations. The Disinfopedia has an informative article about AdTI. Among other things, it is a shill for tobacco companies.

One of Brown's bizarre claims is that Linus Torvalds could not have written the Linux kernel without lifting much of the code from somewhere else. His idea is that Linux is a derivative of Minix, a Unix-like operating system developed as a teaching tool for his operating systems course by Andrew S. Tanenbaum. Brown has not a shred of direct evidence for this. Basically, his argument is that a university student couldn't have written an operating system by himself. As it happens, we can be quite sure that Torvalds did not copy Minix because we can compare the code and see that there was no copying. The source code for both is publicly available, here for Minix, and here for Linux. In fact, we know a priori that Linux couldn't be a copy of Minix because Linux is an example of what is called a monolithic kernel and Minix is an example of what is called a microkernel. Of course, Brown knows nothing about operating system design and so can't be expected to understand this.

AdTI's press release and the draft of the book have elicited a great deal of criticism, including devastating responses from some of the principals. Alexey Toptygin, whom AdTI hired to compare the Linux and Minix source code, has publicly stated that he found no copying and has published the results of his comparison. In a message to Andrew Tanenbaum, he says of Ken Brown:

...pay no attention to this man; to the best of my knowledge he is talking out of his ass.

Andrew Tanenbaum, the author of Minix, has explicitly denied that Torvalds copied Minix. You'd think he'd know, wouldn't you? Dennis Ritchie, one of the creators of Unix, has publicly denied Brown's claim that he conducted an "extensive interview" and has made available both sides of his email correspondence with Brown. Eric S. Raymond, President of the Open Source Initiative, to whom AdTI made the mistake of sending the draft of the book, has published a critique in which he characterizes it as

...a steaming pile of crap, full of anti-factual distortions, scare-mongering, and FUD.

AdTI President Ken Brown has issued a reply to his critics. Amazingly, in spite of Minix author Andrew Tanenbaum's denial and the failure of the code comparison to show any copying, he is still pursuing the line that Linus Torvalds must have based Linux on Minix. Among other things, he writes:

Yet Tanenbaum vehemently insists that Torvalds wrote Linux from scratch, which means from a blank computer screen to most people. No books, no resources, no notes -- certainly not a line of source code to borrow from, or to be tempted to borrow from.

Of course, Torvalds had books on operating system design and other resources. But as I mentioned above, writing from scratch doesn't mean that you've never read a book on the subject and never discussed it with anyone. So it would seem that there are just two possibilities here. One is that Ken Brown's notion of the meaning of from scratch is different from everybody else's. The other is that he's deliberately trying to make it look like Tanenbaum said something utterly implausible when he knows perfectly well that that is not what Tanenbaum meant.

The kicker here is that it makes no difference whether Tanenbaum's statement is correct. Suppose, for the sake of argument, that he got it wrong, that he said that Torvalds used no books or other resources, and that Torvalds in fact did. Would that tell us that Torvalds copied from Minix? Of course it wouldn't. Brown is actually trying to pull off a double fraud here: first, to convince us that Tanenbaum's statement about the origin of Linux is implausible, and second, to convince us that if Tanenbaum made such an implausible claim it would show that Linux was copied from Minix. It almost makes me think that Brown isn't an idiot; maybe he's an evil genius.

Posted by Bill Poser at 01:25 AM

June 04, 2004

More eggcorns

From Josha at Logomacy: a "hard road to hoe", "take another tact", "tow the line". Ratio of original/eggcorn Google hits: 6.1, 5.9, 2.2 respectively.

Posted by Mark Liberman at 07:55 PM

Pre-requiems

Noted by Nora at Bathtub Adventurer:

"I just have certain pre-requiems when I'm going into a relationship"

I'm surprised to find that Nora's post has the only "pre-requiem" in Google's index. It's not only a tiny little poem, it's an original, even unique one.

Posted by Mark Liberman at 03:35 PM

Internally grateful

Francis Heaney points out a strange Christmas ornament with the legend "internally grateful". That one seems to be a joke, though it's hard to be sure, but there are many folks out there for whom the same phrase is an eggcorn. This case is an interesting one, since "internally grateful" is well formed and perfectly sensible, but examples like those below seem almost certainly to be based on misconstruing the cliché "eternally grateful":

(link) If anyone can solve my problem, or tell me how to tell word to open as readonly (I suspect this may solve the problem) I will be internally grateful.

(link) right now my life is going great thanks to friends & family & all the other people who have helped in my continuing recovery, especially my wife Carol & my son Mark, who I'm internally grateful for all their love & support.

(link) He also states that he is internally grateful to his partners and constituents who worked so hard, so many hours, and who helped make the draft a historic and memorable event.

(link) I am internally grateful and consider it an honor and privilege to be listed among my music peers and heroes

(link) I am internally grateful to have been touched by her love and her spirit.

(link) The Stone Roses really did change my life and for that I am internally grateful.

Posted by Mark Liberman at 01:45 PM

Formally to indicate

Jeff Erickson at Ernie's 3D Pancakes has noted an ambiguity in his tenure letter:

With this letter, we invite you formally to indicate your acceptance of both the privileges and the reponsibilities of tenure at the University of Illinois, preparatory to the transmission of our recommendation to the President and the Board of Trustees.

As Jeff asks: "Did they formally invite me to indicate my acceptance, or did they invite me to formally indicate my acceptance? Just to be safe, I assumed the latter and formally indicated my acceptance."

The silly rule about not splitting infinitives often creates unnecessary ambiguities. But as Jeff's quotation illustrates, administrative prose is a cornucopia of ambiguity and vagueness. For example, the quoted sentence makes it clear that the invitation is preparatory to the transmission, but is vague as to where the indication of acceptance falls in this process. If Jeff didn't respond -- formally or informally -- would the rest of the process have been blocked? Obviously he played it safe and responded, both formally and immediately, but the letter invites this response without making it clear what the consequences of various other courses of action might be.

I've noticed that administrative instructions and regulations, even quasi-contractual ones like the letter Jeff quotes, are often rather vague. I suspect that this is not entirely accidental, since it offers administrators (academic and otherwise) a useful amount of interpretive discretion. In this case, the UIUC administration certainly would like to know whether someone is going to say "yes" before they take the trouble to formalize a tenure offer; but perhaps in some cases, they need to go forward without this assurance.

And as a practical matter, they can't stop someone from backing out after the fact. At least, university administrations never try to do so, as far as I know. In this respect, faculty contracts are asymmetrical: a faculty member would almost certainly go to court if a tenure contract were capriciously terminated by the university, but faculty often quit suddenly, or back out of formal offers at the last minute, without getting sued.

Thus the wording of this letter may represent an attempt to induce a commitment from Jeff, in a situation in which the university administration is not really able to require it, or at least doesn't want to establish a policy requiring it. Until the trustees act, as I understand things, they can't legally offer him a tenured position, and until they make the formal offer, he can't fomally accept it. But they'd like to have as strong a contingent commitment from him as possible.

I don't guarantee that I've understood the legal situation correctly here, but I think my analysis of the impulse behind the vaguely-expressed temporal and causal connections among invitation, acceptance and transmission is probably accurate.

Posted by Mark Liberman at 09:31 AM

Writing System Reform

People sometimes wonder why we couldn't reform the awful English spelling system. Some of the relevant factors are:

People want those who come after them to suffer the same way they did.
The ability to spell correctly serves as a sign of social class. If spelling were easy, this marker would not be available.
Those who learn to read in a revised system would have difficulty reading everything published prior to the reform.
The market for spellcheckers and monolingual dictionaries would be greatly reduced.
Spelling bees would disappear.

There has been debate in Japan about whether Chinese characters should be eliminated, replaced by the exclusive use of ひらがな [hiragana] and カタカナ [katakana] or by roman letters, since the late 19th century. After the Second World War the push for romanization became particularly intense. At one point, it seemed likely that the Occupation authorities would impose romanization. A spelling reform did in fact take place, but it just reduced the number of Chinese characters in official use, simplified the form of many characters, and eliminated historical spellings. For instance, 行こう [iko:] "let's go" used to be written 行かむ [ikamu], which is how it was actually pronounced some hundreds of years ago.

[For the curious, what happened is that the /m/ dropped out. Then the /a/ and the /u/ merged into a long [ɔ:]. At this point Japanese had five short vowels but six long vowels, with a contrast between [ɔ:] and [o:]. After a while, [ɔ:] merged with [o:]. The same changes explain why the city of Kobe [ko:be] is written 神戸, which we would expect to pronounce [kamibe]. Actually, you'd probably expect it to be [kamito] or [kamido], but for simplicity's sake we'll assume that you already know that 戸 is pronounced [he] in some placenames, and of course you know that [he] is likely to become [be] at the beginning of the second member of a compound noun. First [kamibe] became [kamube]. Then the changes just described led from [amu] to [o:].]

Although it seems very unlikely that Japan will shift to romanization, there is still an organization that promotes the use of romanization, the Japanese Romanization Society. You can easily see how unlikely the Society is to attain its goal by looking at its newsletter, and you don't even have to be able to read Japanese. Here is the first page of the issue of August 1, 1992, which I happen to have on hand. The lead article celebrates the centennial of the Chinese romanization movement. I used to be on their mailing list, but I guess I fell off it due to moving around.

As you can see, most of the page is in the usual Japanese mixture of Chinese characters and kana. I'd say that of the issues I've read, not one was more than 50% in romanization. That isn't because anyone would be unable to read it. All Japanese learn romanization in Grade Five and also study English. Even the Romanization Society can't quite bring itself to make the change.

Posted by Bill Poser at 01:32 AM

June 03, 2004

Joshua Macy comes clean

Joshua Macy, a man of many blogs ( Foolippic, Books Do Furnish a Room, BlogRiPpinG, BlogLatin, etc.), has recently started a new one, Logomacy, for "a lot of stuff relating to words and language".

In his 11th post, he fesses up:

I’m a prescriptivist. I admit it. I’m even a prescriptivist of a type that’s catalogued in Language Log: A Field Guide to Prescriptivists, namely a Fashion Prescriptivist.

Fashion—how an admired group talks. Deviation is alienation.

In my particular case the admired group is quite specific: Jane Austen. I recognize the point that descriptivist linguistics makes, that linguistic fashions are arbitrary and that there is nothing intrinsic in Standard English, or any other form, that makes it superior. But if the choice is arbitrary, then we might as well plump for the version of the language that lets us read Pride and Prejudice (and, as an added benefit, English literature since then). It would be a tragedy if there came a point where “It is a truth universally acknowledged, that a single man in possession of a good fortune, must be in want of a wife.” needed a gloss for the modern native English-speaking reader as the opening of Romeo and Juliet now does ...

He's right, you know. It would indeed be a tragedy, though at the moment, Jane Austen's social norms may need more of an explanation for young Americans than her linguistic ones do.

Posted by Mark Liberman at 09:47 PM

Wait, wait, don't transcribe me

An unusual number of occurrences of words like home (and other words with the same vowel) during the NPR news this morning enabled me to spend some concentrated time confirming something I had vaguely noticed before about the distinctive and much-admired speech of Carl Kasell. (I'll need to transcribe a few words in the International Phonetic Alphabet in what follows; the symbols should show up all right if your browser handles HTML codes in the region above #256.) The standard wisdom about American English that I've been teaching in my phonetics class is that a word like home would be pronounced [hoʊm]; but Carl Kasell (not that I want to make him feel self-conscious about this) doesn't have anything like that pronunciation. Paradoxically — since he is so much a standard for American speech that NPR runs a weekend call-in news quiz show ("Wait, Wait, Don't Tell Me") on which the prize for winning listeners is getting Carl Kasell's voice on their home answering machine — his speech is actually quite idiosyncratically distinct from what is commonly thought of as standard. His pronunciation of home is [hɔːm]. (For phoneticians whose browsers have let them down: he uses Cardinal 6 instead of a diphthong beginning with Cardinal 7.)

Cardinal vowel number 7, transcribed [ɔː], is not normally found in standard American English pronunciation except where the r-sound [ɹ] follows; thus horn is pronounced [hɔːɹn]. Most British English dialects don't have [ɹ] after the vowel of a syllable, so for them horn is pronounced [hɔːn]. Carl Kasell has that vowel in home and phone and go and coast, etc. Those who win his voice on their home answering machine will find his is telling their friends that they're not [hɔːm] right now.

Their friends probably won't mind, though. It's odd, but sometimes personal idiosyncrasies of speech go completely unnoticed, especially from prestigious people. People get all excited or furious about some minor phonetic aspects of speech like alleged mispronunciations of George Bush or Brad Pitt (not that people know how to describe the facts them very well), but then they will completely miss salient features of the speech of other people in the public eye or ear. Perhaps people don't hate Carl Kasell as much as they hate George Bush or Brad Pitt.

Posted by Geoffrey K. Pullum at 01:29 PM

June 02, 2004

(Mis)spelling Gandhi

Shankar Kalyanaraman observes that people often write Gandhi as "Ghandi". In fact, this misspelling is much commoner than either of the other two errors, "Ghandhi" and "Gandi":

	dh	d
gh	8,220	261,000
g	2,260,000	78,000

The difference is even larger considering that many of the "gandi" hits are really examples of gandi.net or other completely different but equally valid words. We've commented on this pattern of errors many times before, for example with respect to Jennifer, tomorrow, parallel, Karttunen, Attila, and so on. What this means in the case of Gandhi is that people know there is an "h" in there somewhere, and just one of them, but they're not too sure where it is. As a result, the omission of the "h" after the "d" and the insertion of an "h" after the "g" are not statistically independent processes.

It's no doubt also relevant, in this case, that "gh" is a commoner sequence of letters in English than "dh" is, by a large factor.

I haven't seen a model of spelling/misspelling that does a very good job of predicting such patterns. The spelling-correction algorithms that I'm familiar with tend to assume independent of string-local edits, which is obviously wrong.

Posted by Mark Liberman at 09:22 PM

More on winetalk culture

Ray Girvan has emailed links to two articles on the culture of winetalk, which in turn led me to some other links of equal interest. Ray's first link was to a piece on the Berry Bros. and Rudd site, quoting Hugh Johnson as "[accusing] Jilly Goolden and Oz Clarke of 'attracting ridicule' to the beloved subject of his expertise":

"I don't really want my favourite subject to be ridiculed. There is a problem when these people list all these flavours and aromas they think they have detected. It then gets on to the label of the bottle and what you are looking at appears to be a recipe for fruit salad."

Ray's other link is to a Guardian story by Tim Radford from 11/2001, which says that "[n]ew research by French scientists suggests that wine snobs may not always know their asti from their old beaune."

They analysed the language of wine sniffing, and found that the words selected to describe the bouquet were more likely to be linked to the wine's colour than to its scent.

When buyers and wine pundits hold a glass of chardonnay or gewürtztraminer to the nose, they tend to evoke yellow imagery such as straw, melons, honey or apricots.

When they sniff a glass of ripe merlot from the warm south, they reach for dark, rich words like raspberries, tobacco and tar. In other words, people see red, and gush purple prose.

New Scientist today reports that Gil Morot of the national institute for agronomic research in Montpellier and col leagues suspected a case of unconscious synaesthesia - the jumbling of scents, sounds and colours in the brain - when it came to words for wine.

The crux of the experiment was a procedure in which aroma descriptions for white bordeaux were compared with and without the addition of "an odourless red dye."

Dr Morot - who says the results show that "olfactory descriptions are completely subjective" and that smell cannot be divorced from other senses - is to publish his results in the journal Brain And Language. It is likely to join other classics of sip-and-spit literature.

The referenced article was a bit hard to find, since it seems that Radford (or his editor) spelled the author's name wrong. The correct citation appears to be Gil Morrot, Frédéric Brochet and Denis Dubourdieu, "The Color of Odors", Brain and Language
Volume 79, Issue 2 , November 2001, Pages 309-320.

Here is their abstract (emphasis added):

The interaction between the vision of colors and odor determination is investigated through lexical analysis of experts' wine tasting comments. The analysis shows that the odors of a wine are, for the most part, represented by objects that have the color of the wine. The assumption of the existence of a perceptual illusion between odor and color is confirmed by a psychophysical experiment. A white wine artificially colored red with an odorless dye was olfactory described as a red wine by a panel of 54 tasters. Hence, because of the visual information, the tasters discounted the olfactory information. Together with recent psychophysical and neuroimaging data, our results suggest that the above perceptual illusion occurs during the verbalization phase of odor determination.

More recent (?) research by Morrot and others is discussed in this article at pleinchamp.com ("le site expert des professionels agricoles" -- "the expert site of agricultural professionals"), which mentions (in French) that when ten "grands sommeliers" were asked to describe (their tasting perceptions of) a sample of red and white wines, the vocabulary overlapped by only 5%; but when they were blindfolded and asked to identify the same wines as white or red, their performance was only between 60% and 70% -- where flipping a coin would be expected to score 50%. There was no difference in this respect between the "grands sommeliers" and "simple students who were novices in the area". In another experiment, in which they were asked to classify 18 wines according to their "grande région française d'origine", the same sommeliers made an average of 13 errors. I get the impression that Dr. Morrot has a point of view on this subject, so one should look carefully at the experimental reports before accepting the conclusions; but there certainly seems to be a prima facie case here for the view that winetalk has a strong component of, shall we say, poetic imagination.

The Guardian article also says that

In February, an Arizona linguist reported that 20 years ago, wine buffs talked of vintages as clean or cloying, piquant or plump, or even transcendental or twiggy.

By last year, the latest bottlings had become, among many other things, dumb, precocious, harmonious, barnyard, reticent and even intellectual.

The Guardian does not give the "Arizona linguist" a name, even a misspelled one, or identify the publication. The linguist was probably Adrienne Lehrer, whose work on wine is discussed in this interesting article "Delirious Description" by Natalie MacLean:

Wine writing continues to evolve into ever-more esoteric language that seems far removed from the actual experience of smelling and tasting wine. ... What could be prompting this proliferation of purple prose? ...

Dr. Adrienne Lehrer, professor emerita of linguistics at the University of Arizona, has been studying this topic for twenty years. According to her recent report, Trends in Wine and Trendy Words, wine description is getting more precise and intense. A wine today isn't simply balanced, it's "integrated" or "focused." In contrast, an unbalanced wine is "muddled'" or "diffuse." A full-bodied wine is now "chunky" and "big-boned"; a light-bodied wine "svelte" and "sleek."

"I'm interested in this from a linguistic point of view, because wine writers are pushing the language and making up metaphors," Lehrer says. "When critics try to describe thirty Californian chardonnays, they often find that the wines are similar—but it would be boring to read the same thing all the time. So they jazz up the descriptions to keep readers engaged."

While compiling her glossary of frequently used wine adjectives, Lehrer discovered that the high-growth tasting terms include "barnyard funk," "transcendental," "intellectual" and "diplomatic." "Funky was used a lot," she says. "I don't know whether it has any specific meaning that's different from the way that it's used elsewhere."

MacLean also describes how

in Evelyn Waugh's 1945 novel Brideshead Revisited, two young men mock social pretense when they describe the wine they're tasting:

"It is a little, shy wine, like a gazelle."
"Like a leprechaun."
"Dappled, in a tapestry meadow."
"Like a flute by still water."
"And this is a wise old wine."
"A prophet in a cave."
"And this is a necklace of pearls on a white neck."
"Like a swan."
"Like a unicorn."

MaxLean ends with a discussion of the "serious attempts ... to standardize wine tasting vocabulary", especially the Aroma Wheel, "developed in the early 1980s by Ann Noble, professor emerita of oenology and viticulture at the University of California, Davis. The inner circle of its concentric rings notes the most basic wine adjectives, such as 'fruity' and 'floral,' while the sub-divided middle and outer rings provide more descriptive terms such as 'grapefruit,' ' "strawberry jam' and 'asparagus.'" In addition to Noble's "Aroma Wheel", MacLean also provides a link for downloading a free .pdf of the Mouth Feel Tasting Wheel, developed by Richard Gawel and others at the University of Adelaide. Google tells me that Gawel is also the developer of the Recognose Wine Aroma Dictionary, which appears to be a collection of scratch-n-sniff cards providing reference smells. At only US $58, it's a bargain as a nerdly way to ease cultural anxieties! If only there were a digital version, perhaps with a little keypad for recording your tasting notes, and bluetooth for beaming them back to your server, of course along with metadata gleaned from a code on the bottle...

[Update: Jason Streed emailed to remind me of this Calvin Trillin New Yorker article from 2002, in which Trillin administers his own red/white taste test, and concludes that

"...experienced wine drinkers can tell red from white by taste about seventy per cent of the time, as long as the test is being administered by someone who isn't interested in trying to fool them."

Although Trillin only ran three subjects with eight trials each, this exactly coincides with the claim by Morrot et al., who (in the 2001 Brain & Language article) cite 70% as typical performance in distinguishing red wine from white by smell alone (though they add, unsurprisingly, that "we found that the rate of success varies significantly according to the wines used"). As the standard reference on this topic, they cite a paper I have not looked up, Sauvageot, F., & Chapon, M. (1983). "La couleur d'un vin (blanc ou rouge) peut-elle etre identifiée sans l'aide de l'oeil?" Les Cahiers de l'ENS. BANA, 4, 107-115. For some reason, no one told Trillin about this paper -- or about the work of Morrot et al. -- although he consulted with the expert oenologists at Davis, including Ann Noble. ]

Posted by Mark Liberman at 09:56 AM

Grand Cru Smackdown

Here's some serious and substantive disagreement behind the flurries of winetalk modifiers.

The subject: 2003 Chateau Pavie, from the Premier Grand Cru Classé estate in St.-Emilion, Bordeaux.

Robert Parker's evaluation: "an off-the-chart effort", 95 to 100 points on a 100-point scale.

Jancis Robinson's evalution: "completely unappetizing", "ridiculous", 12 points on a scale of 20.

It's not only the ratings that vary: expert descriptions include both "hint of wood" and also "huge wood"; both "needs more fruit" and also "powerful, super-concentrated dark berry and mineral flavors".

It's clear from the SFGate article by Roger Voss that there are many dimensions to this story: tradition vs. innovation, friendship vs. objectivity, and (apparently not least) ego vs. ego. I also take this tale as evidence that whatever else is going on in winetalk, it's not purely a sort of lexicalized read-out from human mass spectrometers (despite the implications of the Wine Aroma Wheel or the Recognose Wine Aroma Dictionary):

Blackberry? yes. Blueberry? a little. Strawberry? no. Gasoline? no. Rubber? yes. Tar? somewhat. Oak? moderate. Grass? yes. ...

That's not to say that the writers are intentionally fabricating their descriptions, or that the categories are physiologically meaningless and without any intersubjective validity, or even that such descriptions need to be intersubjectively valid in order to be interesting and useful to readers. Remembering Emily Dickinson's poem

    A sloop of amber slips away
      Upon an ether sea,
    And wrecks in peace a purple tar,
      The son of ecstasy.

may enhance your appreciation of a certain sunset, without corresponding exactly to everyone's perception of the same scene.

Since Ms. Dickinson is not available to give us her opinion, we'll have to make do with these fuller descriptions of the 2003 Chateau Pavie from a variety of contemporary experts (taken from Voss' story). Alas, it would cost each of us a couple of hundred dollars to experience for ourselves the tastes they're talking about -- and part of the obvious context here, as I've observed, is the linguistic scaffolding of a cultural construction intended to persuade us that it would be worth it.

Michael Broadbent: (14 of 20 points) -- "Very deep, extraordinary nose. Slightly fishy, tarry; fairly sweet, full bodied, powerful, dense and again tarry."

Clive Coates: (no rating) -- "Anyone who thinks this is good wine needs a brain and palate transplant. This wine will be scored simply as undrinkable."

James Lawther: (4 of 5 stars) -- "Big, powerful wine in the super-ripe mould. Rich, confit nose of dark fruits woven with liquorish-vanilla oak. Almost portlike. Palate full and fleshy with a muscular tannic frame. Firm, persistent finish."

Charles Metcalfe: (90-94 of 100 points) -- "Samples and opinions varied. This has very high tannins -- daunting and mouth-drying at present -- and there is a great deal of oak flavor. But behind all this (is) fresh, raspberry fruit with length and perfume. Not one to approach for a long time, but should be excellent in time. Something of a return to traditional form for this estate."

Robert M. Parker Jr. (96-100 of 100 points) -- "An off-the-chart effort ... a wine of sublime richness, minerality, delineation and nobleness .... Inky/purple to the rim, it offers up provocative aromas of minerals, black and red fruits, balsamic vinegar, licorice and smoke. It traverses the palate with extraordinary richness as well as remarkable freshness and definition. The finish is tannic, but the wine's low acidity and higher than normal alcohol (13.5 percent) suggests it will be approachable in 4-5 years . . . A brilliant effort, it, along with Ausone and Petrus, is one of the three greatest offerings of the right bank in 2003."

Jancis Robinson: (12 of 20 points) -- "Completely unappetizing overripe aromas. Why? Porty sweet. Oh REALLY! Port is best from the Douro, not St.-Emilion. Ridiculous wine more reminiscent of a late-harvest Zinfandel than a red Bordeaux with its unappetizing green notes."

James Suckling (95-100 of 100 points) -- "Super-ripe and almost jammy. Very New World on the nose but impressive. Bordeauxlike on the palate. Berries, raspberries and strawberries. Hint of wood. Full-bodied with ripe and round tannins and a long finish. Chewy. Got to like this."

Stephen Tanzer (92-95 of 100 points) -- "Explosive, super-ripe aromas of cassis, violet, minerals and licorice. Thick on entry, then chewy as a solid in the mid-palate, with powerful, super-concentrated dark berry and mineral flavors and enough ripe acidity to give the wine shape and freshness. Best today on the great building finish, which features huge but thoroughly ripe tannins and palate-saturating berry and mineral flavors. An impressively rich, structured wine that wears its high alcohol gracefully."

Roger Voss (87-89 of 100 points) -- "With its aromas of raisins and sweet jam, this wine smells like port. To taste, there are dark tannins and huge wood. It is very tough and needs more fruit. Black and toasty and overripe. Maybe the fruit will develop, but it will take many years."

Some other discussion:

A decanter.com article by Adam Lechmere, quoting from an "impassioned three-page letter" from the chateau owner; and the decanter.com description "One of the top Saint-Emilion's of the vintage. Deep, intense colour. Ripe, dark fruit and liquorish nose. Rich, layered extract, velvety texture of fruit with a muscular tannic structure. Powerful but ripe and balanced with a long fresh finish."

A discussion of the "rumpus" by Neal Martin at wine-journal.com, and his lengthy evaluation, in which he mostly avoids taking sides, but does mention that "the hedonistic delight had given way to something with less delineation and poise".

A story on the "spat" by Catherine Lowe at Wine International.

Posted by Mark Liberman at 01:21 AM

Apologia pro risu suo

Back on May 11, in response to a post of mine about winetalk, Semantic Compositions added some observations in a similar vein about flowery and evocative language among audiophiles. However, Steve (aka Language Hat) wrote in a a comment "I was disappointed in Mark's post; I hate to see him joining the bandwagon of people making easy jokes about winetalk." I've been meaning to get back to this ever since.

I think that Steve is right to defend the oenophiles, but wrong to feel that they need defending. We can wonder -- and even laugh -- at the spectacular profusion of new linguistic species in some subcultural ecology, without concluding that these flocks, herds and groves lack value. On the contrary, the urge to catalog and describe sublanguages accurately implies a certain amount of respect. And if some of the specimens are striking, even preposterous, well, so much the better.

Here's Steve's comment in full:

I was disappointed in Mark's post; I hate to see him joining the bandwagon of people making easy jokes about winetalk. Yes, it sounds silly to outsiders (as any form of jargon does); yes, it can be overdone. But it's absurd to pretend that it's nothing but pretentiousness and pulling the wool over people's eyes -- it's a specialized set of descriptions for very particular taste/smell sensations. I am by no means a wine expert, but I've taken courses, and I assure you there is a great deal of chemical information about the components of taste that can be learned by sniffing tubes with essence of chocolate, herbs, &c, and learning to discern them as components of the sensory impact of wine. To snicker at connoisseurs for their "barnyard flavors" is as much know-nothingism as to sneer at literature critics for analyzing the symbolism in Donne's poetry. The fact is that there is such a thing as a barnyard flavor (or as the French more directly put it, "merde"), and it's an excellent thing in Burgundies, and I've tasted it myself. So there.

(Really, you'd think linguists would know better than to make fun of peculiar subcultures!)

I agree that the ritual tasting descriptions are usually a serious attempt to describe real experiences. The narrative structure of most such descriptions also arises naturally. An example emerged at our dining room table a few weeks ago. I had made myself a cup of camomile tea, and as our 8-year-old son smelled the aroma, he said "Mmm, vanilla! that smells good!" So I offered him a spoonful. He tasted it, looked surprised, and said "Hmm, bland. Not much taste. Just a little flowery." Then after a few seconds he wrinkled up his face and said "Ooh, bitter." I asked him if he wanted any more, and he said "No, definitely not." So this little experience involved three stages: the aroma, the flavor, and the aftertaste; followed by a rating.

Serious tasters break the time series down into more stages, and they use a larger and more elaborate vocabulary to try to convey their experience, but the idea is the same. However, the language of the wine-tasting subculture has developed in specific ways that are, let's say, underdetermined by the linguistic and physiological background; and therein lies the interest for a linguist, an anthropologist, or just a student of the human condition.

In David Shaw's May 12 Matters of Taste column in the LA Times (registration required), he complains about some of the ways in which this language sometimes develops. Shaw is a serious foodie himself, as you can see in his wine descriptions in this April 7 column:

"Sugar Daddy" is a big, dark, brooding wine with a long finish. I drank it with a charred rare New York steak and thought they stood up to each other like two superb heavyweight fighters.

My favorite Red Car wine so far, though, is "The Stranger," which reminded me of a pudding made with ripe blackberries, boysenberries and a hint of blueberry.

However, in his May 12 column, Shaw observes that some of the descriptions that are intended to be attractive are actually off-putting:

...an e-mail from an Orange County wine store quoting Parker's lavish praise (96 points!) for an Australian Shiraz. Included in this wine's "peppery" aroma profile, Parker wrote, were hints of "melted asphalt" and "Band-Aid."

...another Parker review that described a 98-point wine from southern Italy as having "a gorgeous bouquet of scorched earth"...

Shaw is also skeptical about the degree of detail and the obscure flavor categories in some descriptions:

Indeed, heretical though it is for someone who writes about wine and food to admit this, I often find that I can neither taste nor smell all the various fauna, flora and fruit (or animal, vegetable and mineral) components that serious wine critics consistently say are present in wine.
So maybe I'm just jealous of their astonishingly sensitive noses and palates. But I'd bet that many wine drinkers have the same problem I do. Maybe they can sense a bit of, oh, say, citrus in a wine's bouquet, but "truffle, wet stone, spicy oak ... mineral bath ... ripe apple ... and white tobacco" (whatever that is) all in one wine?

And he also recognizes that over-the-top winetalk can be funny:

with the possible exception of the passages about sex in those romance-cum-ravage novels known as "bodice-rippers," there are few writings in the English language filled with more unintentionally laughable prose than wine reviews.

In particular, he sees humor in the anthropomorphizing kind of winetalk (which I haven't written about yet):

Unfortunately, too many wine writers often seem to think they're writing about other human beings, not about a beverage. I remember one wine, for example, being described as "a rather seedy aristocrat," another as "sort of a blowsy blond," a third as "suave and easygoing." Just last month, I heard Anthony Dias Blue speak on the radio of Petite Syrahs that are "downright cantankerous."

I still have in my file from several years ago my all-time favorite such wine description, a wine that the critic said had "tried to summon a bit more seriousness but its supple femininity gave way quickly to shimmering fruitiness."

This is a bit unfair, coming from someone who just a few weeks before wrote about "a big, dark, brooding wine" squaring off against a charred steak "like two suberb heavyweight fighters". But in the end, Shaw's complaint is not so much that winetalk is silly as that it's exclusionary:

My real concern about the language used in so many wine reviews is not so much that they offend my prose sensibility; it's that they intimidate and turn off many new and potential wine drinkers. They contribute to the mystification and mythification of wine, to the layman's sense that the enjoyment of wine requires a certain amount of ritual and the decoding of impenetrable mysteries.

My own perspective is different. It doesn't bother me that winetalk is often exclusionary -- that comes with the subcultural territory, whether the niche is pinot noir, motorcycle racing, or phonetics. And as for mystification, mythification and ritual, those are the bubbles that culture gives off as it ferments -- harmless as long as no one gets hurt, and even interesting and fun, as long as no one takes fizz for fact. And if a mystery or two gets decoded along the way, so much the better.

I hope that Steve will accept this long-winded apology for having seemed to make fun of a peculiar but serious subculture. Because I have some really hilarious stuff on coffeetalk to lay out, before long, and I'd hate to disappoint him again.

Posted by Mark Liberman at 12:11 AM

June 01, 2004

Silly Academic Titles

The Chronicle of Higher Education contains a piece entitled Assistant Directors of the Underclass, Unite! [subscription required] by Robert M. Kahn, about the silly titles that have been cropping up in academia.

As he suggests, some of them are due to downsizing resulting in the combination of what were formerly separate administrative units:

Division Chair for Engineering-Related Technologies, Health, Mathematics, Nursing, and Sciences

Some seem to be the titular equivalents of vanity license plates, intended to make the holder feel unique:

Assistant Professor of Nonformal Educational Processes

And some result from political correctness:

Gender Faculty Specialist

Director of Social Equity

But I have to disagree with Kahn's assessment of some of the titles that he lists under the heading Products of Adjectival Impairment:

Coordinator of Organized Research: Organized Research is probably what is more usually called Sponsored Research, but is perhaps more accurate, including group research projects that receive no external funding.
Director of the Office of Academic Student Instructional Support: Academic Student Instructional Support is very likely support for academic students as opposed to vocational students. It needs to be qualified as instructional support to contrast it with, say, financial assistance or psychological counselling
Coordinator of Liberal/General/Interdisciplinary Studies: liberal, general and interdisciplinary are not synonyms. In many institutions, liberal studies would be a some sort of liberal arts program, while general studies usually indicates a smattering with no focus at all. interdisciplinary studies might deal with individually designed interdisciplinary degrees, though in at least one case that I know of it is basically a euphemism for "postmodern stuff". I suspect that this person's job is to oversee all of the various degrees that are not traditional departmental degrees. The double-disjunction is a bit awkward, but it isn't as awkward as Coordinator of Degrees That Are Not Traditional Departmental Degrees, and it doesn't sound as funky as Coordinator of Non-Traditional Degrees, which sounds suspiciously like it involves a lot of meditation and ingestion of controlled substances and things like that.

What Kahn doesn't mention is that you need to watch out for the acronyms that result. A friend of mine back in British Columbia has just become the Coordinator of what in Canada is usually known as something like First Nations Educational Support Services at the University College of the Cariboo. First Nations is a Canadian cover term for Indians and Inuit and Métis, but at UCC they have decided to use Aboriginal instead of First Nations. They've also decided that it isn't necessary to specify that the support services are educational. The resulting unit is therefore Aboriginal Support Services, which means that my friend is now the Coordinator of ASS.

Posted by Bill Poser at 11:46 PM

A thousand things to say about language

For those who like to watch the odometer roll over to a new power of 10: Mark just posted item number 1000 to Language Log. It's not actually the thousandth post to appear here, because some items got as far as being drafts with numbers but didn't in the end get published (one was withdrawn by request of someone who got cold feet when he saw his words posted here). But to a rough approximation, Language Log has now attained a publication rate in the region of a thousand items a year.

Posted by Geoffrey K. Pullum at 09:51 PM

The child or the savage orator...

I've recently been reading McGuffey's Fifth Eclectic Reader. 1879, Van Antwerp, Bragg & Co., Cincinnati and New York. (This is a revised edition of the book originally prepared by Alexander McGuffey in 1844. My copy was originally owned by one Myrtle Blackburn.)

This book starts from the premise that

The great object to be accomplished in reading as a rhetorical exercise is to convey to the hearer, fully and clearly, the ideas and feelings of the writer.

It aims to accomplish this by first inculcating twelve "rules", laid out in 30-odd pages of front matter divided into sections on Articulation, Inflections, Accent, Emphasis, Modulation and Poetic Pauses, and then providing 117 Selections in Prose and Poetry for the student to practice on.

The work starts out promisingly enough with

RULE I.--Before attempting to read a lesson, the learner should make himself fully acquainted with the subject as treated of in that lesson, and endeavor to make the thought, and feeling, and sentiments of the writer his own.

It further adds a helpful dose of Romantic ideology:

REMARK.--When he has thus identified himself with the author, he has the substance of all rules in his own mind. It is by going to nature that we find rules. The child or the savage orator never mistakes in inflection, or emphasis, or modulation. The best speakers and readers are those who follow the impulse of nature or most closely imitate it as observed in others.

You might think that this would lead to a sort of "method acting" approach, but you would be wrong. The remaining eleven rules, with their various paragraphs, sub-paragraphs and listed exceptions, are an amazing combination of sensible observation, invented prescription and incoherent fantasy, all presented at an extraordinary level of analytic detail. To give the flavor of this material, I've quoted below the first half of the section on "Inflections".

I'll have more to say later about the substantive ideas on intonation presented therein. As an attempt to "[go] to nature [to] find rules", it's just as faulty as the corresponding effort in the same tradition to analyze syntactic structure.

However, my first response to this material is not censure but awe. It's extraordinary that in 1879, it was thought to be reasonable to ask children in grade school to assimilate and apply explicit linguistic analysis of this degree of complexity.

Here's the first half of the section on "Inflections".

III. INFLECTIONS

Inflections are slides of the voice upward or downward. Of these, there are two: the rising inflection and falling inflection.

[...]

Both inflections are exhibited in the following question:

Did you walkˊ or rideˋ?

In the following examples, the first member has the rising the second member the falling inflection.

EXAMPLES.*

Is he sickˊ or is he wellˋ?
Did you say valorˊ, or valueˋ?
Did you say statuteˊ, or statueˋ?
Did he act properlyˊ, or improperlyˋ?

[* These questions and similar ones, with their answers, should be repeatedly pronounced with their proper inflections, until the distinction between the risinng and falling inflections is well understood and easily made by the learner. He will be assisted in this by emphasizing strongly the word which receives the inflection; thus, Did you RIDEˊ or did you WALKˋ?]

In the following examples, the inflections are used in a contrary order, the first member terminating with the falling and the second with the rising inflection.

EXAMPLES.

He is wellˋ, not sickˊ.
I said valueˋ, not valorˊ.
I said statueˋ, not statueˊ.
He acted properlyˋ, not improperlyˊ.

FALLING INFLECTIONS

RULE VI. -- The falling inflection is generally proper whenever the sense is complete.

EXAMPLES.

Truth is more wonderful than fictionˋ.
Men generally die as they liveˋ.
By industry we obtain wealthˋ.

REMARK.--Parts of a sentence often make complete sense in themselves, and in this case, unless qualified or restrained by the succeeding clause, or unless the contrary is indicated by some other principle, the falling inflection takes place according to the rule.

EXAMPLES.

Truth is wonderfulˋ, even more so than fictionˋ.
Men generally die as they liveˋ, and by their actions we must judge of their characterˋ.

Exception.--When a sentence concludes with a negative clause, or with a contrast or comparison (called also antithesis), the first member of the which requires the falling intonation, it must close with the rising inflection. (See Rule XI, and §2, Note.)

EXAMPLES.

No one desires to be thought a foolˊ.
I come buryˋ Caesar, not to praiseˊ him.
He lives in Englandˋ, not in Franceˊ.

REMARK.-- In bearing testimony to the general character of a man we say,

He is too honorableˋ to be guilty of a vileˋ act.

But if he is accused of some act of baseness, a contrast is at once instituted between his character and the specified act, and we change the inflections, and say,

He is too honorableˋ to be guilty of suchˊ an act.

A man may say, in general terms,

I am too busyˊ for projectsˋ.

But if he is urged to embark in some particular enterprise, he will change the inflection, and say,

I am too busyˋ for projectsˊ.

In such cases, as the falling inflection is required in the former part by the principle of contrast and emphasis (as will hereafter be more fully explained), the sentence necessarily closes wtih the rising inflection.

Sometimes, also, emphasis alone seems to require the rising inflection on the concluding word. See exception to Rule VII.

STRONG EMPHASIS

RULE VII.--Language which demands strong emphasis generally requires the falling inflection.

EXAMPLES.

§1. Command or urgent entreaty; as,

Begoneˋ,
Runˋ to your houses, fallˋ upon your knees,
Prayˋ to the Gods to intermit the plagues.

O, saveˋ me, Hubertˋ, saveˋ me! My eyes are out
Even with the fierce looks of these bloody men.

§2. Exclamation, especially when indicating strong emotion; as,

O, ye Godsˋ! ye Godsˋ! must I endure all this?

Harkˋ! Harkˋ! the horrid sound
Hath raised up his head.

For interrogatory exclamation, see Rule X, Remark.

SERIES OF WORDS OR MEMBERS.

§3. A series of words or members, whether in the beginning or middle of a sentence, if it does not conclude the sentence is called a commencing series, and usually requires the rising inflection when not emphatic.

EXAMPLES OF COMMENCING SERIES

Wineˊ, beautyˊ, musicˊ, pompˊ, are poor expedients to heave off the load of an hour from the heir of eternityˋ.

I conjure you by that which you profess,
(Howe'er you came to know it,) answer me;
Though you untie the winds and let them fight
Against the churchesˊ; though the yeasty waves
Confound and swallow navigationˊ up;
Though bladed corn be lodged, and trees blown downˊ;
Though palaces and pyramids do slope
Their heads to their foundationsˊ; though the treasures
Of nature's germens tumble altogetherˊ,
Even till destruction sickenˊ; answer me
To what I askˋ you.

§4. A series of words of members which concludes a sentence is called a concluding series, and each member usually has the falling inflection.

EXAMPLE OF CONCLUDING SERIES

They, through faith, subdued kingdomsˋ, wrought righteousnessˋ, obtained promisesˋ, stopped the mouths of lionsˋ, quenched the violence of fireˋ, escaped the edge of the swordˋ, out of weakness were made strongˋ, waxed valiant in fightˋ, turned to flight the armies of the aliensˋ.

REMARK.--When the emphasis on these words or members is not marked, they take the rising inflection, according to Rule IX.

EXAMPLES.

They are the offspring of restlessnessˊ, vanityˊ, and idlenessˋ.
Loveˊ, hopeˊ, and joyˊ took possession of his breast.

§5. When words, which naturally take the rising inflection, become emphatic by repetition or any other cause, they often take the falling inflection.

Exception to the Rule.--While the tendency of emphasis is decidedly to the use of the falling inflection, sometimes a word to which the falling inflection naturally belongs, changes this, when it is emphatic, for the rising inflection.

EXAMPLES.

Three thousand ducatsˋ: 't is a good round sumˊ.
It is useless to point out the beauties of nature to one who is blindˊ.

Here sum and blind, according to Rule VI, would take the falling inflection, but as they are emphatic, and the object of emphasis is to draw attention to the word emphasized, this is here accomplished in part by giving an unusual inflection. Some speakers would give these words the circumflex, but it would be the rising circumflex, so that the sound would still terminate with the rising inflection.

RULE VIII.--Questions which can not be answered by yes or no, together with their answers, generally require the falling inflection.

EXAMPLES.

Where has he goneˋ?		Ans. To New Yorkˋ.
What has he doneˋ?		Ans. Nothingˋ.
Who did thisˋ?		Ans. I know notˋ.
When did he goˋ?		Ans. Yesterdayˋ.

REMARK.--If these questions are repeated, the inflection is changed according to the principle stated under the Exception to Rule VII.

Where did you say he had goneˊ?
What has he doneˊ?
Who did thisˊ?
When did he goˊ?

Posted by Mark Liberman at 06:01 PM

Double-tongued word wrester

A new website from Grant Barrett. "Words from the fringes of English -- found, researched, defined, and posted for your comments."

Posted by Mark Liberman at 02:27 PM

Inclimate weather

Gene Buckley emailed to point out the widespread adoption of the eggcorn inclimate weather, which has 11,000 whG (web hits on Google), or 2,567 whG/bp (web hits on Google per billion pages). The original phrase inclement weather has 173,000 whG or about 40,372 whG/bp, so the original is only about 16 times commoner than the eggcorn. This is a genuine folk-etymology-in-progress, not a simple misspelling, since the morphologically incoherent "incliment weather" and "inclemate weather" have only 719 whG and 73 whG respectively.

"Inclimate weather" is especially common for a case in which the error is not a recognized orthographic word of English, and so will be caught by any spelling-correction program. Compare (say) feint of heart or reigns of power or honing in on, where spelling correction would have to use a phrasal lexicon and a modest imitation of artificial intelligence.

I surmise that this is why inclimate is rare (though not absent) in journalistic writing, unlike (some of the) eggcorns that map words onto words. Journalists (or their editors) presumably use spelling correction programs. By the numerical evidence, they are also slightly more literate than the public at large, when voyaging beyond the safe harbors of wordlist-based spelling correction:

	whG (web)	whG (news)
inclimate weather	11,000	1
inclement weather	173,000	1,080
original/eggcorn ratio	15.7	1,080
feint of heart	3,920	1
faint of heart	151,000	223
original/eggcorn ratio	38.5	223
reigns of power	7,770	28
reins of power	23,700	119
original/eggcorn ratio	3.05	4.25
honing in on	12,500	29
homing in on	23,900	92
original/eggcorn ratio	1.9	3.2

These are poignant examples of humanity's search for meaning in its linguistic experience, thwarted only in part by technology.

Posted by Mark Liberman at 04:25 AM