Language Log: September 2007 Archives

September 30, 2007

Language Log only pretty strong

A little while ago, Geoff Pullum firmly laid down the law about the name of this blog: it is "strong", that is, anarthrous, lacking the definite articke the:

One hundred percent of the references to Language Log by the people who actually write for Language Log say Language Log. None of us call the site the Language Log. And what made us arbiters of good taste? Well, we created Language Log, and coined its name. We coined it as a strong proper name. The sporadic use of the Language Log by others is a sign of imperfect learning.

But alert reader Tim Leonard has observed that one contributor to Language Log sometimes uses the arthrous version of the name; from my own website:

I am an occasional contributor to the Language Log ... ; to the American Dialect Society mailing list; and to Chris Waigl's eggcorn database.

What's going on here? Something subtler than my having learned the name imperfectly.

Although I believe that my writing on Language Log itself uniformly treats the blog's name as anarthrous, occasionally in writing for other audiences (as above) I go arthrous. Here are three more instances:

Several contributors to the Language Log do not share Garner's animus towards however, regarding it as an acceptable alternative to but. (link)

... in postings to the American Dialect Society mailing list and the Language Log (a group blog on linguistic issues) ... (link)

For fun, I did occasional postings to the American Dialect Society mailing list and to the Language Log. (link)

In three of these, the pairing with the arthrous the American Dialect Society mailing list might have promoted my use of the, but I don't think I would necessarily have omitted the article otherwise. In any case, my practice is variable, though with a very strong preference for anarthrousness.

I believe that my inclination to occasionally use the article stems from the generalization that proper names with singular count common nouns as their heads are mostly arthrous -- arthrousness is the default, though there are a number of islands of anarthrousness -- and that this generalization is very strong for certain types of proper nouns, in particular names of organizations (the Federal Bureau of Investigation, the American Academy of Arts and Sciences), institutions (the Massachusetts Institute of Technology, the Frick Museum), and publications (the San Francisco Chronicle, the Daily Telegraph, the New York Observer). The generalization makes sense, because singular count nouns require a determiner to be usable as argument NPs, and for a proper name this determiner would naturally be definite -- a possessive (Craig's List) or the definite article the.

A blog has something of the character of an organization, an institution, AND a publication, and the word log is a count noun, so we'd expect the proper name Language Log to be arthrous, like the Destin Log (a newspaper in Destin, Florida), the HobbySpace Log (a website for space enthusiasts), and the Cruise Log (on USA Today, with information on cruise travel). There's then a good reason why outsiders (people not directly connected to the blog) often treat the blog name as arthrous, and why I sometimes go arthrous when talking to outsiders.

Further complexity. Sometimes a normally arthrous name occurs in a truncated form, for quick reference or in a heading. So there's a webpage with the heading Science Log (truncated), on which we find an arthrous full version:

The Science Log was compiled by Dee Davis, Science Officer for the USS Texas, International Federation of Trekkers.

An outsider might easily take the heading Language Log to be a truncated version of an arthrous name.

Meanwhile, there's genuine variation in article use, as on the Famosa Slough website, where this San Diego wetland is referred to both with the article and without:

The Famosa Slough is a 37-acre wetland between Ocean Beach and the San Diego Sports Arena area.

FRIENDS OF FAMOSA SLOUGH is a group of concerned citizens whose goal is to restore Famosa Slough as a natural wetland preserve.

An outsider might guess that Language Log was like Famosa Slough, with both variants acceptable. The fact that references to Language Log ON Language Log are consistently anarthrous could easily escape a reader's notice. Why should anyone be keeping track of such things?

Still further complexity. Some proper names are anarthrous by local custom: though we'd expect the names to have an article, locals conventionally use the shorter version. As a slogan: familiarity (sometimes) breeds anarthrousness. Since I last wrote about (an)arthrousness, in acronyms and initialisms, people have been writing me about the initialism CIA, which would be expected to be arthrous on general principles, and is indeed so used by outsiders, but which has been widely reported to be anarthrous for those associated with the agency. Nathan Austin notes that Harry Matthews's recent book (part memoir, part fiction) about the agency is entitled My Life in CIA and has the narrator saying (p. 66):

I asked Patrick if there was anything particularly useful he could pass on to me "about the CIA." "The first thing to remember is that nobody connected with the agency calls it the CIA. It's plain CIA."

So even if you noticed the consistent use of Language Log on Language Log, you might think that that was just an insider thing, and that the other variant was allowable.

In fact, when Mark and Geoff started the blog, they chose an unexpectedly anarthrous name for it, and now Geoff would like to stipulate that only this usage is acceptable. But I can't see how people should have been expected to LEARN this.

One further level of complexity. Perhaps Mark and Geoff were thinking of Language Log not as an ordinary proper name but as a title -- like the title of a book, movie, newspaper column, musical group, etc. If so, then the form of the proper NP is entirely up for grabs; as Geoff Pullum noted in The Great Eskimo Vocabulary Hoax, titles don't even have to be constituents (Geoff lists book titles like If on a Winter's Night a Traveler and Dancer from the Dance). So you could name a rock group Statue of Liberty or Federal Bureau of Investigation even when these expressions would require an article when used as ordinary proper names. So maybe Language Log is a title like these, in which case the person who chooses the title gets to stipulate its form. The Stanford linguistics department's rock band is Dead Tongues; the Dead Tongues is just wrong -- not far off, but wrong. Much like Geoff said about the Language Log above. But then the (an)arthrousness of proper names would not be relevant to the question.

(There's a whole lot to be said about titles, and about articles in them, but I'll save that for another day.)

Posted by Arnold Zwicky at 02:59 PM

Grammarians and the road to national recovery

I'm once again reading Clive James, probably the most valuable of the many wonderful imports from Australia to have benefited British culture over the past forty years. A new fourth volume of his Unreliable Memoirs (yes, the first three were decades ago, and I took them to form a trilogy, but now there is more): North Face of Soho: More Unreliable Memoirs (London: Picador, 2007). He always makes me giggle. And when I reached page 150 I almost cheered:

If all the accomplished but not especially interesting would-be writers became schoolteachers and taught grammar, the country would be on the road to recovery.

Yeah! (I thought for one second); grammar could put the country on the road to recovery! But then immediately (funny how fast you can go off people) I saw the other side of it.

Hey! (I now thought); Wait a minute! Who are you calling a not especially interesting would-be writer? Grammarians aren't failed writers who had to go into teaching because they couldn't get their stuff published, you prejudiced Aussie oik! Grammarians are proud and strong, you mealy-mouthed limp-wristed jumped-up fly-blown pusillanimous talentless antipodean cravat-wearing book-reviewing little heap of dingo dung! I'm insulted. I teach grammar because it's important, and cool, and fascinating, and because it's a lot more fun than being a musician in a rock band, that's why. But I've seen this sort of insulting stuff before: we express ourselves temperately and respectfully but people still diss us. The way Cummings did, for example. We grammarians don't get no respect.

Posted by Geoffrey K. Pullum at 09:58 AM

Weisberg wins

According to Sheryl Gay Stolberg, "Bush Trips Over 'Children', and That's the Official Truth", 9/28/2007, NYT:

At an appearance with city officials, including Mayor Michael R. Bloomberg, Mr. Bush sought to spotlight his signature education bill, No Child Left Behind. The president pronounced himself pleased by a recent report that math test scores have improved, citing it as evidence that the law is working.

“Childrens do learn,” the president said, “when standards are high and results are measured.”

Stolberg's story focuses on the fact that the authors of the the official White House transcript removed the errant -s, and then re-inserted it on the instructions of the new White House press secretary, Dana Perino, who told reporters that

“You know, the president — it is no secret — sometimes makes grammatical errors,” Ms. Perino said, adding, “The integrity of the transcripts are very important to me, and I’ve made that clear.”

No one seems to have noticed that Ms. Perino's own statement contains a lovely example of "agreement with nearest" ("the integrity ... are"), a frequent phenomenon that is generally felt to be a violation of grammatical norms. So you might expect me to use this as the taking-off point for yet another rant about the hypocrisy of Jacob Weisberg's Bushisms industry.

Or failing that, I could spend some time on the irony of journalistic sanctimoniousness in this case, given how much more careful the White House transcribers are than the spectacularly careless direct-quotation practices of journalists, including those at the New York Times.

But I'm starting to get a sort of Ancient Mariner vibe on these topics. ("Starting?", I hear some of you saying...) So I'll take this one in a different direction.

Adding an extra regular plural marker to an irregular plural form ("mices", "geeses", "childrens") is a common mistake for children at a certain stage of language acquisition, but it's very rare for adult native speakers of English to do this. I can't cite any quantitative studies on the frequency of adult errors of this type (if you have some, please tell me), but that's what intuition tells me.

However, it looks to me like that's what happened in this case. Here's an audio clip of the passage in question:

Though W does pause a bit after "childrens", there aren't any indications of a phrasal planning mix-up, like starting to say "children's r(esults do improve)" and then changing to "do learn". Nor is there any obvious source for the -s from perseveration, anticipation or interchange. Nor is "childrens" likely to be part of the spoken norms of W's youth in Midland, Texas (much less Kennebunkport, Maine, Phillips Academy or Yale).

It's true that there's a (fictional?) southern dialect form often spelled "chilluns", which I especially associate with the Uncle Remus stories:

"'I can't clim', Sis Cow', sez Brer Rabbit, sezee, 'but I'll run'n tell Brer Bull,' sezee; and wid dat Brer Rabbit put out fer home, en 'twa'n't long 'fo' here he come with his ole 'oman and al his chilluns, en de las' one er de fambly wuz totin' a pail."

The Cambridge History of English and American Literature wrote about Joel Chandler Harris's Uncle Remus dialect, "The plurals of all nouns tend to become regular. Thus Uncle Remus says foots (feet), toofies (teeth), and gooses (geese), though the old plural year is retained." That's not quite right, since Uncle Remus has chilluns and not childs -- but this may have been Harris's invention anyhow, since all the recorded examples that I can recall are like John Lee Hooker's "Boogie Chillun", without the the final -s.

On the other hand, web search turns up some apparently authentic transcriptions with "chilluns". But in any case, fictional or not, Uncle Remus is not any of W's native dialects.

No, I'm afraid that this is just a morphological error, and a rather weird and unexpected one. It's strange to have a president who says things like that. This time, Jacob Weisberg wins.

[Update -- Leslie Katz has pointed out this 9/28/2007 comment on Ms. Perino's "agreement with nearest".]

Posted by Mark Liberman at 08:18 AM

September 29, 2007

Couldn't be more

A nice little exercise in the interaction of negation, modality and conversational implicature, courtesy of Dilbert:

The construction "X couldn't be more Y" has become a commonplace way to say "X is very Y", as in these examples from recent news articles:

Enter Pavlik, 25, who couldn’t be more different than those three.
When it comes to loving the triumphs and feeling passionate about the game, they couldn't be more alike.
I couldn't be more pleased.
.. these linemen couldn't be more determined not to give up sacks ...
In “If I Did It,” the analogy couldn’t be more appropriate.
The Republican nomination race, where fluidity is the name of the game, couldn't be more different.
Star reporter Linda Diebel's latest book couldn't be more timely – it's a biography,
... he couldn't be more impressed.
Calhoun couldn’t be more proud of Okafor ...
The two sides, Germany and Brazil, couldn't be more different.

These are the first ten hits for a Google News search on "couldn't be more", and it's striking that in three of them, Y is different, and in one more, it's alike. The other six values for Y are pleased, determined, appropriate, timely, impressed, proud -- all positively evaluated terms.

The next pageful of ten results couldn't be more similar:

the 1988 comedy that paired Arnold Schwarzenegger and Danny DeVito as siblings, who couldn't be more different.
I couldn’t be more pleased
100 Saints director Ethan McSweeny couldn't be more pleased that nothing stood in the way of Smith joining their team
I couldn’t be more happy with the effort our kids put forth.
The contrast in the teams’ philosophies couldn’t be more stark.
I couldn’t be more excited about finally getting images by Richard Kern on view in Chicago
... the focal points of a Bison attack that couldn't be more perfectly balanced.
The differences between these two candidates couldn't be more striking.
the cast truly loves the film and couldn’t be more proud of it.
I couldn't be more pleased with how the hospitals in the county reacted

Well, of course it could have been more similiar, really, but it was pretty close. This time we got seven positive evaluations (pleased (3), happy, excited, perfectly balanced, proud) and three same-or-different (different, contrast ... stark, differences ... striking).

But when people ask whether X couldn't be more Y, they're not asking whether X is very Y -- they're generally expressing a wish that X should change in the direction of Y. And the characteristic values of Y become rather different (though in non-ironic uses, Y of course remains positively evaluated, and predication of similarity, generally expressed as "more like so-and-so", also remains common). The first ten Google hits for "couldn't you be more":

Couldn't you be more original than that?
Why couldn't you be more like Marcia?
Couldn't you be more compassionate?
Couldn't you be more specific?
Why couldn't you be more like John Travolta?
Couldn't you be more direct about it?
Couldn't you be more reasonable and less reactionary?
couldn't you be more careful about where you put your feet just once in a while?
Couldn't you be more original?
Why couldn't you be more like "Seinfeld" or "Friends"?

And "couldn't they be more":

Why couldn't they be more up front?
Couldn't they be more user friendly ...
Could they be more original ...
why couldn't they be more like my friend?
couldn't they be more grateful?
Couldn't they be more useful actually policing ...
why couldn't they be more understanding of him?
couldn't they be more metaphorical?
Oh, why couldn't they be more like me?
Why couldn't they be more like him?

For fans of overnegation, there's a dog-that-didn't-bark problem here. If people get confused about phrases like "the importance of this position cannot be underestimated" because it's hard to process sentences that combine a modal, a scalar predicate, and a negative or two, why doesn't anyone seem to have any trouble with the likes of "I couldn't be more unhappy about this" ?

Posted by Mark Liberman at 08:39 AM

The Barry White effect

You've probably seen the news coverage: "Deep-voiced men 'have more kids'", says the BBC (and Newsweek uses the same headline, minus the quotes); "Women may favor men with deep voices", says Science Daily; "Deep-voiced men make more babies", says AFP; "Barry White's Secret", says the Guardian; "Men with deep voices may be more fertile", says the Boston Globe; "Women prefer men with deep voices", says The Telegraph; "Deep voices equal more babies", says Fox News; "Deep-voiced men likely to have more children", says Reuters.

I'm happy to say that these news stories are focusing, for once, on a sex difference in humans that actually exists, and really is under genetic control. Men do have lower voices than women, by a proportion that is on average much greater than the overall difference in size between the sexes. And this particular form of sexual dimorphism is apparently not shared with our relatives the chimps and gorillas, so it must have evolved during the same period that human speech and language did.

Therefore, starting at some point during the last five million years or so, there must have been a selective advantage for male hominins with lower voices. And according to the featured study (C.L. Apicella, D.R. Feinberg, F.W. Marlowe, "Voice pitch predicts reproductive success in male hunter-gatherers", Biology Letters, published online 9/25/2007), evidence of this selective advantage can still be found today.

Their abstract:

The validity of evolutionary explanations of vocal sexual dimorphism hinges upon whether or not individuals with more sexually dimorphic voices have higher reproductive success than individuals with less dimorphic voices. However, due to modern birth control methods, these data are rarely described, and mating success is often used as a second-rate proxy. Here, we test whether voice pitch predicts reproductive success, number of children born and child mortality in an evolutionarily relevant population of hunter-gatherers. While we find that voice pitch is not related to reproductive outcomes in women, we find that men with low voice pitch have higher reproductive success and more children born to them. However, voice pitch in men does not predict child mortality. These findings suggest that the association between voice pitch and reproductive success in men is mediated by differential access to fecund women. Furthermore, they show that there is currently selection pressure for low-pitch voices in men.

What Apicella et al. did was to collect "voice recordings and self-reported reproductive histories" from "49 men between the ages of 19-55 (M=38.18; s.d.=11.38) and 52 women between the ages of 18-53 (M=32.71; s.d.=9)", recruiting from nine camps of the Hadza people, who are hunter-gatherers living on the savannah in Tanzania. The cultural background:

[The Hadza] number approximately 1000. Women dig for tubers and gather fruits, while men mainly collect honey and hunt animals. Marriages are not arranged, so both sexes are free to choose their spouses, though the approval of their parents is often sought. The Hadza are mostly monogamous although approximately 4% of men have two wives [...]. The divorce rate is fairly high [...], so the mating system can be best described as serial monogamy. Approximately 20% of Hadza stay married to the same person their whole life, and divorce when it occurs is often the result of women not tolerating men's extramarital affairs [...]. Women's extramarital affairs appear to be mostly the result of women deciding to take new husbands when their old husbands leave camps for extended periods of time ...

It's important that the statistical effects of male voice pitch were apparently not simply the result of a selective advantage for larger men; and indeed the researchers didn't find any effect of various measured size differences:

Anthropometrics including height, weight and upper-arm muscle mass were also collected but are not reported here, since they do not predict any reproductive outcomes in either sex after controlling for age, and did not significantly increase the r² values in the models reported below.

All that said, we need to observe that the effect of voice pitch was a statistical tendency, a much weaker effect than I'll bet most readers of the news stories are imagining. (Those stories exaggerated and sensationalized the results of this study; in other news, the sky is still often said to be blue, and water is still widely reported to be wet.)

Here's a graph from Apicella et al., showing number of offspring plotted against voice pitch, after correcting for age:

Fig. 2: Residuals from regression of the number of children born on age plotted against voice pitch. This scatter plot shows a negative relationship between male voice pitch and the number of children born. Note that this relationship remains unchanged when the person with most children born is removed from the analysis.

The two factors of age and voice pitch explained about half of the variance in number of children that men report were born to them (r²=0.505). The authors don't report what fraction of the residual variance is accounted for by voice pitch, after the linear effects of age are taken out.

The authors feel the need to reassure us that the "this relationship remains unchanged" when the one outlier guy with many kids (just under 5, in the plot above, after correcting for age) was taken out. (The Reuters story says that he had ten kids, and calls him "the man in the study with the deepest voice", which is an obvious falsehood.) There's another outlier -- that one guy with an average voice pitch of about 175 Hz, whose residual number of children was negative two after correcting for age. With both the super-daddy and the counter-tenor removed, the relationship in that scatterplot is going to look a lot less impressive, though probably there is still an effect.

[Let me climb on my soapbox for a minute, and rant about how there's no excuse for journal editors not to require authors to provide full tables of their data, not just scatter plots and statistical test results. Then we wouldn't have to speculate about things like this.]

As the authors suggest, there are several different possible lines of causation here. The most obvious one is sexual selection (the "Barry White effect"):

Most studies have found that women find lower pitch male voices to be more attractive [...] and judge them to be more dominant, older, healthier and more masculine [...] , while men find higher pitch voices in women to be more attractive, subordinate, feminine, healthier and younger [...] . Furthermore, women's preferences for low-pitch voices in men are greater in the fertile phase of the menstrual cycle [...] and when judging for short-term sexual relationships [...] , suggesting that low voice pitch, like other masculine traits, may signal mate quality [...] .

But male pitch is also correlated with testosterone levels (since it's pubertal response to testosterone that lowers it):

... studies have found that high testosterone levels predict low voice pitch in adult men [...] , and that voice pitch is causally linked to pubertal testosterone levels [...]

and so the (small) reproductive advantage of voice pitch might be the result of other effects of differential testosterone levels, rather than the effects of voice pitch as such.

Whatever the mechanisms of the selective effect, it's certain that there must have been one, and it's pretty clear that it started after the genus Homo separated from the other hominids. It's plausible that this was part of the process that led to human speech and language -- Darwin thought that vocal courtship displays probably came first, and I tend to agree with him:

... [I]t appears probable that the progenitors of man, either the males or females or both sexes, before acquiring the power of expressing their mutual love in articulate language, endeavoured to charm each other with musical notes and rhythm.

On this view, speech is music plus semantics. Whatever the sequence, it would interesting to know just when hominid laryngeal sexual dimorphism evolved -- and we should in principle be able to figure that out, now or not too far in the future.

We would need to know which genomic variations are responsible for the effects of testosterone on larynx growth during puberty in humans (unfortunately, I don't think that anything about this is now known.) Then we could use phylogenetic reconstruction methods to estimate the time depth of the innovation. The estimates would have all the problems that such estimates generally do, and the even an accurate date for the change(s) would be susceptible of several different interpretations. But still, it would be a step forward.

If you're interested in more background, here's the section on the larynx from the lecture on Language and Gender in my lecture notes for Linguistics 001 at Penn:

The larynx

Males and females differ little in stature before puberty, but post-pubescent males are about 8-9% taller. According to a database maintained by NIST, the male children in their sample averaged about 3% taller at age 2, and less than 1% taller at age 10, whereas males average about 9% taller at age 18. According to a 1977 publication from the National Center for Health Statistics, at age 2 the 50th percentiles for males and females are identical; at age 10, girls are .6% taller (in the 50th percentile), and at age 18, males are about 8% taller.
With respect to the length of the vocal folds (the tissue in the larynx that is responsible for producing voiced speech), this overall difference between the sexes is magnified by approximately a factor of seven: the vocal folds of post-pubescent males average about 50-60% longer than those of females of the same age (length of the overall glottis or length of the anterior glottis in the figure and table below)..

Top view of the vocal cords

AC
anterior commissure

VP
tip of vocal process

AnAC
angle of bilateral vocal folds at AC

GWP
glottic width at vocal process level

LEG
length of entire glottis

LAG
length of anterior glottis

LPG
length of posterior glottis

LMF
length of membranous vocal fold

Male

Female

Ratio M/F

AnAC in degrees
16

25

LMF in mm
15.4

9.8

1.57

GWP in mm
4.3

4.2

1.02

LAG in mm
15.1

9.5

1.59

LPG in mm
9.5

6.8

1.40

LEG in mm
24.5

16.3

1.50

(Data and picture from Hirano, M, K Sato and K Yukizane; "Male-female difference in anterior commisure angle", in S. Kiritani, H. Hirose and H. Fujisaki, Eds., Speech Production and Language, Mouton de Gruyter, 1997. The study involved excised larynges from 10 males and 10 females, average age 58 for the males and 66 for the females)

As a result of these laryngeal changes, adult human males have significantly lower voices than females do, out of proportion to their rather small different in average height. Though the pitch of anyone's speech depends very much on circumstances, under comparable conditions, (adult) human females voices are likely to show pitches roughly 75% higher those of male voices. This difference reflects not only the difference in vocal cord length, but also a difference in vocal cord mass -- and perhaps some socially-conditioned factors as well. A graph showing data from various studies is reproduced below (taken from Kent 1994):

Because the larynx also drops lower in the neck in post-pubescent males, the overall adult male vocal tract length is about 15% longer on average. This means that resonance frequences (including the formant frequencies that determine vowel quality) are also about 15% lower in adult males as compared to females. This is about 175% of the difference expected on the basis of the average overall size differences (8-9%). This difference also means that adult males are even more subject to the risk of choking on aspirated food that is a price the human species pays for adapting its vocal organs to speech.
None of the other species of apes shows a similar sexual dimorphism of the vocal organs, although overall size differences between the sexes tend to be larger in other apes than in homo sapiens.

Posted by Mark Liberman at 07:11 AM

September 28, 2007

Cartoon quotatives

In the last panel of this Zits strip, we see teenage quotatives in action -- but they're a bit behind the times:

First, quotative go, and then two occurrences of quotative (be) all. Both of these, and quotative (be) like as well, are stereotyped as characteristic of teen talk. People think teens use quotative all "all the time". But these days it's easier to find examples in comic strips than on the street. All rose as a competitor to like and go (and, of course, say) from the early 1980s to a peak roughly a decade ago and then declined; meanwhile, the combination (be) all like entered the competition, and by now most occurrences of quotative all are in this combination, and like on its own is the quotative of choice among the young (and is used by many others as well). This history is described in two recent papers from the Stanford ALL project:

John R. Rickford, Isabelle Buchstaller, Thomas Wasow, & Arnold Zwicky. 2007. Intensive and quotative all: Something old, something new. American Speech 82.1.3-31.

Isabelle Buchstaller, John R. Rickford, Elizabeth Closs Traugott, Thomas Wasow, & Arnold Zwicky. 2006. The sociolinguistics of an innovation in decline: quotative all. Paper presented at NWAV 35. Submitted for publication.

So Jeremy, above, sounds several years out of date.

Note also the occurrence of DISCOURSE MARKER like in the second panel. This has been widespread (among speakers of all ages) for a long time, and people have been objecting to it for a long time, characterizing it as a "meaningless tic" or an "empty filler" and the like (though there's a considerable literature arguing that it has a variety of meanings and uses). In addition, people tend to lump quotative like and discourse-marker like together, even though they have obviously different syntax and semantics. (When Patricia O'Conner filled in for William Safire in the NYT Magazine "On Language" column back on the 15th of July and cited me -- and Jennifer Dailey-O'Cain and Geoff Pullum -- on quotative like, a number of non-linguist friends wrote me to report on their annoyance at like. But, though O'Conner had gone to some trouble to distinguish the quotative from the discourse marker and said that her column was specifically about the quotative, the messages from my friends were all about the discourse marker.) So it's likely that prejudice against the discourse marker has slopped over onto the quotative. In any case, a great many people are passionately negative about discourse-marker like, quotative like, AND quotative all -- a response that needs some explanation. I'll save that for another posting.

Posted by Arnold Zwicky at 02:17 PM

Have another think

Mark wonders why the OED claims "have another thing coming" is derived from "have another think coming," and yet provides a first citation of the former from 1919 and of the latter only from 1937. The truth is, the updating of the OED takes place in piecemeal fashion, and not every entry can be replenished with new findings at the same time. When the OED first added "have another thing coming" to the entry for thing in 2004, the earliest citation given was from 1981. That was quickly antedated, first by Jesse Sheidlower with a cite from 1959, and then by me with a cite from 1919. So the thing entry was revised yet again to incorporate the new antedatings. The entry for think, however, hasn't been touched for quite a while. When it's finally updated, it can include cites much earlier than the one from 1937. I've reproduced the earliest cites I've found so far on the right. The first is from the Washington Post of April 29, 1897, and the second is from the Chicago Daily Tribune of September 24, 1898. But I certainly don't expect those first cites to stand for very long, as digitized databases of newspapers, magazines, and books continue to expand at a rapid pace.

Posted by Benjamin Zimmer at 09:22 AM

Another thing coming

Adam Liptak, "Verizon Reverses Itself on Abortion Messages", 9/28/2007, quotes one of the people who didn't like Verizon's (later reversed) decision to bar NARAL from sending text messages to its activists:

“I’m a supporter of abortion rights, but I could be a Christian-right person and still be in favor of free speech,” Mr. Hoag said. “If they think they can censor what’s on my phone, they’ve got another thing coming.”

Google has 146,000 hits for "another thing coming", most of which are not the Judas Priest song, vs. 49,300 for "another think coming", which I'm pretty sure is the original expression. (Arnold Zwicky observed thing's internet victory back in June of 2004 -- though the totals were much smaller then, 21,400 to 5,830.)

Although the OED explicitly agrees about the direction of development

to have another thing coming [arising from misapprehension of to have another think coming ...]

the earliest citation for the "misapprehension" is in 1919:

1919 Syracuse (N.Y.) Herald 12 Aug. 8/3 If you think the life of a movie star is all sunshine and flowers you've got another thing coming.

which is 18 years earlier than the earliest citation given for the "correct" form:

1937 Amer. Speech XII. 317/1 Several different statements used for the same idea -- that of some one's making a mistake...[e.g.] you have another think coming.

The fact that Adam Liptak (and his copy editor) rendered Mr. Hoag's quotation with "thing" means (I guess) that they must think that the "misapprehension" is correct. I'd say that they've got another think coming, except that there's some evidence of internal NYT copy-desk dissension (or perhaps lack of interest) on this point. The NYT archive shows 15 instances of "another thing coming" since 1987, compared to 31 of "another think coming".

Not all of these (on either side of the think/thing divide) are in quotations. Thus Ira Berkow wrote ("Liberty Subdues Jitters, and Nearly Comets", 8/27/2000),

But such desperate talk may well have accorded insight into the hearts and minds of the Liberty players. It seemed they were digging themselves a mental hole, but no one could contest that they had a reason.
For the amateur psychologists, however, they'd have another think coming.

And Christopher Bray's book review "Escape From Ischiano Scalo", 9/10/2006:

Sufficient to say that the novel builds, with heartbreaking clarity, to one character’s utterly unpredictable demise. Anyone who thought “The Bicycle Thief” had told them all they needed to know about injustice has another think coming.

On the other hand, Kirk Johnson "Voters, Their Minds Made Up, Say Bin Laden Changes Nothing", 10/31/2004, has the lead sentence:

If Osama bin Laden imagined, in releasing a threatening new videotape days before the presidential election, that he could sway the votes of Kerry supporters like David and Jan Hill and Bush supporters like Paul Christene, he has another thing coming.

And Judy Battista wrote on 3/6/1999 in "St. John's Gets Revenge, and a Shot at UConn":

The payoff finally came last night when, in the waning seconds of a game full of brutal defense and extraordinary intensity, point guard Erick Barkley, who had promised that Miami had another thing coming if it thought this third game would be a cakewalk, stripped Johnny Hemsley of the ball as the clock wound down to single digits, securing a 62-59 victory over the ninth-ranked Hurricanes and St. John's first trip to the Big East final since 1986.

This one doesn't seem to be in the eggcorn database yet.

[Update -- if you're worried about the time sequence of the OED's citations, here's something to set your mind at ease. Or maybe not.

"What was Due the Professor", New York Times, June 9, 1901 (reproduced in full):

She was a Normal School girl and had taken the Regents' examination in Latin. Comely, well-dressed, alert, and rather "proper" in her mannerisms, she would no doubt take great offense if told that she was so addicted to slang that she dropped into it without having any more than a subconscious knowledge of the fact. And yet this is what happened. The examination was over and the papers were being collected.

"Miss ___," said the chief examiner to the young woman, "did you not look on Miss ___'s paper for answers to these questions?"

"No, Sir," snapped the girl, with eyes ablaze.

"Well, Professor ___ thought he saw you do so."

"Well, Professor ___ has another think coming, " retorted the candidate, who expects some day to have in her care a part of the growing population of New York City.

So it seems that as with "home/hone in on", the two versions of this expression have been more or less in sociological equilibrium since the beginning.]

[More here.]

[Update 9/30/2007 -- Today, a Metafilter post on why 0.999... is equal to 1.0 was hijacked by a debate on "have another think/thing coming". Feelings among the Mefians run suprisingly high on this question. There were 211 comments at last count, more than twice the number of the next most active post for the day. ]

Posted by Mark Liberman at 06:27 AM

September 27, 2007

Langage SMS

Maybe it was the color scheme in the photograph in Victor Mair's guest post earlier today, but I'm reminded of a novelty mini-ruler I saw while vacationing in France earlier this summer, describing various SMS text message abbreviations in French. (Disclaimer: I don't even text in English, much less in French. I have no idea of the extent to which the shortcuts described on the ruler are used by French SMSers, if any.)

Some of the examples, like bap = "bon après midi" (= "good afternoon"), are simple initialisms (like lol = "laughing out loud"); others, like bcp = "beaucoup" (= "much") are words recoverable without vowels and/or certain consonants (like mbrsd = "embarrassed").

Most interesting (to me) are examples that take advantage of the phonetic similarity between the name of a letter/number and another word or sub-part of a word. For example, a12c4 = "à un de ces quatre" (= "see you later", lit. "until one of these four") features three numbers, two of which correspond to numbers in the phrase being abbreviated (1 = "un", 4 = "quatre"), but the third corresponds to a similar-sounding word (2 = "de", the number being "deux"); it also features two letters, one of which abbreviates a word its name sounds like (c = "ces").

Just thinking of numbers: in English, the pronunciations of 2, 4, and 8 have the highest likelihood of substituting for (parts of) words (e.g., "too"/"to", "for", words in "-ate"); in French, it's 1 and 2 that predominate -- but there's also 9 = "neuf" (= "new"), as in koi29 = "quoi de neuf?" (= "what's new?"), or 6 = "six" (pronounced like "see"), as in GT o 6né = "j'étais au ciné" (= "I was at the movies"), or 5 = "cinq" (pronounced like "sang"), as in C pa 5pa = "c'est pas sympa!" (= "that's not nice!"), or (my favorite) 100 = "cent" in D 100 = "descends!" (= "come down!").

Anyway, click on the pic and check out the other examples on the novelty ruler -- and if you don't know any French, find someone who can explain them to you. They're kinda fun.

[ Comments? ]

Posted by Eric Bakovic at 07:44 PM

Gender-role resentment and Rorschach-blot news reports

Yesterday, the New York Times opened up its web site to comments on David Leonhardt's article "He's Happier, She's Less So", which explained that "two new research papers, using very different methods" both concluded that "there appears to be a growing happiness gap between men and women". I posted about this article ("The 'happiness gap' and the rhetoric of statistics") because I've become interested in the rhetorical strategies of science journalism.

In this case, there's another variable: clever marketing. A reader asked this morning, "does the NYT strike you as particularly fond of edgy young economists doing edgy young things?" If so, it's because they know their market. Leonhardt's article got 700 comments in some 20 hours, before they closed the door to more.

I haven't read the whole list, but you don't have to go far into it to realize that essentially the entire readership of the NYT fell for the characteristic journalistic misrepresentation that I identified: "turning small differences in group distributions into categorical statements about group properties".

And you don't have to read all 700 comments to realize that there are a lot of unhappy people out there.

Here's a sample from the first couple of dozen:

Men are dogs. Dogs are happy. Voila.
I agree with the most likely explanation noted namely that women now have more to excel at and on the other hand fail at.
For women, it's all in the details. For men, not so much. It's time to follow the lead of the GM workers and go out on strike.
As soon as I read "What has changed..." I knew that the 'explanation' was the female juggle.
It's because many of the gains for equality for women in the job markets made in the 70's and 80's have been eroded away by the Bush Administration.
We're less happy because we're tired. [...] It's too much. I'm ready to crash and burn.
In the '70's, women thought there was a chance at finally having an intellectually stimulating career, without sacrificing having a family. 30 years later, we are struggling with the rigidity of our culture that continues to make that unattainable.
...it takes nothing more than a shiny object or a breast to make men happy. They really are shallow, lazy, morons.
Women are presented with impossible aspirational models. Men, in my opinion, are less reflective.
Men appear happier now because behind every happy man is a good woman giving her life to make him that way. They have everything they had in the 1970s plus their wife's second income.
All of the talents and skills that most women are naturally good at (being a good friend, supporting their communities, managing a household and raising children), have been devalued by our society. Instead we are supposed to go into the workplace and act like men - what a waste.

The comments over at digg.com are drawn from a different distribution -- of equally unhappy people, who were equally eager to swallow Leonhardt's misrepresentations hook, line and sinker:

Oh boo hoo! Feminists made their bed, now they have to lie in it alone with their cats.
Duh, we get Halo, and you get periods.
Men are getting smarter and aren't handing over every penny they make to a pretty face. This makes woman sad.
no shit, work sucks...welcome to what men have been putting up with for hundreds of years
So can we now finally cut the crap that being a housewife is just as hard as holding a traditional job?
Internet (read:porn) caused this. We don't need women anymore.
It is because women wanted to go to work, like a kid wants a toy lawnmower after watching his dad mow the lawn. Now that they have to work they realise it sucks so they are sad.

OK, everybody, take a deep breath and listen: THERE IS NO HAPPINESS GAP!

Every year since 1972, the General Social Survey has been asking a big demographically-balanced sample of American men and women "Taken all together, how would you say things are these days? Are you a) very happy, b) pretty happy, c) not too happy."

Neither in 1972 nor in 2006 was there any statistically significant difference between men and women in the distribution of their responses! And in both 1972 and 2006, the proportion of women who said "very happy" was a little bit higher than the proportion of men who gave that response (though again, in neither year was the difference distinguishable from chance fluctuations).

So what is everyone talking about? Well, some economists fit a complicated statistical model (called an "ordered probit") to the whole sequence of survey results from 1972 to 2006, and this analysis suggests that women have become a little tiny bit less happy relative to men over that whole time period. But the effect is so small that you can't actually see it in the statistical analysis for any one year; the effect is much smaller than the amount of year-to-year jiggle. That's true even through the General Social Survey involves a huge sample, much bigger than is normally used for opinion polls: 4,500 people in 2006.

And then there's some as-yet-unpublished stuff about how the amount of time per week that men say they spend in activities that they find unpleasant has decreased by about 3.6% over the past 40 years, whereas the amount of time that women report spending on unpleasant activities has remained about the same. Since the data remains unpublished, it's hard to know what to make of this, but I'm betting that the between-group change is minuscule relative to the within-group variation, just as in the GSS analysis. If you want to read more about the research, check out this post and the link therein. For now, I'll just reproduce the crucial graph of GSS responses over time:

So look: you can stop trying to explain the happiness gap, because there's nothing much there to explain, at least about differences between men and women.

This doesn't invalidate the struggle of people juggling work and family and friends and sex and everything, or the concerns about (un)changing gender roles -- but we should be able to talk about those things without projecting them onto marginal social-science analyses showing possible differences between groups and across time that are at best a small fraction of the within-group variation.

Still, it does look to me like there are some gaps worth commenting on here.

There's the gap between the people who comment on the NYT web site and the people who comment on digg.com.

There's the gap between female resentment of men and male resentment of women, at least as seen in the fraction of the population who felt moved to comment in various forums on this article.

And most important, there's the gap between what you read in the papers and the truth.

[Today (Sept. 28, 2007) Leonhardt's parable has moved up to the #1 "most emailed" slot on the NYT web site. And the story has been picked up by the TV networks and other media outlets. ]

[More here.]

Posted by Mark Liberman at 04:30 PM

Hu's on Jeopardy

Jeopardy has been doing pretty well recently on the linguistic front, but last night they got one wrong. The clue asked for two English words that are homonyms of the names of the the President and Premier of China. The desired response was who and when. The current President is Hú Jǐntāo (胡锦涛); his family name is indeed pronounced like English who. The current Premier is Wēn Jiābǎo (温家宝). His family name is not, however, pronounced like English when. In the pinyin romanization, the letter <e> is pronounced [ə] in most environments, including this one, so his name actually sounds more like English won.

Posted by Bill Poser at 02:58 PM

Forbidden Language and "Virtuous" (i.e., Disgusting) Conduct

[Guest post by Victor Mair]

This poster is from the Silk Street Market in Beijing.

Since the exact same sign is also found in the Pearl Market (a bustling multi-story "mall") in Beijing, it is probably standard issue for a certain type of store complex. Now, one who has not spent much time in Beijing may think that salespersons could not possibly speak to customers in terms that are as crudely offensive as the "forbidden words", but I have personally heard every single one of these expressions in stores throughout Beijing.

I shall not provide a commentary on all ten of the items that have supposedly been prohibited by the management, because most of them are fairly straightforward. I shall, however, note only that No. 3 sounds worse in English ("shit") than it does in Mandarin (HU2SHUO1BA1DAO4 ["nonsense; rubbish"]). No. 6, BEN4DAN4, literally means "stupid egg," but should be rendered as "fool; idiot."

The translator failed to rise to the occasion on Nos. 8-10. These last three items do become progressively more elusive and difficult to translate, but my attempts are below.

8. "If you can't afford it, don't ask blindly / look blindly" (i.e., don't ask stupid questions / don't keep looking blindly [at the goods]).

9. "Are you a man?" (a challenge to the potential buyer's masculinity [a *real man* would buy this thing right away without pussyfooting around]). Keep in mind that most of the sales personnel who would be hurling this sort of charge would be women.

10. "Just look at how disgusting / revolting you are!" QIAO2 NI3 NA4 DE2XING4. Literally, "look you that virtuous-behavior / nature." The use of DE2XING4 ("virtuous-behavior / nature") to mean virtually its opposite is a good example of the penchant for extreme FAN3HUA4 ("irony; facetiousness") that is pervasive among various social groups in China.

As a matter of fact, the first few times I heard this put-down back in the 80s, it was reduced just to the last two syllables -- DE2XING4 -- and I couldn't for the life of me understand what was so bad about saying "virtuous behavior / nature" to someone. But I could tell unmistakably from the withering tone of voice and the dismissive half-turning away that left the customers wilting like delicate flowers under a scorching sun that something awful had been uttered. Furthermore, to achieve the full effect, DE2XING4 has to be enunciated with exactly the right rhythm and emphasis. The DE2 is a long, drawn-out, rising tone, and the XING4 is a rapidly falling tone -- sometimes accompanied by a snort and a sneer.

According to my jealous yet awed friends from other parts of China, only a Beijing woman can articulate DE2XING4 with just the right nuance to reduce the object of her contempt to smithereens: DUUUUHHH-SHEENG! (I should mention, however, that I have heard Nanjing women and Ningbo women unleash totally different types of indirect invective that are every bit as impressive and devastating in their own way as the DE2XING4 of a Beijing shopgirl.)

To make a few general observations before closing, note that the recommended expressions on the left are much more verbose in comparison with the forbidden insults on the right, which, except for No. 7, are uniformly laconic. (I think it will be next to impossible to train Beijing sales girls to produce such lengthy, unctuously polite expressions. They prefer to speak in short blasts and bursts.) Furthermore, it is remarkable once again (as I have pointed out in previous posts) that bilingualism is such a conspicuous feature of the contemporary scene in China (at least in the big cities), with English playing an increasingly prominent role in so many areas of daily life. Finally, it is worth pondering why the quality of the translations of the acceptable expressions, while far from perfect, is so much better than that of the forbidden expressions. Though I have some ideas about why that is so, I leave it to my readers to come up with their own explanations for this curious phenomenon.

[Photo from 1largeprawn]

Posted by Mark Liberman at 01:54 PM

Treason in Georgia?

No matter how hard I try, I just can't seem to retire. I've failed at it over and over again. I'm a sucker for interesting new law cases and so when an old friend, Larry Barcella of the Paul Hastings law firm in Washington DC, called me to help him with a treason case in the Republic of Georgia, I quickly agreed. It's a fascinating case with international implications that I won't get into now but you can read brief summaries in the Christian Science Monitor (here) and in Russia Today (here).

On May 4, 2006 a meeting was allegedly held in Tbilisi in which the participants were said to have plotted to overthrow the government of Georgia. Four of the people who claim to have attended this meeting became the prosecution's best witnesses in the treason trial that followed, along with seven others who admitted that they did not attend this alleged meeting but who were told about it by others (note: hearsay evidence appears to be allowed in Georgia's legal system). Although the judge closed the trial to the public, including the media, reports about some of the human rights issues managed to trickle out in a variety of places, including Harpers Magazine on July 10 and July 20 of 2007.

The alleged convener of this alleged May 4 meeting was Maia Topuria, a leader of one of the many political opposition parties in Georgia. Forensic linguistics gets into the picture because eleven key government witnesses produced handwritten "confession statements" about what took place at this meeting. Lawyers for the defendants asked me to analyze these statements. I did so and produced a fifty-page expert report containing six charts. The forensic linguist's role here is to analyze the language to determine whether there is linguistic evidence of some kind of influence on those who produced it, but not to try to explain the source or cause of such influence. That's an issue for the lawyers to explore.

My report was translated into Georgian by a court appointed translator and all of the witness statements were officially translated into English. This was a bench trial so my report was read by the judge and was also used extensively by all the defense lawyers, including Lawrence Barcella, Melinda Sarafa, and Gela Nikolaishvili, in their cross examinations and closing statements.

One crucial question for the defense was whether the eleven witness statements, given to at least four different police investigators over a three-month period of time, were produced independently (as the prosecution claimed) or made under the direction and influence of the police (as the defense claimed). If the latter was the case, their authenticity and truth could be seriously questioned. Linguists analyzing written texts apply the conventional tools of linguistics, such as discourse analysis, lexical analysis, and syntax analysis, to the texts, comparing each witness's statement with the others to discover what is similar and what is different. And that's what I did here.

Topic sequencing turned out to be important. Even though the police investigators claimed that the witnesses wrote their statements independently, my charts comparing all of the witness statements with each other evidenced a remarkably similar (very often identical) sequential order in the topics they introduced. Different people who claim to have witnessed the same event (or got their information second-hand) may be expected to talk about some of the same facts in common, but it is highly improbable that they will report them in a similar topic sequence or, as happened very frequently here, in the identical topic sequence.

I also compared the words and expressions used by the eleven witnesses. It can be expected that some words will be used in common but it is highly improbable that both the witnesses who claim to have been present at the alleged meeting and those who were not there but were told about it by others will select identical words and expressions to the extent found in these statements. In my report I cited 32 examples, most of which were also sequenced identically. Examples of these include "many thousand meeting," "the growing character of," "imitation of rushing into the chancellery," "shooting in the direction of," "elucidation by television," "pretext of protection," "to liquidate the ministers," "highly class professionals," and "make a decision to retire." The lexical inventory of Georgian is large enough to suggest that we could expect to find more variability in these (and many other) words and expressions than in those they wrote.

A remarkable similarity of sentence and clause structure was also found in their writing of the witnesses. My report cited 32 examples of these. One example shows the following identical syntax: "In the beginning of May of 2006, I do not remember the exact date, I was in the central office of the party." Three of the witness wrote this in exactly the same way, while the others varied only slighly, omitting the "I do not remember the exact date." The other 31 similar or identical sentence structures were also commonly used by the witnesses.

But two of the witnesses, who claim to have heard about the May 4 meeting from two different people who allegedly attended it, produced unbelievably similar statements. At one point they produced 14 consecutive, very long sentences in which all of their words, syntax, and punctuation marks were completely identical. 770 consecutive words, exactly the same!

One of these two witnesses was asked during cross-examination for the meaning of some of the words he had written. He said "liquidate" meant "to devestate," dispute" meant "a TV debate," and "immitation" meant "an attempt." He said he could no longer remember what he meant by "spontaneous," although he knew what it meant when he wrote it. Equally astonishing at trial was when the defense showed videotaped evidence that on the day one of the witnesses said he was giving his supposedly independent statement to the government, he had actually spent the entire day in his office.

And the result of all this? One might expect evidence like this to cause the judge to acquit. Not so. In late August, all of the defendants, including Maia Topuria, were found guilty of treason. Topuria was sentenced to eight and a half years in prison. The lawyers are taking this case to the European Court of Human Rights.

The judicial system of Georgia needs a lot of help. Whatever shortcomings the American system may have, these must pale in significance to those in the Republic of Georgia.

Posted by Roger Shuy at 01:22 PM

September 26, 2007

The "happiness gap" and the rhetoric of statistics

More precisely, I'm talking about the rhetorical translation of statistical results into journalistic generalizations.

In today's NYT, David Leonhardt ("He's Happier, She's Less So") tells us that in a recent survey, men and women "often gave similar answers about what they liked to do". However, "there were also a number of activities that produced very different reactions from the two sexes" -- for example, men said that they found being with their parents unpleasant only 7% of the time, while women found it unpleasant 27% of the time.

But that's just the teaser. According to Leonhardt,

This intriguing -- if unsettling -- finding is part of a larger story: there appears to be a growing happiness gap between men and women.

Two new research papers, using very different methods, have both come to this conclusion.

The way he tells us about this "growing happiness gap" is a lovely example of scientific research as moral fable. And his story is also an especially clear case of a key method in this transformation: turning small differences in group distributions into categorical statements about group properties.

Here's the first piece of research, as Leonhardt summarizes it:

Betsey Stevenson and Justin Wolfers, economists at the University of Pennsylvania (and a couple), have looked at the traditional happiness data, in which people are simply asked how satisfied they are with their overall lives. In the early 1970s, women reported being slightly happier than men. Today, the two have switched places.

Stevenson and Wolfers report their results in a paper, available as a preprint on their web site, titled "The Paradox of Declining Female Happiness". Here's Figure 1 from their paper, presenting data from the General Social Survey:

It's not obvious, by the method of ocular trauma ("what strikes the eye"), that the sex differences in this data are anything but random noise. But the authors' "ordered probit regression analyses" produces "implied estimates of the gender happiness gap" which allow them to assert that "At the start of the sample women reported higher levels of subjective well-being than did men, however by 2006 this earlier gap had reversed and women's subjective well-being in recent years is lower than that of men".

I'm not about to argue with an ordered probit -- the penalty for that is 15 to 30 months in the slammer, as I recall. (And I'm in the middle of teaching Generalized Linear Models at the moment, so it would be hypocritical of me to bad-mouth one of them.) But in fact, as their Table 1 indicates, the ordered probit analysis found that the "Gender happiness gap" was not statistically significant, either in 1972 or in 2006, even at the 0.10 level. The significant effect was the "Difference in Time Trends".

My point is that these effects, whatever they are, are quite small, requiring clever statistical analysis over very large amounts to data to be seen at all. The researchers themselves describe their inferred distributions this way:

Comparing the 2006 medians with the distribution for men in 1972, we see that the median woman in 2006 is as happy as a man at the 48.8th percentile in 1972 [...], while the median man in 2006 is as happy as the man at the 50.7th percentile in 1972.

1.9 percentile points is not much of a gap, if you ask me. I'd call it more of a crack, or maybe just a a wide pencil line.

The Stevenson and Wolfers paper looks at some other data, and has many interesting things to say -- the main message is expressed in their abstract this way:

By most objective measures the lives of women in the United States have improved over the past 35 years, yet we show that measures of subjective well-being indicate that women's happiness has declined both absolutely and relative to male happiness. [...] Our findings raise provocative questions about the contribution of the women's movement to women's welfare and about the legitimacy of using subjective well-being to assess broad social changes.

But I'd like to pause at this point to consider the rhetorical effect of the report of these results in the New York Times. Leonhardt's article came up in a couple of conversations that I was involved in today. In each case I showed Figure 1 (above). People found it puzzlingly inconsistent with the message that they had taken away from their encounter with with the newspaper.

Most people think in essentialist and non-statistical terms, as if all the members of a category were uniform copies of an invariant prototype. I suspect that most journalists think this way too, but in any case, they certainly write as if they do .

Here, we start with a study that found *no* statistically significant difference in male vs. female group happiness at either end of a time series, even though the data came from a large survey, whose size of 1,500 respondents in 1972 rose to 4,500 respondents by 2006. Looking across all 34 years, the resesarchers were able to find a statistically significant difference in overall male vs. female trends. The magnitude of this effect was cumulatively quite small (though doubtless important from the perspective of the philosophy of economics).

But what people take away from the journalistic description of this study is that women used to be happier than men, and now men are happier than women -- and they think of this as a fact about all men and all women. In fact, we're talking about effects whose size is such that perhaps the happiest half of the population, on an optimistic reading of a complex statistical reconstruction, contains a couple of percent more of one sex than the other! When I show readers of the NYT article the graph of the data that underlies this study, they're flabbergasted.

OK, how about that other study? We're told that it's "even starker":

Mr. Krueger, analyzing time-use studies over the last four decades, has found an even starker pattern. Since the 1960s, men have gradually cut back on activities they find unpleasant. They now work less and relax more.

Over the same span, women have replaced housework with paid work -- and, as a result, are spending almost as much time doing things they don't enjoy as in the past. Forty years ago, a typical woman spent about 23 hours a week in an activity considered unpleasant, or 40 more minutes than a typical man. Today, with men working less, the gap is 90 minutes.

Unfortunately, the paper with the details "will be published in the Brookings Papers on Economic Activity", and I haven't been able to find a preprint. So we'll have to imagine what we'll find when the paper is out, and instead of being told about "a typical woman", we can look in detail at the distribution by sex of time spent per week in activities "considered unpleasant", and how those distributions have changed over the past 40 years.

I can tell you what I expect.

The statement in Leonhardt's NYT article, which we can bet is chosen from the available numbers to make the point as "starkly" as possible, means that over a period of 40 years, women's average self-report of participation in unpleasant activities has increased by about 12 minutes per day relative to men's. This is about 3.6% of the average time reported for such activities. The cited gap of 90 minutes a week between women and men is about 6.5% of the overall average.

I don't know what the standard deviations of these reported unpleasant-activity times are, but we can guesstimate them based on Krueger, Alan B. and David Schkade. "The Reliability of Subjective Well-Being Measures", January 2007, which analyzed

the test-retest reliability of two measures of subjective well-being: a standard life satisfaction question and affective experience measures derived from the Day Reconstruction Method (DRM).

What they found was that people were pretty inconsistent in reporting their affective state:

We analyzed the persistence of various subjective well-being questions over a two-week period. We found that both overall life satisfaction measures and affective experience measures derived from the DRM exhibited test-retest correlations in the range of .50-.70. While these figures are lower than the reliability ratios typically found for education, income and many other common micro economic variables, they are probably sufficiently high to support much of the research that is currently being undertaken on subjective well-being, particularly in cases where group means are being compared (e.g. rich vs poor, employed vs unemployed) and the benefits of statistical aggregation apply.

With test-retest correlations of 0.6 or so, we'd expect multiple tests of the same individual to show a large variance for "time spent per week in activities considered unpleasant", and for group (i.e. male or female) variances to be even larger. It wouldn't be surprising, I think, to learn that the pooled standard deviation of "time spent per week in activities considered unpleasant" is something like 50% of the mean, or around 11.5 of the 23 average unpleasant hours. If so, then the effect size of the cited 90-minute gap might be something like 1.5/11.5 = 0.13.

If you're not familiar with the concept of "effect size", you can read about it here -- an old Language Log post where it turned out that the effect size of the difference in talkativeness between males and females, measured in words per conversational side, was 0.128, which corresponds to a pair of word-count distributions that looked like this:

Again, readers of the New York Times article take away the impression that each woman is spending an hour and a half more per week in "activities considered unpleasant" than each man is. But whatever the real "happiness gap" in this study turns out to be, it's likely that the between-group effect size was very small. And one way to quantify this is to ask how what the odds are that a randomly selected woman reports more time per week spent in unpleasant activities than a randomly selected man does. If the between-group effect size is really as small as 0.13, then the random woman will log more misery-time than the random man about 54 times out of 100. This is a difference in the same general range as we saw in the first study.

[I apologize for the crudely hypothetical analysis of this research -- but I'm not the one who generated a widely-discussed NYT article on a study's conclusions before publication of any of the details on which those conclusions are based.]

OK, so imagine coming into a door labeled "the room of unhappy people". You enter, and find yourself in a hall with between 51 and 54 women, and between 46 and 49 men. Do you think that you could decide which sex predominated, without lining everyone up and doing an explicit count?

Now imagine that you walk through two such rooms, where the first one is around 51-to-49 female, and the second is around 54-to-46 female. Do you think that you would notice the direction of difference in the sex ratios, without another pair of line-ups?

More to the point, do you think that you could spin differences like these into today's second-most-emailed NYT story?

If your answer is "yes", then you may have a future as a science writer. (Or, perhaps, as an economist...)

[To forestall objections from the well-informed readers who occasionally take me to task for treating survey answers as if they were unbiased indicators of people's true internal states and behavioral dispositions, let me stress that I'm just going along with the assumptions of the research under discussion. The group differences in these studies are small ones, whether they're really differences in overall emotional state and in affective reactions to life's experiences, or differences in the mapping from internal emotional states to ways of answering survey questions about feelings and in estimating percentages of time spent in various ways. Whatever they are, these small differences between group distributions have now been transformed, in the public's mind, into facts about all the individual group members.]

[Update -- more here.]

Posted by Mark Liberman at 10:11 PM

What's so funny about endangered languages?

There seems to be a perfect media storm gathering around K. David Harrison and his efforts to bring attention to endangered language research (see here, and follow the link to the comments). Though it doesn't (yet) appear on his press page, David was interviewed by Stephen Colbert last night on The Colbert Report -- follow the link and see the video. (You may still be able to catch the re-run tonight.)

The show was introduced like so:

Tonight -- half the world's languages have gone extinct. I tell them: "good riddance" and "adios".

C'mon, it's a little funny. Especially from a man who changed his name so that it's pronounced with a "soft t".

[ Comments? ]

Posted by Eric Bakovic at 09:00 PM

Hyphen migration, cartoon style

Games you can play with hyphens, from the webcomic xkcd:

(Hat tip to Martin Smith.)

Posted by Arnold Zwicky at 04:44 PM

For all who lack hacking

Fev at Headsup: the Blog caught a great case of cross-dialectal misunderstanding in the Charlotte (North Carolina) Observer ("Tin Ear award of the week", 9/24/2007). The story (since corrected) was about some boy scouts missing on a hike in the mountains:

"We think it's most likely that they realized it was late and they bedded down for the night," said Charity Sharp, of the Cruso Volunteer Fire Department in southern Haywood County. "They were prepared. They knew what they were hacking into. The scout leader is familiar with the area and knew what kind of terrain they were hacking."

Fev's observation:

...as a near-30-year resident of the fair state in question, HEADSUP-L is inclined to suggest that Ms. Sharp said "hiking." H-I-K-I-N-G. Given that they were going on a hike, not trying to steal the Pentagon's launch codes, I mean.

This seems to be some counter-evidence to the claim that "the Southern drawl has expanded to the point where, arguably, more than half of all Americans now glide their diphthongs and hush their Rs like modern-day Rhett Butlers". At least the diphthong part seems not yet to have expanded to include all of the Charlotte Observer's newsroom.

As Fev says,

If you're going to be the Foremost Newspaper of your state, you need to know how its people talk.

Posted by Mark Liberman at 08:05 AM

Stealth variation

Commenting on Dick Margulis's comment on "'Be done' again", an anonymous reader writes:

Some months ago, I accidentally stumbled upon a grammatical difference on this point with a friend and fellow linguistics Ph.D. student. I said to her (by IM), "I think I'm done the exercise." She was shocked that I found that grammatical - it was unquestionably blindingly bad for her. I, on the other hand, couldn't figure out why she was asking if I really found it grammatical. I couldn't see anything about it that looked even slightly questionable. Our initial investigations into who found it good and who didn't seemed to suggest that Canadians (perhaps particularly Western Canadians - I'm from British Columbia myself) tended to find it good and Americans didn't, but then we came across someone from New Jersey who used it - and your "side note" adds more Americans who like it.

One thing that's interesting to me about this is that people on each side of the usage seem to be completely unaware that it could be different. I was amazed that there were people who couldn't say things like "I'm done my homework", and I've found similar reactions from other people who find it good; my fellow student was amazed that anyone could say things like that, and I've seen similar reactions from other people who find it bad. At least with the people I've talked to so far, there doesn't seem to be a lot of "well, I've heard people who say it, but I can't" or "I say it, but I know people who don't like it." I don't know how typical this is of points of grammatical variation.

My impression: it's typical except when it isn't.

Perhaps such stealth variation is commoner when the variants are more-or-less randomly distributed, without salient connections to geography, gender, class or ethnicity -- but that's mere speculation on my part. I dimly recall some research on this topic, but maybe someone once just told me that there ought to be some research on this topic. If your memory is better, tell me and I'll add the information here.

Posted by Mark Liberman at 07:53 AM

¿Que horas son, mi corazón?

In the tradition of Eugene Ionesco's The Bald Soprano and other works of Effle Art, we bring you Cedric the Entertainer's ¿Que Hora Es?:

After viewing part 2, aspiring impresarios may want to look over the little skit embedded in the appendices to Richard Burton's First Footsteps in East Africa ("Vintage Effle", 12/18/2003), or scan other Language Log Effle posts for inspiration.

Posted by Mark Liberman at 07:32 AM

September 25, 2007

Warmthiness

In the beginning, there was truthiness. Now, there is warmthiness:

Lillian Ross's memoir of The New Yorker editor William Shawn was titled Here but Not Here; Laura Bush's presence alongside her husband could be called There but Not There. Through some strange optical illusion or Jedi mind trick she manages to recede into the foreground or project into the background--it's hard to decide which. Either way, she hasn't been supplying the warmthiness that every presidency and reality-TV series requires and desires as a sweetener.

-- from James Wolcott's "The Simple Life: White House Edition", in this month's Vanity Fair (emphasis added).

But Google reveals that Wolcott was beaten to the word-coinage punch. From most to least recent:

Warmthiness is that property of color in a website that makes you feel warm and comfortable. (link; 3/15/07)
Leather jacket warmthiness. (link; 11/19/06)
unthought fuzzery warmthiness (link; 8/30/06)
I just Paypal'ed you $9000--now mod my Autocom into sweet beaten warmthiness!! (link; 8/20/06)

Follow the Language Log trail of truthiness! In particular, see these posts for other extensions of the "Colbert suffix" -iness. (There's also an extensive Wikipedia page not to be missed.)

[ Comments? ]

Posted by Eric Bakovic at 09:40 PM

Hyphenation in the news

These have been busy days on the hyphenation front. First, on Sunday we got an intersection of hyphens with taboo avoidance. Then yesterday was National Punctuation Day here in the U.S., which the NPR news show The Bryant Park Project celebrated by airing a brief interview about the many hyphens vanishing from the Shorter Oxford.

As for taboo avoidance, here's Grand Forks (North Dakota) newspaper reporter Larry Lubenow interviewing jazz musician Louis Armstrong in Grand Forks in 1957, two weeks after nine children were barred from Central High School in Little Rock (Arkansas), as recounted by David Margolick in the op-ed piece "The Day Louis Armstrong Made Noise"(NYT Week in Review, 9/23/07, p. 12):

Mr. Lubenow stuck initially to his editor's script, asking Mr. Armstrong to name his favorite musician. (Bing Crosby, it turned out.) But soon he brought up Little Rock, and he could not believe what he heard. "It's getting almost so bad a colored man hasn't got any country," a furious Mr. Armstrong told him. President Eisenhower, he charged, was "two faced," and had "no guts." For Governor Faubus, he used a double-barreled hyphenated expletive, utterly unfit for print. The two settled on something safer: "uneducated plow boy." The euphemism, Mr. Lubenow says, was far more his than Mr. Armstrong's.

I imagine that the expletive in question was mother-fucker (though mother-fuckin(g) is another possibility). But I was surprised by the "hyphenated". Maybe things were different 50 years ago, but these days the hyphenated spellings are clearly the least common of the three variants (solid, hyphenated, separated). The solid variant motherfucker leads the pack, and this is the heading for the Wikipedia page and for the (recent) OED entry (though the cites in the entry are of all three types), then comes the separated variant mother fucker, and the hyphenated variant trails; similarly for motherfuckin(g) etc.

While I was looking at such things, I checked out the noun gang-bang. This time Wikipedia and the OED diverge: the Wikipedia page has solid gangbang as its heading, but lists separated gang bang as an alternative, while the OED entry has hyphenated gang-bang as its heading (though again the cites in the entry are of all three types). Google shows the same preferences for gangbang as for motherfucker: solid first, the separated, then hyphenated.

[A digression. Hyphenated spellings for such compound words do have a virtue across the board: they indicate visually that these compounds are, structurally, both one word and two; that is, they are words with words as parts. This simple fact is concealed in the solid and separated spellings. That would be a matter of little consequence if our linguistic ideology didn't take written language as primary, as representing the "true" language. But it does, and so people understand motherfucker to be one word and mother fucker to be a sequence of two words (so does the word counter on your word processor; mine also treats the hyphenated spellings as a single word, and I believe that's generally the case). A question about spelling then turns into a question of whether compounds are "really" one word or two, and passions can be aroused. Sigh.]

[Amendment 9/26/07: Ah, Topher Cooper notes that the euphemism eventually used by Lubenow, "uneducated plow boy", suggests that the original was shit-kicker rather than mother-fucker. As so often happens with taboo avoidance, it can be hard to figure out what the disguised expression was. In any case, the discussion of mother-fucker above is probably beside the point, though still interesting on its own.

But shit-kicker is interesting too. There's a class of (well-studied) compounds in English of the form N + f(V), where f(V) is the present participle, past participle, or agentive version of a verb V, and N is understood as a non-subject argument (most often, the direct object) of V. Transparent compounds of the first two types (bicycle-riding, cockroach-infested) require the hyphenated spelling, while fixed expressions of these types allow or require the solid spelling (babysitting, homemade). Transparent agentives are normally spelled separated in some circumstances (a coffee drinker), hyphenated in others (my coffee-drinker friend), while fixed agentives can be either hyphenated or solid: shit-kicker or shitkicker (both are attested).]

On to the NPR show. The Bryant Park Project, put together live in (surprise!) New York City from 7 to 9 a.m., is at the end of a "pilot" stage. Yesterday's show was Pilot #28; next Monday it debuts as a regular show. You can listen to the show on their site.

The bit about hyphens is a short interview with Grammar Vandal Kate Somerville, who was reported on here when she pasted a comma into a Reebok ad that said: RUN EASY BOSTON. She's generally a stickler for what she sees as correctness -- in her blog yesterday, she notes with horror that during the interview she ended a sentence with a preposition -- though she's fairly easy-going on the hyphen question. But she's a demon about other punctuation marks; today's blog has a reproduction of an album cover with the legend

LYLE LOVETT
AND HIS LARGE BAND
IT'S NOT BIG IT'S LARGE

with the comment "As if country music didn't have enough grammatical errors already." It's all grammar, as we often say around here.

I suppose I have no right to complain. I was invited to do this interview, but declined, because I'm really no kind of expert on hyphens and because the show airs (live, remember) from 4 to 6 a.m. Pacific time.

I really mean it when I say I'm no kind of expert on hyphens. Maybe it was a mistake to have posted about the Shorter Oxford comma massacre. People now seem to be taking me to be Dr. Hyphen. Why, even our own Barbara Partee has chosen to invoke me in an aside on her preference for shut-out ball over the more popular shutout ball.

I am not Dr. Hyphen. Though there must be some suitable candidates for the title.

By the way, on the radio show, one of the hosts remarked that the Shorter Oxford was shrinking because of the removed hyphens (and others have said similar things). Well, yes, but not by roughly 16,000 characters. The only actual shrinking is for the hyphenated spellings that were replaced by solid spellings. For the hyphenated spellings that were replaced by separated spellings, a hyphen is replaced by a space, and there is no saving (and even might be a small increase, depending on the width of the hyphens and the spaces in question).

Posted by Arnold Zwicky at 08:48 PM

Trying to avoid the appearance of evil

I have no idea why but when I was asked to be one of the speakers at my high school graduation ceremony in 1948 I chose to talk about the plight of American Indians living on reservations. To my knowledge, I had never actually seen an Indian in my whole life. After some perfunctory library research, I focused my talk on the spread of tuberculosis on reservations in the West and finally pulled off the speech somehow. I had no idea back then that one day I'd be living in a state that has more reservations than anywhere else in the US.

Fast forward to 1996. During my last year living in Washington DC, I got a call from a lawyer for the Washington Redskins football team. The team was being sued because of its name and wanted me to try to spin something in their favor. The way the request was couched may have been enough for me to say that I wasn't interested. I declined it, but for another reason. I knew that I was about to move to Montana, where I thought it would be difficult, or at least embarrassing, to admit that I had helped defend a team name that apparently was hated by Native Americans. As it turned out, another linguist agreed to work with the team and he had the unfortunate experience of discovering that one of our own Language Loggers, Geoff Nunberg, was his linguistic counterpart as expert on behalf of the plaintiffs. Language Log has been on top of this subject since then, see (here) and (here) for example.

Fast forward to the present. I grew up near Cleveland and have been a fan of the Cleveland Indians major league baseball team from the first time I saw them play when I was 8 years old. For years I wore a baseball cap with the cartoon Chief Wahoo Indians logo on it. But one of the first things I did when I moved to Montana was to get rid of that cap. I figured it would be insulting to local minorities here. Wrong. Much to my surprise, I've seen Native Americans out here wearing that same cap. Based on my experience, they seem to prefer to be called Indians, not Native Americans. Or at least they don't seem to care which word is used. But I still feel that the Chief Wahoo logo is inappropriate. I've replaced my cap with one that has a "C" on it and I feel much better as I try to follow St. Paul's advice to the Thessalonians: "avoid the very appearance of evil."

This offensive name thing causes me to marvel at some of the identity designation issues we all live with. For example, back in the sixties I was doing research on what we then called "Negro dialect." After only a few months, I was corrected and told to call it "Black English." Even later I was corrected and told to say "African American English." No matter which term I used, I kept getting in trouble for falling behind some invisible curve that outlined the dimensions of current preference. I didn't seem to follow St. Paul's advice very well. Matters were further confused by Washington DC, blacks, especially the older ones, who kept telling me that they preferred to be called "negroes." They explained that "black" had been a derogatory term for them ever since they heard the Stephen Foster songs and read the Little Black Sambo children's story. It was really hard to figure out how to avoid the very appearance of evil in this.

So what to do? I've learned that it's difficult for non-minority researchers who study minorities to keep from being wrong, offensive, insulting, or racist-sounding. I've also learned that even though I'm a Washington Redskins football fan, to me "Redskins" sounds like an offensive and pejorative name, whether or not it was once used by Indians themselves (as one of the above links shows). I'm also a Cleveland Indians baseball fan, but I'm a bit less bothered by this name mostly, I suppose, because the Indians out here don't seem to be concerned about its use. On the other hand, if that team had been called the Cleveland Redskins or the Cleveland Red Savages or the Cleveland Scalpers, I'd be as bothered as the Native Americans might be.

Clear heads and good intentions should always prepare us to avoid the appearance of evil. Recently people in the nearby town of Ronan, Montana, smack in the middle of the Salish Kootenai reservation, have been arguing over the names of their high school boys sports teams, the "Chiefs," and their girls sports teams, the "Maidens." I'm not sure what to think about these names but the wise thing to do would be to defer to the local tribes. "Chief" hardly seems offensive, since in signifies a CEO, head honcho, or boss. But "Maidens"? Hmm. At least it's better than "Squaws," a name that clearly wouldn't come close to avoiding the appearance of evil these days.

Posted by Roger Shuy at 05:32 PM

"Be done" again

Yesterday I commented on a joke about a North Carolina teacher who cites the dialect form "I'll be done drove there by 3:00", and asks for the "correct" future perfect. In standard written American English, that would actually be "I will have driven there by 3:00"; but (according to the joke) a kid in the class suggests the form "I'll be done drive".

I wanted to check my outsider's intuition that "I'll be done drove" is a possible form, but limited by geography and class rather than being a general Southern States thing; and that "I'll be done drive" is a sort of morphological mash-up, only possible as a joke or a speech error. I cited some responses from Texans suggesting both geographical and class associations. In this morning's mail, Jake Voorhees explains the situation lucidly from the point of view of someone situated further to the east:

"I'll be done drive" would be ludicrous, even amongst the uneducated of the North Georgia Appalachian region. I live in North Georgia and come from a long line of Georgia lower class, and have never heard anything of the sort. More likely would be a future perfect instance, with the person saying, "I'll have done drove." "I'll be done drove" is easily conceivable, but even that is a rarer usage generally found in clans with a tradition of never going far into high school. Being relatively newly graduated from high school, with fresh, painful memories of how mixed up tenses become in the average English class, I have little doubt that "I'll be done drive" is a result of a student with a background in the traditional dialect experiencing momentary confusion on the proper form.

Or (perhaps more likely) it was a fictional construction by the jokesmith, imaging what such a student might have said.

For those who know (or are willing to learn) what perfective and resultative mean, Bill Labov added some fascinating linguistic analysis, suggesting that (at least in African-American Vernacular English) "be done" is not exactly a future perfect, but something more flexible, which "can be used to refer to inevitable consequence in past, present or future", and "is also free of any reference to a specific location in time in relation to the moment of speaking".

Yes, that is the future perfect, but there's a complication. In most future perfect uses, there's another (sometimes implicit) clause that marks the first of two successive events in the future, and the clause with "be done" or "will have" is attached to the first. But AAVE has evolved recently to permit attachment to the second, creating a resultative rather than perfective use.

It seems to be a matter of pure tense == time relations, not perfective aspect, even though "done" is involved. If anything, the sense of "inevitable consequence" in the resultative smacks of mood rather than aspect.

This is pretty well laid out in the attached section from my 1998 paper on "Co-existent systems" and in some more recent ones too. I'm also attaching a handout I found laying around from a course I taught in 89. I've got a fair amount of other data from my students' observations in the 1990s and 2000s in LING160.

On the "I'll be done drove" story, it's worth noting that there is considerable variation in the form of the participle. I think it's interesting that some non-linguists see the relation between "be done" and "will have" (you do get "I'll be done. . .").

I recommend the lucid accont in the cited section of a book chapter, which is taken from "Co-existent Sytems in African American English", pages 110-153 in S. Mufwene , J. Rickford, J. Baugh and G. Bailey (eds.), The Structure of African-American English, 1998.

Here's Bill's 1989 class handout:

L160S89.H04 Feb 2, 1989
Introduction to Sociolinguistics W. Labov, T. Morton

2. Some uses of be done.

From Baugh 1983:

(9) We be done washed all the cars by the time JoJo gets back with the cigarettes [said at a church-sponsored car wash].
(10) They be done spent my money before I even get a look at it.
(11) I'll be done killed that motherfucker if he tries to lay a hand on my kid again

From Dayton 1984:

(12) By the time he finish, we be done paid him so much I could direct. [BF 25].
(13) I don't want no silver dollars in my possession because I be done dropped ;them in the machines. [BF 25]
(14) My ice cream's gonna be done melted by the time we get there. [BF 25]
(15) So they can be done ate their lunch by the time they get there [vacation Bible school] [BF 30].
(16) I should be done lost 70 pounds by the time we get there [BF 25]
(17) We coulda be done wrote it. [BF 25]
(18) A: Where's Willy?
B: He be d—ne left.[BM 36]
(19) That washer oughta be d—ne cut off.[watching television, talking about washing machine downstairs, BF 70].
(20) I'm gonna be done hafta went back and finished in eight years [BF 30's].
(21) The readin' of the scriptures, all that's gonna be done done. [BF 40's]
(22) If you love your enemy, they be done ate you alive in this society. [BF 36]
(23) He [a nephew] knows best not to talk back to me 'cause I be done slapped the little knock kneed thing upside the head. [BF 19]
(24) Don't do that 'cause you be done messed up your clothes! [to cousins 4,5,6 running up and down the steps, BF 17]
(25) [to a dog barking] Get outta my way or I'll be done slid you in the face! [BF 26]

[Update -- a "side note" from Dick Margulis:

A couple of weeks ago on COPYEDITING-L, Kyle McCaskill posted the following, which she has given me permission to forward to you...
"... Up until today I had never heard this usage from anyone but my husband: "I am done this book," meaning, "I have finished reading this book." He's from North Carolina, so I thought it was colloquial southern phrasing. But today I was stopped in my tracks by the same use from a New England colleague: "I am done this round of checking."
"Discussions of "done" versus "finished" aside, I might say "I am done with this round" or "I am done checking," but "I am done this [noun]" gives me hives. Has anyone else encountered it?"

A discussion ensued in which the following were contributed by other observers:
- "I'm finally done the laundry" in Kanata, Ontario
- "I am done work for today" or "You can leave when you are done dinner" in Pennsylvania German-influenced English

I'm surprised to see some of these (e.g. "I'm done work for today") raise an eyebrow, much less cause hives. I'm afraid I may have raised some rashes over the years without even knowing it (no jokes there in the back).

And Cathy Prasad writes from Houston:

As someone who grew up at least on the fringes of the South (Texas, West Virginia), I would probably not blink an eye if someone said "I'll be done drove", but I would stop to ask, "Who's driving you? Is something wrong with your truck?". I would think the standard English equivalent would be "I will have been driven".

]

Posted by Mark Liberman at 06:55 AM

Turn right on Roxburgh

Three footnotes to my very syntactically oriented remarks on Edinburgh street names. The first is that I don't want to leave anyone with the impression that I think Edinburgh is different in kind from all other cities in its profusion of classname distinctions (Street, Place, Road, Lane, Avenue, etc.). It may be a bit more luxuriant and systematic in Edinburgh, but many British cities show a much higher reliance on classname distinctions than is common in California.

The second note is that I don't want to leave anyone with the impression that in devising a well-formed street name you can just randomly toss in two or more classnames picking them at random and combining them anyhow. This is not the case. There is some semantics here. To cite just one example, a lane with a name like Buccleuch Place Lane will be a narrow lane that goes behind the grand 18th-century houses of Buccleuch Place, originally serving to afford access to the backs of the houses so that tradesmen could arrive there for deliveries, repairs, and so on (in those days the plumber never came in by the front door). Thistle Street North West Lane is the lane in the carefully planned New Town (new because it was not started until the 1700s) that goes round the back of the buildings on the north side of the western portion of Thistle Street. There's a logic and a sense to it. Even with the internal structure of proper names, there is both a syntax and a semantics.

I said I had three footnotes. The third is a bit more embarrassing, and concerns my own cognitive shortcomings. The fact is that an analytical understanding of the syntax of a system is not the same as a good common-sense ability to work with it and use it in your daily life. On Friday I had an appointment to attend a briefing about grant applications here at the University of Edinburgh, at the Edinburgh Research and Innovation offices, 1-7 Roxburgh Street. I hurried along Nicolson Street, turned right onto Drummond Street down to where the old school buildings became visible, and saw the Roxburgh sign on the right and started making inquiries of the people in the University buildings at numbers 1-7. No flicker of recognition there. I was beginning to be in danger of being late for the session. Finally a servitor [you may not know that word, but they use it here] told me that this was Roxburgh Place. Despite everything I had said about the information-theoretic aspects of classname significance, I had behaved like a Californian (which is basically what I am, after so many years in Santa Cruz) and turned right on Roxburgh, not even looking at the classname clearly displayed. Roxburgh Street was (Scots, please forgive the Americanism) one block further along Drummond.

(I mean Drummond Street. That's near the university in the EH8 district. There is also a Drummond Place, but it happens in this case to have no connection with Drummond Street; it is in a different part of town, EH3, over near where I live.)

I know, intellectually, that the system is a minefield if you concentrate on forenames and neglect classnames, and yet I cannot make myself behave as if I truly knew it at a deep level. The analytical understanding is there, but it hasn't sunk into the habitual behavior layer. As many of you will recognize, it is often like that in the early stages of language learning.

Posted by Geoffrey K. Pullum at 04:19 AM

September 24, 2007

"I'll be done drive": ungrammatical in any dialect?

Patrik Jonsson's recent article "The Southern Drawl: Is It Spreading?", ABC News, 9/22/2007, starts this way:

True story: A North Carolina teacher gave an example to his class of a statement by the school's football coach: "I'll be done drove there by 3 o'clock." Now, the teacher said, give the correct future perfect tense of that sentence. A boy's hand shot up. "I'll be done drive," he said proudly.

As a Yankee, I can believe that in some parts of the American South, some people say things like "I('ll) be done drove". I recall this Muddy Waters lyric:

Well, now it ain't no use to you rambling, when your baby don't want you to ramble around
Yes, now it ain't no use to you rambling, when your baby don't want you to ramble around
Well keep on rambling, she be done drove on out of this town

And internet search turns up this bit of oral history:

Oh, when they'd be throwing out those hymns, them old, good hymns, when you'd be down at the bottom of the hill trying to get that dust off your feet to get your shoes on, because you'd be done walked there barefoot.

And this quotation from a web forum:

Plus I don\'t know if you hear the little digs, but from a female point of view, I be done dragged her little ass off to the ladies\' and we would have had a talk ...

But I'm also pretty sure that "be done drove/walked/dragged/etc." is limited to certain regions and classes. Here's the reaction of a woman born and raised near Dallas:

Gosh, no, it sounds like something a hick would say. <snort>

And to a woman from Wichita Falls, the whole thing sounds so outlandish that she she interpreted "done" as a full verb form meaning "finished", rather than as a marker of perfect aspect:

I can state with confidence that there are no hicks in Wichita Falls who would say such a strange and un-English thing. In fact, I'll go further and say that "I'll be done..." would not be uttered in the city limits of Wichita Falls by someone who was born there. "I'll be finished..." is OK, and "I'll be through..." is OK, but I actually thought that "I'll be done..." was a *Yankee* thing.

Hah! In accordance with my view that this is a Yankee thing, you will note that the story is set in *North* Carolina.

To which the Dallasite responded

Yeah, and I swear I can hear this in one of those strange Maine accents...

And I haven't found anyone yet, from any part of this great nation of ours, who can interpret "I'll be done drive" as anything other than the punch line of a joke. If you disagree, please let me know.

[Update: more here.]

Posted by Mark Liberman at 08:33 PM

Monks and civilians

An odd (to me) use of the word civilian has been widespread in the news coverage of the recent demonstrations in Myanmar. Thus "Anti-government protest in Rangoon", BBC, 9/24/2007

Many thousands of monks and civilians are now marching through the streets of Rangoon in what appears to be the biggest demonstration yet.

And "Myanmar Protesters Estimated at 100,000", ABC News, 0/24/2007

Buddhist monks, accompanied by civilians, march on a street in a protest against the military government in Yangon, Myanmar, Sunday, Sept. 23, 2007.

And "'100,000 join Saffron Revolution' in Burma", Times Online, 9/24/2007:

Onlookers cheered and shouted support as between 10,000 and 20,000 monks in maroon robes with saffron sashes marched on routes through Rangoon, the country's largest city.
Civilians joining the marches swelled the number of demonstrators to as many as 100,000, according to some estimates.

And also "Thousands Join Monks in Myanmar Protests", NYT, 9/24/2007:

In the country’s largest city, Yangon, the Buddhist monks who have led the protests for the past week were outnumbered by civilians, including prominent political dissidents and well-known cultural figures.

None of these passages requires the interpretation that monks and civilians are disjoint classes, but that seems to me to be the natural reading. In contrast, the Economist asked "As more monks and laymen join protests in Myanmar, what will the junta do?", using the term that I would have thought more appropriate for non-monks. Perhaps the other sources have trouble with the -men part of laymen?

Anyhow, after my experience with bagatelle, I wondered whether I'd been missing something else about the English language, all these years. The answer turns out to be "yes", as it usually is when I turn to the OED. But what I've been missing was not an extended use of civilian to mean "a person not in religious orders" (which is obvious enough, and probably not new, though I'm still surprised to see it used so widely in the press). No, it turns out that there are three other things about civilian that I didn't know.

The OED starts its entry with two curious senses previously unknown to me:

1. One who makes or has made the Civil Law (chiefly as distinguished originally from the Canon Law, and later from the Common Law) the object of his study: a practitioner, doctor, professor, or student of Civil Law, a writer or authority on the Civil Law.
2. Theol. ‘One who, despising the righteousness of Christ, did yet follow after a certain civil righteousness, a justitia civilis of his own’ (Trench).

(I suppose that sense 2. might be somehow relevant to the Myanmar protests, except that I don't imagine that either the Buddhist monks or the Buddhist non-monks have any particular opinion of the righteousness of Christ.) Then comes the meaning that I know, though oddly restricted in sex:

3. A non-military man or official.

This is supplied with a suprising literal or original meaning, also new to me:

a. orig. (More fully Indian Civilian): One of the covenanted European servants of the East India Company, not in military employ. Now, a member of the Indian Civil Service of the Crown.

[I presume that the word "now" in this passage is a candidate for editing in some future version...]

And then finally the normal contemporary use:

b. generally (esp. in military parlance): One who does not professionally belong to the Army or the Navy; a non-military person.

But there's nothing there about non-monks, though the analogy is obvious enough.

[Update -- Laura Pettelle writes:

I have noticed this usage for years, while I was an undergraduate student in theology and later while I was in seminary, where everyone is clearly familiar with the word "laymen."
My intuition, which may be completely incorrect, is that the use of "civilians" in theological circles is twofold. First, these days it seems like you hear "laymen" a lot more in reference to scientific laymen than theological laymen and people's minds seem to jump to that meaning. Second, with the declericalization of theology, there are plenty of laymen who are important theologians, and sometimes "lay" vs. "clergy" doesn't convey the right distinction; I heard "civilian" a lot in that context, when the "in group" wasn't ministers but theological professionals (academic, ministerial, and otherwise) and the "out group" referred to as civilians was everybody else.
One other thing that might matter is that "lay" or "laymen" has been used as an insult and as a demarcator in some particularly ugly church governance battles (in a variety of denominations) and in some settings I think people avoid it because (consciously or unconsciously) they don't want to reference those very negative uses.
I don't think the "men" in laymen is really the problem, since layfolk, lay people, and laity are all perfectly acceptable and common constructions.
I always love to see a language quirk I've wondered about get a shoutout (and explanation!) on LLog. It's nice when something you've wondered about anyway appears before your very eyes on your morning RSS feed.

I've heard and seen civilians used (sometimes jokingly) to mean "outsiders" in a number of non-military situations. But this is always from the perspective of insiders looking out -- which the reporters writing about the Myanmar protests surely are not.]

[David Eddyshaw writes:

I vaguely recall reading that the Roman Christian use of 'paganus', literally 'rustic' of course, was borrowed from Roman military jargon, in which it meant 'civilian'. My trusty Classical Latin dictionary (Lewis) gives 'civilian' as a meaning of 'paganus', in Juvenal and Tacitus.
'Heathen' is a calque of 'paganus' IIRC too.
I don't think Buddhists are quite so keen on describing their community with military metaphors though ...

It never occurred to me that "pagan" meant "hick", but of course it obviously did, etymologically speaking anyhow. And it's interesting that it got that way due to the Christians adopting the military term for outsiders.]

[Geraint Jennings writes:

In the monks v civilians reporting context, the image that was conveyed to me was of monks in robes and others in civvies (i.e. civilian dress). For me it was a visual thing implying contrast in costume - uniform v non-uniform.
Back (way back) in the days when I was at school, being out of uniform for visits, trips etc was being "in civvies", so that perhaps influences my reaction.

Yes, the monks' robes (as well as their orderly ranks in marching) are obviously part of the analogy. ]

[Update 9/25/2007 -- Barbara Zimmer observes that "By today (Sept 25) the phrase has morphed into "monks and supporters", "monks and other protesters", "monks and many others", etc in print and on the radio..... " ]

Posted by Mark Liberman at 09:25 AM

The late Richard Hogg

Professor Richard M. Hogg, distinguished historian of English and contributing co-editor of the multi-volume Cambridge History of the English Language, died recently of a heart attack at the age of 63. I first met him when I was an undergraduate and he was a junior lecturer, and I liked him enormously (hey, everyone did). I saw far too little of him over the subsequent decades, while he worked at the University of Manchester and I at the University of California. I was looking forward to seeing him again now that we once again both reside in the UK. It was a shock to learn that it is now too late. There is an excellent obituary in today's Guardian, written by Richard's colleague Nigel Vincent, who paints a clear picture of this warm and delightful human being. Richard's death deprives us of the second volume of a major grammar of Old English (almost complete when he died) and a history of English dialectology (reportedly three-quarters complete).

Posted by Geoffrey K. Pullum at 08:09 AM

Mukasey weighs in on clear writing and light beer

Newspapers have been running profiles of Judge Michael B. Mukasey, President Bush's nominee to succeed Alberto Gonzales as Attorney General, and he is revealed to have a number of surprising qualities, at least compared to some of Bush's past choices for Cabinet positions. In the Washington Post, a federal public defender describes Mukasey as "very sharp, very focused," adding, "It was interesting to argue before him because he was interested in ideas and language." The New York Times divulges that he has a framed photograph of George Orwell in his chambers. "He is a particular idol of mine for his clear writing and complete disdain for cant," Mukasey explained. "I try to recognize when some spongy abstraction is trying to cover up an excuse for thought or analysis."

Now, not everyone at Language Log Plaza is a fan of Orwell's writings on language-related matters. Geoff Pullum, for one, has vilified Orwell's famous 1946 essay, "Politics and the English Language," agreeing with Stanley Fish's assessment of it as "turgid, self-righteous and philosophically hopeless." But regardless of how you feel about Orwell as an icon of linguistic clarity, it's refreshing to have a prospective Attorney General who is "interested in ideas and language." In this regard, the most telling detail in these profiles of Mukasey is a bit of Latin wordplay buried in a judicial footnote, concerning the taste of competing brands of light beer.

In the 1992 case (Coors Brewing Co. v. Anheuser-Busch Co. 802 F.Supp. 965), Coors argued that an Anheuser-Busch ad campaign for Natural Light beer made unfair and misleading comparisons to the brewing process of Coors Light. In his U.S. District Court opinion, Mukasey rejected the complaint from Coors, but in a footnote he refused to take a stand on the effect of pasteurization on the taste of beer:

The parties have disputed vigorously whether the taste of beer, unlike the taste of milk, is adversely affected by pasteurization. Coors says it is; Anheuser-Busch say it isn't. They have also disputed, in this forum and elsewhere, other features of their products and their advertising, defendants having gone so far as to accuse Coors of "ad hominem attacks on Natural Light." (Def. Mem. at 13) However, de gustibus cerevesiae non scit lex.

The New York Times glosses Mukasey's Latin phrase as, "The law takes no account of taste in weak beer," while the Washington Post translates it more literally: "Concerning the taste of beer law knows nothing." What the papers don't mention is that Mukasey was simultaneously playing on two well-known Latin expressions:

De gustibus non est disputandum.
"There is no accounting for tastes."
(Lit.: "Concerning taste there is no disputing.")
De minimis non curat lex.
"The law does not care about trifling matters."

The latter expression is the basis for the legal principle known as de minimis, used by courts to avoid passing judgment on matters deemed unworthy of judicial attention. Mukasey managed to dismiss the pasteurization question as de minimis, while at the same time making a clever parallel to the de gustibus sentiment of "to each his own" — surely a wise tactic when it comes to assessing the relative merits of light beer.

(An etymological note: Latin cerevisia is the root of Spanish cerveza, Portuguese cerveja, and Catalan cervesa, all via Old French cervoise. It is a medieval Latin word of Gaulish origin and honors Ceres, goddess of the harvest. Gaulish was a Celtic language, which presumably explains the connection to another cognate, Welsh cwrw 'beer, ale.' Most of the other western European languages have inherited terms like English beer and German bier, from a Germanic root that probably derives from Latin bibere 'to drink.')

[Update #1: Conrad Roth emails to question my assumption that Welsh cwrw shares the Celtic origin of Latin cerevisia. According to an online dictionary, the derivation is as follows:

ETYMOLOGY: Welsh cwrw < cwrwf < *cwryf < *cwrf < British *korm
From the same British root: Cornish korev (= beer), Breton koref (= beer)
From the same Celtic root: Irish coirm (= beer; drinking party; concert)
Cf Latin cremor (= broth, thick juice), also used as a technical term in Enlish - Webster 1913 cremor = cream; a substance resembling cream; yeast; scum); Greek kourmi (Harry Thurston Peck, Harpers Dictionary of Classical Antiquities (1898); entry for Perseus: "The beer or barley-wine of Crete was known as korma or kourmi.").

The University of Wales' "Lexicon of the Celtic World" (see PDF link for "Examples from the Celtic Core Vocabulary") agrees with this, tracing kwrw back to Proto-Celtic *kurmi- (and PIE *kor-m). That is evidently distinct from the etymon for Ceres, cereal, etc. See also this entry in "An etymological lexicon of Proto-Celtic."]

[Update #2: Mischa Hooker writes:

I'd suggest, however, that cerevisia *is* likely related to the Welsh cwrw, since it's apparently a borrowing from Gaulish. The connection of cerevisia with Ceres is what I would question. [Note the apparent pattern m > v as argued by L. H. Gray, "Mutation in Gaulish," Language 20.4 (1944) pp. 227-8 (mutation no. 17 in the list), citing this set of cognates among some others.]

Serves me right for wading into the muddy waters of Celtic etymology!]

[Update #3: Beer historian Martyn Cornell further discounts the connection between cerevisia and Ceres:

This is Latin folk etymology, based on the Romans' idea that all barbarian words had to be derived from Latin originals. The name of the goddess Ceres has nothing at all to do with the word cerevisia (how would it?), which comes from a variant of the Gaulish word curmi, with e-vocalisation and the same change of m to v that happened with other Celtic words, such as the names of the British kingdoms Dumnonia (Devon) and Demetia (Dyfed). Curmi seems to be linked etymologically with the same IE root that gave the Latin word cremo, to burn or boil, in the same way that so many other brewing words have roots in words meaning bubbling or boiling.
The Welsh version of the Common Celtic word curmi underwent the same m-to-v change, to become coref (IIRC) in medieval Welsh, losing the f (single f being pronounced v in Welsh, of course) to turn into cwrw. In Cornish and Breton the f stayed, giving coref in the former and coreff in the latter. The old Irish cognate was coirm, later replaced by lionn, which originally just meant "drink". Cerevesia (or cervisia, or other variants) isn't a medieval Latin word either — it's found from at least the 1st century AD. And modern French still has the word cervoise, with the specialised meaning "unhopped ale", as against biere, the name of the hopped drink — a distinction that has been, sadly, lost in English these past three or four centuries. ]

Posted by Benjamin Zimmer at 06:08 AM

The ends to the means

Kevin Millwood, Texas Rangers winning pitcher, who had just pitched 7 innings of shut-out ball, is quoted as follows:

"I've been working on stuff all year and felt like I've been improving," Millwood said. "Today was kind of the ends to the means, I guess, to be able to go out and put everything I'm working on to use and come out on a good note."
- from Texas 3, Baltimore 0: Recap: By Stephen Hawkins, AP September 23, 2007

[Hyphen note for Arnold: I write "7 innings of shut-out ball", but "innings of shut-out ball" gets only 1980 Google hits vs. 32000 for "innings of shutout ball".]

A Google search on "ends to the means" claims about 21,600 hits. On the first page, all but one or two make normal sense:

- sacrificing the ends to the means
- adjusting the ends to the means at hand
- scale down the ends to the [available, I suppose] means
- deal with the relation of the goals to the techniques, or the ends to the means, on each side
- shift your focus from the ends to the means

This next one seems "intermediate".
... it also deals with the idea that relationship is not the ends to the means; it is a means (a very important means) to an end.
This one apparently starts from the phrase "means to an end" and uses the reversed phrase "ends to the means" for contrast, a way to say that the relationship is not an end but a means. But the reversed phrase "ends to the means" is not an independently normal phrase: the noun "end" doesn't normally take a "to"-phrase complement. Means are means "to" some goal, but ends are the goals themselves: "To what end did he do that?"

One other example from the first page of Google results seems non-standard:
- time and time again we go through life searching for the ends to the means we try and after all ...
But this is part of some unpunctuated and unparsable 'poetic' writing, so I don't want to count it for too much and can't begin to guess what's going on in that writer's head.

But all those quotes, the normal ones and the questionable ones, show that the established collocation "means to an end" makes "ends to the mean" more likely to occur than it would be otherwise. And I suppose the quoted pitcher somehow produced a blend between that collocation and a sense of "end" as "end result" -- "the end result of all those efforts" then became "the ends to the means".

And note how he even ended up with plural "ends", which isn't common either for "end" as 'purpose' or "end" in 'end result'. Oh, wait, I just checked Google again: "to what ends" has 72k hits, and "to what end" has 558k. So plural "ends" is not so terribly uncommon, and putting ends and means together presumably helps to boost the plural use.

So in the end, although I don't think I could use that expression the way that pitcher did, I'm happy to defend it and it was fun to work out the means by which he might have come up with it.

Posted by Barbara Partee at 02:09 AM

September 23, 2007

On the fringes of snowclonia

Snowclones are language patterns with open slots which are in some sense formulaic, but as we've noted over the years here on Language Log, there are all sorts of language patterns like this: syntactic constructions, idioms, clichés, catchphrases, riddle and joke forms, poetic forms, and more. People also make playful allusions to idioms, clichés, quotations, and titles, varying parts of the models for effect. So there are all these things that aren't snowclones, and some classic cases that are -- and some more cases on the fringes. My snowclone omnibuses (here and here) are compendia of candidate (putative or potential) snowclones, things that people have suggested to me might be snowclones, not things I'm certifying are snowclones; each case has to be looked at on its own. (Unfortunately, the putative cases pile up faster than I can deal with them.) Today I'll look at some cases that have come to my attention recently.

Call Your Office: I argued a while back that "the wonderful world of X" is just a cliché with an open slot.

Now Ben Zimmer has blogged on "X, call your office", which I at first took to be something similar. But Ben pointed out that the figure probably has its source in a catchphrase ("Judge Crater, call your office", itself based on earlier literal uses) -- which would make it an instance of the playful allusion type rather than the cliché type -- and his discussion suggested that the figure as a whole contributes some meaning (in my terms, that X is absent from the scene but is relevant to the matter at hand), though some of the occurrences might lack this meaning and be mere playful allusions.

Unsafe In Any Child's Garden: In earlier postings on candidates for snowclonehood, I argued several times that the candidates were just playful allusions to fixed expressions of one kind or another -- for example, here on "X-back mountain" (based on Brokeback Mountain) and here on "X eye for the Y guy" (based on Queer Eye for the Straight Guy). In both these cases, the formulas contribute no meaning of their own, while the clearest examples of snowclones ("X is the new Y", for instance) do.

Then back in August, Chris Phipps asked me about "unsafe P any X", as in the title of a posting by Mark Liberman: "Journalists' questions: unsafe in any mood?". This, I replied, was just a playful allusion to Ralph Nader's Unsafe at Any Speed, and pretty much a one-shot deal at that: there's no significant collection of variations on the title, a collection that would suggest that there's a pattern available for general use here. Similarly in my use of the title "A child's garden of languages" for a recent posting; the title takes off on Robert Louis Stevenson's A Child's Garden of Verses.

Over the years I've posted several times about occasions on which all sorts of language play are prominent (in science writing, in teaser headings on the covers of porn magazines). I intend to post further on these "ludic locales", of which there are many, but my current point is that among the types of language play to be found in them are playful allusions, lots of them, and no one should want to label most of these as snowclones, since virtually any idiom, cliché, quotation, or title can serve as the basis for an allusion. Many of them combine some other feature of language play with the playful allusion, as in these two examples, where there's some phonological play: "Ground Control to My Imam" (feature title in Harper's, November 2006), alluding to David Bowie's "Ground Control to Major Tom"; and "Take the Money and Rue" (NYT editorial, 9/12/07), alluding to Woody Allen's Take the Money and Run.

Closer to Snowclone Central: Closer in is "on a scale from one to X", as discussed here by Mark Liberman a little while ago. Mark, cautiously, rated the figure as a 5 on a scale from one to snowclone, though others might accord it a higher rating.

Meanwhile, I just noticed "from X's lips/mouth to God's ear" in a posting of Geoff Pullum's: "From [Stanley] Fish's mouth to God's ear." Substantial number of hits, but it's not at all clear what people are using the figure to convey: there are some occurrences of "from your lips to God's ear" that seem to convey nothing more than that God hears everything you say, but in most occurrences of the figure something more complex is going on.

And over on ADS-L I recently started a discussion of "Who are you and what have you done with X?", which I'd contemplated using in a recent posting that mentioned my granddaughter's alarm at being confronted by her mother speaking German: "Who are you and what have you done with my mother?" The figure is canonically used in situations where the speaker is confronting someone who appears to be X but observes that this person lacks some property or properties historically characteristic of X; think Invasion of the Body Snatchers. Things are a bit tricky, though, because there are perfectly straightforward uses of such expressions, as sequences of ordinary questions. In any case, everybody seems to think that the figure originated in a specific quotation (not in IBS, so far as I can tell), but so far no one has a good candidate. And there are lots of instances out there.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 02:54 PM

Bagatelling around

In the September 6 issue of Nature, a verb caught me up short (Phileppe Claeys and Steven Goderis, "Solar System: Lethal billiards"):

A huge collision in the asteroid belt 160 million years ago sent fragments bagatelling around the inner Solar System. One piece might have caused the mass extinction that wiped out the dinosaurs 65 million years ago.

The only use I ever see for bagatelle is "a mere bagatelle", with the occasional reference to Beethoven's bagatelles. But the fragments of this cosmic collision seem to have been anything but trifles or fun bits of light music. No self-respected copy editor would pass "sent fragments baubling around the inner Solar System", full as this may be of poetic possibilities. I suppose that "fragments gavotting around" might make it, but a musical bagatelle is not really a dance, as far as I know. Was an editor at Nature asleep at the switch? Unlikely. So I looked it up.

The OED gives the first sense of bagatelle as "A trifle, a thing of no value or importance", and sense 1.b. as "A piece of verse or music in a light style". But then comes

2. A game played on a table having a semi-circular end at which are nine holes. The balls used are struck from the opposite end of the board with a cue. The name is sometimes applied to a modified form of billiards known also as semi-billiards.

So apparently for some people, bagatelling is roughly the same as caroming.

By the way, it looks like the game derived from the trifle rather than the other way around -- the earliest citation for bagatelle-the-game is 1819, while bagatelle-the-trifle goes back to 1645. The OED gives the etymology as

[a. F. bagatelle, ad. It. bagatella, a dim. form which Diez attaches to Parmesan bagata a little property, prob. from baga: see BAGGAGE. With bagatello, cf. -ADO suffix 2. Formerly quite naturalized in sense 1, now scarcely so; sense 2 is purely Eng. in origin and use.]

This would help explain why, as an American, I couldn't make any sense of asteroid fragments bagatelling around the solar system.

In looking at patterns of usage in news writing, I didn't find any other examples of things (great or small) bagatelling around. However, I did come across another unexpected development. Some writers -- apparently mostly British Commonwealth, though the sample is small -- use the collocation "mere bagatelle" as if it were a mass noun, with no article:

Australia: At the same time, another was hanging on as if life itself depended on it, saying four election victories and 11 years in office was mere bagatelle.
U.K.: On an executive car, this premium would be mere bagatelle.
U.K.: Ken Bates was looking towards a person for whom a couple of hundred million is mere bagatelle and in Mr Abramovich he appears to have found such a guy.
U.K. However, this is mere bagatelle compared with what the pension funds hold in equities.

Perhaps this is due to thinking of the billiards-game ("this is child's play") rather than the bauble ("this is only a trifle") ? Or maybe it's a generalization from the commonly-anarthrous use in headlines? Anyhow, I'm used to seeing this cliché with "a" in front of it, like this:

U.S. If you think there is a gas crunch now, marked by the largest oil price spike in a generation, it will be a bagatelle when China and India bring a couple of billion more people on to their highways...
U.K. As we report today, this is likely to make his current £400,000 a year stipend - enough in itself to buy 134,000 Wetherspoon breakfasts - seem a mere bagatelle.
N.Z. We are not going to let a mere bagatelle like that sour our relationship going into the future.

Wow, two "new" uses of the same word in one day.

[Update -- Rick Sprague writes to report another (though) related sense:

Growing up in central New York in the 1950s, I remember having one of those little novelties you use to keep kids busy on car trips. It was named "Bagatelle" and consisted of a cardboard disc with eight or so dimples in it, and an equal number of tiny metal balls, with a clear plastic cover sealing them in. The goal, of course, was to coax one ball into each dimple without dislodging the others.
I don't think I ever encountered the word bagatelle anywhere else, and until your LL post didn't realize it had a meaning at all, but now it's obvious that the name came from the "trifle" meaning. And yet, if you've ever tried to work such a puzzle while bouncing down country roads, the "carom" meaning is also very relevant. So to me, the two meanings just naturally blend, and "fragments bagatelling around the inner Solar System" evokes a nice space-themed pinball machine image in my mind.

I would have guessed that the balls-in-dimples game was named as a sort of palm-top billiards analogue. Either way, Lindsay Marshall sent in a link to high-class British wooden version -- well, really more of a primitive pinball machine, as far as I can tell -- "true to Jaques original design, as supplied to Queen Victoria". Lindsay's comment:

Let me point you to this other version of bagatelle (which is the one most familiar to me, indeed I have a board like this in my front room) and in this game the marbles really do bagatelle around because of all the pins.

If such a game was really supplied to Queen Victoria, under the name of "bagatelle", it's uncharacteristic for the OED to have missed this sense for so long. But Thomas Thurman sent a note suggesting that these quasi-pinball versions are to be considered merely automated versions of the game played with a cue stick:

I should possibly note (as a British English speaker) that although the game "bagatelle" is the primary meaning of the word in my head and I'd be unlikely to produce it in the other sense (though I'd recognise it), I've never heard the word used as a verb, so the Nature usage still seems odd to me.
(Incidentally, most modern bagatelle sets have the striker/cue spring-loaded, so you just have to draw it back for it to hit the balls. This reduces the skill required somewhat. Often they are also made of plastic and use ball bearings instead of wood with little wooden balls.)

]

[Douglas Sundseth writes

he machine you and your correspondents are describing as a "Bagatelle" sounds very much like the popular Japanese gambling diversion known as "Pachinko". The latter term is quite familiar to me, but I've not heard the former usage previously. A cursory search turned up this page, which seems to imply that Pachinko is a derivative of Bagatelle.
Finally, a search on "pachinkoing around" returned this fragment:
"Granted, I didn't go on the month-long herd'em along pachinkoing-around tour of China."
which seems to be closely related to the usage of bagatelle that you report.

Well {" pinballing around "} gets 1,200 Google hits -- the metaphor works, once you know what the word means. ]

Posted by Mark Liberman at 08:13 AM

September 22, 2007

Automatic hyphenation

Having just touched on issues of hyphenation, I'm reminded that I should do a follow-up (note: usually this is follow-up for me, but sometimes followup; so sue me) on automatic hyphenation, a topic that I posted on at the end of June. I relayed some mis-hyphenations resulting from early attempts to eliminate proofreaders in favor of hyphenation programs: kneep-ants, co-aches, and, in a manual for a program that was supposed to make proofreaders obsolete, pro-ofreaders. Now it turns out that there's a name for such things.

I also commented:

In any case, pro-ofreaders were clearly not obsolete then. Nor are they now. Though brute-force methods -- really really big dictionaries with possible hyphenations specified -- can improve things considerably, and undoubtedly have.

But it turns out that there's an excellent hyphenation program that abstains from simple brute force.

First, the term for mis-hyphenations. This is the wonderful mishy-phens, devised by Donna Richoux and reported on the newsgroup alt.usage.english. In 2004 Richoux posted an entertaining list of all the mishy-phens she'd collected since 2000. (Hat tips to Ben Zimmer, who posted on the topic on ADS-L in 2004; and to Aaron Dinkin.) Meanwhile, Mark Mandel wrote to say that in 2000 he wrote a song, "Editors' Waltz", with examples of all sorts of things that can go wrong in manuscripts, including the mishy-phen moong-low from the NYT that year.

Then, the excellent hyphenation program. Chris Lance and Jed Davis both wrote me about the TeX algorithm that was developed as a Stanford Ph.D. project by Frank Liang in 1983, under the direction of TeXman Donald Knuth. (This is embarrassing, because Knuth is of course a colleague of mine at Stanford. In my defense, I'm not a computer scientist, nor even a TeX person.) The TeX hyphenation dictionary contains 4447 patterns that the algorithm uses, a much smaller number than the huge number of entire words that were used in developing it, so it's far from a brute-force scheme. Both Lance and Davis report that the performance of the algorithm is very good.

There's still plenty of work for proofreaders, of course.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 01:31 PM

Vanishing hyphens

With the appearance of the 6th edition of the Shorter Oxford English Dictionary -- see the statement by editor Angus Stevenson on 9/13/07 -- the media have been abuzz (not a-buzz) about the disappearance of about 16,000 hyphens in this dictionary. This many words that were hyphenated in the previous edition are written either solid (bumblebee, chickpea) or separated/split (fig leaf, pot belly) in this one. Reactions are, predictably, all over the map: it's about time, ho-hum, the language is going to hell. The last response seems to have been prompted mostly by OUP's statement that the elimination of hyphens was in part motivated by the practices of e-mailing (that's my usage, and I'm sticking to it) and text messaging, forms of communication widely seen as contributing to the decline of the language.

A few framings of the story in the media:

Small object of grammatical desire (BBC News, 9/20/07)

Hyphens are vanishing. Blame e-mail. Sorry, Email. (Wall Street Journal Online, 9/20/07)

Thousands of hyphens perish as English marches on (Reuters, 9/21/07)

Hyphen falls victim to the email society (Telegraph, 9/21/07)

and in discussion forums:

Demise of the hyphen (Desktop Publishing Forum)

Bye bye hyphen (bit-tech.net Forums)

Every so often I get e-mail from Language Log readers who want to know what the "right" punctuation (solid, hyphenated, separated) of specific compound words is. I'm not able to give useful advice on these matters. I know what my own preferences are -- I'm generally sparing with hyphens, but find that they sometimes work for clarity -- and I know that my practice is sometimes inconsistent, a fact that doesn't particularly bother me. As I pointed out here a while back, not all inconsistency is worth worrying about -- especially when the conflicting styles of different writers and manuals accustom readers to more than one variant, as is surely the case with the three schemes of punctuation at issue here. So I'm of the ho-hum school.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 12:09 PM

Ask Language Log: On a scale from one to snowclone

Justin Fetterman writes:

I've noticed what might be a new snowclone cropping up. As in this article, where, right before the picture, we have the phrase "a scale of one to crazy". Recently, a friend of mine used the same construction to form "on a scale of one to super hot".

"On a scale from one to __" is certainly out there, e.g.

All in all, on a scale from one to weird, it was about a twelve.
On a scale from one to Stupid, I give it a Stupid+.
What I’m trying to say, I guess, is on a scale from one to nerd, we’re off the charts.
on a scale from one to super fucking-riffic you guys are super fucking-riffic!
i think english lit to 1500 may kill me because on a scale of one to boring that class is probably a 9.
On a scale of one to excellent, I'd say a 7.
on a scale of one to creepy this dream was off the charts.
BabelBabe, on a scale of one to insane, I'd give you a three.
On a scale of one to evil, we give that idea four-and-a-half Kissingers!
Okay, so until next time, on a scale from one to awesome, I'm super great.

And of course the psychologists and sociologists have been preparing the way for years, with their incessant questions "On a scale from one to three/five/six/ten/100/etc."

The recent cause of generalization beyond the numerals seems to be the Reunion Show's song "On a scale from One to Awesome (You're Pretty Great)".

But some obvious instantiations of this pattern are not to be found yet on the web, like "on a scale from one to totally __", or "on a scale from one to wonderful". (I'm surprised that the last one didn't even make it into a Cole Porter lyric.)

So on a scale from one to snowclone, I'd give this pattern about a five.

Update -- Aaron "Dr. Whom" Dinkin writes:

The Reunion Show song "One a Scale from One to Awesome (You're Pretty Great)" actually takes its title from a line in an early Strong Bad e-mail: see here. Strong Bad ends the cartoon by saying "Okay, so until next time, on a scale from one to awesome, I'm super great." The Strong bad e-mail appears to predate the Reunion Show album by at least a year.

Yes, it's sbemail #3, which dates from 2001. And the Reunion Show released Kill your Television, which included the "On a scale of ..." cut, in the fall of 2003.

Ethan Glasser-Camp points out an instance of this pattern in the cartoon strip Force Monkeys from 1/25/2003, Superheroics and You. ]

Posted by Mark Liberman at 09:58 AM

The Crockus and the Bassoon

The folks over at Metafilter have picked up on the emerging science of crockusology, and the usual lively discussion has ensued. One of the commenters asked an especially insightful question: "Is the crockus related in any way to Shatner's Bassoon?" In fact, they're practically next-door neighbors:

Shatner's Bassoon is the region of dorsolateral prefrontal cortex responsible for time perception. It's known to be four times larger in boys than in girls, which is why boys are so much more likely to suffer from ADHD, and also why they're so much more at risk for the side-effects of the new European boom-rave drugs in the cranabolic amphetamoid family, such as dimesmeric antiphosphate, known on the streets of London as "cake". (More background is available here.)

By a curious historical coincidence, Dr. William Shatner, the eponymous discoverer of Shatner's Bassoon, and Dr. Alfred Crockus, the eponymous discoverer of the Crockus, are colleagues at the Boston Medical University Hospital. It's striking that two men who have contributed so much to our understanding of educational neuroscience should both be on the staff of an otherwise little-known institution. (Rumor has it that staffers for David Brooks have been sighted in the BMUH cafeteria, deep in discussion with Drs. Shatner and Crockus, doing background research for the next installment of Brooks' series of pieces on the political philosophy of cognitive neuroscience.)

If you're new to this discussion, you should read, in order;

"How big is your crockus?", 9/17/2007
"High crockalorum", 9/18/2007
"Dr. Alfred Crockus and Crosley Shelvador, MD", 9/19/2007
"Crosley Shelvador comes in from the cold", 9/20/2007
"Dr. Crockus in Central New Jersey", 9/21/2007

Additional background can be found in "David Brooks, Neuroendocrinologist" (9/17/2006).

[Another Mefi commenter points out that Dan Hodgins, the popular avatar of crockusology, will be lecturing in the Chicago area on October 3 and 4.]

[To avoid any possible confusion, perhaps I should add that Shatner's Basson and dimesmeric antiphosphate are satirical fictions. Dr. Alfred Crockus, the region of the brain known as the Crockus, and Boston Medical University Hospital are also apparently fictions, though the motivation and history of their invention remain obscure. (Satire, fraud and credulous crackpottery have been suggested; my favorite hypothesis remains the bar bet theory.)

Dan Hodgins, however, is not a fiction, but a real person who really does travel around giving presentations to groups of educators and interested parents. Some of these presentations apparently really do feature the Crockus-related pictures and claims discussed here and here. And the email quotations from Dan Hodgins that I've given in various messages were cut and pasted, verbatim and complete, from messages that came from whoever answered my queries to the email address given for him here.]

[Update -- some further news in the search for Dr. Crockus. The Massachusetts Board of Registration in Medicine has no one named "Crockus" licensed in medicine or osteopathy.]

Posted by Mark Liberman at 08:10 AM

Punctuate this

How right Arnold Zwicky is to refuse to follow the dumb punctuation-sequencing rule that American publishers and copy editors all insist on (occasionally to the point of hypercorrection): the rule that small punctuation marks (the comma and the period but not the question mark or the exclamation mark) should be shifted to the left of any closing quotation mark that they fall next to and would logically be to the right of. Take a look at this nasty, from the Pears & McGuinness translation of Wittgenstein's Tractatus Logico-Philosophicus, as published in England by Routledge and Kegan Paul (1974 impression, printed in Suffolk by Richard Clay Ltd):

Logic is prior to experience—that something is so. It is prior to the question ‘How?’, not prior to the question ‘What?’

I just happened to notice it this morning while cleaning the apartment (there are Wittgenstein books all over our coffee table right now, because Barbara is going to teach a course on Wittgenstein at the University of Edinburgh next semester). The sentence looks unfinished to me: it doesn't have a final period like it should. But at least the comma is outside the quotation marks around How? as it logically should be. Things get much worse when American publishers reprint such passages.

Here's how that second sentence looks when quoted on page 2 of Richard M. McDonough's book The Argument of the "Tractatus" (the title of which is its own little punctuation conundrum that I'll deal with later):

It is prior to the question ‘How?,’ not prior to the question ‘What?’

But is How?, the question, or is How? the question, the comma being extraneous to it? Surely the comma logically belongs in the structure of the main sentence, not inside the quotation marks where a one-word question (How?) is quoted and used in the main sentence as if it were a noun phrase. Punctuated this way, the sentence looks as if it has neither a comma before the phrase beginning not nor a final period. Ugly.

Here's how I would really like to see the sentence punctuated, if I had been doing the editing myself:

It is prior to the question ‘How?’, not prior to the question ‘What?’.

That makes it clear that the structure of the sentence is It is prior to X, not prior to Y. — it has both a comma and a period in it.

At this point, if you are half the scholar I expect you to be (you are a Language Log reader after all; you do care about such things), you will be wondering if we couldn't go back to the original German and see how things were done there. And of course we can: the Routledge edition is bilingual, German on the left and English on the right, in parallel. But we learn nothing: what Wittgenstein did in German was to coerce the words for how and what into being nouns. He gives them definite articles and capital letters (as German nouns have to have). So we get:

Sie ist vor dem Wie, nicht vor dem Was.

Literally, "It is before the how, not before the what." (You may also be wondering what the hell he means by this statement. But that is not my province. For that you take Barbara's seminar, where you will have a chance to discuss such profoundly difficult matters at length.)

One loose end remains: the correct title of McDonough's book, published by the State University of New York Press. On the cover we see this, in blue:

THE ARGUMENT OF THE "TRACTATUS"
Its Relevance To Contemporary Theories of
Logic, Language, Mind, and Philosophical Truth

Inside on the title page we see this:

The Argument of the Tractatus
Its relevance to contemporary theories of
logic, language, mind, and philosophical truth

And on the reverse of the title page in the Library of Congress publication data we see a third version:

McDonough, Richard M., 1950—
The argument of the Tractatus

So are there quotation marks inside the title, or not? I think we have to draw a distinction between the structure of titles and the typographical realization thereof (notice the differential use of Gratuitous Capitalization of Significant Words, which independently supports this conclusion). The main title has the form the argument of X, where in this case the X is the title of another book. We have to make a decision about how to typeset the whole thing. The designer of the cover made one decision, and the designer of the title page made a different one, and both are different from the Library of Congress cataloguers. Bibliographers will probably make a fourth choice, namely this:

McDonough, Richard M. (1986) The argument of the Tractatus. Albany, NY: State University of New York Press.

In that version, the title is italicized, and the embedded title is identified by dropping of the italics: when you're already in italics you signal the effect of italicization by switching back to plain roman. (There are other books that have to be dealt with in this way; one example is Howard Lasnik's book entitled either Syntactic Structures Revisited (if you're not in italics but you want to italicize the title) or Syntactic Structures Revisited (if you are in italics so you want the title not to be). Personally I find such book titles very typographically annoying. But that's just me.

Posted by Geoffrey K. Pullum at 07:22 AM

A Couple of Corrections

The National Geographic News article Languages Racing to Extinction in 5 Global "Hotspots" about the Enduring Voices Project discussed by Eric a couple days ago contains a couple of errors worth commenting upon. First, the Pacific Northwest region listed as one of the five "hotspots" is not "the U. S. Pacific Northwest" but the Pacific Northwest region of North America, which includes much of British Columbia.

Another error appears in the assertion that

In the last 500 years, an estimated half of the world's languages, from Etruscan to Tasmanian, have become extinct.

which implies that Etruscan became extinct in the last 500 years. Actually, Etruscan died out long before that, around 100 C.E. The last person known to have been able to read Etruscan was the Roman emperor Claudius, who died in 54 C.E. He reported that he compiled his Etruscan-Latin dictionary (regrettably lost) by interviewing some of the small number of elderly rural people who could still speak Etruscan. He is, I believe, both the first and the last monarch known to have conducted linguistic fieldwork. We don't know exactly when the last speaker of Etruscan died, but it was on its last legs in the first half of the first century.

The rate of language loss has accelerated as communication and travel have become more rapid and efficient, but the phenomenon is far from new.

[For some other critiques see Jane Simpson's post as well as Claire Bowern's post at Anggarrgoon to which Eric referred.]

Posted by Bill Poser at 03:47 AM

September 21, 2007

A child's garden of languages

My granddaughter Opal (who's three and a half) has been working for a while on a theory of what different languages are. The latest episode, as reported this morning by her mother Elizabeth on the blog Elizabeth and her husband Paul maintain:

Opal is about to take Chinese classes. Paul advised her to say "Ni hao ma" to her teacher. "I can't," said Opal. "Why not?" "I can't say 'ni hao ma'." Our laughter alerted her to the fact that there was a problem with this statement. "I mean, I can't say 'ni hao ma' in Chinese. How do you say 'ni hao ma' in Chinese?" I dunno. "Hello", maybe? We were unable to convince her it was already in Chinese.

Elizabeth told me this story at breakfast today, at which point it occurred to us to ask Opal what language we were speaking. Elizabeth offered several possibilities, all of them rejected until she got to English, which Opal accepted enthusiastically, adding that "English is plain words".

[Amendment 9/22/07: Several readers have pointed out that Paul didn't have the Mandarin quite right (as I should have noticed): "ni hao ma" (with the interrogative marker "ma") is a question, literally 'Are you good/ok?', conveying 'How are you (doing)?', while "ni hao" is literally 'You (are) good)', serving as a conventional greeting, roughly 'Hello'. The latter is what Opal would have wanted to greet her teacher.]

Opal is about to take French as well as Mandarin. At her day care center, there are kids bilingual in English and several other languages, and most of the teachers are at least bilingual. Her mother is fluent in French, and Opal's been to Germany (where she learned one word of German, "Hauptbahnhof" -- useful to her because a main train station will offer both ice cream and books). So she's been exposed to lots of languages (and to quite a range of varieties of English, from Australian to Irish). For a while, she identified all languages other than English as Spanish; I'm not sure what her current take on these things is.

[Addendum 9/22/07: Elizabeth notes that Opal understands a number of exressions in a variety of languages -- but apparently she thinks things like "On y va!" and "¡Vámonos!" and "Bleibst bei mir" are just English idioms that don't relate to other things in the language. Meanwhile, she does pretty well at shifting back and forth between Australian lexical items (to her father) and American ones (to her mother).]

At one point on that trip to Germany, Opal awoke from napping in her mother's arms to find Elizabeth negotiating with a desk clerk in German. Opal shrieked, demanded to be let down, and ran to the door, trying to get out of the pension. Elizabeth asked what was going on, and Opal explained that she had to go outside to find her momma and daddy. Apparently, she thought (for a little while, anyway) that Elizabeth had been replaced by a German-speaking impostor.

Now we'll see how the kid handles French and Mandarin as well as plain words.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 02:36 PM

Dr. Crockus in Central New Jersey

If you're behind on your Crockusology, the backstory is here, here, here and here. Today we have this note from Laura Ahearn:

I just wanted to drop you a line to let you know that Dan Hodgins repeated his claim about the crockus in a speech he gave last night here in central New Jersey. The director of my daughter's preschool attended the talk and reported that Hodgins said almost exactly what he had told you via e-mail: that the crockus had been discovered only three years ago by Dr. Alfred Crockus from Boston Medical University Hospital. The director of my daughter's preschool, who has been following the story on your blog, asked Hodgins for a list of references, which he reportedly promised to send her. I will let you know if the list contains any references to work by the esteemed Dr. Crockus.

Posted by Mark Liberman at 01:54 PM

Punctuational hypercorrection

It's the tiniest of things, but it still caught my eye: the punctuation at the end of this paragraph from Gail Collins's op-ed piece "McCain's Midnight Ride" (NYT of 9/20/07):

Here's the great thing about playing the role of Cassandra. We're not supposed to hold the four years of lost lives, international chaos and missed chances against John McCain because he always knew it was going badly. He said it on "Meet the Press!"

I did confirm my recollection that the name of the show is "Meet the Press", not "Meet the Press!".

How did that exclamation point get into the name of the show? By punctuational hypercorrection: someone involved in the production of the column -- Collins herself or, more likely, a copyeditor -- overapplied the rule of punctuation (followed in most American publications, including the NYT) that says that certain punctuation marks should go inside closing quotation marks, even when they weren't in the quoted material. This style rule applies only to periods and commas, however; other marks go inside closing quotation marks only if they were in the quoted material. That last sentence should go: He said it on "Meet the Press"!

(Two remarks: (a) the hypercorrect punctuation was in the print edition and was still in the on-line edition this morning; and (b) you will note that I don't follow the rule myself.)

[Addendum: several readers have suggested that the exclamation point doesn't originate outside the title "Meet the Press" (which is my understanding of Collins's intentions), but was inserted into the title by someone who, believing that imperative sentences should end in exclamation points, altered the name of the program to make it fit this "rule". Now, in some languages, the conventions of punctuation generally call for exclamation points in imperatives, but English is not such a language. In English, an exclamation point conveys urgency ("Stop now!") or enthusiasm (the album "Meet the Beatles!"), and a great many imperatives lack them, as in the movie titles (chosen from a great many such) "Meet Dr. Christian" (1939), "Meet John Doe" (1941), "Meet Me in St. Louis" (1944), "Meet Miss Bobby Socks" (1944), "Meet the Navy" (1946), "Meet Me Tonight" (1952), and "Meet the Fockers" (2005), all with imperative "meet". I'd hope that writers and editors for the NYT would not be ignorant of these facts about English punctuation, and also that they would not alter significant aspects of the form of titles (this is a case where faithfulness should win over well-formedness).]

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 10:19 AM

Unreal

Dan Rather is suing CBS over the way they treated him in the aftermath of the 2004 fontgate affair. (In case you've forgotten that bizarre episode, I'll remind you that CBS, under Rather's leadership as anchor, featured obviously-forged documents purporting to show that George W. Bush had shirked his National Guard duties during the Vietnam War.) And yesterday, the Huffington Post featured a piece by Mary Mapes, one of the CBS producers who was disgraced along with Rather ("Courage for Dan Rather", 9/20/2007).

Mapes' piece is extraordinary -- and I don't mean that in a good way -- for two reasons.

First, she seems still to be claiming that the forged documents were real, or perhaps copies of real documents, or at least not proved to be faked:

We reported that since these documents were copies, not originals, they could not be fully authenticated, at least not in the legal sense. They could not be subjected to tests to determine the age of the paper or the ink. We did get corroboration on the content and support from a couple of longtime document analysts saying they saw nothing indicating that the memos were not real.

On this point, I'd like to suggest that you go read Geoff Pullum's 9/15/2004 post "Typography, truth, and politics", and (if you have the time and care about this question) some of the rest of the Language Log commentary from the period. I'm suggesting this not because Geoff -- for all his diversely excellent qualities -- produced the definitive assemblage of evidence on this point, but because of the second extraordinary feature of Mapes' screed, namely her view that the only people who raised questions about the memos' authenticity were members of "the conservative blogosphere, particularly the extremists among them":

Instantly, the far right blogosphere bully boys pronounced themselves experts on document analysis, and began attacking the form and font in the memos. They screamed objections that ultimately proved to have no basis in fact. But they captured the argument. They dominated the discussion by churning out gigabytes of mind-numbing internet dissertations about the typeface in the memos, focusing on the curl at the end of the "a," the dip on the top of the "t," the spacing, the superscript, which typewriters were used in the military in 1972.

It was a deceptive approach, and it worked.

Now, you didn't have to be an expert on document analysis to follow the (straightforward, convincing and indeed incontrovertible) argument that certain of the crucial documents were crude forgeries. The person who made this argument in its most complete and convincing form, Dr. Joseph Newcomer, explained that "I am not a fan of George Bush. But I am even less a fan of attempts to commit fraud". And the people who were convinced, and said so, were not all "far right blogosphere bully boys". Geoff Pullum can be a bit rough on careless purveyors of bad grammatical and stylistic advice, but he's no thug; and I believe that he considers himself politically left of center; and he began his post on the subject this way:

The documents that CBS, Dan Rather, and 60 Minutes presented as 1972 memos from the Texas Air National Guard, with their putative revelations that George W. Bush tried to wriggle out of his obligations, are crude forgeries. The evidence for this claim is basically linguistic. There are weaker points about style (a military officer writing a memo to file with "CYA" as the subject?) and abbreviatory arcana (OETR for OER), but the strong evidence has to do with technical topics often discussed on Language Log and fairly close to the business of many modern linguists: things like character sets, typographical details, and word processing technology. Enough so, anyway, that the story does merit a brief but rather serious discussion here, and a comment at the end.

According to Google, the phrase {"reality-based"} occurs 1,150 times on the Huffington Post site, often echoing the political left's positive identification with a negative characterization attributed to a Bush staffer:

The aide said that guys like me were "in what we call the reality-based community," which he defined as people who "believe that solutions emerge from your judicious study of discernible reality." ... "That's not the way the world really works anymore," he continued. "We're an empire now, and when we act, we create our own reality. And while you're studying that reality—judiciously, as you will—we'll act again, creating other new realities, which you can study too, and that's how things will sort out. We're history's actors . . . and you, all of you, will be left to just study what we do."

I'm proud to consider myself a member of the "reality-based community". As a result, I'm reasonably convinced that George W. Bush used family connections to "jump the line" into the Texas National Guard in order to avoid service in Vietnam; and the evidence that I've seen indicates that he did as little actual National Guard service as the law allowed, and maybe less. Having been drafted and sent to Vietnam myself, I resent this a bit, especially in the context of what W now has to say about that war. But Mary Mapes' attempt to rehabilitate those forged documents is not based in any kind of reality that I understand.

[Note added by Geoff Pullum: Allow me to say just this about Mary Mapes's apparent implication that I belong to a gang of "far right blogosphere bully boys" — "keyboard assault artists who saw themselves as avenging angels of the right" — and thus should be dismissed as not competent to analyze documents or review the work of those who did.

It should not be necessary here for the issue of place on the political spectrum to come up at all. I have both Republican friends and Democrat friends who think George W. Bush is quite simply the worst American president they have ever known or heard about. He is loathed by conservatives and leftists alike. In the UK my Tory friends, my Labour-voting friends, and my Liberal-Democrat inclined parents all think the same. Just about everybody I know thinks it would be excellent news if evidence were found that would reveal flagrant failure to perform military duty on the part of GWB, because it might either weaken him politically or hasten his departure from the political scene. But for heaven's sake, we can't let truth be confused with wishful thinking when it comes to evidence. We can't let a forged memo, clearly faked in the early 2000s using Microsoft Word, be passed off as a genuine memo from 1972 typed on a military typewriter. Objecting to that is not right wing!

Ms Mapes is just not even looking at the evidence that I briefly and somewhat reluctantly reviewed in 2004 — still desperately trying to justify herself and Dan Rather and the whole production team, who were simply duped by a clumsy forger. It is truly amazing that even now, three years later, Rather and Mapes are trying to justify their stupidity and dismiss the thoroughly vindicated analyses offered by their many critics.

Grow up, people. You humiliated yourselves on national TV by accepting documents that could be spotted as forgeries as soon as they were released in facsimile. You were had. You were patsies, you were careless, and you caused enormous damage to the reputation of CBS. You ruined the case for GWB's military irresponsibility and mendacity (in very much the same way that Mark Fuhrman wrecked the murder case against O. J. Simpson). You messed up. Deal with it.]

Posted by Mark Liberman at 10:03 AM

Wherein I take the bait

Coturnix at Blog Around the Clock has the interesting job of "Online Community Manager" at PLoS-ONE (Public Library of Science). "My job", he tells us, "is to try to motivate you to comment on the papers there". Doing his job yesterday with respect to a recently-published PLoS paper by Gang Li et al., "Accelerated FoxP2 Evolution in Echolocating Bats", he wrote

Looking forward to further responses by other blogs, hopefully Afarensis, John Hawks and Language Log?

I don't have much time this morning, so I'll start by pointing people to our earlier posts on the subject, and echo the suggestion that people read the excellent and quite accessible new PLoS paper -- which everyone can do, because PLoS is Open Access! Here's the new paper's abstract:

FOXP2 is a transcription factor implicated in the development and neural control of orofacial coordination, particularly with respect to vocalisation. Observations that orthologues show almost no variation across vertebrates yet differ by two amino acids between humans and chimpanzees have led to speculation that recent evolutionary changes might relate to the emergence of language. Echolocating bats face especially challenging sensorimotor demands, using vocal signals for orientation and often for prey capture. To determine whether mutations in the FoxP2 gene could be associated with echolocation, we sequenced FoxP2 from echolocating and non-echolocating bats as well as a range of other mammal species. We found that contrary to previous reports, FoxP2 is not highly conserved across all nonhuman mammals but is extremely diverse in echolocating bats. We detected divergent selection (a change in selective pressure) at FoxP2 between bats with contrasting sonar systems, suggesting the intriguing possibility of a role for FoxP2 in the evolution and development of echolocation. We speculate that observed accelerated evolution of FoxP2 in bats supports a previously proposed function in sensorimotor coordination.

I'll also reproduce their cool phylogenetic tree (actually it's more of a phylogenetic thicket, or perhaps a phylogenetic tumbleweed) showing that whatever use humans have made of our little FOXP2 innovation, it's small change relative to the massive innovations by the bats in the same gene:

Figure 1. Radial phylogenetic tree showing relative rates of non-synonymous evolution among 35 eutherian mammals, including 13 bats.
Bats species are given as italicised binomial names. Branch lengths based on maximum-likelihood estimates of non-synonymous substitutions along 1995 bp of the FoxP2 gene are superimposed onto a cladogram based on published trees. Bat lineages are coloured to show the echolocating Yinpterochiroptera (blue) that mostly possess high duty constant frequency (CF) calls with at least partial Doppler shift compensation, the Yangochiroptera (orange) that mostly possess low duty cycle calls, as well as the absence of laryngeal echolocation in Yinpterochiroptera fruit bats (violet).

And I'll also put in another plug for FABLE ("Fast Automated Biomedical Literature Extraction"), which is the result of an NSF-funded collaboration between computational linguists at Penn and biomedical researchers at Children's Hospital. It's fast, it's free, and it'll let you find all the original research article that mention FOXP2 under any of its synonyms (cag repeat protein 44; cagh44; dkfzp686h1726; forkhead box p2; forkhead box protein p2; foxp2; spch1; tnrc10; trinucleotide repeat-containing gene 10; trinucleotide repeat-containing gene 10 protein), or search for other genes often mentioned in association with FOXP2 (it finds 99 of them, starting with PRM2, FOXP1, FOXP3, IL2, SHC2, AUTS1), and so on.

And finally, I'd like to head off a mis-interpretation. Gang Li et al. write that FOXP2 is "implicated in the development and neural control of orofacial coordination, particularly with respect to vocalisation". They don't intend the implication that FOXP2 should be identified as (something like) "the gene for orofacial sensorimotor coordination" (even though that would be a step up from calling it "the language gene" or "the grammar gene" -- see Christine Kenneally, "First Language Gene Found", Wired News, 2001). They don't intend this implication because they know very well than many other genes are also "implicated in the development and neural control of orofacial coordination",a nd that FOXP2 is a transcription factor that affects the expression of many other genes (or better, is part of many other genetic networks) with many other functions as well.

Thus Weiguo Shu et al., "Foxp2 and Foxp1 cooperatively regulate lung and esophagus development", Development 134: 1991-2000, 2007:

The airways of the lung develop through a reiterative process of branching morphogenesis that gives rise to the intricate and extensive surface area required for postnatal respiration. The forkhead transcription factors Foxp2 and Foxp1 are expressed in multiple foregut-derived tissues including the lung and intestine. In this report, we show that loss of Foxp2 in mouse leads to defective postnatal lung alveolarization, contributing to postnatal lethality. Using in vitro and in vivo assays, we show that T1alpha, a lung alveolar epithelial type 1 cell-restricted gene crucial for lung development and function, is a direct target of Foxp2 and Foxp1. [...] These data identify Foxp2 and Foxp1 as crucial regulators of lung and esophageal development, underscoring the necessity of these transcription factors in the development of anterior foregut-derived tissues and demonstrating functional cooperativity between members of the Foxp1/2/4 family in tissues where they are co-expressed.

There's obviously some interesting connection between FOXP2 and vocalization, as you can learn by reading Stephanie A. White et al., "Singing Mice, Songbirds, and More: Models for FOXP2 Function and Dysfunction in Human Speech and Language", The Journal of Neuroscience, 26(41):10376-10379, 2006. But let's not get carried away.

Post-finally, I'll end by expressing some puzzlement about what seems to be a careless mis-statement in the Blog Around The Clock post that I'm responding to. It starts this way:

Earlier studies have indicated that a gene called FOXP2, possibly involved in brain development, is extremely conserved in vertebrates, except for two notable mutations in humans. This finding suggested that this gene may in some way be involved in the evolution of language, and was thus dubbed by the popular press "the language gene". See, for instance, this and this for some recent research on the geographic variation of this gene (and related genes) and its relation to types of languages humans use (e.g., tonal vs. non-tonal).

The "geographical variation" business refers to the work by Dediu and Ladd that was previously discussed on Language Log here and here. They correlated the population frequency of adaptive haplogroups of the two genes, ASPM and Microcephalin, with the geographical distribution of lexical tone. (Well, actually, they correlated the geographic distribution of 1,000 genetic variants with the geographic distribution of 26 linguistic traits, but that was only to provide a background distribution for evaluating the correlation of Microcephalin and ASPM with lexical tone.)

Now, neither Microcephalin nor ASPM is another name for FOXP2. They're not even close. The UCSC Gene Browser (accessed via FABLE's LitTrack feature) identifies ASPM as chr1:195,319,996-195,386,439 (i.e. positions 195,319,996 to 195,386,439 on chromosome 1); Microcephalin as MCPH1 at chr8:6,251,528-6,509,561; and FOXP2 as chr7:113,513,617-114,157,642.

What was Coturnix thinking? I'd hate to think that he was just stirring the pot with random evocative references, as less reputable PR operations are wont to do.

[Update -- Bora Zivkovic, aka Coturnix, wrote (in part):

Thank you for the excellent response. I have to admit that I put my post together in great haste (thus my mistake in it, but I was hoping to start a discussion) [...]

Translation: He was stirring the pot with random evocative references -- but it was in a good cause ;-)! ]

Posted by Mark Liberman at 06:16 AM

The prehistory of emoticons

There's been a fair amount of press coverage this week for the 25th anniversary of a momentous event in the history of online communication. On September 19, 1982 at 11:44 a.m., Scott Fahlman posted this electronic message to a computer science bulletin board at his home institution, Carnegie Mellon University:

19-Sep-82 11:44 Scott E Fahlman :-)
From: Scott E Fahlman <Fahlman at Cmu-20c>

I propose that the following character sequence for joke markers:

:-)

Read it sideways. Actually, it is probably more economical to mark
things that are NOT jokes, given current trends. For this, use

:-(

Yes, it's the first smiley face (together with a frowny face), now classified as an emoticon. The message was long considered lost, until it was recovered from backup tapes five years ago. At the time of the recovery, ZDNet UK reported that "the date 19 September, 1982, is now likely to join the lexicon of other significant dates in the information revolution." And sure enough, the fanfare for the 25th anniversary has been ample. Despite the significance of Fahlman's world-changing combination of three punctuation marks, there were actually a few predecessors in earlier decades, proto-emoticons if you will.

First of all, Fahlman should properly be credited for devising the earliest known ASCII smiley, since in the 1970s users of the PLATO message board (among the first of its kind) concocted a whole range of smileys by overstriking characters. But emoticon-like symbols also turned up from time to time before the age of online communication. Urban legend debunker Barbara Mikkelson of Snopes.com recently found just such a forerunner in the May 1967 issue of Reader's Digest:

Many people write letters with strong expression in them, but my Aunt Ev is the only person I know who can write a facial expression. Aunt Ev's expression is a symbol that looks like this: —) It represents her tongue stuck in her cheek. Here's the way she used it in her last letter: "Your Cousin Vernie is a natural blonde again —) Will Wamsley is the new superintendent over at the factory. Marge Pinkleman says they tried to get her husband to take the job —) but he told them he couldn't accept less that $12,000 a year —) "
(Reader's Digest, May 1967, p. 160, citing Ralph Reppert of Baltimore's Sunday Sun)

"Granted, the 'tongue stuck in cheek' glyph is a bit different than the smiley face in that it is meant to be read square on; that is, looked at directly," Mikkelson writes. "However, lack of head tilt requirement or not, it is indeed an emoticon in the sense that keyboard symbols were used to create a representation of the sender's face for the purpose of conveying a better sense of how she meant her words to be taken." (Wikipedia reports the claim that a similar tongue-in-cheek symbol was used in April 1979 by Kevin MacKenzie on MsgGroup, an early bulletin board.)

But if we're going to extend the definition of emoticon to any expressive use of typographical symbols "to create a representation of the sender's face," then Ralph Reppert's Aunt Ev is hardly the only pre-computer precursor. As previously noted here (and, indeed, on Scott Fahlman's own site), Vladimir Nabokov made the following comment in an interview with Alden Whitman of the New York Times in April 1969:

Q: How do you rank yourself among writers (living) and of the immediate past?
Nabokov: I often think there should exist a special typographical sign for a smile – some sort of concave mark, a supine round bracket, which I would now like to trace in reply to your question.

A "supine round bracket" would look something like this: . Like Aunt Ev's tongue-in-cheek symbol, it's meant to be viewed directly and not in the sideways orientation of modern emoticons. Unfortunately, the "supine round bracket" doesn't match any symbol in standard typography, at least without manipulation. (Now there's a Unicode character that fits the bill, U+23DD or 'bottom parenthesis,' though it's not supported in all fonts. [John Wells emails to point out that the more appropriate symbol is U+2323 'smile,' which goes along with U+2322 'frown.'])

As it happens, Ambrose Bierce had pretty much the same idea way back in 1887, and unlike Nabokov he actually put it on the printed page instead of just talking about it. In an essay entitled "For Brevity and Clarity," Bierce proposes a number of reforms for the English language in his usual sardonic style. Here's the relevant passage as it appears in The Collected Works of Ambrose Bierce, Vol. XI: Antepenultimata (1912), pp. 386-7, courtesy of Google Books:

Before seeing the Google Books page image, I had thought that Bierce's suggested punctuation looked like this: \___/. That's how it appears in a footnote to Andrew Graham's online essay, "Forked Tongue: The Language of Serpent in the Enlarged Devil's Dictionary of Ambrose Bierce," as well as the Wikipedia entry on emoticons. It's interesting to discover that the parenthesis-as-smile representation actually goes back 120 years. (In Ambrose Bierce's Civilians and Soldiers in Context: A Critical Study, Donald T. Blume dates this essay to September 25, 1887, but the version published in the 1912 collection may have been subsequently revised.)

So as we celebrate Fahlman's baptismal smiley, let's also pay homage to Bierce's "snigger point or note of cachinnation." (The OED defines cachinnation as "loud or immoderate laughter" — for the kids that would translate to LOL or perhaps ROTFLMAO.) Nestled in an obscure corner of Bierce's collected works, the snigger point probably had no lasting impact on future typographical innovations, but it's still noteworthy that his ideas would be revisited independently in coming decades. Bierce's cynical deployment of the snigger point also prefigures various proposals for sarcasm marks (previously discussed here). Then again, this call for sarcastic punctuation was itself couched in sarcasm — as Donald Blume explains, Bierce was snidely recommending the snigger point to writers he disapproved of. What do you suppose a finely tuned wit like Bierce would make of the current efflorescence of less-than-clever emoticons? :-(

Posted by Benjamin Zimmer at 12:00 AM

September 20, 2007

Yes, they're still dying

Earlier this year I announced the publication of When Languages Die by K. David Harrison. The NYT have finally caught up with the non-profit group led by Harrison and Gregory D. S. Anderson: the Living Tongues Institute for Endangered Languages, in an article by John Noble Wilford published yesterday called " Languages Die, but Not Their Last Words". (The most recent related piece in the NYT appears to be Say No More (nudge nudge, wink wink?), a Magazine piece by Jack Hitt published way back in Feb. 2004.)

There's a related article and video on National Geographic News, published two days ago: "Last Speaker of "Extinct" Language Found". ~~Anderson is identified as Harrison (and vice-versa) in the video, but otherwise worth a look.~~ [Update: Just bumped into Roger Shuy at the water cooler, and he tells me this misidentification has been fixed.]

Speaking of National Geographic: be sure to check out the Language Hotspots page (linked from yesterday's NYT article).

[Update: Ben Zimmer writes to inform us about the critical discussion over at Anggarrgoon.]

Hat tip to Roger Levy and Alex Del Giudice for sending the links.

[ Comments? ]

Posted by Eric Bakovic at 11:52 AM

For lovers of crosswords and new words

Over on OUPblog, the blog for Oxford University Press (where I also hang my hat), today's launching of the sixth edition of the Shorter Oxford English Dictionary is being celebrated cruciverbally. The creators of Jonesin’ Crosswords, whose puzzles appear in alternative weeklies around the country, have contributed a crossword in which the long entries were selected from among the 2,500 words and phrases new to the sixth edition. Check it out. (You can read more about the Shorter in my OUPblog column here and here.)

And if you're looking for something a little more out there, lexically speaking, check out the crossword that Francis Heaney put together as a special puzzle for contestants at this year's American Crossword Puzzle Tournament (PDF, Across Lite). That puzzle features "recently coined words or archaic words waiting to be repopularized," championed by OUP's Chief Consulting Editor for American Dictionaries, Erin McKean (as part of a forthcoming Discovery Channel documentary, "The Joy of Lex"). Maybe a few of those words will be appearing in the seventh edition of the Shorter!

Posted by Benjamin Zimmer at 11:41 AM

Mistakes

I hope that Bertrand Russell, as he hoists a pint with Kurt Gödel and Alan Turing in some celestial pub, will enjoy discussing whether (the message on) this sign is a member of the set of all quotations that are true of themselves:

Victor Mair, who sent this in, explains:

At the top it says: LI4ZHI4 MING2YAN2 "famous words of encouragement" (lit., encouragement famous words).

A direct, literal translation of the Chinese translation would be: "Deriving / drawing / extracting a lesson from (one's) mistakes is an extremely important part of education."

The translation is acceptable and has no obvious typographical error as in the English, so the English mistake must have been genuine and unintentional.

The attribution simply says YING1GUO2 ZHE2XUE2JIA1 ("English philosopher") LUO2SU4. B.

[Photo by David Moser]

Posted by Mark Liberman at 10:35 AM

Crosley Shelvador comes in from the cold

I seem to have been excessively indirect in my remarks on "Dr. Alfred Crockus and Crosley Shelvador, M.D." This happens a lot, as certain people have helpfully pointed out to me over the years, though this failing is balanced (if not excused) by the frequent occasions when they find me to be excessively explicit. In any case, I would have saved some readers the time that they've spent showing that Peconic County Community College has probably never existed, that Shelvador is not a plausible Brazilian surname, etc., if I had directly stated that Crosley Shelvador was the first Refrigerator-American ever to have had an article published in a major refereed journal.

In fact, I understated Crosley's accomplishments, since he also authored a 2001 book notice in Language (a review of "Cracking the Codes" by Richard Parkinson), and more. However, he fell far short of the brilliant career that some of us anticipated for him. As Larry Horn explained on the ADSL mailing list (Re: obsolescene, 28 Feb 2005) :

The Crosley Shelvador...ah yes, I remember it well. When I was a (non-post-doc) post-doc at MIT in 1971-72, the old fridge/ice box for graduate student use in one of the corridors of the linguistics quarters in the late Building 20 (can't recall if it was the D wing or E wing) was a Crosley Shelvador, and some of the graduate students (this was the era of Lasnik, Fiengo, Wasow, Prince, et al.) decided that this would be our "Bourbaki", so that squibs would be submitted as authored by Crosley Shelvador, acknowledgments in papers would express gratitude to Crosley Shelvador, and so on. Can't recall (this was 33 years ago, and memories of even important events of this kind do tend to fade over time, as Maurice Chevalier reminded us) how far we progressed with this scam, or what became of the eponymous Crosley himself.

Meanwhile, back in the hunt for Dr. Alfred Crockus, discoverer of the brain region that bears his name ("the detailed section of the brain, a part of the frontal lope"), Tracy Walsh has supported the bar-bet theory with a clue from Cassell's Dictionary of Slang:

crocus (metallorum) n. (also croacus, crockus, crokus) 1 [late 18C] (orig. milit.) a doctor, a surgeon, esp. a quack. 2 [mid-late 19C] a beggar who poses as a doctor. [? pun on croak us (though CROAK v.², to die or kill is first recorded slightly later), but OED suggests 'the Latinized surname of Dr. Helkiah Crooke, author of a Description of the Body of Man, 1615, Instruments of Chirurgery, 1631, etc. ...' The quack implication suggests a further pun on hocus-pocus. Note fairground use, crocus, a doctor, a herbalist, a miracle-worker; market use, crocus, a fair-weather trader who works only during the spring or summer (f. the flower). Metallorum, lit. 'of metals', plays on crocus metallorum or crocus antimonii, which are more or less impure oxysulphides of antimony, obtained by calcination]

Several readers also suggested an etymological or perhaps intellectual connection to another entry on the same page of Cassell's:

crock of shit n. (also bucket of shit, crock of bullshit, load of shit) [1940s+] complete nonsense, a lying statement, anything useless or unpleasant [SE crock, a pot + SHIT n.³]

I prefer to think that Dr. Crockus might be a well-loved Crock Pot. The brain region know as the "Crockus", though certainly complete nonsense, is anything but useless, since it "supports the Corpus Callosum" and explains why girls are better with details and thus well suited to secretarial or clerical work. And the crock-pot hypothesis also explains the good doctor's reclusive behavior. You may find this hard to believe, in an age in which several house plants have been elected to national office, but in the academic world, there is a still a lot of prejudice against household appliances.

[Update -- Dr. Vaughan Bell at Mind Hacks has begun making nominations for the Dr. Alfred Crockus Award for the Misuse of Neuroscience.]

Posted by Mark Liberman at 07:36 AM

September 19, 2007

Dr. Alfred Crockus and Crosley Shelvador, M.D.

This all started on Monday, with an email query from Heidi W. She had attended a lecture by Dan Hodgins, an "internationally known presenter" who told the assembled teachers in her large urban school district that "Girls see the details of experiences", while "Boys Brains see the whole but not the details", because the girls' "Crockus is Four times larger than boys". Heidi questioned the validity of Hodgins' brain-lore, and she was especially puzzled about what that "Crockus" might be.

Some initial poking around turned up no prior art in crockusometry ("How big is your crockus", 9/17/2007; "High crockalorum", 9/18/2007). So I emailed Dan Hodgins himself, who responded that "The Crockus was actually just recently named by Dr. Alfred Crockus". This deepened the mystery, in a way, because I couldn't find any publications by Alfred Crockus on Google Scholar, nor any web presence for Dr. Crockus more generally. So I wrote to Dan Hodgins again:

Thanks for the quick reply!

But who is Dr. Alfred Crockus? I can't find him via Google or Wikipedia.

And again, he was kind enough to reply quickly:

Crockus works for Boston Medical University Hospital.
Good luck
Dan

This didn't help me locate the eponymous doctor, and so I asked:

Do you mean Boston Medical Center, which was formed about ten years about by the merger of Boston City Hospital (BCH) and Boston University Medical Center Hospital (BUMCH)? I checked their directory search and couldn't find anyone named Crockus:
http://www.bmc.org/physref/index.asp?ty=basicphys

As far as I know, the only other university hospital in Boston is the Harvard/MGH complex, and he doesn't seem to be there either. Perhaps he's retired?

Can you point me to an article or a book where Dr. Crockus describes his findings?

I haven't gotten an answer yet.

However, I too know what it is like to owe an intellectual debt to someone whose web presence is so ephemeral that others might be excused for doubting his very existence.

When I was in graduate school at M.I.T., I shared a group office in Building 20, down the hall from the Tech Model Railroad Club, with a large and motley collection of fellow students and others. You can consult the list of dissertations completed between 1972 and 1975 for a list -- but it's only a partial one, because some of our office-mates never got to the point of finishing a degree, and others were never enrolled in the program to start with, but just hung out in Building 20 for reasons of their own. During this period, one of the most reliable and influential inhabitants was Crosley Shelvador -- he seemed to be there 24/7, tirelessly interacting with whoever else happened to be in the office.

Older than the rest of us, Crosley never offered any information about his background. Mark Aronoff once told me about a story to the effect that Crosley Shelvador (or perhaps another individual of the same name) was an Anglo-Brazilian doctor who had made important (though amateur) contributions to Amazonian linguistics. But I suspect that this is one of those rumors that starts as someone's "for instance" speculation, perhaps in this case based only on the evocative resonances of Crosley's name.

In any case, I think it's fair to say that no one had a larger influence on my brief but intense graduate-school career than Crosley Shelvador did. Many of my best ideas were supported if not created by interactions with him. But when I now search Google Scholar for {Crosley Shelvador}, the only evidence that I can find of his existence as an intellectual is one single work: a review of Language and the History of Thought, by Nancy Struever, and The Search for the Perfect Language, by Umberto Eco, in Language 72(4), Dec. 1996) pp. 852-856. A search for {Crosley Shelvador} on Google Images turns up a number of pictures of what may be relatives, but none of them look exactly like Crosley as I remember him.

When I taught at M.I.T. in 1978, Crosley was still hanging around, but after that, I lost track of him, and I have no idea where he is today. The 1996 Language book review gives his address as Department of Cultural Studies, Peconic County Community College, Shelter Island, NY -- but that institution seems to have vanished from the pages of time, leaving a trace only in a list of "common misspellings or interpretations" at a dodgy-looking college loan site. Perhaps Mark Aronoff, who was the editor of Language in 1996, knows what happened to Crosley Shelvador and to PCCC.

And who knows, if I ever locate Crosley again, maybe Alfred Crockus will be there too.

[Meanwhile, in other crockusological news, Deen Skolnick Weisberg pointed out to me that there's an article in press at Cognition -- David P. McCabe and Alan D. Castel, "Seeing is believing: The effect of brain images on judgments of scientific reasoning", whose abstract reads as follows:

Brain images are believed to have a particularly persuasive influence on the public perception of research on cognition. Three experiments are reported showing that presenting brain images with articles summarizing cognitive neuroscience research resulted in higher ratings of scientific reasoning for arguments made in those articles, as compared to articles accompanied by bar graphs, a topographical map of brain activation, or no image. These data lend support to the notion that part of the fascination, and the credibility, of brain imaging research lies in the persuasive power of the actual brain images themselves. We argue that brain images are influential because they provide a physical basis for abstract cognitive processes, appealing to people’s affinity for reductionistic explanations of cognitive phenomena.

I believe that this provides additional experimental support for the wager that I jocularly attributed to Prof. Hodgins.]

[More here.]

Posted by Mark Liberman at 09:53 PM

Interviewing hits prime time

We all know that the interview is a standard way for academic researchers to gather data, especially in the social sciences. This approach is used by psychologists, anthropologists, sociologists, educators (mostly for testing purposes), historians, and by that quaint group of linguists who hold that naturally occurring language is a pretty good source for us to learn about the way people talk. Even police departments interview suspects, although hardly for academic purposes. In this age of handbooks on virtually every topic you can think of, Sage has published still one more, a tome called The Handbook of Interview Research (okay, I admit it, I have a chapter in it). And, of course, we also watch interviews all the time on television and read them in magazines and newspapers. But who would have guessed that the little old, researchy, harmless interview would ever make it as a prime-time television program?

Never fear, a brand new program called OCI, which stands for On Campus Interview, is said to begin airing soon on your local television station. It's alleged to be one of those reality-based shows in which sexy looking (hey, this is television) representatives of law firms jet around the country interviewing prospective law school grads for positions in top law firms, allegedly trying to find out whether they'll be a good fit for those companies. Gives us a chance to see a variety of law school types, from nerds to slick super-salespersons, if you're interested in that sort of thing. I haven't seen the program, of course, but it sounds a little like a Donald Trump take-off, doesn't it?

Okay, so the article is a put-on. But in the current television climate, it's possible that a show like this COULD happen, coudn't it?

Posted by Roger Shuy at 06:10 PM

Post like a pirate!

We interrupt our pursuit of Dr. Alfred Crockus and his brain region to remind you that this is Talk Like a Pirate day, and to enact our annual ceremonial repost of the Corsair Ergonomic Keyboard for Pirates:

Poke around in our treasure trove of pirate lore and you'll find some gems mixed in with the trash. For example, this excellent instruction video on How to Talk Like a Pirate:

If I had time, I'd tell you about Dr. Crockus's pioneering work in thomboracic surgery, but right now I need to get back to business.

Meanwhile, over in Old Blighty, it seems to be "Talk like a toff as a result of brain injury" month. Just a few days ago, it was a Czech motorcycle racer who allegedly started speaking like a BBC newsreader about someone ran over his head. Now it's a Yorkshire boy:

His mum Ruth, from York, said: "He just kept on surprising doctors.

"He survived the operation and the most amazing thing is that he came out of surgery with a completely different accent.
"He went in with a York accent and came out all posh. He no longer had short 'a' and 'u' vowel sounds, they were all long."

You can find some information and links about Foreign Accent Syndrome in my post on the Czech bike racer.

Posted by Mark Liberman at 06:55 AM

September 18, 2007

Fictional Linguists

A few weeks ago I got phone calls from two different companies that produce tape recordings of books for people who themselves are either unable to read or who just like to hear novels read to them. I was a bit mystified about why they called, since the only thing the two callers wanted to know was how to pronounce my name (for those of you who don't know and may be curious, it's pronounced "shy").

I've been cited in academic publications but never before in a piece of fiction, at least as far as I know. Kathy Reichs' new novel, Bones to Ashes (Scribner, 2007) is her latest book in which fictional forensic anthropologist Temperance Brennan gets involved in a gripping story of crime and lust, eventually solved by forensic science. Reichs knows this field well, because she's a forensic anthropologist herself and holds a position in the anthropology department at the University of North Carolina, Charlotte.

So why is she citing me in her novel? Reichs first mentions forensic linguistics in chapter 23, referring to the Unabomber case (foreshadowing?), then she drops the topic for a while until she introduces her old college friend, Rob Potter. This is a fictional name based on her real-life acquaintance, Rob Leonard, a forensic linguist at Hofstra University. Apparently Reichs got the novel's forensic linguistic twist from him. The fictional anthropologist asks the fictional linguist to analyze two sets of poems, which he does in chapter 34. I won't tell you what he finds, since that might spoil the novel for you, but in the process of explaining his findings, fictional Rob tells her about a somewhat different case that his mentor, Roger Shuy, once worked on. Here's the non-fictional part--real me, real Rob, and real report of a case. His analysis, of course, convinces the forensic anthropologist, who had gathered lots of other evidence on her own, that the poems were written by the same person--her childhood friend who had gone missing many years earlier.

Arnold Zwicky once posted about fiction containing fictional linguists here and Heidi Harley posted about actors playing linguists in movies here but I don't think actual linguists were mentioned in them.

I'm not much of a novel reader so I won't try to pass judgment on the quality of this one. But it was kinda nice to see forensic linguistics play a role in it. And maybe people who listen to the talking book will finally stop calling me "shoe-we," "shay," or "shoe." Or confusing my name with that of Norman Hsu, the disgraced political fund raiser, now residing in a Colorado prison.

Posted by Roger Shuy at 11:52 AM

High Crockalorum

Yesterday, Heidi W. sent in a note about the presentation that Dan Hodgins made to an Early Childhood Education group in her large urban school district ("How big is your crockus?", 9/17/2007). Much of his talk was the standard stuff of current pop neuro-indoctrinology ("science has revealed that boys' brains are made of snips and snails and puppy-dog tails"), but one bit stuck out: the idea that a brain structure called "the crockus" is four times larger in girls than in boys.

Now, I'm used to seeing the pop neuro-indoctrinologists misrepresenting or even inventing numbers, exaggerating differences, and generally misusing science to shore up weak arguments about social policy. But this is the first time, as far as I know, that someone has actually made up a whole new brain area.

In fact, the idea that someone would actually fabricate "the crockus" as a neuro-anatomical neologism was so unexpected, so far beyond the boundaries of my imagination, that I misinterpreted Heidi's note. I thought "crockus" was her attempt to render some odd pronunciation that Hodgins used in his presentation.

But she quickly corrected me, and sent in a scan of (parts) of his handout, which was the take-away paper form of his PowerPoint slides. Here's the crucial slide:

His next slide tells us that (because of their smaller Crockus) "boys see the whole but not the details".

He illustrated the role of Crockus by showing the audience a lateral view of the brain -- Heidi sent a scan of the handout version:

And she did a bit more research:

The drawing of the brain is not labeled on the handout, and it wasn't labeled in Hodgins' PowerPoint presentation, but the drawing indicates that it's from BrainConnection.com and I just located it here.

Hodgins referred to the small royal blue area, which is labeled "pars opercularis" on the web site PPslide, and he said that's the size of the crockus in males, and he referred to the motor cortex (somewhat lighter shade of blue) and said that's the size of the crockus in females.

This is truly strange stuff. I feel like I'm in a magic realist novel that's slipped slightly out of editorial control.

But I have a theory. Maybe back in 2003, Prof. Hodgins was talking with some of his drinking buddies, and the conversation went something like this:

Hodgins: Those education professionals, they're so worried about sex differences and so wowed by neuroscience, you can show 'em a picture of the brain and tell 'em any crazy thing about how brain scans show boys are different from girls. And they not only believe it, they pay you. Fly you out, nice hotel, per diem, the works.

Drinking Buddy #1: OK, they're desperate and they're credulous, but you got to make it plausible.

Hodgins: Nah, you take a random brain picture and any stupid made-up words and numbers, they'll swallow it like ice cream.

Drinking Buddy #2: Come on. You got to do a little reading, anyhow, you can't just tell them, oh, the "crockus" is four times bigger in girls and that's how come they're like better with details while us guys deal with the big picture.

Hodgins: Sure I could, absolutely.

Drinking Buddy #1: $500 says you're full of it.

Hodgins: You're on.

This theory is surely false. Maybe Prof. Hodgins just mis-remembered pars opercularis as crockus. Or ....

No, I give up. I'll just spoil the joke, in my usual heavy-handed way, by citing what's actually known about sex differences among school-age children in the brain region that Hodgins seems to have been talking about.

Let's take a look at R.E. Blanton et al., "Gender differences in the left inferior frontal gyrus", Neuroimage, 22(2):626-636, 2004.

This study examined frontal lobe subregions in 46 normal children and adolescents (25 females, mean age: 11.08, SD: 3.07; and 21 males, mean age: 10.76, SD: 2.61) to assess the effects of age and gender on volumetric measures as well as hemispheric asymmetries.

Here's their Fig. 8 "Gender Differences"

which they explain as follows:

Total intracranial volume, total gray matter, and total white matter were significantly larger in boys compared to girls (F = 11.63, P = 0.001; F = 16.17, P < 0.001; F = 9.59, P = 0.004, respectively). In addition, total left gray matter volume was found to be significantly larger in boys (F = 7.58, P = 0.009).

After covarying for total intracranial volume, male subjects were found to have more gray matter in the left IFG (F = 9.14, P = 0.004), a trend for a significantly larger left IFG relative to girls (F = 6.38, P = 0.016), and a trend for increased asymmetry in total right white matter as compared to total left (F = 6.24, P = 0.017).

Let me point out here, again, that the distributions are heavily overlapped, and that most of the subjects are in the overlapped region. The sex differences are interesting and worth further study, but it's nuts to prescribe on the basis of such evidence that our educational systems needs to treat boys and girls as if they were completely separate species.

[Update -- Since the pars opercularis and the pars triangularis together form Broca's area, several readers have suggested that perhaps it was Hodgins who misheard "Broca's" as "Crockus". This occurred to me as well -- and also to Heidi W. -- but we were all wrong. I wrote to Dan Hodgins, asking

Someone who recently heard you speak asked if I could explain your reference to a part of the brain called the "Crockus", which is four times larger in girls than in boys, and is apparently involved in explaining differences in attention to detail.

I couldn't find any information on the web about a part of the brain by that name. Can you help?

Prof. Hodgins was kind enough to reply very quickly:

Thanks for asking....The Crockus was actually just recently named by Dr. Alfred Crockus. It is the detailed section of the brain, a part of the frontal lope. It is the detailed section of the brain. You are right, it is four times larger in females then males from birth. This part of the brain supports the Corpus Callosum (the part of the brain that connects the right and left hemisphere. The larger the crockus the more details are percieved by the two sides of the brain.

For boys, usually they only view and analyze the whole picture, not the sum of its details. Girls brains are wired to look at the details first, which then leads them to the whole picture.

Look at the work by Moir.

This deepens the mystery, I think, because I can't find any likely-looking Alfred Crockus via Google Scholar or Wikipedia or even general web search. I think that the "Moir" he's referring to is the co-author of Anne Moir and David Jessel, Brainsex, 1992. But Amazon offers its "Search Inside" feature for that work, and a search for "Crockus" in it comes up empty.

I've written back to Prof. Hodgins for more information.]

Posted by Mark Liberman at 06:17 AM

Lowest of any of the others

I'm used to seeing comparatives like "taller than anybody on his team" meaning taller than anybody *else* on his team, and have imagined (off the top of my cuff) that we automatically accommodate a restriction on the domain, though I don't know if there's been any real work on how this works, whether the same phenomenon is found cross-linguistically, etc. I probably commit this 'error' myself sometimes -- the logician in me hates it, but the non-prescriptivist in me reminds the logician-in-me that it's quite benign, since it would never be misunderstood, so why fuss? Just file it away as an interesting curiosity that I hope someone someday will work on (or already has? I'd be curious to know) and explain how/why it happens so naturally.

But now I've found a superlative construction with the 'opposite' 'error': "I really like the look (and feel!) of my ifrogz case. It has the lowest profile of any of the other cases I have used." -- from a testimonial in the metrobagz part of the ifrogz site, http://ifrogz.com/products/metro-bagz/.

I wouldn't know how to account for it unless it could be called hypercorrection (I somehow doubt it), and am wondering whether the different domain requirements of the comparative and superlative constructions are just very commonly confused (the way "3 times as big as" vs "3 times bigger than" are distinguished only by pedants or for small fractions like "50% as big as" vs "50% bigger than").

This time we have to EXPAND the domain to put the item being compared back IN, since the superlative construction needs to involve a set that includes the item in question. My inchoate hypothesis for the comparative examples was that when comparing an individual and a whole set, it's natural to invoke some sort of implicit disjoint reference idea, so that even when we say "[taller than] everybody in his class", we mean "everybody except him". But this example really messes up any such idea, since "any of the other cases I have used" requires MORE words, and we have to add the ifrogz case back in to provide a suitable domain for the superlative. So it can't be anything like an accommodation of 'disjoint reference'.

Although this construction seems unrelated to negation, it somehow feels a lot like all the "can hardly underestimate his importance" examples we've been spotting lately.

I was also unsure whether "any" is normal in superlatives, thinking it may usually be restricted to comparatives. But when I asked around, Larry Horn assured me that it occurs very readily in superlatives. Larry wrote:

-----

NPIs in general are fine in superlatives--"the lowest profile I've ever seen" is impeccable, as is "the toughest problem I have yet encountered", and "the lowest profile of any of them" is OK for me as well. And my intuitions aren't unique; just checking a few of the likely suspects, I find e.g.

"most of any": 404,000 google hits

"best of any": 242,000 google hits

"biggest of any": 9,900 google hits

-----

In any case, maybe "has the lowest profile of any of the other cases" could be analyzed as a blend of comparative and superlative: "has a lower profile than any (of the) other X" + "the lowest profile of any X" (Larry's formulation).

Larry agrees that it's an interesting phenomenon, and that it's more problematic than the restriction-accommodation case of "taller than anyone in his family", which he sometimes uses variants of for translations in intro. semantics.

Google peculiarities: When I tried to get a rough Google comparison of "biggest * of any of the other" vs. "biggest * of any of the", I actually seemed to get a much bigger number for the first, though it should be a subset of the second. I got 106,000,000 for the first and just 12,800 for the second! But then with some help from Kai von Fintel and David Beaver, it was discovered that Google behaves very strangely with some ungrammatical strings. Closer inspection of the return from the search that seemed to give 106,000,000 hits shows that it returns only 3 pages of results, with the number 106,000,000 at the top of pages 1 and 2, but the number 21 on page 3, and in fact it only returned 21 hits.

David sleuthed out the phenomenon; here's his report.

***********

Unfortunately, the numbers given as results of google searches have become less meaningful over the last few years rather than improving in any sense relevant to us. The numbers google gives in response to a query are not counts of the number of pages with the given string. Rather, they are estimates based on a formula that, so far as I know, is not public. For simple searches, the estimate is presumably based on a calculation of the probability of the page having all the search terms based on the number of pages in the google caches for each of the component terms. But once you start doing string searches, this sort of approach becomes very unreliable.

I assume that the oddity of the result for "biggest * of any of the other" occurs because Google doesn't have any smart way to calculate the likelihood of strings for which the number of responses appears too large to simply count them. That is, I guess the algorithm works by first putting some bounds on the likely number of hits based on e.g. how rapidly various google network nodes appear to be sending responses, and if that number is sufficiently small, then google uses some fairly accurate algorithm for estimating the total, like counting every single response. But if there appear to be loads of responses, then the algorithm makes an estimate based on, well, who knows what. In the case at hand (and similarly for "smallest * of any of the other", "largest * of any of the other"), the estimate assumes some distributional properties that just don't hold for semantically or syntactically anomalous strings. Then, as you start going through the hits, Google is forced to self-correct as soon as you force it to actually enumerate all the results.

Hmm. So, if I'm right, then Barbara has stumbled on a rather interesting test for grammatical anomaly (though only relative to Google's bizarre assumptions about normality). Lets try another case: "* who thinks that is happy". This one has pretty damn ordinary set of words in it, but suffers from an unfortunate case of a missing subject. Here Google initially estimates 10,900 results. But then it rapidly revises down to 16. Try a that-trace violation: "who * said that is happy", first 1,130,000, then 22, none of which are actually that-trace violations (and all of which are only produced because Google's interpretation of the * operator is insane, a point we've made on Languagelog before). Similarly, "* give a man a man a *", which has too many arguments for "give", starts off at 4,260, but then drops to 54.

So. Umm. There you go.

David

Posted by Barbara Partee at 04:19 AM

September 17, 2007

Survived by his trainer

In case you missed them, there were a couple of updates added to my report of the death of Alex the African Grey parrot. I also have a couple more comments of my own, so I'm consolidating all these into this post.

First, the updates: a Language Log reader wrote to inform us about an interesting discrepancy between Alex's final words as reported in the NYT story that I cited and this Boston Globe story:

NYT: Even up through last week, Alex was working with Dr. Pepperberg on compound words and hard-to-pronounce words. As she put him into his cage for the night last Thursday, Alex looked at her and said: "You be good, see you tomorrow. I love you."

Boston Globe: Pepperberg said she and Alex went through their good-night routine, in which she told him it was time to go in the cage and said: "You be good. I love you. I’ll see you tomorrow." To which Alex said, "You'll be in tomorrow."

I think it would be no more or less impressive for Alex to have predicted Dr. Pepperberg's return in the morning than for him to have expressed his loving feelings for her. But would the prediction-of-return have generated anything like the following editorial musings on the expression-of-love? (Pointed out by the same LL reader.)

These are bottomless questions, of course. For us, language is everything because we know ourselves in it. Alex's final words were: "I love you."
There is no doubt that Alex had a keen awareness of the situations in which that sentence is appropriate -- that is, at the end of a message at the end of the day. But to say whether Alex loved the human who taught him, we'd have to know if he had a separate conceptual grasp of what love is, which is different from understanding the context in which the word occurs. By any performative standard -- knowing how to use the word properly -- Alex loved Dr. Pepperberg.

Now my own updates. The NYT story I originally cited was "Alex, a Parrot Who Had a Way With Words, Dies" (by Benedict Carey, published Sept. 10). As often happens, this story was lightly edited and republished the next day with a new title significantly referencing the expression-of-love: "Brainy Parrot Dies, Emotive to the End". Among the edits are these:

"demonstrated off", which I figured must have been an editing error in my original post, has been changed to "demonstrated". (In the same sentence, "programs on the BBC and PBS" has been changed to "programs on PBS and the BBC" for some reason -- America first?)
A sentence originally ending "... scientists had little expectation that any bird could learn to communicate with humans" now ends "... scientists had little expectation that any bird could learn to communicate with humans, as opposed to just mimicking words and sounds".

In yesterday's Week in Review, George Johnson has a piece provocatively titled "Alex Wanted a Cracker, but Did He Want One?" There you can see a YouTube video showing off (or demonstrating off, if you prefer) some of Alex's purported skills with language and numbers. (Johnson also correctly identifies the PBS show hosted by Alan Alda that Alex appeared on: it's "Scientific American Frontiers", not "Look Who's Talking" as Carey had reported.)

As of this writing, the NYT sidebar indicates that Johnson's is the 8th most e-mailed NYT piece -- but note that the italics are missing from the title, unfortunately disguising what I find to be the most interesting aspect of the piece.

This most interesting aspect is the nature of consciousness, and whether Dr. Pepperberg's research with Alex (and other African Grey parrots) has anything to tell us about it. This is touched on in the two paragraphs that sandwich a longer passage about Alex's purported understanding of the complex mathematical concept of "zero":

[Dr. Pepperberg] is quick to concede the impossibility of proving that the bird was actually verbalizing its internal deliberations. Only Alex knew for sure.

[...]

In a well-known essay, "What Is it Like to Be a Bat?" the philosopher Thomas Nagel speculated about the elusiveness of subjectivity. What was it like to be Alex that last night in his cage? We'll never know whether there really was a mind in there -- slogging its way from the absence of a cork-nut to the absence of Alex, grasping at the zeroness of death.

Nagel's article is well-worth reading. The original is available from JSTOR, but for those without JSTOR access there several alternatives -- see this or this, the short summary and discussion here, or if you prefer PDF to HTML, try this. I happen to have read the article myself this summer as reprinted in Hofstadter & Dennett's The Mind's I: Fantasies and Reflections on Self & Soul (which I also highly recommend to anyone interested in the nature of consciousness).

A final comment. At Dr. Pepperberg's Alex Foundation website, the news page has the following announcement:

Your kind attention please.

We have had a many orders placed recently and it may be a few days to get everything back up to speed. Some of the shipping costs are incorrect we were working on the issue, before the events of Sept 7th.

What struck me here was not the mistake in the last sentence ("...shipping costs are incorrect we were working..."), no doubt brought about by a foundation in chaos after the death of its primary subject. (I've been taken to task for pointing out these kinds of mistakes before.) I'm more interested in the very last noun phrase: "the events of Sept 7th". This is a reference to Alex's death and the ensuing media attention, of course: Alex was found dead on Friday the 7th (though note that the foundation's official press release notes the date of death as Thursday the 6th). When I read or hear the phrasing "the events of [month date]", I invariably think of the terrorist attacks of September 11, 2001 -- and so of course it doesn't help here that the month is September.

In a way, Google agrees with me. Searching for the strings {"the events of month"} returns significantly many, many more hits when month = sept(ember) than any other month -- in fact, about 3.5 times all other months put together. And not surprisingly, paging through the sept(ember) results reveals almost nothing but references to the terrorist attacks.

the events of ...	jan(uary)	feb(ruary)	mar(ch)	apr(il)	may	jun(e)
ghits	39,500	31,930	45,070	57,604	85,400	51,108
the events of ...	jul(y)	aug(ust)	sept(ember)	oct(ober)	nov(ember)	dec(ember)
ghits	56,302	55,130	1,985,000	55,070	42,202	46,970

Interestingly, each of the other months are in a rather tight range, with the mean and the median both around 51K ... interesting, but I've already invested too much time today (indeed, this month) writing about a dead parrot.

[ Comments? ]

Posted by Eric Bakovic at 03:27 PM

(an)arthrous abbreviations

The Economist Style Guide (2005), p. 7, advises us:

the : not needed before pronounceable abbreviations like NATO, UNESCO

Rachel Cristy unearthed this in a search through usage manuals for instances of Omit Needless Words (ONW) and Include All Necessary Words (IANW) advice. This one looks like an ONW case, but my first reaction to it was that no advice was necessary: things like "The NATO is an international organization" seemed to me to be just ungrammatical, and unlikely to occur with any frequency; for me, acronyms like NATO are obligatorily anarthrous (an-arthr-ous, lacking an article). [Reminder: acronyms and initialisms are both abbreviations made up of initial letters of words in some expression. But an acronym is pronounced like an ordinary word, while an initialism is pronounced as a sequence of letter names. The Economist's advice is about acronyms.] But, yes, there's variation out there. As Geoff Pullum noted here recently in connection with another set of proper names, there are some generalizations about arthrousness, but also many exceptions, and there is variation from speaker to speaker (and, in fact, for a single speaker on different occasions).

[Terminological note: following the Cambridge Grammar of the English Language, Geoff uses the technical terms weak and strong rather than my arthrous and anarthrous, respectively. The intended image is that arthrous proper names can't stand on their own; they're weak and need an article, while anarthrous names require no such support. Unfortunately, I can see a rationale for using the terms weak and strong in exactly the reverse fashion: arthrous names come with an article and so have strength "built in", while anarthrous names are weak because they're missing an element. So rather than trying to remember which metaphor CGEL had in mind, I've opted for technical terms that, it seems to me, can't be confused. I'm also just fond of these terms.]

First, it's not hard to find examples where the definite article in a full proper name (like the North Atlantic Treaty Organization) is preserved in the corresponding acronym; very often a writer goes back and forth between the arthrous and anarthrous variants, as on this website, which begins:

Why is NATO wrong?

The NATO is treated as beyond moral judgment: for most politicians in Europe, it simply exists, like gravity. For them, the only issues are: who should join, and where should it intervene? Nevertheless, the NATO has no moral basis: its existence and its fundamental purpose are wrong - let alone its interventions

Perhaps the writer intended the arthrous variants to be read out in full and the anarthrous ones to be pronounced as single words, but that seems unlikely to be true of all the cases you can find.

Be that as it may, FOR ME the following principle is (I think) exceptionless:

The Acronym Principle: Acronyms are anarthrous (even when the full names they abbreviate are arthrous).

This covers NASA, FEMA, MOMA, Unicef, NOAA and other acronyms whose full forms are arthrous. It covers at least some hybrid abbreviations, like SFMOMA (part initialism, part acronym), and covers in general "coerced" acronyms, where vowels are inserted to make strings of letters (especially long strings of letters) pronounceable. like NOGLSTP, pronounced like "nogglestup" and standing for "The National Organization of Gay and Lesbian Scientists and Technical Professionals" (yes, I know, not a catchy name, but at least it's full of information).

[Style note: throughout this posting, I'll cite abbreviations without periods in them. I understand that most style sheets call for periods in some of them, but my personal preference is for this very spare style.]

On to initialisms (abbreviations that are read as sequences of letter names). Here the large generalization is just the opposite of the Acronym Principle:

The Initialism Principle: In general, initialisms are arthrous if their full forms are (and, of course, anarthrous otherwise).

Hang on: there are plenty of exceptions, but this is the overarching generalization.

Some examples: the FBI, the CIA, the NSA, the GAO, the SHC (the Stanford Humanities Center), the EU, the LSA, the ADS, the AAUP, the AARP, the NAACP, the NSF, the NIH, the NEH, the NEA (the National Endowment for the Humanities, the National Education Association). I've given a fair number of examples to convince you that there's a real phenomenon here, hoping that you can multiply the examples with others of your own. (For the Acronym Principle, this is no problem.)

A digression. Before I go on, I want to confront an idea that some people have advanced to me about the facts so far: that the anarthrous abbreviations lack an article because they're seen as holistic proper names (like John Smith), while the arthrous abbreviations are seen AS abbreviations, and so preserve aspects of the corresponding full forms. (Of course, the expressions we're looking at are all both proper names and abbreviations.)

I'm inclined to think that the pure-proper-name idea is an illusion created by the forms: no the = pure proper name, the = mere abbreviation.

To put the question in a larger context, let's look at some expressions that aren't abbreviations. Note the uniform arthrousness of proper names with the head common noun river (the Mississippi River, the River Nile) vs. the uniform anarthrousness of proper names with the head common noun lake (Searsville Lake, Lake Washington). Similarly, arthrous building (the Brill Building) vs. anarthrous hall (Carnegie Hall), though there is some American/British variation in the second case (note: the Royal Albert Hall).

There's a real system here (though one exquisitely dependent on the particular common noun that serves as head in the proper-noun constructions). I can't see any non-circular way of viewing this as a matter of conceptualizing things in different ways; it's just convention.

In fact, it makes sense for both versions to be possible. Proper names have (contextually) unique reference, and uniqueness is one of the two circumstances in which referring expressions are semantically definite. [Further digression: givenness (in context) is the other, and scholars differ as to whether there are two kinds of definiteness here, or only one (and if only one, whether one of the circumstances is fundamental, or whether they are both manifestations of a single more general meaning), and as to whether languages (or varieties) can differ as to the status of the two circumstances. But in the case at hand, uniqueness is what's at issue, and we can put off these deeper questions.]

So much for SEMANTIC definiteness. What we're looking at now is the question of how definiteness is marked syntactically and morphologically. There are two schemes available, and each of them has a rationale:

Economy: If the referent is unique in context, use no syntactic or morphological mark of definiteness, because it's unnecessary. (Omit Needless Words!)

Clarity: If the referent is unique in context, use a syntactic or morphological mark of definiteness to indicate this fact.

(These principles apply to languages in general. English has only a syntactic marker of definiteness, the article the, though other languages (including a number in the Indo-European language family, as well as many outside it) have affixes marking definiteness, either instead of or in addition to a syntactic mark, and syntactic marking other than via an article -- by word order, for instance -- is also possible.)

The competition between economy and clarity, as abstract principles, comes up all the time. See, for example, my discussion (in my posting on at about) of economical (implicit) vs. clear (explicit) marking of relations -- there with reference to bare NP adverbials vs. P-marked adverbials. Both principles are valid, but they can't be satisfied simultaneously; instead, the competition is negotiated though a system of conventions for specific cases, with one principle holding sway in some cases, the other in others.

So it is with proper names in English. For personal names, English almost entirely opts for economy: Arnold (Zwicky), not the Arnold (Zwicky); yes, I know about the Donald. (Other languages insist on clarity -- definite marking across the board -- or have definite marking for personal names in some contexts and lack it in others.) In other spheres, English is much more variable: there are conventions, of several different sorts, and exceptions to those, and variation, both within speakers and between speakers. Recall the river/lake and building/hall cases, consider the examples that Geoff Pullum gave in his posting --

There are some generalizations, but also many exceptions. Cities, boroughs, and regions are usually strong (like Amsterdam or New York or North Africa or Antarctica) but a few are weak (like the Hague or the Bronx or the Maghreb or the Antarctic). And remarkably, to a rough approximation at least, numerical freeway names are weak proper names in Southern California ("Get on the 55") but strong proper names in Northern California ("Take 17 South").

and check out the somewhat longer treatment in CGEL (pp. 517-8); and be prepared for more variation in the material to come. (But bear in mind that these discussions are only samplings of the phenomena, not complete inventories. A full treatment of definite marking in English proper names, including a survey of the variation, would fill a book.)

Digression over. We're now ready to get back to initialisms. For initialisms, English generally goes for clarity: the Initialism Principle.

But there are exceptions, and there's variation. Though it's almost invariably the BBC (and not just BBC), it's also, on this side of the Atlantic, almost invariably NBC, ABC, and CBS (not the NBC etc.). For me, it's mostly the OED, but I've occasionally written OED instead; meanwhile, AHD is, for me, almost always anarthrous (possibly because it's so often paired with NOAD, which is anarthrous because I read it as an acronym). You can also find anarthrous occurrences of some American government agency names (NSF, NIH, NEH, DOD), though these names are usually arthrous.

A striking GENERAL exception to the Initialism Principle is the naming of educational institutions:

The Educational Principle: In general, initialisms naming educational institutions are anarthrous.

So we get: MIT, OSU, UCLA, UCSD, RPI, etc. I say that I have a Ph.D. from the Massachusetts Institute of Technology, not Massachusetts Institute of Technology (the full name is arthrous), but I say that I have a Ph.D. from MIT, not the MIT (the initialism is anarthrous).

But, as usual, there is variation. The Educational Principle is pretty firm for me, but it's clear that it doesn't work for everybody:

The MIT is going to change its curriculum structure that was famous for teaching Scheme in introductory courses. (link)

As for the large educational institution in Columbus, Ohio, usually known familiarly as OSU, it famously tries to insist on a the in its full name (The Ohio State Unversity), a quirk that is sometimes carried over to its initialistic version, as the OSU (or The OSU):

Department of Electrical Engineering, The Ohio State University ... More recently, the Center of Intelligent Transportation Research (CITR) at the OSU is ... (link)

NASA support draws upon the Byrd Polar Research Centre of the Ohio State University .... of the data acquisition plan while the OSU is responsible for, ... (link)

(You can find similar instances of the OSU referring to Oregon State. And some other unexpected arthrousness for other institutions.)

And, in fact, the (no doubt originally snarky) orthographic variant tOSU (or tosu or Tosu) -- an acronym pronounced /tosu/ -- has grown up to represent the arthrous variant:

Michael Floyd commits to tOSU (link)

Why is Ohio State referred to as tosu by some people, mostly Michigan fans and others who don't like Ohio St? (link)

Note that because the Educational Principle applies specifically to names of educational institutions, there are minimal contrasts in arthrousness for initialisms: when such initialisms stand for other things that have arthrous full names, they are almost uniformly arthrous:

the MIT for the metal/insulator transition, the Millvale Industrial Theater, the Management Improvement Team (of the USA Freedom Corps), etc.

the OSU for the Overseas Singaporean Unit, the Operation Support Unit (Denton County, Texas, Sheriff's Office), the Oxygen Servicing Unit (McNaughton Dynamics UK), etc.

Syntactic footnote. All of the preceding was about proper names standing on their own or serving as arguments; some of these names are normally anarthrous, some normally arthrous. But other syntactic constructions can impose their own requirements. As a result, it's easy to find instances of normally anarthrous names preceded by the, and also instances of normally arthrous names without preceding the, but these aren't relevant to the classification of proper names with respect to arthrousness.

So: acronyms (like NATO) are normally anarthrous, as are initialisms referring to educational institutions (like MIT). Abbreviated proper names can serve as prenominal modifiers -- NATO support 'support from/by NATO', MIT buildings 'buildings at/of MIT' -- and the resulting expressions can have preceding determiners, which means we can get things like the NATO support and the MIT buildings, which have the structure
[ the [ NAME HEAD ] ],
not the structure
[ [ the NAME ] HEAD ].
That is, they do not contain the arthrous proper names the NATO and the MIT; NATO and MIT are anarthrous here as elsewhere.

In fact, if the head noun is a proper name, the resulting expression is a (more complex) proper name, which may itself require a definite article: the NATO Secretary General, the MIT Media Lab. But once again, the article belongs to the outer layer of structure, and NATO and MIT are again anarthrous, despite the preceding the.

In the other direction, though initialisms are generally arthrous, an abbreviated proper name serving as a prenominal modifier is obligatorily bare, so we get things like A local startup has gotten CIA funding (with CIA funding 'funding by/from the CIA'), not A local startup has gotten the CIA funding. The condition on prenominal modifiers trumps the arthrousness of initialisms.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 02:35 PM

How big is your crockus?

Yesterday afternoon, Heidi W. wrote to say:

Thought you might be interested in knowing that Dan Hodgins is the brain-sex guru who's been making the rounds in Early Childhood Education (ECE). Earlier this year, my (very large urban) school district invited him to speak on, "What About Those Boys?", and I just learned that they asked him to come back this fall to speak about discipline. A Google search indicated that he's been speaking at a lot of regional and national ECE conferences.

Ms. W. provided a link to copies of Prof. Hodgins' handouts, from a site where he gave a talk in 2003, and observed that

It seems that, around here, I'm in the minority in feeling very alarmed by his claims, which are supposedly supported by brain research. Red flags went up for me when, very early in his presentation, he showed a drawing of the brain and claimed that, "Girls see the details of experiences... Boys see the whole but not the details".

On that drawing of the brain, the motor strip was highlighted, as well as a much smaller area just anterior to it, which he called the "crockus". Ostensibly, he was supporting the 'girls seeing details' assertion when he claimed, "The crockus is four times larger than in boys" (as if the highlighted motor strip was "the crockus" in girls). I've searched extensively but have been unable to find any information about an area of the brain called the crockus --are you familiar with this term?

I'm puzzled too. A "smaller area just anterior to the motor strip" that shares some sounds with "crockus" would be Broca's area, which is functionally implicated in speech -- but Heidi is "ABD in Ed Psyc, with a cognate in Neuropsyc", and so she's unlikely to have mis-heard "Broca's" as "crockus". [Update: she informs me that "the word crockus was spelled out as such in the handout that Hodgins gave to us, and I was quoting directly from that handout in my message to you. It was definitely not Broca's area to which he was referring."] In any case, I've never heard that Broca's area is larger in girls, much less four times larger. In fact, I don't think there are any human brain areas at all where a reputable claim for a factor-of-four size difference by sex has ever been made. (If you think you know what brain area Dan Hodgins meant by "crockus", please tell me...)

In the Mott Community College Focus, I found two articles by Hodgins, ("Male and Female Differences" and "Classroom Climate that support Male and Female Differences"). They say nothing about the "crockus" or any other brain structures differing in size by a factor of four, but they do contain a number of specific factual claims about brain sex. For example, Hodgins tells us that

In most cases, female brains mature earlier than males. An example is in the myelination of the brain. One of the last steps in the brain’s growth to adulthood occurs as the nerves that spiral around the shaft of other nerves of the brain, like vines around a tree, are coated. This coating is myelin, which allows electrical impulses to travel down a nerve fast and efficiently. Myelination continues in all brains into the early twenties, but in young women it is complete earlier than in young men, almost twelve – eighteen months earlier

There's some suggestion of confusion about the nature of myelin here, in the talk about "the nerves that spiral around the shaft of other nerves", but never mind that. Do female brains really mature earlier than males, specifically with respect to myelination?

Hodgins doesn't cite any sources, but as far as I can tell, his assertion is contradicted by the published literature -- and this is an area where there's been a lot of recent publication, because MRI's ability to separate gray and white matter has made in vivo studies possible. In the first place, it seems that myelination continues in humans well beyond "the early twenties". According to Elizabeth R. Sowell et al., "Mapping cortical change across the human life span", Nature Neuroscience 6:309-315, 2003,

Post mortem studies have shown that myelination continues into a person's 30s and perhaps beyond, which could explain the increase in cerebral white matter observed into adulthood in the volumetric imaging studies. [...]

The increase in white matter volume peaked at age 43 and declined thereafter.

Specific information about human sex differences in the development of cerebral gray matter and white matter volume was presented in M.D. De Bellis et al., "Sex Differences in Brain Maturation during Childhood and Adolescence", Cerebral Cortex 11(6): 552-557, 2001. They studied 61 males and 57 females ranging in age from 6.9 to 17 years.

[F]indings from cross-sectional studies suggest that cerebral white matter (WM) volume and the area of the corpus callosum (CC), the main interhemispheric commissure, increase significantly from childhood through late adolescence. Recent results from longitudinal MRI studies of healthy children and adolescents have confirmed these age-related linear increases in cerebral WM and CC area. These observations may reflect in vivo evidence of age-related progressive events such as axonal growth and myelination. [...]

We investigated the relationship between age, sex and cerebral [gray matter] GM and WM volumes and CC area using high-resolution MRI volumetric analyses in a large community sample of healthy, age-matched and sociodemographically similar male and female children and adolescents. We specifically investigated age- related sex differences in human brain maturational processes (age-related changes in cerebral GM and WM volumes and CC area). [...]

Overall,

Males had larger intracranial and cerebral volumes than females by 11 and 12%, respectively. These effects remained after correction for height. Cerebral GM and WM volumes and CC areas did not differ between gender groups after adjustment for cerebral volumes.

What about sex differences in maturation rate, i.e. age-by-sex interactions?

The sex by age interaction term was significant for cerebral GM and WM volumes and CC area. The slopes of these changes significantly differed between male and female subjects. Thus girls showed significant developmental changes with age but at a slower rate than boys. Specifically, males had an ~19.1% reduction in GM volume between 6 and 18 years of age compared with a 4.7% reduction in females. On the other hand, males had a 45.1% increase in WM and a 58.5% increase in CC area compared with 17.1 and 27.4% increases, respectively, in females.

Here are the plots of the relevant age-by-sex results:

Figure 1. Scatterplots of cerebral volume (A), cerebral GM (B) and cerebral WM (C) volumes and CC areas (D) by age and sex in healthy male (n = 61) (solid lines, individual points = Y) and female (n = 57) (dashed lines, individual points = X) children and adolescents. Cerebral GM and WM volume and CC area means were adjusted for cerebral volume.

In other words, the age-related effects in brain development (as measured in this study) went in the same direction in 7-to-17-year-olds of both sexes, but at a faster rate in boys. (And as usual, the distributions for males and females are so highly overlapped that it seems irresponsible to draw any general conclusions about "classroom climates" for sex-specific education.)

I have no idea whether all this means anything about learning styles or appropriate educational practice; but it does suggest that Prof. Hodgins (who gives no references) is talking through his crockus.

Heidi W.'s note discussed various of Hodgins' other claims, such as the one about boys' brains going into "pause states", so that boys (as opposed to girls) need frequent "pause breakers" such as "spinning, shouting and jumping". I've come to the end of my breakfast hour, so most of that stuff will have to wait for another time. But I can't resist a quick note about another of Hodgins' brain-sex assertions, one that is more specifically related to speech and language:

Another structural difference, and perhaps the most striking, is the corpus callosum, the bundle of nerves that connects emotion and cognition. In females, it is up to 20% larger than in males, giving females better decision making and sensory processing skills. All learning must connect emotion and cognition. Because of this difference in size, females have better verbal abilities and rely heavily on verbal communication; males tend to rely heavily on nonverbal communication and are less likely to verbalize feelings. The current research suggests that sixty-seven per cent of males throughout their life are visual learners. This learning style has immense ramifications in our present culture, which relies so heavily on talk, conversation, words.

In the first place, what the corpus callosum connects is the right and left hemispheres of the cerebral cortex, not "emotion and cognition". In normal humans of both sexes, both the right and left sides of the brain are involved in both cognitive and emotional processing. As for the relative size of the corpus callosum in males and females, we just saw De Bellis et al. assert that in their sample, "CC areas did not differ between gender groups after adjustment for cerebral volumes". (Before adjustment for overall brain volume, they found an average male CC volume of 8.04 cc vs. 7.69 cc for the females in their study.)

A few weeks ago ("And why 'without sauce'?", 8/19/2007), I quoted from K.M. Bishop and D. Wahlsten, "Sex Differences in the Human Corpus Callosum: Myth or Reality?", Neuroscience & Biobehavioral Reviews, 21(5) 581-601, 1997:

It has been claimed that the human corpus callosum shows sex differences, and in particular that the splenium (the posterior portion) is larger in women than in men. Data collected before 1910 from cadavers indicate that, on average, males have larger brains than females and that the average size of their corpus callosum is larger. A meta-analysis of 49 studies published since 1980 reveals no significant sex difference in the size or shape of the splenium of the corpus callosum, whether or not an appropriate adjustment is made for brain size using analysis of covariance or linear regression. It is argued that a simple ratio of corpus callosum size to whole brain size is not an appropriate way to analyse the data and can create a false impression of a sex difference in the corpus callosum. The recent studies, most of which used magnetic resonance imaging (MRI), confirm the earlier findings of larger average brain size and overall corpus callosum size for males. The widespread belief that women have a larger splenium than men and consequently think differently is untenable. Causes of and means to avoid such a false impression in future research are discussed.

Heidi ended her note this way:

So the ECE folks here (of which I'm one) saw this dog and pony show by Hodgins and they just ate it up (I had to walk out). I found your Language Log when I was trying to locate info about [some other research that Hodgins referred to]. Any thoughts?

I think it's disturbing that such easily-refuted nonsense is so easily and widely accepted. We've got serious educational problems to solve, and sex differences -- whether genetic or cultural -- are clearly part of the picture. But surely no one's interests are served by promoting solutions on the basis of pseudo-scientific bafflegab.

Those males (and females) among us whose "brains go into pause after too much talking", as Prof. Hodgins puts it, might feel the need for a comic-strip break at this point. This morning's Dilbert will fit right in:

Prof. Hodgins' work certainly renews my appreciation for science. In fact, it makes me want to jump, spin and shout!

[Note: Besides decoding that "crockus", here's another puzzle that Language Log readers may be able to help me with. In Prof. Hodgins' article on "Male and Female Differences" he writes:

This information is an accumulation of research, study and observation over the last ten years that I had the opportunity to conduct with Dr. Gerison and Dr. Stake from the University of Southern California.

I couldn't find anyone named "Gerison" or "Stake" in the current USC online directory. Nor could I turn up any plausible "Gerison" in a Google search (and "stake" occurs too often as an ordinary noun to be a useful search term, and {author:gerison author:stake} comes up empty in Google Scholar, and I'm out of time and here, in that order). If you think you know who these researchers are, please tell me. ]

[Update -- readers have provided many suggestions and clues. They range from plausible if unsatisfactory (e.g. there is a Dr. Jayne E. Stake who works on "issues related to the self-concept and empowerment of women", but she's at the University of Missouri), not USC, and she hasn't done neuroscience research) to, well, less serious ("Gerison Stake" is an anagram for "searing tokes"). So far, Drs. Gerison and Stake remain as mysterious as the elusive Crockus.]

Posted by Mark Liberman at 06:53 AM

Snowclone collectors, call your offices

Eagle-eyed reader Josh Kamensky points out a phrasal formula that has popped up in the headings of three separate Language Log posts:

"Fenimore Cooper, call your office" (Geoff Nunberg, Oct. 7, 2003)
"Vaslav Tchitcherine, call your office" (Mark Liberman, Oct. 19, 2006)
"George M. Cohan, call your office" (Geoff Nunberg, July 5, 2006)

Geoff clearly enjoys this trope, having sent similar requests in other contexts to Deborah Tannen and William Safire. (He even exhorted me to call my office in a May 25, 2007 post, though sad to say I never did.) Having spotted this pattern, Kamensky wonders if this is properly categorized as a snowclone (or perhaps merely a catchphrase or cliché), and he also inquires after the source of the expression. I can take a stab at the second question, which may help in answering the first question.

Young whippersnappers may not remember an age before beepers, pagers, or cellphones. (Of course, in the march of technology, beepers and pagers are now almost forgotten.) There once was a time when a doctor or other professional could only be paged in a public place by means of an announcement over a public address system. The paged person would then presumably scurry to the nearest pay phone (remember those?). Sporting events were prime locales for these announcements. Take this moment from Nelson Algren's 1942 novel Never Come Morning, taking place between rounds in a Chicago boxing match featuring the protagonist Bruno "Lefty Biceps" Bicek:

The announcer climbed over the ropes and spoke into the amplifier.
"Is Dr. Morris Pechter in the house? Dr. Pechter, please call your office. Is Dr. Pechter here?"
"Is that fer him 'r me, Case?" Bruno wondered.
"Not fer neither. You ain't hurt. Fer some croaker in the house is all."

Baseball games were another typical venue for the "call your office" announcement. Here are mentions in reports on World Series games in 1945 and 1957:

Just when the Detroit Tigers belted Chicago's Hank Borowy from the mound, the public address system blared: "Coroner A.L. Brodie, call your office." (Washington Post, Oct. 8, 1945, p. 11)

There's no free publicity for doctors here at the stadium [in Milwaukee]. If they are expecting calls, they are issued numbers, and are paged over the loud-speakers as, "Dr. 123, please call your office." (Los Angeles Times, Oct. 7, 1957, p. 6)

It's hard to know when such announcements became common, since they only get referred to in newspaper articles when they're somehow noteworthy. With the advent of large-venue PA systems (and coin-operated public telephones) around the country, it's probably safe to say that the "call your office" line could be heard on a regular basis by 1930. That was the year when Joseph Crater, a New York Supreme Court judge, mysteriously disappeared after entering a Manhattan taxi. The case attracted an enormous amount of attention, but Crater was never found. A 1979 Washington Post article recounts the effect of Crater's disappearance on the national consciousness:

Within mere months of his disappearance he had become part of the national folklore, the subject of scavenger hunts and night club routines — "Judge Crater, call your office." The phrase "to pull a Crater" entered the idiom. (Washington Post, Aug. 5, 1979 — via Barry Popik)

Other secondary sources back up the popularity of this punchline in the months and years following Crater's disappearance (Arthur S. Koykka's Project Remember says "vaudeville routines of the 1930s often included the line"). Unfortunately, I haven't been able to find any contemporaneous accounts. The line first starts popping up in the newspaper databases quite suddenly in 1966, when it was revived as a popular bit of graffiti. Syndicated columnists picked up on it right away:

And this inscription in the gentlemen's room of a lower Manhattan saloon ain't bad either: "Judge Crater — Call Your Office." (Column by Norton Mockridge, Delaware County [Penn.] Daily Times, Sep. 21, 1966, p. 7)

A few days ago, some wag among the many state legislative leaders assembled in conference at a Washington hotel affixed the following note to the lobby bulletin board: "Judge Crater ... Please call your office." ("NEA Washington Notebook," Hagerstown [Md.] Daily Mail, Nov. 30, 1966, p. 6)

Kids no longer chalk up the same old words on, walls and subway posters. That's considered very, very old hat these days— and decidedly unsophisticated. Instead, you can, find startling inscriptions like the following: ... Judge Crater, please phone your office. ... The handwriting, concludes Warren Boroson, who copied down the above graffiti, is obviously on the wall. (Bennett Cerf's "Try And Stop Me," Playground Daily News, Fort Walton Beach, Fla., Dec. 22, 1966, p. 8)

The American Psychiatric association heard a learned paper based on scribblings on the walls of bars, wash rooms, bus stations, etc. Examples: "Judge Crater — Please Call Your Office Immediately," [etc.] (Column by George Morgenstern, Chicago Tribune, Dec. 31, 1966, p. 1)

It's a bit surprising that this would emerge as a widespread graffito 36 years after Judge Crater disappeared, but he was apparently still remembered as "the most missingest man in America." (He would eventually be eclipsed by D.B. Cooper in 1971 and Jimmy Hoffa in 1975.) In any case, the "X, call your office" line was subsequently applied to various public figures whose whereabouts were unknown, at least temporarily. For instance, it came in handy when Chicago Alderman Fred Hubbard disappeared in May 1971 with $100,000 in federal funds (he was later tracked down and convicted of embezzlement):

A restaurant on Touhy Avenue in Skokie sports this sign: Ald. Fred Hubbard, Call Your Office! (Chicago Tribune, June 18, 1971, p. 18)

Ald. Hubbard: Call your office! (Chicago Defender, Feb. 29, 1972, p. 4, headline)

And when Wilt Chamberlain was brought on as a coach for the San Diego Conquistadors of the American Basketball Association but then didn't show up for two games, the Mar. 1, 1974 Los Angeles Times headline predictably read, "Wilt, Call Your Office."

With the expression firmly entrenched, "X, call your office" began to be extended to other humorous uses, not simply as a joke about someone "pulling a Crater." X could be a literary or historical figure whose presence was rhetorically requested by the writer. (Similar devices include "Paging Dr. Freud...") Conservative columnist George Will took advantage of this trope rather too often. A search on his columns (in the Washington Post and Newsweek, and syndicated elsewhere) finds an enormous number of examples, beginning with "George Orwell, call your office" in a Mar. 8, 1976 column. Here is a list of the other figures that Will called upon from the late '70s to the late '80s:

Henry Adams (June 10, 1979), Pericles (Oct. 21, 1982), Thomas Jefferson (Jan. 2, 1984), Ring Lardner (Oct. 9, 1984 and again on Apr. 14, 1986), Peter Pan (Feb. 3, 1986), Karl Marx (Dec. 28, 1986), Stephen Gould (June 25, 1987), Abraham Lincoln (Jan. 4, 1988), Jesse Jackson (Apr 17, 1988), Edward Gibbon (May 9, 1988), Jefferson and Madison (Dec. 15, 1988), Immanuel Kant (Feb. 27, 1989), and Cotton Mather (May 8, 1989). (Whew!)

Finally, someone took Will to task for his repetitiveness. Reviewing a collection of his columns in the Washington Times (Nov. 5, 1990), Florence King wrote:

Otherwise, in an age that is seeing the certain collapse of English, the miracle is that George Will has only one stylistic tic: "E.T., phone home, your accountant says you have sold 15 million videocassettes." "Immanuel Kant, call your office. Washington cannot keep track of all its categorical imperatives." George Will, phone home. The Graffiti Protection Society has surrounded your house.

That review must have chastened Will to some extent, since he put the kibosh on the "call your office" line for several years. He returned to his old favorite "Orwell, call your office" as the headline for a Newsweek column on Feb. 3, 1997 and since then has used the formula quite sparingly ("Freud, call your office," Aug 10, 2003; "Inspector Clouseau, call your office," Dec. 18, 2006).

So, is "X, call your office" properly considered a snowclone? When the question was brought up at the Language Log water cooler, Adam Albright chimed in:

I have a feeling that it doesn't really count as a snowclone in the sense of "generalization of a specific expression as it enters popular parlance", since it was truly a very common functional formula that has gotten stranded in restricted contexts nowadays.

I personally take a more latitudinarian approach to snowclones. First, I don't think we always need to pinpoint a "specific expression" that is the source of a generalizable template. For instance, the standup comedian Richard Lewis takes credit for popularizing the snowclone "the X from hell" in the '80s, even though we can't trace a single baptismal use of this expression. And in the case of "X, call your office," we actually do have an originating source for the current cliché in "Judge Crater, call your office," from which all subsequent jocular and rhetorical usage apparently flows. The sturdiness of the Crater joke outlasted the original functionality of the expression and has displayed true snowcloniness by spawning countless imitators.

Of course, the template can now be used by younger generations of writers with no firm memory of Crater or the public paging routines that provided the cultural context for the punchline/graffito. As we've seen before, successful snowclones very often take on a life of their own, detached and decontextualized from whatever provenance we might be able to document. But uncovering the hidden histories of such snowclones (now made possible by digitized databases of books, magazines, and newspapers) can certainly be fascinating in its own right, even if the quarry remains as slippery and elusive as Judge Crater, the most missingest man.

Posted by Benjamin Zimmer at 01:03 AM

September 16, 2007

W-type pronouns, dragon heartstring and speech act theory

A draft manuscript recently appeared in my inbox. I was so struck by its potential implications that I requested permission to circulate it to a wider audience--that is, you, dear Language Log readers--and I am pleased to report that the author agreed! You will therefore get to see this groundbreaking research in its early stages, years before the rest of the world reads a mangled three-paragraph description of it in the BBC Science (or possibly Arts) feed. Just one more of the many perks of your yearly subscription to LL.

In this ms--How To Do Things With Words And Wands: The Pragmatics Of Casting Spells-- author Molly Diesing investigates the deictic devices and speech-act properties of successful spellcasting, based on the corpus of spells and descriptions of spellcasting events which has recently become available through the efforts of J.K. Rowling.

The research deals with the syntactic and semantic conditions on the expression of spell targets, including cases of explicit mention, deictic wand pointing, noun incorporation, and complete object drop. It also considers whether spells themselves are imperatives or performatives, and, if the latter, what happens when you violate their felicity conditions.

Dr. Diesing will develop this investigation in collaboration with Sally McConnell-Ginet, and their results will certainly be of interest to the entire Wizarding and linguistics world. I predict that many future Ravenclaw term papers will take their work as a leaping-off point. Wuggles (non-linguists) beware, however! This is heady territory. Take along a linguist companion, or at least a good encyclopedia article on pronominal reference and another on speech act theory. LL cannot be held responsible for the consequences if you don't!

Comments?

Posted by Heidi Harley at 08:56 PM

A new technique in language teaching

Mark's receipt today of an invitation to Colorectal Congress 2007 reminded me that I received a nice conference invitation myself recently, somewhat more relevant than his. It turns out that a new approach to second language learning has been developed, and if I could make it over to Turkey, I could find out all about the new method way before it takes the linguistic world by storm. Here's the invitation:

Dear David Beaver

Your presence shall only be a true honour to experience, may you think yourself kindly invited to the program at the [hotel name redacted], dated [date redacted], by [name redacted], the founder of [organization name redacted], as to how to get into language usage through its own parameters; a claim of a new technique in language teaching.

Well, I for one would not dispute the claim of originality. So well does the method work that there is no need whatsoever to have your language proofread by a native speaker.

Unfortunately, the invitation was sent to the wrong address, and I only received it some 6 months after the conference had taken place, so I'll never know how to get into language usage through its own parameters. Hrumph. Most unfair. No chance at all for a weekend of Istanbul. But maybe the Colorectal Congress still has some openings?

Posted by David Beaver at 05:54 PM

An Act of Decency in Time of War

I've been reading G.P.V. and Helen B. Akrigg's British Columbia Chronicle: 1778-1846. It isn't linguistics, but I encountered something I think worthy of note. Discussing the end of Captain George Vancouver's voyage of exploration in 1795, they note:

Entering the Atlantic, he had a special anxiety, fear of being intercepted by one of the French warships on the prowl for British ships. H.M.S. Discovery, lightly armed as a survey ship rather than a man-of-war, and worn out by four and a half years of voyaging, could not hope to defend herself successfully against a French cruiser. Vancouver's great voyage could end all too possibly with him and his crews rotting in a French prison. At St. Helena he received very welcome news: the French National Assembly, recognizing the contribution to human knowledge rendered by the Vancouver expedition, had instructed its warships not to molest the Discovery and the Chatham on their homeward journey. [pp. 99-100]

Posted by Bill Poser at 03:06 PM

Abject lesson

I don't think that a reader named Stephen was making fun of me when he wrote

I really love your blog ... it has given me more things to gripe about.

But I'm not sure. Stephen prolonged the suspense by explaining

I really hate it when people say "I would of ..." and the like. I, tend to read sentences with inappropriately placed commas with the commas pronounced out loud - that, is my biggest gripe.

And then he really made me wonder by asking this:

I'd like your opinion on the concept of "object lessons" - my understanding is that the more correct form is "abject", but "object" is used more commonly - and is now acceptable - because "abject" is a word not known by those who don't read, i.e. the majority.

As incorrections go, this one is pretty far out there. The OED has

object lesson n. (a) (now chiefly hist.) a lesson in which a pupil's examination of a material object forms the basis for instruction; (b) fig. a striking practical example of a principle or ideal.

with citations back to 1831.

Searching Literature Online, I find a number of hits for "object lesson". Thus George Eliot wrote in Middlemarch:

He found the family group, dogs and cats included, under the great apple-tree in the orchard. It was a festival with Mrs Garth, for her eldest son, Christy, her peculiar joy and pride, had come home for a short holiday---Christy, who held it the most desirable thing in the world to be a tutor, to study all literatures and be a regenerate Porson, and who was an incorporate criticism on poor Fred, a sort of object-lesson given to him by the educational mother.

And Thomas Hardy wrote in Jude the Obscure:

Then they tried to laugh, and went on debating in whispers the object-lesson before them.

The string "abject lesson" doesn't occur in the OED, nor does it occur in Literature Online, nor do I recall ever having seen it before. However, it gets 10.400 Google hits, and UsingEnglish.com says that in Indian English

An abject lesson serves as a warning to others. (In some varieties of English 'object lesson' is used.)

Searching Google News for {"abject lesson"} this morning turns up four hits. One is from India, two are from South Africa -- but one is in a piece by Alexander Laman, "The most commercial of music festivals", The New Statesman, 8/21/2007:

Iggy Pop jumping around, throwing himself into the crowd and causing havoc on stage is now de rigeur for a performance, but the intensity and verve with which Tim Booth, managing to look sinister even while clad in a skirt, fronted a reformed James was an abject lesson in not every band reunion coming over as a cynical cash-in.

The string {"object lesson"} gets 124 Google News hits, supporting my belief that "abject lesson" is an eggcorn (though perhaps one that is becoming dominant in some regional Englishes). I'm less sure about whether Stephen was seriously hoping to be able to gripe about people who haven't converted this idiom yet from object to abject.

In fact, this morning's mailbag is full of mysteries. For example, was someone trying to send me a message by putting me on the mailing list for Colorectal Congress 2007 ("It is an honour for us to welcome the worldwide opinion leaders in Rectal Cancer to St. Gallen")?

[Update -- several readers have written in to tell me that they too are abject lessoneers. Thus Emily Lilly:

I am an avid reader of Language Log (which I always refer to as "the Language Log" in conversation--somehow, I'm not sure that a namer can just declare a proper noun as strong versus weak, but that's not why I'm writing). My husband and I browse through it every day--we met in the linguistics graduate department at UMD-College Park.
Anyway, I am 32, and was raised in variety of places (first Illinois, then NJ, then a 2.5-year stint in the Netherlands, back to NJ, then college in MA). I have always said and written the phrase as "abject lesson". I'm not sure where or when I picked it up (probably not in Illinois--I left at the age of 7!), but there it is. Perhaps a TV or comics-section character used it during my childhood, or some other such media vehicle that was aimed more at my age group than at the previous generations. It would be worth looking into the age breakdown of the folks who say "object" vs. the folks who say "abject".
I was quite surprised to find that I was using the wrong word! I'm usually pretty good at getting these sorts of things right. (Okay, so I'm totally anal retentive about this sort of thing, which is why my dad, who has just written his first book, is going to give me the last draft so that I can edit for grammar, spelling and punctuation. Other family members with other talents and interests are getting it ahead of me...)

]

Posted by Mark Liberman at 12:10 PM

Notes from the ESL trauma unit

There's a great story making the rounds. According to Joe Fay ("Czech falls off motorbike, wakes up with British accent", The Register, 9/14/2004):

A Czech speedway racer discovered his inner British toff after another rider ran over his head.

Non-English speaker Matej Kus, 18, took the spill during a race in the UK. Paramedics were stunned when he came round and asked where he was – in perfect English. It soon became apparent that Kus had lost his memory, forgetting he was a Czech bike racer, and presumably thinking he was an accent coach at the BBC.

Team manager Peter Waite told Ananova that Kus sounded “like a newsreader”. The biker's foray into the world of received pronunciation was shortlived, however. As soon as his memory returned, two days later, his command of English evaporated.

Speaking through an interpreter, Kus said: "There must be some English deep in my head…Hopefully I can pick some English up so I'll be able to speak it without someone having to hit me over the head." [...]

Doctors put the baffling linguistic transformation down to Foreign Accent Syndrome, a rare condition where a blow to the head – or a stroke – damages the parts of the brain that control speech.

In 2004 a Bristol woman woke up speaking French and thinking she was living in Paris. She was subsequently diagnosed with Susac’s syndrome. As she explained to the Daily Mail last year, "It might sound funny to others, but suddenly thinking you are French is terrifying."

The headline writers have had a lot of fun with this one -- The Register's sub-head was "Bohemian rhapsodises in cut-glass English". Ananova's story ran as "Czech out the new lingo".

Matej repeated for the Daily Mail the standard folk-scientific story about this sort of thing:

After flying home to the Czech Republic to recover, he said - through an interpreter - that he remembered nothing of the accident or of the following two days.

Yesterday he added: "It's unbelievable that I was speaking English like that, especially without an accent. "Hopefully I can pick English up over the winter for the start of next season so I'll be able to speak it without someone having to hit me over the head first.

"There must be plenty of the English language in my subconscious so hopefully I'll be able to pick it up quickly next time."

I'll keep an open mind about this -- perhaps during the two days that Matej allegedly spoke perfect BBC-newsreader English, someone thought to tape some of it. But pending some evidence more credible than reported interview statements by the team manager, I'm agreeing with the doctors who put this down to Foreign Accent Syndrome, in which head injury cause someone to talk funny, and casual observers dream up various alternative specific but fanciful interpretations. The wikipedia article notes a FAS case that was documented on the air in 2006 (by the BBC):

Another case of foreign accent syndrome occurred to Linda Walker, a 60 year old woman from the Newcastle area. After a stroke, her normal Geordie accent was transformed and has been variously described as resembling a Jamaican, as well as a French Canadian, and a Slovak accent. She was interviewed by BBC News 24 [1] and appeared on the Richard and Judy show in the UK in July 2006 to speak of her ordeal.

I'll also note that this is a case of life imitating the funny papers -- see "Memetic mutation and traumatic release", 8/30/2007.

This traumatic xenoglossy stuff is also more peripherally connected to the phenomena of "speaking in tongues", and to the cases of alleged recovery of past-life languages through hypnotic regression (see S. Thomason, "Do you remember your previous life's language in your present incarnation" American Speech, 59:340-350. 1984).

[So far, I haven't found any links to careful studies of cases where a head injury is said to have created the ability to speak a language that the victim did not previously know. But there are plenty of careful studies of FAS, for instance these:

Sheila Blumstein et al., "On the nature of foreign accent syndrome: a case study", Brain and Language, 31(2):215-44, 1987. The abstract:

A detailed acoustic analysis was conducted of the speech production of a single patient presenting with the foreign accent syndrome subsequent to a left-hemisphere stroke in the subcortical white matter of the pre-rolandic and post-rolandic gyri at the level of the body of the lateral ventricle. It was the object of this research to determine those changes which contribute to the perception of a “foreign accent.” A number of acoustic parameters were investigated, including features of consonant production relating to voice, place, and manner of articulation, vowel production relating to vowel quality and duration, and speech melody relating to fundamental frequency. The results indicated that many attributes which might have contributed to the foreign quality of the patient's speech were similar to those of normal English speakers. However, a number of critical elements involving consonant and vowel production and intonation were impaired. It was hypothesized that the acoustically anomalous features are linked to a common underlying deficit relating to speech prosody. It is suggested that the normal listener categorizes this speech pattern as a foreign accent because the anomalous speech characteristics, while not a part of the English phonetic inventory, reflect stereotypical features which are a part of the universal phonetic properties found in natural language.

Kathleen M. Kurowski et al., "The foreign accent syndrome: a reconsideration", Brain and Language, 54(1):1-25, 1996. The abstract:

This study compared the post-CVA speech of a patient presenting with the foreign accent syndrome (FAS) to both a premorbid baseline for that patient and to similarly analyzed data from an earlier reported case of FAS. The object of this research was to provide quantitative acoustic data to determine whether: (1) the constellation of phonetic features associated with FAS is the same across patients and (2) a common neural mechanism underlies FAS. Acoustic parameters investigated included features of consonant production (voicing, place and manner of articulation), vowel production (formant frequency and duration), and prosody. Results supported the characterization of FAS patients as having a “generic” foreign accent and the hypothesis that FAS deficits are qualitatively different from that of Broca's aphasia. However, comparison of this case with recent studies revealed the extent to which the constellation of phonetic features may vary among FAS patients, challenging the notion that a general prosodic disturbance is the sole underlying mechanism in FAS.

Inger Moen, "Foreign accent syndrome: A review of contemporary explanations", Aphasiology 14(1):5-15, 2000. The abstract:

This paper presents an overview of the cases of the so-called foreign accent syndrome (FAS) which have appeared in the literature during the last ten to fifteen years and discusses the explanations that have been offered to account for the anomalous phonetic/phonological features of the patients speech. Explanations for the underlying nature of the production disorder in FAS have been given in terms of phonetic setting, in terms of mechanisms for the control of speech motor behaviour, in terms of cognitive processing and in terms of phonological theory. FAS can be seen as an apraxic condition where the ability to control and coordinate the various laryngeal and supralaryngeal features of speech has been damaged. Recent developments in phonological theory, models where the distinction between a phonetic and a phonological level of analysis is less clear cut than in most models, offer interesting perspectives on the description and analysis of FAS.

Lila Guterman, "When Speech Goes Strange", Chronicle of Higher Education, 11/15/2002

Jo Verhoevan and Peter Marien, "Prosody and Foreign Accent Syndrome: a Comparison of Pre- and Post-stroke Speech", Speech Prosody 2004.

Diane Garst et al., "Foreign Accent Syndrome", The ASHA Leader, 11(10):10-11, 2006.

]

Posted by Mark Liberman at 08:19 AM

Language Log is strong

A small point, while I think of it, at the risk of seeming a tiny bit pedantic, concerning how to make reference to Language Log. You may have noticed, from other websites or our occasional direct quotations from them, that there are many people who write things like "I really enjoy the Language Log". To take a random example, this page from the website of the radio program Here and Now says The "Language Log" is an online hub where linguists trade thoughts on all aspects of language. And another site said (and we really are flattered and grateful): the website of record for die-hard language buffs is the Language Log, acknowledging in the following sentence: The Language Log, I admit, is not for the faint of heart (see it here). Many thanks for the praise; but for the non-faint of heart, it's "Language Log", not "the Language Log". If I may use the terminological distinction drawn in The Cambridge Grammar of the English Language in pp. 517ff, recently mentioned here), Language Log is a strong proper name, not a weak one.

What I mean by that is simply that syntactically it falls into line with Arizona rather than with (the) Azores; it is like Kansas City rather than (the) Old City (in Jerusalem): it takes no definite article. What you're enjoying is Language Log. Thank you.

Oh, and by the way, if you're wondering whether this is an entirely prescriptive judgment: no, it isn't really. True, I did just recommend using the correct name for our site, so I am making a point about what your future writing behavior should be, if you want to fall into line with the usual practice among those who know the correct name of our site. But the generalization I'm making reference to here is entirely descriptive. It's not like the Language Log has become widely used all over the place and I'm some sort of atavistic reactionary trying to deny that the language is the way it is. One hundred percent of the references to Language Log by the people who actually write for Language Log say Language Log. None of us call the site the Language Log. And what made us arbiters of good taste? Well, we created Language Log, and coined its name. We coined it as a strong proper name. The sporadic use of the Language Log by others is a sign of imperfect learning. That's a descriptive fact. At least at the moment it is. Things could be different in fifty years. But right now, I'm telling you that saying the Language Log is like saying the Iraq: it is just a mistake.

Being anti-prescriptivist doesn't mean refusing to admit that there can ever be such a thing as a linguistic mistake. It means being interested in what's a mistake and what isn't, rather than bull-headedly sticking with ideas of correctness that cannot possibly be correct.

I grant you that this is a rather subtle point. I have seen discussions in the past that have convinced me that many people cannot see any middle ground between two extremes: for them, it's "everything is correct" versus "nothing is relevant". That is, either (they think) there are no rules or standards at all and everything is as grammatical as everything else, or else the rules are the rules no matter what, and it doesn't matter one whit what the educated usage of native speakers and writers tells us about the language. I regard both extremes as utter whacked-out idiocy. Of course some sequences of words are correct Standard English and others are not. But of course that doesn't mean something can be ungrammatical in Standard English despite the fact that all educated speakers take it to be grammatical and normal usage.

In studying English (or any language) one can make the mistake of following usage too closely, and fail to distinguish sporadic errors of speech from systematic patterns of syntax; but one can also make the mistake of not following it closely enough, maintaining a quasi-superstitious belief in rules that don't really state generalizations about the structure of the language at all. Determining whether you are straying toward one of these methodological errors or the other is a matter that calls for deep reflection, close attention to large bodies of data, careful statement of tentatively proposed rules, a constant willingness to reconsider and revise proposals about what an accurate grammar for the language should say. It's not easy at all. It's the business that defines descriptive English grammar, and linguistics more generally. It's what those of us who write for Language Log actually do for a living. I am not in any way trying to suggest to you that's it's straightforward or easy to draw the distinctions in the right places, or to describe languages correctly. But I know I've got this little piece of English right.

Posted by Geoffrey K. Pullum at 07:34 AM

Weird tradename umlaut

Weirdest use of fake diacritical marks on letters in brand names that I have seen recently: I bought a couple of food items at Dublin Airport store run by Wrights of Howth and took away a bag saying:

Wrights öf Howth

I just have no idea what they think those umlaut dots on top of the "o" are supposed to be doing. Most references to the long-established Dublin fish company in question display no umlaut. I would have expected an apostrophe (Wright's), but they apparently never use that. Just an inexplicable umlaut — at the Dublin Airport store and on its bags, but apparently not most places on the web where they are mentioned. I guess my worry is that some advertising agency was actually paid money to develop branding ideas and came up with this one in the hopes that it would get them publicity... like a mention on Language Log. Hmm. I guess it worked, didn't it? However, you will not be seeing any nonsensical typographical re-branding here on L@ngu@ge Lög.™

[Update: These fake umlauts are of course called "heavy metal umlauts". Wikipedia offers you more than you could possibly want to know about them in this article. Arnold Zwicky briefly discussed them on Language Log here, and Mark Liberman added a note about other fanciful uses of diacritics like the macron here. My surprise was not occasioned by seeing yet another brand using a heavy metal umlaut, but at seeing the device used by a reputable Irish fishmonger.]

Posted by Geoffrey K. Pullum at 06:09 AM

September 15, 2007

"We all speak different and it's kind of cool"

Just a tidbit of good news. Daily Hampshire Gazette, September 3, 2007, carries an article about the UMass married students' housing at North Village, titled "University village gathers young from around the globe", which features the multilingual, multicultural aspects of the community, which includes lots of international students, many of them with kids. The fourth paragraph quotes 8-year-old Sabrina Maketa from the Democratic Republic of Congo.

"My friends live from all different countries and they're teaching me words," Sabrina said. "I didn't understand people when I first got here, but I'm learning that we all speak different and it's kind of cool."

I don't think you can read the article without being a subscriber; here's the Google cache version.

Posted by Barbara Partee at 11:41 PM

Where do people get this stuff?

A couple days ago I got the catalog for the University of Northern British Columbia's new Continuing Studies offerings, which means typical community college fare. Scattered throughout are little "Did You Know?" boxes. The one in the Languages section asserts that "China has more English speakers than the United States", which I doubt. It is certainly not true if native speakers are meant, and I very much doubt that there are 300 million fluent non-native speakers either. Even if the standard is the ability to carry on a less than fluent conversation I am doubtful. I suspect that the figure they are using is that of the number of people who have studied English.

The really strange item, though, is the claim that: "The language most closely related to English is Flemish.", which is indisputably false. What exactly one means by "Flemish" is actually problematic as the term is used in a number of ways, not all of which describe a meaningful linguistic unit. All of these uses, however, describe one or more varieties of Dutch, and there is no classification of the Germanic languages on which English is closer to Flemish than to other varieties of Dutch.

The subgrouping of the West Germanic languages is somewhat controversial. One common view is that West Germanic contains four subgroups: High German, Low German, Low Franconian, and Anglo-Frisian, where Low Franconian contains Dutch (including Flemish), and as its name suggests, Anglo-Frisian contains English, along with Scots (English) and Frisian. If one takes an expansive view of "English" and includes Scots English as a variety of English, the closest relative of English is Frisian. The subgrouping given in the Ethnologue is a bit different in that it has Frisian as a sister of English (along with High German and Low Franconian-Low Saxon), but again, Flemish is not in any sense the closest relative of English.

What I wonder is, where do they get such factoids? Obviously not from the Ethnologue or Wikipedia. I haven't checked all the encyclopaedias, but in my experience they usually follow the Anglo-Frisian view, and I've never seen any source that put English and Flemish together. UNBC has no linguistics faculty or even Germanicists, so it isn't likely it came from faculty. Is there some source of misinformation that I don't know about? Did they somehow confuse Flemish and Frisian?

Footnote: Actually, the term Frisian is somewhat problematic too. What is usually meant by "Frisian" consists of West Frisian (in the Netherlands) and North Frisian (in Germany) together with Saterfriesisch in Germany. "East Frisian" (in Germany), on the other hand, is considered by some to belong with Low Saxon-Low Franconian and not to be closely related to the other Frisians.

Posted by Bill Poser at 07:46 PM

Evs

Omri Ceren writes from LaLa Land, with an update to my 8/3/2007 post on abbreviated forms of whatever:

I thought you should know that at least as far as Los Angeles is concerned, "whatevs" is so six months ago. "Evs" is the vernacular now. Example: "He was like 'baby I didn't know she was your roommate' and I was like 'evs'."

The entry for evs in the unreliable-but-interesting Urban Dictionary gives popularization credit to Australian rocker Toby Rand, from (the reality TV show) Rockstar Supernova. His Live space does have a considerable density of evs, which he spells EVS, generally with two or more exclamation points. The exclamation points are unexpected, it seems to me, and so are his contexts of use, which seem to express some sort of strange down-under enthusiasm in place of the expected hip resignation:

Oh, and one more thing, EVS!!!

The performance show was great this week. I felt everyone stepped up…I thought my version of “Layla” was really good, however, it proved not good enough to keep me out of the bottom 3. Although, I’m happy my term was broadcast on TV…EVS!!

All-caps EVS seems to be the show's official spelling, although others seem to have a better command of what the expression means -- thus in another contestant's Live space:

Going shoe shopping later for my next performance this week. Got the outfit, just need the hooves. I’m really happy about my song this week. It’s my mom’s favorite song & it brings back so many childhood memories. I only wish that she knew I was here & that she could hear me sing this one. EVS…

"Evs" might be the hot interjection in L.A. these days, but it's not easy to find real examples on line. Most of the 3,950,000 hits for {evs} seem to be abbreviations for "electric vehicles" or "European Voluntary Sevice" or "Eastern Vascular Society" or etc. The search string {"was like evs"} turns up only this single excellent example, from November of 2006:

this halloween was like the best ever!!!!!!!!!!!!!!! omg we saw a certain person we wanted to see and we went to this awesome house that like had a haunted house u had to walk through to get candy. this really hot kid like popped out at me!!! i was slightly embarrassed but it was halloween so i was like evs about it.

and {"were like evs"} turns up this (on a myspace page from Sidney, Australia, which suggests that Toby Rand's usage is idiosyncratic rather than regional):

i had LE wierdest drunken dream last night.. we were at some party and i was talken to julian and then somehow i heard that you and dan were in a room hooking up.. and brianna goes, yeah i made it happen, im so proud.. and im all like wtf-ing all over the joint... and then i was like what about matt? and you were like.. evs bro evs... LIKE WTF KINDA DREAM IS THAT???? and yeah then i woke up in a very wtf mood
wtf? stfu. gtfo. pwned.. LAWL

Posted by Mark Liberman at 10:10 AM

Thum

Yesterday on Ask MetaFilter, Pedro Alcocer (writing as "Inigo Jones") asked about his apparently-idiosyncratic way of pronouncing the word them when it's stressed. He gave an example of contrastive substitution: "Don't give it to me, give it to THEM." For most English speakers, that them would rhyme with hem -- but for Pedro, it rhymes with hum.

Unreduced them is historically in John Wells' lexical set #2, the DRESS set, which includes words like step, ebb, hem. But Pedro made a different choice when he first learned English -- he put it into lexical set #5, the STRUT set, which includes words like cup, rub, hum. (Inigo describes his pronunciation as rhyming with bum.)

Different English dialects pronunce these (sets of) vowels differently, but for most of us, we're talking about [ðɛm] vs. [ðʌm] in the International Phonetic Alphabet, approximated in most of the discussion on Ask Metafilter as "them" vs. "thum".

So far, Pedro hasn't found any fellow thummers.

For "the curious and thum-unbelieving", he provided a recording, which I've made more accessible here:

Here's how he describes his background:

Born in Latin America, moved to US (New Jersey) at age 2. Lived in New Jersey until age 7. Moved to Miami, FL and lived there until college.

There is a Miami accent of English (spoken by native English speakers who grew up in Miami, not necessarily immigrants), that I and other Miamians can identify, but as far as I can tell and others have told me, I don't have it. Generally, I think I have a fairly standard American accent ("how they talk on TV").

Pedro has clearly assigned them to a lexical set in an idiosyncratic way, but his mistake was a reasonable one with plenty of precedents.

It's a sensible mistake because them is nearly always unstressed (pronouns and other anaphors are usually unstressed), and therefore pronounced with a reduced vowel, some form of schwa, for which the closest stressed-vowel equivalent would be the vowel of the STRUT set.

As for the precedents, most Americans have adopted several changes of the same type, as John Wells pointed out to me in email. We have stressed of rhyming with love, although historically of had the vowel of LOT, and still does for most Brits -- that's a member of lexical set #4 moving to lexical set #5. And many of us pronounce because to rhyme with buzz rather than with laws -- that's a member of lexical set #13 moving to lexical set #5.

So despite the lack of thummers among those who answered Pedro on Ask Metafilter, I'll bet that there are some of you out there. All things considered, I'm surprised that this development is apparently so rare.

[Note -- I originally identified the Ask Metafilter author only as "Inigo Jones", but changed the name throughout the post in response to this request:

I'm writing to identify myself as Inigo Jones, the thummer you recently wrote about on Language Log. My name is Pedro Alcocer and I'm a linguistics PhD student at the University of Maryland. I was hoping you could replace instances of Inigo Jones with my name on your entry, as this would gain me much cred among my peers.
Thanks for your analysis. I've been mocked for my thumming, and I hope other thummers come out of the woodwork because of your entry so we can form a support group or something.

]

Posted by Mark Liberman at 08:13 AM

Family Values in Biology and Linguistics

In my post on The Arachnid Threat, I compared the cooperation of the twelve families of spiders involved to cooperation between human beings and Lar Gibbons. The basis for this comparison was the fact that human beings and Lar Gibbons, like the spiders, belong to different families within the same order. Without knowing exactly which species of spiders were involved (not reported in the news item), distance on the traditional family tree was the only measure available to me.

Over at Evolgen, RPM points out that this is not a good distance measure because mammalian taxa are much less diverse than arthropod taxa. Using estimates of the time depth of separation of the various taxa involved, he concludes that a more apt comparison would be cooperation between human beings and marsupials, such as kangaroos, koalas, opossums, and Tasmanian devils.

This improved estimate makes the arachnid threat even greater, but it also raises an interesting point about linguistic and biological taxonomy. In the traditional Linnean classification system, the nodes are assigned categories according to the following hierarchy:

     Kingdom
          Phylum
               Class
                    Order
                         Family
                              Genus
                                   Species
                                        Variety

(The mnemonic that I learned for this many years ago is: Kangaroo Pouches Can Offer Fuzzy Gorillas Stomach Vibrations. There are many others.)

The intention was that the members of one taxon at a given level would be about as diverse as the members of another taxon at the same level. Given two species, it was possible to argue about whether they were two species of the same genus, or of different genera within the same family, or of different families within the same order, and so forth. The point is, a Linnean classification consists not only of a tree, which by itself contains only information about the sequence of divergences, but also node labels that indicate the degree of separation, which, in evolutionary terms, is naturally interpreted as time depth.

Until very recently, biologists had no reliable way to measure time depth. Now that they can, by comparing DNA sequences, they know that some taxa have much greater time depth than others of the same category. Nonetheless, they continue to use what we now know to be a flawed taxonomic system. The "taxonomic bias" to which RPM says I fell victim is actually a fault of biologists, who continue to use the meaningless Linnean categories. Why did the biologist who studied the Lake Tawakoni spiders report that they belonged to twelve different families rather than something more meaningful, such as "species separated by as much as so many millions of years"?

This is one respect in which linguistic classification is superior to biological classification. Linguistic classifications, like biological classifications, contain trees, but there are no categories like kingdom and phylum. (Linguistic classifications are actually forests, since it is not established that all languages are related.) The only category distinction that you will find in mainstream genetic classifications of languages is the distinction between language and dialect, roughly comparable to the biological distinction between species and variety, and although we use it for some purposes, the great majority of linguists will tell you that it has no real meaning, or that if it does have any meaning, it is social, not truly linguistic.

Hierarchies of categories like that used in biology have been proposed in linguistics, but they aren't in general use. Indeed, use of terms from these proposals, such as phylum and stock, is a good indicator that you are dealing with crank work. Generally speaking, a family is a group of languages whose relationship is solidly established, whereas a phylum or a stock is a group of languages for whose relationship solid evidence is lacking.

I find it ironic that biologists, who now have much better means than linguists for subgrouping and determining time depth, continue to make use of a flawed taxonomic system which linguists have been smart enough never to adopt.

[Update: I have been pointed to this site, which is written by the scientists involved and contains details of the species as well as photographs.]

Posted by Bill Poser at 02:51 AM

September 14, 2007

Elmore's adverbs

Yesterday I objected to Terrence Rafferty's description of Elmore Leonard's writing style as "no gassy speeches, just behavior in all its unaccountable variety"; and I even did a tiny stylometric experiment to support my impression that his characters -- the heroes as well as the villains -- are in fact unusually talky ("The laconic hero: now 54.9% talk", 9/13/2007).

It turns out that I could instead have simply made an argument ad auctoritatem. Andrew Brown from Helmintholog sent in a link to his post "St. Elmore's Fire" (10/8/2005), in which he quoted this passage from B.R. Myers' review of The Hot Kid in the November 2005 Atlantic ("The Prisoner of Cool"):

Though pioneered a century ago by the English dandy Ronald Firbank, and then popularized by a man whose first name was Evelyn, the technique of letting conversation carry a story is regarded in America as the tough guy’s way to write a novel, and Leonard makes no secret of his pride in it. Unfortunately, it compels him (as it did Firbank and Waugh) to stick to talkative characters. This excludes the true professionals on both sides of the law, leaving us with small-time cops and ex-cons who rarely keep quiet long enough to seem cool. They’re street-smart for sure, but although the recurring interjection “The fuck’m I doing here?” certainly puts Sartre in a nutshell, no one seems to think about anything, at least not anything interesting.

Andrew wrote that "[Myers'] whole essay is an example of what book reviewing ought to be", and it is.

That doesn't mean that Myers is right, though. He thinks that Leonard's western novels are better than his crime novels because they're more morally serious ("Back then he was still immune to the silly idea that it's unrealistic to pit a very good person against a very bad one, so even in a short novel like Hombre (1961) the conflict seems thrillingly epic in scope") and more willing to probe below the surface ("addressing the reader directly, getting into his characters' heads, and engaging in other things he now dismisses as 'hooptedoodle'").

That's pretty much the opposite of Rafferty's claim that it's hard to make movies from the westerns because Leonard never gives us any clues about the characters' motivation. But my memory of the books suggests that Rafferty and Myers are both wrong -- Leonard's westerns and his crime novels seem pretty consistent to me in tone and style.

I think that Myers' impression of Leonard's writing may have been subtly warped by what he wrote about how to write. Myers observes that

Leonard has explained his craft as a matter of avoiding adverbs and imagery, using only the word "said" to carry dialogue, and doing everything else possible to make himself "invisible." [...]

In recent years Leonard has begun describing his style in the imperative. "Try to leave out the part that readers tend to skip," he wrote in a famous New York Times article in 2001, "[like] thick paragraphs of prose you can see have too many words in them … I'll bet you don't skip dialogue." Just thinking of the prigs who will squawk at this aesthetic makes one want to cheer it. A moment later one realizes that what Leonard is in effect advocating—and indeed, what he writes—are novels in which the characters spend most of the reader's time talking.

Myers' complaints about this aesthetic are partly a matter of taste, not fact:

The screenwriter Richard Price has said that a Leonard line "looks great on the page, but when somebody is saying it, you feel like you have to stand up and say, 'Author! Author! Perfect ear!'" This is because the dialogue often serves no other purpose than to show off that ear. Besides, people may talk like that in real life but they don't hear like that; "use to," for example, draws more attention to the aural surface of speech than anyone would normally give it. There is a reason why the best novelists worry no more about authentic dialogue than is necessary to avoid outright stiltedness.

But consider this:

One might also ask why Leonard can't get that damned smile out of his voice even when he's describing a frightened girl with nowhere to go but a brothel. The answer lies in the tough-guy aesthetic he has spent too long cooping up his talent in: it just isn't man enough to handle any real drama.

At one point in the novel we sense what we are missing.

Joe Young picked up his gun and went around to open the cash register. Taking out bills he said to the woman, "Where you keep the whiskey money?"

She said, "In there," despair in her voice.

There is still enough of the western writer in Leonard for him not to have struck that last line, though it violates his current style in letter and spirit. We aren't supposed to notice how it jars with that famously light-handed approach to violence; nor, I suppose, are we to notice that "despair in her voice" is hardly less of an adverb than "despairingly" would have been. But there is more of 1930s Oklahoma in those four words than in all the novel's historical detail. What good is an aesthetic, if it has to be cheated into accommodating the things that count?

In his western novel The Bounty Hunters, which I just re-read, the background is the genocide of Indians on both sides of the U.S.-Mexico border in the 1870s. The "bounty" in question is paid by the Mexican government for Apache scalps -- men, women or children -- and part of the plot is that a gang of American outlaws has started killing Mexican villagers, blaming the killings on Indian raids, and selling the villagers' scalps for the bounty to a corrupt lieutenant of the Rurales. But all this is just as much backgrounded in The Bounty Hunters as Depression-era Oklahoma is in The Hot Kid.

And as for that phrase "despair in her voice", this syntactic pattern is typical of Leonard's style in all genres. I wrote about this small stylistic hypocrisy in an earlier post ("Avoiding rape and adverbs", 2/25/2004):

Honesty compels me to point out that Leonard uses lots of adverbs. The first few paragraphs of Cat Chaser have fast, neatly, freshly, once, half, already, almost, directly, there, and today, not counting the large number of adverbial PPs.

I should also point out that Leonard often uses appositives and adverbial PPs in quotative tags:

...Moran said, just as dry.
...Nolen Tyner said, smiling a little, ...
...the woman said, with an edge but only the hint of an accent.
...Virgil said, spacing the words.
...Mr. Perez said, with his soft accent.
...Ryan said, still wanting to be sure.
...Rafi said, his expression still grave.

"Blah blah, Rafi said, his expression still grave" is stylistically different from "blah blah, Rafi said gravely", but it doesn't seem to me that the writer is intruding any less in these quotative-tag appositives than in quotative-tag adverbs.

In that post, I pointed out that Leonard's rate of adjective use is actually on the high side, relative to general prose norms. I didn't count his quotative adverbials and appositives, but my guess would be that he uses them at a rate comparable to what you'd find in some romance novels.

Myers' point about Leonard's sidelong glances at big emotional and moral issues is a serious one. (If you don't like this kind of moral indirection, though, I think you need to blame its sources, which include Twain and Hemingway as well as Hammett and Bogart.) But when he supports this point with the implicit claim that Leonards western novels are richer in quotative adverbials than his crime novels are, I wish, not for the first time, that we lived in a culture where humanists were used to testing their empirical generalizations against ascertainable fact.

[Note: I originally followed Andrew Brown's assumption that B.R. Myers was female -- but a couple of readers have written in to inform me that he isn't, so I've adjusted the pronouns appropriately in the post above.]

Posted by Mark Liberman at 06:10 AM

September 13, 2007

Evil multilingualism

On the American Dialect Society's mailing list this morning, Dennis Preston quoted from a column in the State News, Michigan State's student newspaper, today:

Leftists have tried fervently since the 1960s to subvert American culture by promoting cultural heresies, which really only amount to a form of subversion. These cultural heresies include but are not limited to: Radical feminism, sexual deviancy, multilingualism and atheism.

Since the days of Joseph McCarthy, the list of threats to America from the left has evolved some. Communism is no longer high on the list, while feminism has joined homosexuality and atheism as a major ogre, along with abortion and same-sex marriage (cultural heresies that didn't make conservative columnist Nate Sherman's short list of four, though they're surely in his top ten). But now we see that multilingualism is subversion, major subversion.

Godless, queer, feminazi multilinguals! They're everywhere!

Why, my partner Jacques was quadrilingual. Oh.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 02:44 PM

The Arachnid Threat

There's something weird going on. As we have often remarked, the press, especially the BBC, is all too ready to pick up on questionable stories about animal communication, yet there is a serious story to which they aren't paying much attention. I refer to the giant spider web at Lake Tawakoni State Park in Texas. The web covers hundreds of square meters. Not only was it built by hundreds of spiders, who normally build isolated webs and eat each other if they get too close, but entomologist Allen Dean reports that they belong to twelve different families! We're talking massive inter-species communication here folks, and not particularly closely related species either. It is comparable to communication and collaboration between human beings (family Hominidae) and Lar Gibbons (family Hylobatidae).

Now I don't know what the size of their vocabulary is or whether they have recursion, but the fact is, they are communicating, and speaking for arachnophobes everywhere, I don't like it. If they keep this up, pretty soon it won't just be little spiders in North Texas building a big web, it will be armies of giant tarantulas coming to get us. Think of the bugs in Starship Troopers. We need a massive, international research effort into spider communication right now, so that we can put a stop to this menace. I don't know why the BBC and other journalists haven't raised the alarm. They're probably in league with the spiders.

Posted by Bill Poser at 02:43 PM

Scalar implicature in the funny papers

Recently, there's been some experimental work on the psychological development of scalar implicature. Thus Anna Papafragou and Julien Musolino ("Scalar implicatures: experiments at the semantic-pragmatics interface", Cognition 86, 253-282, 2003) worked with (the Greek version of) sentences like "Some/all of the horses jumped over the fence", "Two/three of the horses jumped over the fence", or "The girl started/finished making the puzzle", and found that:

[S]ubjects were presented with contexts which satisfied the semantic content of the stronger (i.e. more informative) terms on each scale (i.e. all, three and finish) but were described using the weaker terms of the scales (i.e. some, two, start). We found that, while adults overwhelmingly rejected these infelicitous descriptions, children almost never did so.

For their five-year-old subjects, they also found that "Children also differed from adults in that their rejection rate on the numerical scale was reliably higher than on the two other scales" -- but they didn't test 15-year-olds engaged in financial negotiations with their parents.

Papagragou and Musolino went on to do a second experiment to "test the hypothesis that children’s apparent inability to derive scalar implicatures may be due to the nature of the task and in particular children’s inability to infer the goals of the experimenter". They conclude that "children’s sensitivity to scalar implicature greatly improves once they are made aware of the goals of the task and provided with contexts which more readily invite the kinds of pragmatic inferences under investigation".

Today's cartoon suggests that adolescents' sensitivity to scalar implicature may decline when their own goals are strongly in conflict with the pragmatic inferences in question. In fact, I suspect that this is true for all of us.

[ Mark Logan writes:

Your post on the Zits comic reminded me of an episode from my grad school days. A friend who was somewhat frazzled from her math courses (and thoroughly indoctrinated in mathspeak where one has to follow very strict entailment) went to a hardware store in search of some kind of bolt. She brought an example, and said "I need five of these." The clerk said "I checked, and we have three of them." My friend responded, "Okay, but do you have five of them?" This supposedly went on for several rounds to the increasing exasperation of both sides. I still find it hard to believe that the clerk didn't say "only three" at some early stage, but whatever. In the same vein of mathematicians being reluctant to follow everyday implicature, this also reminds me that I want to make a cartoon featuring a mathematician ordering two distinct hotdogs.

]

Posted by Mark Liberman at 06:57 AM

The laconic hero: now 54.9% talk

In Terrence Rafferty's review of the recent remake of 3:10 to Yuma ("Elmore Leonard's Men of Few Words, in a Few Words", NYT 9/2/2007), he echoes the stereotypical idea that real men -- or at least real male heros -- don't talk much:

[I]t's next to impossible for screenwriters, directors and actors to feel entirely comfortable with motivation as sparse as Mr. Leonard supplies, especially for his most heroic characters, and especially in his super-laconic, man's-gotta-do-what-a-man's-gotta-do westerns.

Rafferty admits that the villains in Leonard's crime novels might be a little gabbier:

Mr. Leonard tends to like it chilly, though: no warming sentiment, no gassy speeches, just behavior in all its unaccountable variety. When the market for western fiction dried up in the early '60s, and he began to write the eccentric contemporary crime novels that have since enlarged his reputation, the characters got chattier, but it's mostly the villains.

It's true that Elmore Leonard once wrote westerns and now writes crime novels, but my memory and intuition tell me that all the other assertions and implications of these passages are false.

In particular, it's dialogue more than anything else that carries Leonard's narratives along. I'd guess that his books -- both westerns and crime novels -- contain a larger fraction of reported talk than the books of most other writers do. (A lot of this talk is about motivation, too, but more of that another time.) I also believe that his good guys talk at least as much as his bad guys do, and often more -- though of course individual heros and villains have their own individual styles.

Being a positivistic kind of guy, at least in small things, I decided to do a Breakfast Experiment™ in order to see whether my beliefs about the role of talk in Leonard's works are likely to be right.

Unfortunately, my Elmore Leonard collection is all in paper rather than digital form, so I need to count things by hand. In order to be able to finish the experiment during my breakfast hour, I decided to count printed lines in (my copies of) a few chapters of a couple of books, comparing the total line count to the number of lines that contain some directly-quoted dialogue.

As a baseline, I checked out the first two chapters of P.D. James' Devices and Desires. No one could accuse Ms. James of neglecting sentiment and motivation, or of featuring low-verbal leading characters: her detective Adam Dalgleish is a (fictionally) published poet. The two chapters that I scanned contain 315 lines, of which 56 are direct quotation in whole or in part. By this crude measure, these two chapters are 17.8% talk.

The first two chapters of Elmore Leonard's The Bounty Hunters (a western) contain 731 lines, of which 401 are direct quotation in whole or in part. Thus these two chapters are 54.9% talk

The first two chapters of Elmore Leonard's City Primeval (a crime novel) contain 605 lines, of which 283 are direct quotation in whole or in part. That's 46.8% talk. I won't claim that this is a significantly smaller proportion than in the western novel, but it's surely not more. If Leonard's crime-novel characters are chattier than his western-novel characters, it's not evident in this case.

What about the claim that it's especially the crime-novel bad guys that get talky? Well, the first chapter of City Primeval features a bad guy, Clement Mansell, whose utterances supply 11.1% of the chapter's lines. The second chapter features a good guy, Raymond Cruz, whose utterance supply 35% of the chapter's lines. (The rest of the quoted material in those two chapters comes from victims and other incidental characters.) Again, this is not a large enough sample to allow us to be confident that Leonard's heros are chattier than his villains, but that's the direction of the effect.

This is confirms my impressions strongly enough that I'm willing to offer Mr. Rafferty -- or anyone else -- a modest wager. I claim that the characters in Elmore Leonard's crime novels are not significantly chattier than the characters in his western novels, and that in both genres, the heroes are at least as chatty as the villains. I also claim that the proportion of dialogue in Leonard's novels, taken as a whole, is on the high end of the distribution for recent popular writers. I bet that if we counted up the amounts and proportions of talk in a larger sample of books, the numbers would continue to confirm this. The stakes? Up to you; but at least, if you can show me that I'm wrong, I'll buy you breakfast.

OK, what about Elmore Leonard's allegedly sparse man's-gotta-do-what-a-man's-gotta-do approach to motivation? This is harder to quantify, and I've come to the end of my breakfast hour. For now I'll limit myself to noting that in the four chapters that I scanned, just about everything -- all the dialogue and all the action -- is directly relevant to the characters' motivation in the conflicts that will follow. And a surprisingly large amount of the talk is explicitly about motivation.

Here's a bit of dialogue from City Primeval. A reporter is interviewing the hero, detective Raymond Cruz:

"That's it -- you're trying to look older, aren't you? The big mustache, conservative navy-blue suit -- but know how you come off?"

"How?"

"Like someone posing in an old tintype photo, old-timey."

Raymond leaned on the table, interested. "No kidding, that's what you see?"

"Like you're trying to look like young Wyatt Earp," the girl from the News said, watching him closely. "You relate to that, don't you? The no-bullshit Old West lawman."

"Well," Raymond said, "you know where Holy Trinity is? South of here, not far from Tiger Stadium? That's where I grew up. We played cowboys and Indians over on Belle Isle, shot at each other with B-B guns. I was born in McAllen, Texas, but I don't remember much about living there."

This stuff about laconic male heros is the other side of the stubborn stereotype about gabby women. And curiously, it seem that both ideas are just as false in fiction as in fact.

[Update -- an anonymous educator writes:

I love the language log, even if I can only understand 2/3 of it. And I remain a fan of David Foster Wallace - in fact, you guys made him more likeable by humanizing his brilliance to some extent.
Anyhow, I'm writing about Elmore Leonard. Anyone familiar with his work knows that it is driven by dialogue. That's what makes him Elmore Leonard. It's strange that the NYT editors would let those claims stand, assuming they had even a passing knowledge of Mr. Leonard's work.

It's a puzzle. I've speculated that journalists are more concerned about whether "facts" are morally instructive than whether they've true, but I have no evidence for this hypothesis beyond its power to account for journalists' behavior in all its unaccountable variety. The author of Heads Up: The Blog, who seems to be a sort of social scientist working undercover as a newspaper editor, put it more charitably in an email to me last year: "For better or worse, that's a function of journalism; it transmits cultural norms and empirical data in roughly equal proportions."

Of course, I agree with Anonymous Educator that the role of dialogue in Elmore Leonard's writing ought to be a cultural commonplace. But as with our fellow citizens' inadequate knowledge of the Simpsons, this is apparently an area where we teachers need to work harder. ]

Posted by Mark Liberman at 06:53 AM

September 12, 2007

Fifty ways to lose your lover: Edinburgh's street names

In a street name like Sesame Street, let us (just for clarity) refer to the more distinctive part (Sesame) as the forename, and the classificatory word part (Street) as the classname. Clearly, Smith Street and Smith Avenue are distinct proper names; but in the USA — or at least, very definitely in California (people tell me it is not nearly so true of cities like New York or Baltimore or Denver) — it is common for street classnames to be thought of as conveying virtually no information. In the American English of Northern California it is standard for streets to be referenced simply by forename. This is true even where it is significantly misleading: The sign on Junipero Serra Boulevard near Page Mill Road in Palo Alto, California, adjacent to the university and separate post town of Stanford, simply says Stanford, but it does not mean "This is Stanford"; it means "This side street is called Stanford Avenue". And of course intersections are denoted by coordinations of forenames (think of Hollywood and Vine in Hollywood, or Florence and Normandy in south Los Angeles). Well, in Edinburgh, the city to which I moved a few days ago, things are very different. Nobody leaves off the classname here. It would be a ticket to confusion and madness. Let me explain.

Nobody in Edinburgh could possibly say, "We live on Inverleith." Edinburgh has an Inverleith Avenue, an Inverleith Avenue South, an Inverleith Gardens, an Inverleith Green, an Inverleith Place, an Inverleith Place Lane, an Inverleith Row, an Inverleith Terrace, and an Inverleith Terrace Lane, all on the same page of my street atlas yet all quite different and not necessarily adjacent or intersecting.

Did your hasty notes on a beer coaster say that you promised last night to pick up your hot date this Friday at a flat on Craigmount? You will find there are many ways to lose your lover: you might be looking for Craigmount Approach, Craigmount Avenue, Craigmount Avenue North, Craigmount Bank, Craigmount Bank West, Craigmount Brae, Craigmount Crescent, Craigmount Court, Craigmount Drive, Craigmount Gardens, Craigmount Green, Craigmount Green North, Craigmount Hill, Craigmount Loan, Craigmount Park, Craigmount Place, Craigmount Terrace, Craigmount View, or perhaps Craigmount Way. Lots of luck with arriving on time for that date. Next time get a phone number too.

As Andrew Durdin has pointed out to me, Buckstone is a particularly promiscuous forename. Street atlases list All of these:

Buckstone Avenue	Buckstone Bank	Buckstone Circle	Buckstone Close
Buckstone Court	Buckstone Crescent	Buckstone Crook	Buckstone Dell
Buckstone Drive	Buckstone Gardens	Buckstone Gate	Buckstone Green
Buckstone Grove	Buckstone Hill	Buckstone Lea	Buckstone Loan
Buckstone Loan East	Buckstone Neuk	Buckstone Place	Buckstone Rise
Buckstone Road	Buckstone Row	Buckstone Shaw	Buckstone Terrace
Buckstone View	Buckstone Way	Buckstone Wood	Buckstone Wynd

That's 28 different Buckstone streets. And for good measure there is also one that has no classname, just an attributive premodifier of the forename: High Buckstone.

There are actually way over fifty ways to lose your lover in this city's streets. We get the first fifty from the fact that a cursory glance through a street index reveals that all of the following words to be quite common as street classnames in Edinburgh:

Approach	Crescent	Green	Parade	Square
Arcade	Crest	Hill	Park	Street
Avenue	Dean	Lane	Passage	Terrace
Bank	Dell	Lea	Path	View
Boulevard	Drive	Loan	Place	Villas
Circle	End	Mains	Promenade	Walk
Circus	Entry	Market	Quadrant	Way
Close	Gait	Mews	Rise	Wood
Cottages	Gardens	Mount	Road	Wynd
Court	Glebe	Neuk	Row	Yard

But in fact, for any chosen forename, we can make many more than fifty well-formed street names.

First, there are additional choices of classname that I did not include in the above top 50: I left out Brae, Buildings, Causeway, Crook, Crosscauseway, Grove, Junction, Pend, Port, Rigg, Stile, Syke, Vale, and various others.

Second, there are a large number of more complex street names combining a forename with two classnames: Argyle Park Terrace, Buccleuch Place Lane, Douglas Gardens Mews, Hermitage Park Green, Logie Green Gardens, Maitland Park Road, Merchiston Bank Gardens, Moredun Park View, Mortonhall Park Bank, Muir Wood Crescent, Niddrie Mains Terrace, Primrose Bank Road, Queen's Park Court, Regent Terrace Mews, and so on.

Third, there are many streets known simply by a forename (Abbeyhill, Blandfield, Croft-an-Righ, Damside, Esplanade, Fairbrae, Galachlawshot, Haymarket, etc.), and a few streets known simply by the definite article plus a classname with no forename at all (like The Mound or The Crescent).

Fourth, for any combination of forename and classname(s), any one of eight compass point names (North, North East, East, South East, South, South West, West, or North West) can be either prefixed (as in West Gyle Crescent) or suffixed (as in Craigmount Green North), so we can multiply the total so far by sixteen.

And to tell you the truth, it's more than that, because of at least one other place that a compass point can intervene: Edinburgh has a Thistle Street North East Lane and a Northumberland Street North West Lane.

It is a jungle of thoroughfare nomenclature out there, as was first pointed out to me by my colleague Bob Ladd when I got the classname of his address wrong. We're going over to his house tonight for dinner. Should be easy to find. He he lives over on Leamington, near Craigmount.

Thanks to John Cowan, Matthew Rankine, Dan Asimov, Jen Wood, Alexis Grant, and various other readers for corrections and additions that I have silently incorporated into the above since it was first posted.

Posted by Geoffrey K. Pullum at 01:19 PM

Connard de flic

According to Emil Steiner in the Washington Post (OFF/beat, 9/10/2007):

Perhaps inspired by McDonald's assault on the Oxford English Dictionary, a French police union is suing one of its country's most respected dictionaries over its definition of "police." The Unsa-police union has asked for a court order to force Le Petit Robert to remove a slang reference from its 2008 edition that defines their profession as "connard de flic" (bloody pig). In solidarity, a second union is calling for a boycott of the dictionary and Interior Minister Michele Alliot-Marie has disapproved of the editors decision saying she "deplored" the use of that parlance, popular among immigrant youth. But if those words of disapproval aren't enough, perhaps France's best solution would be to simply confiscate and burn all copies containing that virulent reference? Not only would that eradicate the word, but also give those who once used it a new found respect for their government's proactive understanding.

If you've ever read a dictionary, then that clicking sound is probably the needle of your BS detector bumping up against its limiting peg.

And in fact, a quick scan of news.google.fr for "connard" turns up a very different story. As explained in the Nouvel Observateur, for example,

Pour illustrer le terme "rebeu", le dictionnaire cite une phrase de l'auteur de polars Jean-Claude Izzo: "T'es un pauvre petit rebeu qu'un connard de flic fait chier, c'est ça !". Alliance considère qu'il s'agit là d'un "outrage fait à la police nationale et aux policiers".

To exemplify the term "rebeu", the dictionary cites a phrase by the crime-fiction author Jean-Claude Izzo: "You're a poor little<rebeu> that a <connard de flic> pisses off, that's it!" The [main union of the national police] believes that this constitutes an "outrage to the national police and to policemen".

So the controversy has got nothing whatever to do with the "definition of 'police'" -- the contested item is an example sentence used in the entry for a completely different term, rebeu,which itself is double-verlan for Arab (Arab → beur → rebeu). (No one seems to be complaining about the "pauvre petit rebeu" part, oddly enough.)

Misrepresentation of lexicographical practice aside, this brings up the puzzling question of judging offensiveness across languages and cultures. As far as I know,flic is informal but in itself entirely inoffensive, about at the level of English cop, and not at all like pig. And as for connard, it was the French Word of the Day over at About.com in February of 2005, where it was defined as

un connard
(familiar) - idiot, jerk, schmuck
C'est un vrai connard ! - He's a real jerk!
Related: une connarde / une connasse (familiar) - bitch, cow

This is about what what I had inferred, both in terms of meaning and in terms of degree of offensiveness, from hearing and seeing the word used. And idiot, jerk and schmuck are not words of praise, but they're not taboo words either, and the quote from Izzo seems pretty clearly to refer to a particular policeman, not to the police in general.

On that basis, it's hard to see why the police union got upset, though perhaps they were upset already and just looking for a hook to hang it on. Then again, maybe I'm wrong about how (in)offensive connard might be. The Dictionnaire de l'Académie Française has no entry for it, but there is one for con:

(3)*III. CON, CONNE n. XIIe siècle. Du latin cunnus, « sexe de la femme ».
1. N. m. Vulg. Organe sexuel de la femme. 2. N. Fig. et très vulg. Personne sottement passive, imbécile, idiote, par comparaison dépréciative, héritée de la tradition latine, avec l'activité virile. Quel con ! Quelle conne ! Il s'est conduit comme un con. Faire le con. Une histoire à la con, particulièrement stupide. Expr. d'argot militaire. Mort aux cons ! Adjt. Il est trop con. Elle est vraiment con, ou conne. Expr. Con comme la lune. Bien que cet emploi figuré apparaisse dans les correspondances littéraires dès le XIXe siècle et que l'usage parlé s'en soit fort répandu, ne doit être employé que dans une intention de vulgarité appuyée.

The usage warning is in boldface in the original: "Although this figurative meaning appears in literary correspondence of the 19th century, and the spoken use is widespread, [this word] should not be used without a specific intention of vulgarity."

Sebastien Fontenelle at Vive le Feu suggests another set of English counterparts for "connard de X", as a way of translating into French a phrase from the life of Johnny Rotten. And there's some useful discussion of translation options for "connard de flic" by Billy Beccles on the snopes.com message board -- he settles on "twat of a cop".

This seems to me to put too much emphasis on the etymological meaning of connard, but anyhow, I can't imagine a police union in the U.S. getting bent out of shape over a dictionary entry for a slang term for an ethnic minority X, that quoted a piece of detective-novel dialogue along the lines of "you're just a poor little X, pissed off by some twat of a cop" (and much less, "some jerk of a cop", which might be closer in meaning and tone). The Committee for X-American Understanding, now, that would be a different matter.

For lagniappe, consider that the feminine form connasse, which I gather remains quite a bit more offensive than connard in France, has turned into a common slang ethnonym for Cajuns, viewed positively at least by some. The wikipedia entry for coon-ass says:

Socioeconomic factors appear to influence how Cajuns are likely to view the term: working-class Cajuns tend to regard the word "coonass" as a badge of ethnic pride; whereas middle- and upper-class Cajuns are more likely to regard the term as insulting or degrading, even when used by fellow Cajuns in reference to themselves.

Despite an effort by Cajun activists to stamp out the term, it can be found on T-shirts, hats, and bumperstickers throughout Acadiana, the 22-parish Cajun homeland in south Louisiana.

[Or maybe not -- John Cowan observes that

Research has since disproved Domengeaux's "conasse" etymology. Indeed, photographic evidence shows that Cajuns themselves used the term prior to the time in which "conasse" allegedly morphed into "coonass."[1] As a result, the origin of "coonass" remains uncertain.
[1] Shane K. Bernard, The Cajuns: Americanization of a People (Jackson: University Press of Mississippi, 2003), pp. 96-97.

Bernard shows a 1943 picture of a C-47 transport plane nicknamed the Cajun Coonass, which does refute the theory that the term was invented by continental French speakers to refer to Cajun G.I.s after D day. But if the etymology involves con(n)asse, it seems more plausible that the development took place in Louisiana among the Cajuns themselves, and the 1943 picture doesn't affect that possibility one way or the other. Still, it's clear that the origin of coonass is at best uncertain.]

[John O'Toole wrote to correct my translation of the French idiom "faire shier", literally "cause to defecate", which I rendered as "scares shitless", whereas it should have been "pisses off". Obviously, I need to improve my understanding of the the cross-cultural associations of excessive vs. inadequate solid vs. liquid bodily wastes. This wouldn't be the first time that high-school French let me down.]

[Alex Price offers some additional information about the tone of the idioms involved:

I enjoyed your blog entry on the "connard de flic" controversy. My only demurral would be that in my opinion "connard" is quite a bit stronger than you suggest. In fact, I would rate it more offensive than "con," and Le Petit Robert agrees with me: "con" is just labeled "Familier," and in most contexts just means "idiot" or "jerk"; but "connard" is labeled "Vulgaire et méprisant." The contempt that it conveys is what makes a much nastier word than "con." The best American English translation that I can think of is "asshole," which can also be very strong. In an American context, "twat" has always sounded a little comical to me, and so is not as good a translation, despite the referential correspondence with the French.
Also, "faire chier" doesn't quite mean "piss off." Le Robert gives "embêter" and "ennuyer" as synonyms, which is right. For me the expression "Ah, tu fais chier!" is about equivalent to "You're a pain in the ass!" To annoy or to bother someone is not quite the same as to piss them off, although one often leads to the other!

]

Posted by Mark Liberman at 08:18 AM

September 11, 2007

What's it all about?

Some time ago, Mark Liberman came across a peeve about at about in expressions like at about 10:30, and countered that there was nothing wrong with it -- it means 'at approximately', which is neither incoherent nor redundant -- and is attested in the writing of eminent authors over the centuries. Garner's Modern American Usage finds no fault with it, nor does Merriam-Webster's Dictionary of English Usage. Yet, as MWDEU notes, a long list of manuals condem the usage -- a fact that itself calls out for some explanation. But first, some words from Zippy the Pinhead about about:

(MWDEU covers most of the territory I'm about to discuss, citing especially Bergen and Cornelia Evans, A Dictionary of Contemporary American Usage (1957), which -- unlike most of the other manuals -- is right on target.)

The use of about in at about is approximative, roughly as in the last panel of the Zippy strip. The uses in the first two panels are ordinary uses of about as a preposition (P) taking NP objects -- what I'll refer to as an OPERATOR use of Ps. (The third panel has hard-to-classify idiomatic uses.) But in the last panel, about is functioning as an ADVERBIAL, in this case modifying predicate adjectives. A number of Ps -- about, around, under, over, for instance -- have adverbial uses, often more than one kind of adverbial use; all four of the Ps I just listed can serve as modifiers of numerical or time expressions of one kind or another, as in:

(1) About/Around/Under/Over ten people came to the party.

(2) We'll leave in/for/after about/around ten minutes.

(3) We'll stay here until about/around 10 o'clock.

(Different Ps have somewhat different syntax.)

Now, if you fail to appreciate the difference between operator Ps and adverbial Ps, and take all Ps to be operators, you'll think that at 10:30 has one operator P, conveying a (relatively) exact time, and that about 10:30 has another operator P, conveying an approximate time. As a result, you'll find at about 10:30 to be incoherent, as the griper Mark Liberman cited did.

The mistake in the reasoning here almost surely stems from a failure to distinguish syntactic CATEGORIES (like P) from syntactic FUNCTIONS (like operator and adverbial). We've commented here many times on the confusions and misunderstandings that result from not distinguishing category and function. To appreciate the syntax of English possessives (like Mary's in Mary's father), for example, you need to recognize that they are NPs (and not adjectives or adjective phrases), but NPs functioning as determiners (a type of noun modifier, distinct from adjectivals) rather than as arguments (subjects, direct objects, objects of prepositions, etc.). In fact, the category/function distinction plays an important role in the next chapter of the at about story.

The most common critique of at about seems to be that it's redundant (rather than incoherent); the at should be omitted because it's redundant. You get to this conclusion in five steps:

(a) the observation that the at in at about is omissible, in the sense that the versions with and without the at are both grammatical and don't differ significantly in meaning;

(b) the claim that, in general, elements that are omissible in this sense are redundant, meaning that they repeat information;

(c) the claim that redundancy, in this sense, is a bad thing; hence

(d) the conclusion that omissible elements should (or must) be omitted;

(e) in particular, the at of at about should (or must) be omitted.

(The careful reader will have noticed that step (d) is a special case of the famous Omit Needless Words principle.)

I would reject both claims (b) and (c) as general principles, but it's (b) I want to focus on here, because thinking about at about and similar expressions in terms of the superficial notion of omissibility in (a) and (b) leads people away from asking about the syntax of the expressions at issue. Let's do that now. I'll start with something reasonably simple, at 10:30 in

(4a) We met at 10:30.

This is pretty clearly a PP, with head P at and object NP 10:30, the whole thing functioning as an adverbial (which is what PPs mostly do). The only complexity is the nature of the object 10:30. Time Ps take objects denoting, among other things, locations in time (at / before / after noon), and when they do, the objects take adverbial rather than adjectival modifiers -- nearly, approximately, exactly, almost -- the same types of adverbials that modify numerical expressions (nearly / approximately / exactly / almost ten people). (There are many complexities here about which combinations of P, adverbial, and object occur.)

On to at about 10:30, as in

(4b) We met at about 10:30.

This is just like at 10:30 in (4b) -- a PP functioning as an adverbial (of location in time) -- except that the object about 10:30 of at contains the adverbial-P modifier about. It's entirely parallel to We met at approximately 10:30.

Now, finally, about 10:30, as in

(4c) We met about 10:30.

This is something of a surprise; about 10:30 is, according to everything I've said so far, a NP, just like about 10:30 in (4b), but here it's functioning on its own, without an operator P, as an adverbial (of location in time). It's what we call in the syntax trade a BARE NP ADVERBIAL; its category is NP, but its function is adverbial. In English, most NPs cannot serve as bare NP adverbials (exactly 10:30, for example, cannot), but some can, and a few of these alternate with P-marked variants, as in

(5a) We met Sunday. [bare]
(5b) We met on Sunday. [P-marked]

About 10:30 (and about in combination with many other time expressions) is like Sunday in allowing a bare variant.

There's a huge literature on bare NP adverbials in English; it's a complex topic, with lots of fascinating wrinkles. But for my purposes here, it's enough to point out that what's notable about at about vs. plain about is not that there's a P-marked variant, but that there's a bare variant; "omissibility" is the special case.

[Addendum 9/12/07: Eli Morris-Heft reports that (4c) is completely unacceptable to him, which is to say that about 10:30 and the like are not available for him as bare NP adverbials in VP-final position, or perhaps in general. There's plenty of variability in the sets of bare NP adverbials that individual speakers have; this might be just another data point. Clearly, a great many speakers have no problem with bare NP adverbials in about -- or otherwise manuals wouldn't be recommending them as replacements for P-marked adverbials.]

And there's no redundancy in the P-marked variant. The P makes explicit the relationship between the time denoted by its NP object and the time of the situation denoted by the clause the PP modifies. In the bare variant, in contrast, this relationship is merely implicit; in a sense, the bare variant is underinformative (rather than the P-marked variant being redundant). The relationship between the P-marked and bare variants is then parallel to many other cases of explicit vs. implicit marking -- for example, (explicit) that-marked complement clauses (I realize that pigs can't fly) vs. (implicit) unmarked complement clauses (I realize pigs can't fly).

What's gone wrong in so many advice manuals is that they've focused on omissibility (in sequences of Ps, in particular), treating this mechanically as a test for redundancy, without an appreciation of the syntax and semantics involved.

Here's one treatment of at about from this literature -- in Roy Copperud's American Usage and Style (1980), where on p. 301, in a subsection of the entry on "piled-up prepositions", we are told:

Single preposition also sometimes superfluous:

- of: omit in "A low temperature (of) near 45 degrees"
- from: omit in "received (from) two to four inches of snow"
- at about: omit at from "(at) about 9:00"

(I've left in the of and from cases as extra entertainment for the reader.)

Note, first, that the usage in question is implicitly referred to higher-level principles -- piling up prepositions is, in general, a bad thing, and omissible material should be omitted (because it's "superfluous") -- though in fact the actual advice consists of a list of very specific cases, which don't hang together. No one actually proposes that sequences of Ps are in general a bad thing (instead, particular sequences are proscribed), and no one actually insists that omissible material should always be omitted (instead, omission is prescribed in certain specific constructions). You could spend hours collecting examples of perfectly impeccable sequences of Ps, of several different kinds (Sandy took the box from under the table; Terry walked out of the house; etc.). And to insist on omission wherever possible would be to insist, among other things, that explicit marking should never be used when implicit marking is available; in particular, bare adverbials would have to be used instead of P-marked adverbials whenever both are available: (5b) out, (5a) in.

Second, though the principles appealed to in the manuals are over-general, sometimes absurdly so, the actual recommendations are often over-particularistic, focusing on a few cases while disregarding other entirely parallel ones. What Copperud and other handbooks say about at about should carry over almost entirely to at around, though the manuals don't mention around. The same holds of adverbial about in the object of Ps other than at which are "omissible" on occasion, as in the following pairs:

(6) On about Sunday, things will get worse. / About Sunday, things will get worse.

(7) In about June, things will get worse. / About June, things will get worse.

(8) I waited for about two hours. / I waited about two hours.

Third, these very particularistic recommendations are mostly framed in terms of linear strings -- the sequence X Y is to be avoided when one of them is omissible -- though they should really be framed in terms of structures or constructions, and (more important) the notion of "omissible" isn't made explicit. In the case of at about, some occurrences are totally irrelevant: the person I yelled at about the failures, where at and about belong to different, parallel, constituents. Some fail one or the other clause in the definition of "omissible", even though the structures are more or less of the right sort: I aimed at about ten targets, where the version without at is not grammatical (unless you're someone who can aim targets at things), because the at of aim at expresses goal rather than location; At about the corner, I fell down and At about these rates, indebtedness will decline in ten years, where the at is not omissible even though it expresses (metaphorical) location; and I yelled at about ten kids, where the version without at is grammatical, but is not even approximately a paraphrase of the version with at (goal rather than location again).

The problem is that the manuals give you generalizations -- high-level ones (Omit Needless Words) or specific ones (omit at in at about) -- and one or more instances of the generalizations, but nothing to indicate the limits of the generalizations. In effect, they're saying

Follow this advice, unless that would be wrong.

and they're obliging the reader to try to divine their intentions from the examples they give. Not very helpful.

Sometimes, I'd guess, the usage advisers are simply unaware of the complexities in their advice or don't understand the details of the constructions they're talking about. Other times, I'd imagine, they're aware that more needs to be said but shrink from introducing technicalities, or (relying on the conceptual apparatus of "traditional grammar") they just don't have adequate vocabulary to convey those technicalities. After all, how many people know about bare NP adverbials?

As I've said before, these nuggets of advice all start from specific events: somebody noticed a class of examples, judged them to be in some way imperfect, and then formulated a principle to appeal to in proscribing them for others -- a process that promotes both overspecificity (focusing only on the motivating examples) and overgeneralization (leaping to abstract characterizations of the perceived problem). These proposals aren't seen as hypotheses about how the language works, but just as bits of advice about what to avoid in your language. Then, in some cases, these ideas disseminate, as ideas do in communities.

Which brings me to my fourth point: as I've also said before, the advice manuals, unsurprisingly, tend to share opinions and attitudes. Through their influence and the influence of editors and teachers, some usages get picked out for special opproprium, out of all proportion to their significance in the larger scheme of things (why should anyone care about the saving of the little word at in at about?), sometimes without reference to the practice of "better authors" (recall that at about has a long and distinguished history), and often without connection to very similar usages (at about gets bad press, at around and for about escape notice). Ordinary people develop a prejudice -- a pet peeve -- against these usages, complete with viscerally unpleasant responses to them. There are fashions in pet peeves, linguistic pet peeves included, as in other things.

Finally, a note about how I got interested in at about in the first place. One of my current projects is to investigate instances of Omit Needless Words and Include All Necessary Words advice in the manuals. (I'm now thinking of it as the OI! project: Omit!/Include! Oi.) ONW and IANW figure in usage advice in two different ways. Sometimes it seems pretty clear that a usage is deprecated for social reasons -- because of the people who use it or the contexts in which it's used. The primary objection is that the usage is non-standard, specific to some social group or region, innovative (or perceived to be so), informal, or mostly spoken rather than written, and this objection is bolstered by a SECONDARY appeal to some general principle, for instance ONW or IANW. (The reasoning here goes both ways: in a widespread piece of language ideology, the standard, general, established, formal, written language is taken to be intrinsically good, so that variants that are non-standard, restricted, innovative, informal, or spoken are EXPECTED to be intrinsically defective in some way: sloppy, vague, redundant, illogical, etc.)

Examples: the various sorts of "intrusive" of -- notably, in off of and other combinations of prepositions with of (It fell off of the table), and in exceptional degree modification (That's too narrow of a topic for a paper) -- are judged by many to be non-standard, or at least innovative, informal, or spoken. ONW is then appealed to as a backing for the advice that these usages are to be avoided. Meanwhile, the determiner a couple without of (A couple people complained rather than A couple of people complained) is judged by some to be non-standard, or at least innovative, regional, informal, or spoken. IANW is then appealed to as a backing for the advice that this use is to be avoided.

Sometimes, however, appeals to general principles (like ONW and IANW) lack any evident social basis; these are PRIMARY appeals. The objection to at about as a violation of ONW seems to be of this sort, as does the objection to the then of if ... then as a violation of ONW. (I hope to post again soon on the then case.)

My hypothesis in the OI! project is that secondary appeals to ONW and (especially) IANW considerably outnumber primary appeals. To investigate this hypothesis, I've had an intern, Rachel Cristy, inventorying OI! appeals in a collection of manuals; it was Rachel who pointed me to at about as an example of a primary appeal to ONW that is, or at least was, surprisingly popular (perhaps as a result of a fashion in peeves). (My thanks to the office of the Vice Provost for Undergraduate Education at Stanford for funding Rachel's internship.) Meanwhile, I code and tabulate.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 12:04 PM

So "smoke" is weather after all

Back in August I commented on Google's weather report, noting that it referred to Montana's forest fire smoke as "fog." Among other things, I wondered about the inventory of weather terms and whether "smoke" really qualifies as a category of fog. Or whether it's really weather at all, for that matter. Sim Alberson, a meteorologist, now writes to clarify that there is indeed an international standard for weather terminology put out by the World Meteorological Organization, part of the United Nations. Be sure to check the link. It's very pretty.

Sure enough, "smoke" is included, along with 40 other different types of what Sim refers to as "frozen precipitation." You'll find "smoke" in box number 4 on the chart of accepted terms and symbols. So I guess it's official. "Smoke," "dust," (see boxes 6, 7, 8, 9, 30, 31, 32, 33, 34, and 35) and "sand" (see boxes 9, 30, 31, 32, 33. 34, and 35) are considered weather, at least by the WMO and the UN.

Now why meteororologists call this "frozen precipitation" remains unclear, at least to me. Maybe Sim can help explain this.

Update: John Maline informs me that the government also uses "smoke/haze" as one of its categories.

Posted by Roger Shuy at 11:59 AM

Lexical choice today

Posted by Mark Liberman at 09:09 AM

Linguistic Advice in the Lavatory: Speaking Mandarin is a great convenience for everyone

[Guest post by Victor Mair]

This is a 1950s-public-service-ad style placard that appears above the urinals in all the men's bathrooms at Capital Normal University in Beijing. Similar displays appear in all the women's bathrooms as well.

The particular circumstances of this slogan call for further analysis. When we look more closely at the entire message, a number of interesting aspects emerge.

First of all, the handsome young man is enjoining everyone to speak Mandarin **in Beijing**. This must mean that a lot of people at this university and elsewhere in Beijing (much less outside of Beijing, which is supposedly the epicenter of Mandarin usage in China!) do not speak Mandarin to each other.

The campaign against multilingualism is underscored by the fact that the spokesman pictured here might be a translator, which would not be necessary if everyone in China spoke the same language, viz., Putonghua (Modern Standard Mandarin [MSM]), the designated national tongue. Even if he's not a simultaneous translator with the headset of his profession, one wonders why an operator, an announcer, or whatever he's supposed to be, is pictured making this particular gesture and wearing that type of headset.

A further irony is that the administration of the University felt it necessary to post this slogan both in English and in Mandarin, which raises the very real questions of HANZI literacy and the emerging role of English as a rising lingua franca of convenience (as it is in the world's other most populous country, India).

Most amazing of all, however, is an amusing, yet subtle and perhaps subconscious, pun in the Chinese slogan:

说好	普通话
SHUO1HAO3	PU3TONG1HUA4
speak well	Mandarin

方便	你	我	他
FANG1BIAN4	NI3	WO3	TA1
convenient	you	me	him
	(i.e. everybody)

The grammar of the second line makes "convenient," which is normally a stative adjective, into a causative verb: "cause / make convenience [for]."

Now, FANGBIAN is a translation of Sanskrit upāya, that is to say, "skillful means," "skill-in-means," which implies teaching at a suitable level for or with devices appropriate to one's auditor. In medieval Chinese Buddhist monasteries, FANGBIAN became a euphemism used by monks who wanted to go to the toilet, but it soon spread to the general populace. Still today, in colloquial parlance, FANGBIAN can mean "go to the toilet," and DA4BIAN4 ("great convenience") indicates defecation, while XIAO3BIAN4 ("lesser convenience") signifies urination.

All in all, this is a remarkable sign that is posted at Capital Normal University, a sign that raises a host of questions, many of which cannot be fully answered without knowing the minds of those who thought that the message was vital enough to be posted above all the urinals and stalls in the WCs at one of Beijing's most important universities.

[Guest post by Victor Mair]

(Photograph courtesy of David Moser)

Posted by Mark Liberman at 06:28 AM

September 10, 2007

This parrot is no more!

Fans of talking parrots will be saddened by the news that Alex, "a parrot who had a way with words", died last week. Alex was a parrot in Dr. Irene Pepperberg's lab (as noted in this very early post), and according to the obituary-like piece in the NYT, "Alex showed surprising facility. For example, when shown a blue paper triangle, he could tell an experimenter what color the paper was, what shape it was, and -- after touching it -- what it was made of." This sort of claim is balanced in the same NYT piece by the following:

Other scientists, while praising the research, cautioned against characterizing Alex's abilities as human. The parrot learned to communicate in basic expressions -- but it did not show the sort of logic and ability to generalize that children acquire at an early age, they said. "There's no evidence of recursive logic, and without that you can't work with digital numbers or more complex human grammar," said David Premack, a professor emeritus of psychology at the University of Pennsylvania. (Link to Premack's homepage added.)

Of possible further interest to Language Log readers is the following error in the NYT piece, obviously introduced by an editor's word substitution:

[Alex] demonstrated off some of his skills on nature shows, including programs on the BBC and PBS.

That's obviously supposed to be showed off, and my guess is that the editor either thought it was too colloquial-sounding or perhaps didn't like the repetition with "nature shows" a few words later. But instead of replacing all of showed off with demonstrated, the off was left behind, resulting in the ungrammatical demonstrated off (at least it's ungrammatical to me; if it's not to you, I'd love to hear about it.)

The NYT piece ends with the following touching account of Dr. Pepperberg's final interaction with Alex.

Even up through last week, Alex was working with Dr. Pepperberg on compound words and hard-to-pronounce words. As she put him into his cage for the night last Thursday, Alex looked at her and said: "You be good, see you tomorrow. I love you."

He was found dead in his cage the next morning, and was determined to have died late Thursday night.

It's certainly not unimpressive that Alex had apparently learned to associate the quoted string of words above with being put back in his cage for the night. Call me callous, but I can't help but thinking that Alex's last words would have been very different if Dr. Pepperberg and her associates had taken to saying "see you later, bird-brain" to Alex every night.

[ The title of this post is brought to you by the Judean People's Front and the People's Front of Judea. ]

[ Comments? ]

[Update -- a reader sent this note:

For your files on journalists and quotations. Did one of these papers confuse the person with the parrot?
" Death of gifted parrot stuns scientists", Boston Globe, 9/11/2007:
Pepperberg said she and Alex went through their good-night routine, in which she told him it was time to go in the cage and said: "You be good. I love you. I'll see you tomorrow." To which Alex said, "You'll be in tomorrow."
New York Times: (as quoted above):
Even up through last week, Alex was working with Dr. Pepperberg on compound words and hard-to-pronounce words. As she put him into his cage for the night last Thursday, Alex looked at her and said: "You be good, see you tomorrow. I love you."

It's the synoptic problem all over again... ]

[And the plot thickens, as the same reader points out. There is no correction in the NYT, but an editorial by Verlyn Klinkenborg (" Alex the Parrot" muses on the alleged fact that Alex's last words were "I love you".

For us, language is everything because we know ourselves in it. Alex's final words were: "I love you."
There is no doubt that Alex had a keen awareness of the situations in which that sentence is appropriate -- that is, at the end of a message at the end of the day. But to say whether Alex loved the human who taught him, we'd have to know if he had a separate conceptual grasp of what love is, which is different from understanding the context in which the word occurs. By any performative standard -- knowing how to use the word properly -- Alex loved Dr. Pepperberg.

The interesting thing about this is that it's much clearer that Alex loved Irene Pepperberg than it is what Alex's last words, if any, actually were.]

Posted by Eric Bakovic at 05:33 PM

Morning mailbag

I'm way behind in dealing with LL-related email -- apologies to those of you that I haven't answered. A few especially interesting links came in yesterday, and I don't have time to do them justice, so I'll just dump them here with a couple of brief comments.

From Pat Schwieterman:

America's finest news source seems to have picked up on "moist" aversion. The latest copy of The Onion (9/6/07) has the following brief "correction":

Last week, we promised never to say "moist" again. That was incorrect. We are going to say "moist" two more times in this issue, and after we say "moist" those two times, we might say "moist" again in the future. If we feel it, we'll say it. The Onion apologizes for your discomfort.

I wasn't able to find the passage in a quick search of the online version, but it's at the bottom of page 10.

This might be a lift from Language Log, but with The Onion it's hard to say -- they too are marvelously in tune with the zeitgeist.

I couldn't find The Onion's "moist" note on the web either, but I did stumble on Zach Caldwell's "Bro, You're a God Among Bros", which has nothing to do with word aversion, but is the densest presentation of punning neologisms that I've seen in a long time. Actually, they're not exactly neologisms, and I'm not sure what the word for them is -- I'm talking about things like

You are the king of all bros. Brotankhamen. You are the Ayatollah Bromeini. You are Broseidon, lord of the brocean.

and

I've long admired your absolute broficiency in all things bro-related, and the way you've always carried yourself in a brofessional manner. I consider you a brole model.

This is a mode of speech that I associate (perhaps unfairly) with young male underachievers -- another example would be the sequence in Wayne's World where Dana Carvey (as Garth) says things like "She's such a babe, if she were president she'd be Babraham Lincoln". The idea seems to be that a certain word is raised to such a level of mental activation that it takes over the rest of the vocabulary through substitution for phonetically-similar syllables in semantically-associated words and phrases. (If you think you know what this rhetorical style is called, tell me.)

[Update -- Andy "the Androidinator" Hollandbeck writes:

The style you attribute to Garth of Wayne's World was taken to a whole new level with the introduction of Rob Schneider's character Richard "The Richmeister" Laymer. You can find a transcript of one of his sketches here.
If there isn't already a name for this annoying habit, I suggest "Laymerism," which has a certain resonance, considering that each new "nickname" is lamer than the last.

There's certainly a connection, but the Richard Laymer link suggests that this trope is only used for creating nicknames, which seems slightly off the mark.

Greg Stasiewicz suggests that there might be a connection to Smurf Language.

And Seth Kleinerman brings another source into the discussion:

I wonder whether you know Ludacris's seminal 2000 work, "Ho," from his album called, um, "Incognegro." In it, he uses a very similar strategy to the one under discussion in serenading the listener or some third party with claims that she is, in fact, a ho.
"Reach up in tha sky for tha hozone laya" is one lyric. But altogether the effect is diminished in print; maybe the 30-second preview clip on iTunes might illustrate the point better.

]

From Don Blaheta:

In this week's Straight Dope, Cecil Adams does a lovely job debunking the myth that Americans' vocabulary has shrunk 60% in sixty years: "Does the American student have less vocabulary today than in days gone by", 9/7/2007.

Once again, the journalists fail to do their jobs (this time in a flub dating back to 1984), and the slack gets picked up by columnists and bloggers, although the damage is already done, sigh.

It's indeed an excellent piece, as usual from Cecil. But just to keep the numbers straight, I think he identifies the origin of the flub as an item in Harper's Index from 1990, which referenced studies from 1945 and 1984.

(Harper's Index might be responsible for introducing more dubious factoids into American culture than any other source, despite (or perhaps because of) their practice of footnoting sources.)

For more background on how to lie with numbers about vocabulary usage, see "Britain's scientists risk becoming hypocritical laughing-stocks, research reveals", "Vicky Pollard's revenge", and "An apology to our readers".

Dick Oehrle, an old friend who's now Chief Linguist at Cataphora, pointed me to a Red Herring article from 8/24/2007 that I somehow missed, "DC Madam Bets on Valley Firm".:

Accused of running a high-priced prostitution ring that catered to the rich and powerful in Washington, Deborah Jeane Palfrey, dubbed the “D.C. Madam,” earlier this year threatened to name names. But political elites breathed a sigh of relief when a television reporter reviewed a list of her clients and, save for a couple of exceptions, declared the names to be of little interest.

Now the woman at the center of Washington’s most titillating scandal in years is hoping to bolster her defense by hiring a small Silicon Valley search and data analysis company that she hopes will be able to mine a treasure trove of phone records, Congressional papers and other documents to draw up a much longer list of her clients.
That could put Cataphora, a relatively unknown Redwood City, California-based company, at the center of one of the most closely-watched scandals in Washington in recent memory.

“It could be the cornerstone of our defense,” Ms. Palfrey's attorney, Montgomery Blair Sibley, said of Cataphora’s work.

Privately-held Cataphora will analyze thousands of pages of Ms. Palfrey’s old telephone records—which do not have names attached to them—to names and numbers subpoenaed from telephone companies. Ms. Palfrey has said she knew most of her customers by first names or aliases only. And Mr. Sibley said her telephone records are the only evidence she has not yet destroyed.

Ms. Palfrey was charged in March with running a prostitution ring but she claims that her firm, Pamela Martin and Associates, was a legitimate escort service that catered to the erotic fantasies of up to 15,000 customers—but did not provide sexual services. Mr. Sibley said that without new names provided by Cataphora’s analysis, it could be difficult to call witnesses to support his client’s position.

“This is the best way (for me) to put people on the stand to say, ‘I was just getting a massage or sniffing underwear’ or whatever,” he said.

I'm sure that they're all looking forward to it.

(In fact, Dick was in town for a visit, and told me about the article in person.)

From time to time, a student asks me "but what can I do with a degree in linguistics?" Now I can give them an even more diverse list of career options.

Posted by Mark Liberman at 08:08 AM

September 09, 2007

Detained, not arrested

You know your legal theories are in trouble when a sports-oriented comic strip starts making fun of them.

The backstory here is that Tank McNamara, a former NFL lineman working as a sports broadcaster, has run afoul of the league's Code of Personal Conduct in a case of Road Rudeness:

The (fanciful) premise is that the league has "pseudo-police" who keep watch on players, even former players, using illegal surveillance techniques:

And (again fancifully) they don't arrest you, they "detain" you. Which, we've learned in the past six years, means no habeas corpus or other judicial recourse, plus possible waterboarding and the like.

The whole analogy between the NFL's Code of Personal Conduct, which the commissioner can use to suspend players more or less at will, and the various legal theories under which the Bush administration has claimed the right to carry out domestic surveillance without a warrant and to detain citizens indefinitely without charges, is more than a little strained. But then, so are those legal theories.

Posted by Mark Liberman at 07:34 AM

Pavarotti and the crack to chaos

The death of Luciano Pavarotti has brought out some of the hallucinatory anatomy, physiology and acoustics that singers and their teachers use to describe what they do. But it also brought a note from a reader with an interesting question about the emotional content of musical intervals and scales.

Daniel J. Wakin's "High C: The Note That makes Us Weep", NYT, 9/9/2007, contains a few choice examples of vocal metaphors and allusions:

“The reason it’s so exciting to people is, it’s based on the human cry,” said Maitland Peters, chairman of the voice department at the Manhattan School of Music. “It’s instinctual. It’s like a baby. You’re pulled into it.” When a tenor sings a ringing high C, it seems, “there’s nothing in his way,” Mr. Peters said. [...]

Mr. Peters ... said the chest voice, the strongest source of sound, and the head voice, where the sound vibrates in the head’s cavities, must be perfectly balanced. The base of the tongue, the jaw, the larynx must all lie in just the right position, unrestricted by tension.

Mr. Pavarotti once described the feeling this way: “Excited and happy, but with a strong undercurrent of fear. The moment I actually hit the note, I almost lose consciousness. A physical, animal sensation seizes me. Then I regain control.”

The tenor Juan Diego Flórez, also acclaimed for his top, will sing “Fille” at the Met in April. He said in an interview on Thursday that he imagines a keyboard in his head, and reaches for the note there.

“You think very high,” he said. “You give a lot of space in your throat.”

But what led a reader to write me about this article was not the vocal-technique metaphors, but the echo of Baroque Affektenlehre in this remark about pitch and key:

The pitch, in itself, has a satisfying quality. The key of C major, after all, is a stable, cheerful, happy key, the one with no sharps or flats.

Lane Greene wrote:

As a musician I'm a bit struck by the bit about C major being the happiest of all keys, with no sharps or flats. No individual note, sharp or flat, should be any happier than any other, right? Any major key has the same interval between its notes - A-flat major should be every bit as happy as C major... or is there something I don't know about flats and sharps?

There's the classic line in Spinal Tap where Nigel describes D-minor as "the saddest of all keys". It's ridiculous because anyone knows D-minor shouldn't be any sadder than G-minor. Or not?

Well, not, at least historically. As the article on "Key" in Grove's explains:

Keys are often said to possess characteristics associated with various extra-musical emotional states. While there has never been a consensus on these associations, the material basis for these attributions was at one time quite real: because of inequalities in actual temperament, each mode acquired a unique intonation and thus its own distinctive ‘tone’, and the sense that each mode had its own musical characteristics was strong enough to persist even in circumstances in which equal temperament was abstractly assumed.

To see where this comes from, consider the circle of fifths, which leads us conceptually through the 12 chromatic keys of recent western music:

A perfect fifth corresponds to a pitch ratio of 3 to 2, and successive musical intervals correspond to multiplication of ratios, so that 12 perfect fifths makes a ratio of 3/2 raised to the 12th power, or about 129.7463.

This is unfortunate, because the assumption built into the circle is that 12 fifths equals 7 octaves, bringing us back to the same pitch class we started from, 7 octaves up. But the interval of an octave is a ratio of 2 to 1, and 2 to the 7th power is 128.

This divergence between 12 fifths and 7 octaves -- ((3/2)^12)/(2^7) ≅ 1.013643 -- has been known for more than two millennia as the Pythagorean comma. It's one of three crucial flaws in the mathematical fabric of reality that apparently formed part of the esoteric lore of the Pythagoreans. According to Thomas McEvilley, "The Shape of Ancient Thought: Comparative Studies in Greek and Indian Philosophies",

It requires a leap of horizon to understand the intensity with which such things mattered to ancient thinkers. ... The issue which made [the Pythagorean comma] so pressingly important was nothing less than the question .. whether reality is mathematical or not.

When Pythagoras discovered (or learned) the so-called Pythagorean Theorem, ... it is said that he hastened to sacrifice oxen. he felt that he had touched on a power center in the mathematical fabric of the universe. ...

The Pythagorean Theorem is the threshold to the discovery of irrational numbers and incommensurable lengths -- a discovery which Hellenists attribute to Hippasus of Tarentum, a renegade Pythagorean whom, according to one account, Pythagoras pushed off a boat for revealing to outsides the tragic secret of the Pythagorean Theorem, which was irrationality or incommensurability. ... The discovery that the side and diagonal of a square will always be incommensurable produced an ideological convulsion in the Pythagorean order comparable to the shock conveyed by the discovery of the Precession or the Pythagorean comma. ... Like the Precessional drift and the Pythagorean comma, this apparent crack or gap in the mathematical fabric of the universe seemed ominous, as if such cracks lead through the membrane of order to chaos. They deny that the universe is orderly and hence that it is cognizable, and thereby remove credibility from all human thought. The Precession threatens the calendar and all the depends on it, and through the Pythagorean comma, as through a crack to chaos, the plethora of untuned sounds that could disrupt the harmony of the universe flows in.

The notion that small-integer musical intervals also play an important role in the melody of speech recurs stubbornly. Sometimes the idea is that different emotional or attitudinal states are associated with different musical intervals -- I posted last year about some Dutch research that claimed to find that sad people speak in minor keys ("Poem in the key of what", 10/29/2006). Another recurrent idea is that different languages have different characteristic intervals or scales -- R.A. Hall once argued that Sir Edward Elgar never because popular outside of the U.K. because he favored intervals that are peculiarly common in British speech ("Elgar and the intonation of British English", Gramophone 31(6), 1953). (Actually, in fairness to Hall, he claimed only that Elgar's frequent use of melodic leaps echoed the typically wide pitch range of British speech.)

In evaluating these ideas, I think we can safely say that if Pythagoras had based his cult on the role of number theory in human speech, the problems that led to Hippasus of Tarentum's unfortunate maritime accident would never have arisen. If there is any secret knowledge here, it would have to be a clever method for finding small integers in the intervals of speech, in the face of the straightforward observation that linguistic tone and intonation (appear to?) involve glissandi among freely gradient pitch values.

Returning to that embarrassing Pythagorean comma, to see what it means for the scales in different keys, consider what a properly-tuned (i.e. "just") diatonic scale is like. The scale degrees correspond to small-integer ratios of pitches, as indicated in the table below.

	Interval name	Just interval	Just interval relative to 1
C	unison	1:1	1
D	(major) second	9:8	1.125
E	(major) third	5:4	1.250
F	fourth	4:3	1.333...
G	fifth	3:2	1.500
A	sixth	5:3	1.666...
B	major seventh	15:8	1.875
C	octave	2:1	2

Within the diatonic scale, some of the internal intervals work out exactly -- thus F to C is (2/1)/(4/3) = 3/2, just as it should be, and and G to D is also (18/8)/(3/2) = 3/2.

But there are worrisome symptoms already here. If we go up by a perfect fifth from D to A, for example, we would get (3/2)*(9/8) = 27/16 relative to C, which is not the 5:3 ration of a perfect sixth. Turning it around the other way, the interval between justly-tuned D and justly-tuned A (in the key of C) is (5/3)/(9/8) = (5/3)*(8/9) = 40/27, which is by no means 3/2. We'll see in a minute what this means for other intervals, if we move to the key of D major without re-tuning.

If we build on these intervals to fill in the rest of the justly-tuned chromatic scale we get something like this:

	Interval name	Just interval
C	unison	1/1
C#	minor second	16/15
D	major second	9/8
D#	minor third	6/5
E	major third	5/4
F	fourth	4/3
F#	diminished fifth	7/5
G	fifth	3/2
G#	minor sixth	8/5
A	major sixth	5/3
A#	minor seventh	16/9
B	major seventh	15/8
C	octave	2/1

But now consider the derived ratios in (for example) the key of D major, compared to C major:

	Interval name	Just interval	Just interval relative to 1		Derived interval	Derived interval relative to 1
C	unison	1:1	1	D	(9/8)/(9/8) = 1:1	1
D	(major) second	9:8	1.125	E	(5/4)/(9/8) = 10:9	1.111...
E	(major) third	5:4	1.250	F#	(7/5)/(9/8) = 56:45	1.244...
F	fourth	4:3	1.333...	G	(3/2)/(9/8) = 4:3	1.333...
G	fifth	3:2	1.500	A	(5/3)/(9/8) = 40:27	1.481..
A	sixth	5:3	1.666...	B	(15/8)/(9/8) = 5:3	1.666...
B	major seventh	15:8	1.875	C#	(216/15)/(9/8) = 256/135*	1.896...
C	octave	2:1	2	D	(29/8)/(9/8) = 2:1*	2

Four of the seven scale intervals are different!

On this approach, each of the twelve chromatic keys will contain different internal intervals.

Violinists and singers can re-tune the intervals when they modulate, but keyboard players and players of fretted string instruments (like viols) are stuck, and wind players are limited in what they can do to change the pitches they play.

Because some of the keys that result from this problem are not just different, but unpleasantly sour-sounding in some of their crucial internal intervals, various schemes have been developed over the centuries to "temper" the tuning of the different chromatic scale degrees, so as to spread the problem across different key signatures to some extent. The most consistent method for doing this is "equal temperment", in which all twelve semitones are set exactly as the twelveth root of 2, a ratio of approximately 1.0595:1. Thus a tempered fifth is not 1.5:1, but rather 2^(7/12) ≅ 1.498; a tempered major third is not 1.25:1, but 2^(4/12) ≅ 1.260;1

In equal-tempered tuning, the internal intervals of all keys are exactly the same. But equal temperment didn't become the norm until the 19th century -- before that, other systems of temperment were used, in which the internal intervals in different keys in fact were different.

It's probably because of this that the Baroque era's "Theory of the affects" (Affektenlehre) included a component based on choice of key. According to Grove's definition, this was

In its German form, a term first employed extensively by German musicologists, beginning with Kretzschmar, Goldschmidt and Schering, to describe in Baroque music an aesthetic concept originally derived from Greek and Latin doctrines of rhetoric and oratory. Just as, according to ancient writers such as Aristotle, Cicero and Quintilian, orators employed the rhetorical means to control and direct the emotions of their audiences, so, in the language of classical rhetoric manuals and also Baroque music treatises, must the speaker (i.e. the composer) move the ‘affects’ (i.e. emotions) of the listener. It was from this rhetorical terminology that music theorists, beginning in the late 16th century, but especially during the 17th and 18th centuries, borrowed the terminology along with many other analogies between rhetoric and music. The affects, then, were rationalized emotional states or passions. After 1600 composers generally sought to express in their vocal music such affects as were related to the texts, for example sadness, anger, hate, joy, love and jealousy. During the 17th and early 18th centuries this meant that most compositions (or, in the case of longer works, individual sections or movements) expressed only a single affect. Composers in general sought a rational unity that was imposed on all the elements of a work by its affect. No single ‘theory’ of the affects was, however, established by the theorists of the Baroque period. But beginning with Mersenne and Kircher in the mid-17th century, many theorists, among them Werckmeister, Printz, Mattheson, Marpurg, Scheibe and Quantz, gave over large parts of their treatises to categorizing and describing types of affect as well as the affective connotations of scales, dance movements, rhythms, instruments, forms and styles.

Even after the adoption of equal temperment, there remain several possible reasons for associating moods and attitudes with keys. From the point of view of performers, different keys can lie very differently on their instrument, and therefore feel different to play. And from the point of view of some listeners, those with perfect pitch and synesthesia, different keys may also evoke very different associations. Even if all the intervals are tempered and therefore mathematically identical in every key, the pitches themselves are different.

[Jonathan Knibb writes:

This isn't an especially original or deep observation, but as a coda to your excellent post on key and temperament it would be worth mentioning that, whether or not there is any auditory reality to the perceived affective differences between keys, a widespread belief in such differences can become a self-fulfilling prophecy, as composers choose keys they feel appropriate to the nature of their music. Of course, this could happen also without any such belief, as long as a sufficient proportion of influential works appear to support certain associations. To take the quoted example of D minor, many of Mozart's works in that key (the opening of Don Giovanni, the piano concerto in that key, the Requiem, etc.) share a certain affective quality, difficult to define in words but distinct from his use of say G minor, and it would be surprising if that fact did not influence its later use, explicitly or subconsciously, whatever Mozart's own feelings may have been.

]

[Ray Girvan writes:

Another major mover in forming these "pitch = particular emotion" conventions was John Curwen, who popularised the Sol-fa system as a mnemonic to teach music to singers not up to reading a score.
Curwen stated (I don't know on what grounds) mental effects for each note of the scale: "Doh, the strong or firm tone; ray, rousing and hopeful; me, steady and calm; fah, desolate or awe-inspiring; soh, grand or bright tone; lah, sad or weeping tone; te, piercing or sensitive tone".
The Sol-fa system was immensely influential, and so many singers must been have told this as fact that I could well believe, as Jonathan Knibb says, that it could have turned into self-fulfilling prophecy.

]

[Randy Alexander writes:

Even in equal temperament, keys can have different characteristics simply because one key is higher or lower than another key. The dominant to tonic relationship in G is far away from the same dominant to tonic relationship in C. A melody sung in one key has notes that are easy, hard, resonant, not so resonant, etc. The head voice vs. chest voice mix will be different for every note, giving each key a markedly unique quality. Even on a piano, keys that are relatively higher are perceived as hollow, and relatively lower keys are muddy. For composers, choice of key has a strong relationship with what the music is actually doing. Simply because of range, some things sound great in one key but ridiculous in another.

True, but people who feel that different keys have different affects seem to feel that way about (say) B major vs. C major, where the range difference is small.]

Posted by Mark Liberman at 07:31 AM

September 08, 2007

Whom was that masked man?

Rhymes With Orange takes on social class and pronoun case:

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 01:45 PM

Language is a virus

To add to Mark Liberman's reflections on language as a virus, with bows to William S. Burroughs and Laurie Anderson... here are two of Tom Tomorrow's explorations into the subject, one from 2001 and the other from 2005:

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 12:13 PM

Types of truth

Last week, I commented on Rahm Emanuel's clever remark that "Alberto Gonzales is the first attorney general who thought the truth, the whole truth and nothing but the truth were three different things". I observed that the formal semantics and pragmatics of the traditional oath are subtle things, and asked for a logical translation of the joke ("Political semantics quiz", 8/28/2007).

The answers were surprisingly diverse -- summarizing crudely, some people think that the three sub-phrases are just lawyerish repetition for emphasis, while others think that they reference three different relations between statements and the world, or perhaps three different relations among speakers, statements and the world.

Mr. Emanuel's quip seems to rely on the redundancy theory being true, or at least being the view of all previous holders of Mr. Gonzales' office. I thought that this implication was probably wrong, although the joke is still funny.

Email has piled up faster than I can post it or answer it, as Barbara Partee has recruited several other interesting people into a lively discussion. I've posted a couple of the more interesting messages below.

Jerry Hobbs wrote:

Construct the following matrix:

said: not said:

true:
A

B

false:
C

D

To tell the truth: A > 0
To tell the whole truth: B = 0
To tell nothing but the truth: C = 0

(The first line depends on how you interpret "the" in "the truth". A better way of saying "A > 0" would be "Make sure there is some truth in what you are saying." Actually, I think "Tell the truth." is just a summary statement for what is elaborated in the next two statements; so it would really mean "B = 0 & C = 0".)
Since A/(A+B) is recall, promising to tell the whole truth is promising 100% recall. Since A/(A+C) is precision, promising to tell nothing but the truth is promising 100% precision.
I've always thought a good answer to that standard TV lawyer question, "Please answer the question yes or no." would be "I'm under oath to tell the whole truth. The whole truth is more complicated than just a yes or a no." But fortunately I've never been in a position to use this argument.
In fact, I think the Supreme Court has ruled that a witness is not required to tell the whole truth if he or she is not asked the right question.

I think that Jerry's initial picture of the semantics is similar to what I implied by suggesting that truth is like pie: ("Do you swear to eat the pie, the whole pie and nothing but the pie?"). But he concludes that the first clause is really "just a summary statement for what is elaborated in the next two" -- while Tim Finin thought that clause #1 and clause #3 are synonymous, while clause #2 means something different. And I bet that Jerry meant truth to be constrained by relevance, since his analysis would otherwise require rather lengthy testimony in answer to any question (the number-theory segment alone would empty the courtroom and exhaust the witness...)

Larry Solan commented on the relevant SCOTUS decision, and more generally on the subtle philosophy of language involved in legal discussions of perjury:

The perjury cases are very interesting. The lead case is called Bronston v. United States, decided by the Supreme Court in 1973. Bronston had filed for bankruptcy. He was questioned under oath as follows:

Q. Do you have any bank accounts in Swiss banks, Mr. Bronston?

A. No, sir.

Q. Have you ever?

A. The company had an account there for about six months, in Zurich.

Q. Have you any nominees who have bank accounts in Swiss banks?

A. No, sir.

Q. Have you ever?

A. No, sir.

It turns out that he also had had a Swiss bank account in the past. He was convicted of perjury, but the Supreme Court reversed 9 - 0, on the theory that he didn't say anything literally false and that it was up to the questioning lawyer to pursue the truth. So far, implicature loses.
But there is a footnote in the opinion. It gives the following hypothetical situation, which it agrees does constitute perjury:

"The District Court gave the following example as an illustration only: If it is material to ascertain how many times a person has entered a store on a given day and that person responds to such a question by saying five times when in fact he knows that he entered the store 50 times that day, that person may be guilty of perjury even though it is technically true that he entered the store five times."

The Court comments: "it is very doubtful that an answer which, in response to a specific quantitative inquiry, baldly understates a numerical fact can be described as even "technically true." Whether an answer is true must be determined with reference to the question it purports to answer, not in isolation. An unresponsive answer is unique in this respect because its unresponsiveness by definition prevents its truthfulness from being tested in the context of the question -- unless there is to be speculation as to what the unresponsive answer "implies."
Furthermore, there is a more recent case by a court of appeals which held someone guilty of perjury for testifying truthfully when the questioner had misstated the question. The witness answered truthfully to the question as put (it contained a wrong date), but the witness clearly knew what the questioner had meant and purposely took advantage of the mistake to attempt to create a false impression. So the law seems to care about implicature just when further inquiry is not likely to undo the perlocutionary effect of deceit.
Peter Tiersma and I write about this in our book, Speaking of Crime, focusing largely on the Clinton scandal. I think it shows that even though lawyers don't know the vocabulary, they have a pretty good intuitive sense of the relevant linguistic concepts. As for the morality of cases like Bronston, that's a different matter.

I take all this to confirm my suspicion that Alberto Gonzales was by no means the first Attorney General to think that the truth, the whole truth and nothing but the truth are three different things. But he may have been uniquely ineffective at exemplifying the distinction in testimony to congress.

[Update -- Bob Ray writes:

I was comforted by following statement in your post:
"In fact, I think the Supreme Court has ruled that a witness is not required to tell the whole truth if he or she is not asked the right question."
Many years ago, I was an expert witness on spelling errors in a highly publicized kidnapping case. The word "approuch" appeared in the ransom note and in a letter written by one of the defendants years earlier. This spelling error was one of three key pieces of evidence against the two defendants.
I told the defense lawyers when they asked me to testify that I could think of a number of reasons why "approuch" is a reasonable way to misspell the word "approach," but that if I was asked if it was a "common" misspelling of the word, I would have to say I didn't think so. This was ante-Google and all I had to go on were compiled lists of common spelling errors.
When I was on the stand, the defense attorney asked me if it was a reasonable misspelling and I gave a number of reasons why I thought it was. The prosecutor asked me nothing at all. The defendants were acquitted.
I think I can rest easy about having been guilty of perjury, but I still wonder if I violated my oath to tell "the whole truth." In all these years, I've never seen the word spelled that way in the wild but it's also a comfort that Google shows over 29,000 hits for it (and 366 million for "approach").

]

Posted by Mark Liberman at 09:37 AM

White goods all over the world

[Note: This post is unusual in having been modified and expanded numerous times over its first day or two.]

As usual, language turns out to be a much more complex and interesting phenomenon than it first appears, and etymology (word or phrase origins) is one of the most difficult aspects of it. Even though I avoid advertising my email address (it filters out the spam robots and the stupid people, so I end up only getting mail from clever people like you), I have received mail from all over the world about the phrase white goods, which was the subject of my recent quiz question. The picture is cloudy, and neither the history nor the geography of the phrase is clear.

The answer to the quiz question, to begin with, is that white goods means large household appliances — refrigerators, washing machines, dryers, dishwashers, cookers — of the sort that used to be very commonly supplied with a white enamel finish as the only color option. The other meanings I simply made up — except that Wikipedia reports the use of the phrase white goods in American English to mean linen goods like sheets, and identifies the appliance meaning as mainly British. [I missed one additional meaning that the phrase has in another context, pointed out to me by Jonathan Lundell in the USA: in the alcoholic beverage trade, the spirits vodka, gin, tequila, and rum are called white goods, while bourbon, scotch, and other whiskies are called brown goods.] But Language Log readers have also been offering opinions, experiences, and added information about the household appliances meaning. I report some below.

Andrew White in Australia says the phrase has been in widespread use in Australia since at least the 1970's, because he remembers ads featuring it when he was a child, and his mother actually had to explain to him that not all white goods had to be white. (Notice, that means the phrase is an idiom: even with perfect knowledge of the meaning of the parts, you cannot figure out the meaning of the whole in a principled way by using general facts about English syntactic structure.)
On the other hand, Rory Turnbull in Scotland says that in 21 years of residence in Scotland he has never encountered the phrase there. Nonetheless, it was in Scotland just a few days ago that I saw the phrase in the fine print of the lease on the flat that is my new home. Rory may simply have missed it by not being involved with the sort of trades where white goods are mentioned in bulk.
Jussi Piitulainen in Finland couldn't guess the meaning (again, that's the hallmark of an idiom, of course), but notes that the answer is revealed in Collins' excellent COBUILD dictionary.
Patrick Heenan in Canada offers evidence that the phrase was current before I emigrated from Britain in 1980. He spent the summer of 1979 working for a department store in Canterbury, England, helping to deliver white goods to customers' homes, and distinctly remembers both that the term was new to him then, and that he and his co-workers were instructed not to use it when talking to customers, who would not understand because — and this is the surprise — it was a specialized term from American English (or so they thought).
Steve Jones in Sri Lanka [no, sorry, he's in Saudi Arabia; he gets about a bit] points out that people are regularly wrong about dialect origins, and we need corpus evidence as a corrective. He doesn't think white goods is limited to British English or necessarily came from the UK, having found a Time magazine example from 17th Jan 1950 (White goods (stoves, refrigerators, washing machines) should be painted), and three more from the 1950s and 1960s, only then the word disappears from view, and the next hit in Time is from 2002 (in Asia, Europe and the U.S., which produce "white goods" from washing machines to microwaves). To cap it all, he says he first encountered the phrase in Spanish translation.
And talking of translations of the idiom into other languages, Cihan Baran, from Turkey, but currently at Stanford University in California, provides the remarkable piece of information that exactly the same usage is found in Turkish.
What's more, Tako Schotanus says witgoed in Dutch has the same sense.
Leslie Decker says there is a direct equivalent in Czech as well: bílé zboži.
Nick James tells me that the Danish equivalent hvidevarer (lit. white wares/goods) is very common in Denmark (distinguishing between hard white goods, which are appliances, and soft white goods, which are linens etc., according to Jens Bjernemose).
José San Martin reports a very similar use of branca in Brazilian Portuguese to distinguish "white" product lines (linha branca) from "brown" ones.
Steven Tripp in Japan advises that Jim Breen's online Japanese-English dictionary translates "white goods" to a Japanese phrase, 白物, literally "white things", meaning large household appliances. So at this stage we have at least eight languages worldwide that have corresponding idioms with the same meaning. This is a global meme.
Tim McKenzie says that in New Zealand they say whiteware.
Thomas Williams, in the UK, searched the Oxford English Dictionary and found examples from The Economist in 1960 (where "so-called" precedes the phrase, and it is in scare quotes), and from the consumer magazine Which? in 1976 talking about "what the trade calls white goods", and from The Times in 1981 using it apparently without any indication that it's anything but a common, everyday term. (Confusing the picture a little, it has an example from American writer Laura Ingalls Wilder in 1943, but that one, "Busily working with the white goods, Ma and Laura discussed Laura's dresses", probably refers to linens and such. The example from Which? is particularly interesting because it also has the phrase "brown goods", which is used in the trade to refer to electronic devices such as TVs and radios. It's still apparently in use, though apparently much less than white goods, possibly because although white goods are still mostly white, the days when radio and TV cases were often made of a ghastly brown Bakelite or similar dark brown fake wood veneer are long past. Tako Schotanus says "brown goods" also has a Dutch counterpart.
Steve Maguire tells me that, by 2000 at any rate, staff at Sears (largest appliance retailer in the USA) were using the phrase "white goods", so perhaps it was there in American English all along and I just happened never to come in contact with it during my many years of residency.

What do we learn from this, class? Anyone? Yes, you at the back. That's right. We learn that language is complex and surprising; that idioms do not have either predictable meanings or meanings that are constant across the Anglophone world, though they can spread into different languages; that identifying phrase origins is very difficult; and that the origin of the phrase white goods probably lies somewhere in the 1930s or 1940s but we don't even know which side of the Atlantic it originated. That's all for today. Don't forget there will be another quiz next week. Have a great weekend.

Posted by Geoffrey K. Pullum at 08:55 AM

More on culture as disease

A few days ago, at the end of a post about how viral became a good thing (as in "viral marketing"), I added a link and comment from Cosma Shalizi, who noted the "fairly long history of the idea-infection analogy, and the idea-evolution analogy". I speculated that seeing the spread of disliked ideas as contagion probably goes back to classical times, and wondered when the analogy first starts to apply to "positively-evaluated information or attitudes or groups". Cosma's response:

"I guess that it wouldn't be a surprise to find that some third-century Romans saw Christianity as a plague": how about a second century Roman, namely Pliny the Younger? Here he is, writing as governor of the province of Bithynia, to the emperor Trajan in 110:

In fact, this contagious superstition is not confined to the cities only, but has spread its infection among the neighbouring villages and country.

This is the translation in Project Gutenberg, letter 96 in book 10 of his letters. The Perseus Project gives the Latin text as

Neque civitates tantum, sed vicos etiam atque agros superstitionis istius contagio pervagata est

which confirms "contagious superstition" as Pliny's thought, and not the translator's.Clearly, Pliny does not approve.

On the other hand, here is Walter Bagehot in what seems to be the first selectionist account of cultural and social evolution, Physics and Politics (1872):

The same patronage of favoured forms, and persecution of disliked forms, are the main causes too, I believe, which change national character. Some one attractive type catches the eye, so to speak, of the nation, or a part of the nation, as servants catch the gait of their masters, or as mobile girls come home speaking the special words and acting the little gestures of each family whom they may have been visiting. I do not know if many of my readers happen to have read Father Newman's celebrated sermon, 'Personal Influence the Means of Propagating the Truth;' if not, I strongly recommend them to do so. They will there see the opinion of a great practical leader of men, of one who has led very many where they little thought of going, as to the mode in which they are to be led; and what he says, put shortly and simply, and taken out of his delicate language, is but this--that men are guided by TYPE, not by argument; that some winning instance must be set up before them, or the sermon will be vain, and the doctrine will not spread. I do not want to illustrate this matter from religious history, for I should be led far from my purpose, and after all I can but teach the commonplace that it is the life of teachers which is CATCHING, not their tenets. And again, in political matters, how quickly a leading statesman can change the tone of the community! We are most of us earnest with Mr. Gladstone; we were most of us NOT so earnest in the time of Lord Palmerston. The change is what every one feels, though no one can define it. Each predominant mind calls out a corresponding sentiment in the country: most feel it a little. Those who feel it much express it much; those who feel it excessively express it excessively; those who dissent are silent, or unheard.

(From Gutenberg e-text. The passage is on pp. 66-67 of my 1956 Beacon Press paperback edition.) Here what is "catching" (emphasis in the original) is a good thing, or at least potentially so, depending on content. He also speaks of "infectious belief" (but negatively, of superstitions), and "the infection of imitation" (neutrally). Your post led me to finally read Bagehot, a mere nine years after I picked up my copy of his book at a library sale... As always when reading the most respectable Victorians, the casual racism -- the apparent self-evidence of the causal racism -- is quite shocking: "To offer the Bengalese a free constitution, and to expect them to work one, would be the maximum of human folly" (p. 132). And this is in the context of arguing that "There then must be something else besides Aryan descent which is necessary to fit men for discussion and train them for liberty"! This probably calls for a post of its own.

[Above is a note from Cosma Shalizi.]

Posted by Mark Liberman at 07:50 AM

September 07, 2007

Guess the meaning: a British English quiz question

If you're a speaker of American English with no experience of life in the United Kingdom, here's a quiz question (this is not for you if you live in Britain; to you it will seem absurdly easy).

Choose the correct meaning for the phrase white goods from the following list of potential meanings:

Goods of any sort that are white in color — flour, paper towels, lilies, emulsion paint, toothpaste, ermine fur, milk, eggs, refined sugar, button mushrooms, etc.
Goods that carry no duty and can thus be freely imported and carried through customs without officials needing to be in any way concerned with them.
Garments typically or traditionally made with undyed white cotton, such as plain dress shirts, underwear, tennis shorts, cricket clothes, and so on.
Goods that are fully legal, in the sense of being properly imported with duties properly paid rather than being part of the so-called "black economy".
Office paper, letter envelopes, and similar white paper office supplies.
Household appliances such as washing machines or refrigerators that are often painted white.
Linen household goods such as sheets, pillow cases, and towels.
Goods of a sort determined by market research to be primarily of interest to customers of European rather than African or Asian origins.
Goods deemed by government regulatory agencies to be (unlike an increasing number of toys and other products from the People's Republic of China) free of harmful features and fully fit for sale to the general public.
Milk, buttermilk, yoghurt, and other non-cheese liquid dairy products.

I tell you frankly, after being away from Britain for 25 years, I would not have been able to answer this question.

I've come across the phrase a couple of times since taking up residence in Britain a little over a month ago, and one of those occurrences was in a legal context, so it's not slang, it's a real part of the language now. And as far as I can report, in 1980 it was not.

I'll leave a few days for you to do your own guessing and to test monolingual American friends, relatives, and colleagues to see what percentage of them can guess the correct meaning (they are not allowed to do any Googling or Wikipediery before answering, of course). Then I'll supply the correct answer in an update below. My guess is that there will be quite a range of answers, showing that I was not alone in being unable to guess the meaning this phrase has now taken on in British English.

[Update: The answer has now been given in a subsequent post.]

Posted by Geoffrey K. Pullum at 05:30 PM

Dangling Modifiers at Jeopardy

In the first game of the final round of the 2007 Jeopardy College Championship, one of the $2000 Double Jeopardy clues was:

You may find "awk" for "awkward" next to this type of ambiguous modifier, like "shake before using"

The intended response was: "What is 'dangling'?". Apparently the Jeopardy folks think that "shake before using" contains a dangling modifier. It doesn't, for the simple reason that it contains no modifiers of nouns at all. The only modifier is the clause "before using", which may be treated as an adverbial modifier of "shake", and this does not dangle since the constituent it modifies is overt. [I've modified slightly my original statement that there are no modifiers at all here in response to a comment by Arnold Zwicky. I don't think that such clauses are considered modifiers in school grammar, though I could be wrong about that.]

Here's a definition of dangling modifier by Frances Peck from the web site of the Writing Centre at the University of Ottawa:

A dangling modifier is usually a phrase or an elliptical clause -- a dependent clause whose subject and verb are implied rather than expressed -- that functions as an adjective but does not modify any specific word in the sentence, or (worse) modifies the wrong word.

It gives the example:

Raised in Nova Scotia, it is natural to miss the smell of the sea.

Here, the phrase "raised in Nova Scotia" is intended to modify the implicit experiencer of the main clause. That is, this sentence is intended to mean the same thing as:

It is natural for a person raised in Nova Scotia to miss the smell of the sea.

where "raised in Nova Scotia" modifies "person".

As I am a specialist neither in English nor in school grammar, just to be sure that the term "dangling modifier" is not used differently from the way in which I understand it, I checked with an authentic expert, my colleague Geoff Pullum. The Jeopardy folks ought perhaps to do the same.

Fortunately, this error did not affect the game as none of the three contestants, Craig Boge, Christine Kennedy, and Cliff Galiher, even attempted this clue. It is possible that this is because they know nothing at all about grammar, but an optimistic interpretation is that they were perplexed by the clue precisely because they do know what a dangling modifier is.

Posted by Bill Poser at 01:31 PM

Naming Opportunities

Larry Gonick is upset about the Privatization of Everything:

;

If a bound morpheme like "Mc" can become a form of intellectual property, then I suppose not even a function word is safe -- though in the current international legal order, it's only the naming of products and services and so on that can be constrained, not everyday use.

And actually, in my opinion, the most striking threat to individual linguistic freedoms comes not from privatization of culture, but from its nationalization. There are serious proposals to create new collective property rights, such as the the UNESCO/WIPO proposal for sui generis folklore rights ("The Algonquian Morpheme Auction", 3/3/2004). It's clear that some well-intentioned people believe that (the authoritative representatives of) at least some cultures should (and perhaps already in effect do) have the power to regulate the use of "their" languages, including requiring payment and forbidding uses that they don't like ("Language as property", 11/24/2006; "Mapuche is ours, not yours", 11/24/2006; "Should the 'owners' of a language be permitted to forbid its use to criticize them?", 12/13/2006; ).

If you don't think that this could be a serious issue, please think for a few minutes about the potential interactions between well-meaning and naive international intellectuals on one hand, and corrupt and rapacious local bureaucracies on the other. (Or, alternatively, corrupt and rapacious international conglomerates working with well-meaning and naive local authorities, or any of the other logical combinations of global and local naiveté and rapacity -- take your pick...) Then read stories like this one for an indication that the problem is not purely hypothetical.

Kerim Friedman at Savage Minds offers some additional links and discussion here.

[Hat tip: John Lawler]

[Update -- Piotr Orbis Proszynski writes:

Reading the recent post on "Naming Opportunities" lead me to your speech on "UNESCO/WIPO proposal for sui generis folklore rights" (in " The Algonquian Morpheme auction", 3/3/2004).
In the interest of weeding out an etymological fallacy: while the word "polka" does come from the word for "a Polish woman", as a dance it originates in Bohemian folklore. Even if performed by a (typically immigrant Polish) Chicago-style band, hypothetical folklore royalty cheques would go to the Czechs!
I am not sure how the WIPO proposal would deal with the "polska" dance, popular only in Scandinavia but stemming from the Polish court; however the French-named "polonaise" is all Polish! Of course before Chopin, it used to be a lot less solemn -- I wonder if his classical polonaises would face penal sanctions for perverting Polish cultural heritage for his own profit.

I can see that the WIPO Folkloric Property Court will not only provide full-time employment for thousands of lawyers and hundreds of judges, it will also offer plenty of expert-witness gravy to the world's historians, anthropologists, musicologists and so on. If such a thing actually came to pass, it would be one of history's greatest transfers of wealth from the creativity of ordinary people to the bank accounts of bureaucrats, professionals, intellectuals and clerks. Fortunately (or unfortunately, depending on your perspective), I think that it's unlikely to happen. ]

Posted by Mark Liberman at 07:45 AM

September 06, 2007

The "fiction gap": empathy, prestige, or what?

According to Eric Weiner, "Why Women Read More than Men", NPR.org, 9/5/2007,

A couple of years ago, British author Ian McEwan conducted an admittedly unscientific experiment. He and his son waded into the lunch-time crowds at a London park and began handing out free books. Within a few minutes, they had given away 30 novels.

Nearly all of the takers were women, who were "eager and grateful" for the freebies while the men "frowned in suspicion, or distaste." The inevitable conclusion, wrote McEwan in The Guardian newspaper: "When women stop reading, the novel will be dead."

You can read that whole story here: Ian McEwan, "Hello, would you like a free book?", The Guardian, 9/20/2005. According to McEwan's report, he gave away the 30 novels to the "lunchtime office crowds picnicking on the grass" near his house in London, and only one man "was tempted", so that 97% of the books went to women. Eric Weiner gives a more general market statistic that is almost as striking:

When it comes to fiction, the gender gap is at its widest. Men account for only 20 percent of the fiction market, according to surveys conducted in the U.S., Canada and Britain.

I haven't been able to track down any of those surveys -- if you know a reference, please tell me. I did find Reading at Risk: a Survey of Literary Reading in America, NEA Research Division Report #46, June 2004, which says that in 2002, 55.1% of American women over the age of 18 read "literary" works, as opposed to 37.6% of men. (If they read equal amounts, that would give women 59% of the market; but presumably the women read more.)

We'll come back to some other facts from this report later on; for now, let's go forward with Weiner:

Theories attempting to explain the "fiction gap" abound. Cognitive psychologists have found that women are more empathetic than men, and possess a greater emotional range -- traits that make fiction more appealing to them.

This is all true, but it's not clear that it's enough to explain what's going on. Simon Baron-Cohen and others have devised what they call an "Empathizing Quotient", and tested many men and women for it. On the basis of the data presented in a paper of their that I discussed at length last year ("Stereotypes and facts", 9/24/2006), the distribution of EQs for men and women in the general population should look something like this (mean 41.8, sd 11.2 for men; mean 47.2, sd 10.2 for women):

I guess you could make something up about non-linear amplification of the effect, but it's not obvious that this difference should translate into 80% of the fiction market for women.

But let's go on, because we're about to meet an old friend:

Some experts see the genesis of the "fiction gap" in early childhood. At a young age, girls can sit still for much longer periods of time than boys, says Louann Brizendine, author of The Female Brain.

"Girls have an easier time with reading or written work, and it's not a stretch to extrapolate [that] to adult life," Brizendine says. Indeed, adult women talk more in social settings and use more words than men, she says.

Dr. Brizendine, apparently not any more impressed by recent contrary evidence that by earlier contrary meta-analyses, continues to show that she was a worthy winner of the Goropius Becanus award for 2006. For lagniappe, she turned Wiener on to mirror neurons:

Another theory focuses on "mirror neurons." Located behind the eyebrows, these neurons are activated both when we initiate actions and when we watch those same actions in others. Mirror neurons explain why we recoil when seeing others in pain, or salivate when we see other people eating a gourmet meal. Neuroscientists believe that mirror neurons hold the biological key to empathy.

(Mirror neurons are actually located both in the inferior frontal gyrus and in the inferior parietal lobule -- which is "behind the eyebrows" only in the sense that the whole brain is -- and according to some theories of their function, this pairing of posterior and anterior brain regions is essential to their function in abstracting over perception and action. But anyhow...)

The research is still in its early stages, but some studies have found that women have more sensitive mirror neurons than men. That might explain why women are drawn to works of fiction, which by definition require the reader to empathize with characters.

I've only found one study, Ya-Wei Cheng, et al., "Gender differences in the human mirror system: a magnetoencephalography study", NeuroReport, July 31, 2006, 17:11. They studied 10 males and 10 females (all Taiwanese) between 20 and 32 years old.

Neuromagenetic mu (∼20 Hz) oscillations were recorded over the right primary motor cortex, which reflect the mirror neuron activity, in 10 female and 10 male participants while they observed the videotaped hand actions and moving dot. In accordance with previous studies, all participants had mu suppression during the observation of hand action, indicating activation of primary motor cortex.

As you'd expect for a paper published in NeuroReport (which is a sort of quick sketch-pad for neuroscience results that are interesting but not yet ready for prime time), the gender-related effects are not very strong:

When the normalized suppression of the maximal 20 Hz post-stimulus rebound was quantified, women had the mean ± SEM as 11.6±5.1% in the hand and as 6.3 ± 5.7% in the dot. For men, the mean ± SEM was 5.6 ± 3.2% and 8.8 ± 2.8%, respectively. The statistical results did not show a major effect in the gender itself (F_1,18 = 0.767, P = 0.393), but in the condition itself (F_1,18 = 4.521, P = 0.048) and their interaction (F_1,18 = 9.331, P = 0.007).

Converting SEM ("standard error of the mean") to standard deviations, and plotting gaussians with the cited values, we get something like this:

The men had a bit less mu suppression for the hand compared to the dot, while the women had a bit more, and the interaction was statistically significant -- but it's not an enormous effect. (The biggest and most obvious trend, actually, seems to be that the women were more variable. Oh, and please note that this is not really a picture of their data distribution -- it's a display of gaussian distributions based on the means and standard errors that they cite. Their actual data is probably quite a bit messier.)

Cheng et al. note some problems with their study -- including the fact that the moving hand was a male one -- to be addressed in further work:

The significant interaction between the conditions and the genders may in part result from the different strategy of the participants (women versus men) during the observation of moving dot. Men might treat this stimulus as an 'object' to trigger the canonical neurons of the premotor cortex, but women did not. One MEG study by Hari et al. [5] reported that the viewing of a moving dot did suppress the ∼20 Hz poststimulus rebounds, but the suppression was weaker than that observed during action viewing. Here it was found that such suppression was modulated by the participant's gender. Moreover, the present findings might reflect partially the opposite-gender response, that is, female participants responded stronger to the displayed male hand, although the equivalent conjectural rate between the genders was controlled to ensure the male hand rendered androgynous. To address this issue, we need a further study with the displayed female hand.

These results are less striking, I think, than the much more direct measures of empathy reported by Baron-Cohen's group -- this seems to be another example where the neuroscience, though interesting and suggestive from a scientific point of view, actually adds nothing but empty intellectual prestige-symbols to the larger argument (which here is about the role of empathy in explaining the "fiction gap").

Overall, I'm not impressed by the strength of the argument from empathy (and the mirror-neuron stuff doesn't increase the strength of the argument at all, it just adds logically-irrelevant pizzazz). What else might be going on?

Well, let's take a look at some of the other data from the 2004 NEA study. The differences among racial and ethnic groups were as large as, or larger than, the differences between males and females:

Do we think that "White Americans" are more empathetic than "African Americans", who in turn are more empathic than "Hispanic Americans"? I certainly don't -- those differences must have some other explanation. (And it's not hard to come up with several hypotheses.)

What about the huge effects of education? Again, do we think that people with more education are more empathetic? And what about the effects of income, which are smaller than the effects of education, but (at the extremes) are as large as the effects of sex?

I'd like to suggest looking in a different direction to explain at least some of these effects. Most sociolingustic variables pattern in exactly the same way with respect to education, socio-economic status, and sex. (You can read all about one example, with graphs and tables, in a Language Log post from a few years ago: "The internet pilgrim's guide to g-dropping", 5/10/2004).

The fact is, reading "literary works" is a cultural trait associated with prestige groups in our society. In most cases, traits associated with education, prestige and formality are also found more strongly in women rather than men, other things equal.

It's uncertain, not to say controversial, why this is; and the effect (though sometimes large) is probably not strong enough to explain an 80% market share (but maybe that figure is exaggerated, I don't know). All the same, I bet that (whatever accounts for) the sex-prestige interaction has as much to do with the fiction gap as those mirror neurons do.

[In both cases, the amount of explanation might well be "not much" -- but this is an interesting example of amplifying effect of ideology. Someone who was disposed to do so -- as I certainly am not -- could make up a story about how this shows us, once again, that women are insecure social climbers obsessed with displays of status markers. That story would be at least as well supported by the "scientific" evidence as the story about how women are motivated to read fiction because, as Brizendine told Wiener, ""Reading requires ... the ability to 'feel into' the characters. That is something women are both more interested in and also better at than men." ]

[Helen DeWitt writes:

(Hand to brow)
I have a friend, Lawrence Powers, who once demonstrated the circumstances in which men, given the chance, will talk your ear off -- we met at an art exhibition in Kreuzberg, he happened to mention that he was obsessed with soundtracks of video games (Americans play Japanese games, hence prog rock soundtracks, playing on non-hackable consoles, Europeans play European games, disco soundtracks, on hackable computers, hence demo parties, hence open-source movement, to condense ruthlessly), WOW, I said, and the rest was history, or rather a 5 hour discourse punctuated by the occasional WOW... Anyway, the discourse was definitely WOW-worthy, because Lawrence was talking also about generations of game-players, in the game world, he explained, a generation is about 6 years, he has 2 brothers, an older and a younger, but they belong to different generations, they played different games...
While he was talking I was thinking -- yeah, but I'll bet this is only really true of boys, there are girls who play games, sure, but I don't think they fall into generations according to the games they play, I don't think computer games define the sensibility of a cohort of girls...
The medium through which people prefer to engage with fiction may well be indicative of /something/. But if one group of people likes novels, and another group likes computer games, and a third likes films, taking one particular form of fiction without looking at the others tells us nothing about liking for fiction, let alone empathy. (It's not clear why an engagement with fiction in textual form should show a greater degree of empathy than an engagement with fiction in other forms.)
As far as McEwan's experiment goes, anyway, the observer may have affected the observations. I used to live in East London, spent a lot of time reading in pubs; I used to get into conversations with men who'd left school at 16 to go into the Army. As it turned out, they'd read a fairly wide range of fiction -- Nabokov, Welsh, Barnes and Mailer were some names that were mentioned. If a blonde American were to walk up to strange men in a British park and offer them a choice of novels she might get more takers than a 50-something-year-old British man. McEwan's slapdash approach to experiment design, of course, confirms all our worst fears about the scientific woollymindedness of both men and women in the arts.

Well, I wish I could say that that men and women in the sciences were uniformly clear-headed on such topics. McEwan's failure to control for the sex of the book-distributors is exactly mirrored by the failure of the mirror-neuron experimenter to control for the sex of the imaged hand -- though they did worry about it, perhaps because a reviewer brought it up. And in the more general domain of over-interpretation, (some of) the scientists are likely cross the finish line well in front.

The 2004 NEA report finds a large relative decline in male "literary" reading during the past 10 years, roughly as computers and digital gaming and the internet have spread. I like the idea of computer/video games as empathy projected through contemporary male culture. Even if the empathy sometimes comes disguised as giblets...]

[Russell Borogove writes:

Regarding the experiment of handing out free novels, besides controlling for the sex of the book distributors, I'd be very interested to know what books, exactly, were being given out (Tom Clancy, Stephen King, Christopher Moore, Barbara Kingsolver?), and what happens if you try and give away something non-book but relatively gender-neutral -- calendars of landscape photography, for instance.

Me too. But mostly, I'd like to track down that 80%-market-share feature. You could almost get it from another widely-quoted statistic -- that romance novels amount to 53% of fiction sales (I quote the figure from memory, but I think it's as I've read it.) If essentially all the romances were bought by women, and the rest of the fiction were divided half-and-half, you'd have about 77% market share, which would round up to 80%.

]

[Ann Bartow at Feminist Law Professors has a discussion of Weiner and related articles. ]

Posted by Mark Liberman at 08:53 AM

September 05, 2007

[obscene gerund]

Though I vowed to take a vacation from taboo avoidance, here's one that's just too delicious for a linguist: "obscene gerund" as an ostentatious replacement for the modifier fucking (or fuckin'). In a Doonesbury cartoon from 1999, and in a more recent Jerk City cartoon:

(Hat tip to Paul Blankenau.)

[Correction 9/7/07: The Trudeau cartoon above was dated 1999, but the panel is considerably older that that, as correspondents Qnavry Pheevr and Nathan Simpson have pointed out to me. The panel seems to have first appeared in 1985 and was included in Trudeau's 1986 collection That's Doctor Sinatra, You Little Bimbo.]

You can google up a fair number of other instances.

But, but... I have to object to the gerund part of obscene gerund, as a reference to fucking (or fuckin') in expressions like your fucking boss. I've written here before about grammatical concepts and terminology in the world of English Ving, so I'll give just the most basic explanation.

Background: except for a few defective verbs, every verb in English has a form in -ing (with a variant -in') with a great many uses. The form gets various labels in the scholarly literature on English: among others, "present participle" (or some abbreviated tag, like "PRP"), "gerund participle", "-ing form", "form N". (The first is the most common label; the second is the term in The Cambridge Grammar of the English Language; the third is the term in the big Quirk et al. grammar, A Comprehensive Grammar of the English Language; and the last is my own preferred label.)

Most of the uses of form N can be classified as one of three types: verbal, adjectival, or nominal. (The examples in the following discussion are merely representative of a much larger collection of cases; this is not an inventory of all the constructions that form N can appear in. And, to keep things short, I've swept many complexities under the rug.)

In verbal uses, the form serves as the head of a (non-finite) clause (exclamatory Him having a hat on!) or as the head of a VP complement to a V (progressive Kim was amusing the children by juggling watermelons, with amusing the children by juggling watermelons serving as complement to a form of BE; aspectual It started fiercely snowing, with fiercely snowing serving as complement to a form of the aspectual verb START).

In adjectival uses, the form serves as the head of a adnominal phrase, modifying a N (people not having a hat on, with not having a hat on modifying people, much as a restrictive relative clause like who do not have a hat on does).

In nominal uses, the form serves as the head of an argument phrase, just like an ordinary NP (Kim's juggling watermelons so skillfully entertained the children, with Kim's juggling watermelons so skillfully serving as subject; Kim's skillful juggling of watermelons astonished us, with Kim's skillful juggling of watermelons serving as subject).

I'd prefer to use the terminology above -- verbal vs. adjectival vs. nominal -- for a rough (and incomplete and pretheoretical) taxonomy of the uses of form N, but unfortunately there are other terms, participle and gerund, deriving ultimately from grammatical terminology for Latin, that have long been used for this purpose, and they aren't very satisfactory. Part of the problem is that in this tradition the same terminology ends up being used for labeling morphological forms and for labeling classes of syntactic constructions, but that's not my concern in this posting.

Here's the immediate problem: in this tradition, participle is often defined as a 'verbal adjective' (or 'verb used as an adjective'), gerund as a 'verbal noun' (or 'verb used as a noun'), which would provide alternatives to my labels adjectival and nominal above, but nothing corresponding to verbal. In some handbooks of English grammar, this gap is filled by extending the term participle to the verbal uses, while maintaining the characterization of participles as verbal adjectives. That's just wrong, because the verbal uses have no adjectival properties at all. Some sources just stipulate that participle covers both adjectival and verbal uses. And, of course, some use participle for all uses of form N.

But there's one fixed point in this terminological morass: so far as I know, if a source uses the term gerund at all, it's restricted to nominal uses. Which brings us back to obscene gerund. The fucking in your fucking boss is in no way nominal; instead, it's adjectival, located in NPs between determiners (like your) and the head N, in with ordinary adjectives. The expletive (god)damn has a similar distribution, as of course do alternatives to fucking like freaking, frigging, etc.

Actually, expletive fucking has a wider distribution. It functions as an "A-al", with adverbial as well as adjectival uses. (Adverbs and adjectives share a number of properties, so that it makes sense to posit a larger class A comprising both of them.) It modifies not only nouns, but also A's (adjectives, as in That's fuckin' huge, and adverbs, as in You did that fuckin' fast) and verb phrases (You need to fuckin' stop that). So expletive fucking doesn't quite fit into the taxonomy above; it's one of the complexities I alluded to above. But what's special about it is that it's an A-al use rather than just an adjectival use of form N; it's not nominal and shouldn't be labeled a gerund.

(By the way, I don't really understand what's going on in the Jerk City cartoon.)

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 01:50 PM

Tiny experiments

Given my fondness for Breakfast Experiments™, I was intrigued to see that Jon Bentley (of Programming Pearls fame) is giving a talk tomorrow afternoon, in the Murray Hill Mathematical and Algorithmic Sciences Colloquium series, on the topic "Tiny Experiments for Computing and Life":

Computing experiments come in all sizes. A jumbo testbed for the Traveling Salesman Problem, for instance, can take years to build, and additional years can be spent designing and running insightful experiments. This talk concentrates on tiny computing experiments that can be conducted in a few minutes. Such experiments include parameter estimation, hypothesis testing, determining functional forms, and conducting "horse races". This talk also describes how tiny "Math, Science and Engineering" can be done in one's head or on the back of the proverbial envelope, and shows how to apply it to computing problems and problems in everyday life.

Unfortunately I'm scheduled to teach here in Philly at the same time, so I'll have to use my imagination.

Posted by Mark Liberman at 09:47 AM

Parsing Miss Upton

While the world was laughing at Miss Teen Carolina's dysfluent answer to an ill-posed question, Lukas Biewald and Brendan O'Connor at Powerset got serious about it. They fed a transcript of Lauren Caitlin Upton's response into their version of the XLE parser, with this result:

Powerset's goal is to use such analytic techniques to improve web search, and one of the questions about this idea has always been how well the analysis copes with material that's fragmentary or carelessly written or non-canonical in some other way. So they were proud to announce that their system was able to use the parser's output to answer a question about Miss Upton's answer:

Some nitpickers may complain that the answer is incomplete -- what about South Africa and the Iraq and like such as? Still, an impressive achievement.

I found it striking that their analysis of Miss Upton's remarks involved such deeply right-branched embedding. That's because their grammar treats strings of fragments that it can't analyze further in terms of structures generated by rewrite rules of the form

FRAGMENTS → X FRAGMENTS

As a result, a string of apparently disconnected babble -- say, Vicki Pollard's classic string of discourse markers -- will look something like this

rather than like this:

This is a sensible-enough way to approach the problem. From a computational point of view, uniformly right- (or left-) branching structures are easily handled by finite-state methods; and from a psychological point of view, right-branching structures have the advantage over left-branching structures that you don't need to decide how deep you're going to go before you start talking.

But a set of interesting psychological issues lurk behind those formal choices. When people produce (or understand) such strings of fragments, is there a sense in which they're processing them in layers, as we assume they're doing in dealing with phrases like "Kim doesn't like stale fruitcake", or "Leslie's uncle's dog was barking"? Are strings of discourse markers (including maybe the communicatively-meaningful disfluencies that you can read about in Michael Erard's Um) different from false starts in this respect?

[OK, I know that the elements of Vicki Pollard's conversational opening are actually grouped as (yeah but) (no but) etc., as indicated both by the prosody of her performance and (I think) by the interpretation of the content. All the more reason to wonder about the constituent structure of disfluencies... And I've always thought that the most amusing part of the "yeah but no but yeah but" business was the suspicion that there's a way of construing it as a recursive stack of discourse markers, rather than as a series of false starts -- if only I had enough short-term memory to grasp it. ]

Posted by Mark Liberman at 06:45 AM

The Real Labor Day is May Day

Geoff Nunberg points out that Labor Day in the United States has been bleached almost completely of its association with labor unions. That was the intention. Labor Day is celebrated at the beginning of September only in the United States and a few other countries such as Canada. Just about everywhere else, Labor Day is May Day, May 1st. May Day is the real labor day. The reason that a September Labor Day was adopted in the United States was in order to avoid the association of May Day with socialism, the more militant wing of the labor movement, and internationalism. American Labor Day started out as a watered down version of May Day. That it has all but lost its association with labor unions reflects not only the decline in influence of unions but the intention of the US government back in 1894.

Posted by Bill Poser at 01:45 AM

September 04, 2007

Viral marketing is a language from inner space

Yesterday I promised to write something about how viral turned into a good thing. In the first place, I need to confess that it's not entirely true. When I search Google News this morning for the word viral, the top 20 topics include 12 about dengue fever and hepatitis C and meningitis and viral gastroenteritis and so on, and only 8 about things like these:

A viral campaign that spiralled through social networking site Facebook has forced HSBC into a humiliating U-turn over its decision to scrap interest-free overdrafts for university graduates.
Viral evangelism seems to be working because Firefox continues to gain market share worldwide.
"B2B companies weren't in early because they presumed the kinds of things that went viral weren't mature and respectable enough to be appropriate for business customers."
Perlico, Ireland’s leading alternative provider of phone and Internet services has announced that its viral marketing campaign has met with immense success.

Still, by this crude count, viral has become 40% good (if you think that a method for spreading information about politics and products is good, as the writers of those sentences clearly do). This is a natural development, but there may have been a bit of impetus from William S. Burroughs by way of Laurie Anderson, and perhaps a second push from Richard Dawkins and other memeticists.

The term virus originally meant "venom", but by 1870 or so it was used to mean "agent of infectious disease"; a "filterable virus" was an infectious agent small enough to pass through filters that trap bacteria ; this term was shortened in common use to virus, and the structure and function of viruses was gradually clarified during the middle of the 20th century.

The OED's earliest citation for the adjectival form viral is from 1948:

1948 Diagnostic Procedures for Virus & Rickettsial Diseases (Amer. Public Health Assoc.) 15 Viral agents belonging to the psittacosis group.

More recently, viral has undergone the same sort of change that long ago added to infectious the meaning that the OED glosses as "Of actions, emotions, etc.: Having the quality of spreading from one to another; ‘catching’, contagious", with glosses from 1611 onwards:

a1611 BEAUM. & FL. Maid's Trag. I. i, She carries with her an infectious grief, That strikes all her beholders. 1700 DRYDEN Palamon & Arc. II. 313 Through the bright quire th' infectious virtue ran. All dropt their tears. 1828 WHATELY Rhet. in Encycl. Metrop. 300/1 Almost every one is aware of the infectious nature of any emotion excited in a large assembly. 1866 G. MACDONALD Ann. Q. Neighb. xi. (1878) 200 How hearty and infectious his laughter!

We wouldn't talk about "viral laughter" (at least I wouldn't, though Google finds 27 pages on which someone thought differently), but starting in the late 1980s, marketing types began to use viral to talk about the spread of information rather than disease:

Chiefly Marketing. Of, designating, or involving the rapid spread of information (esp. about a product or service) amongst customers by word of mouth, e-mail, etc. to go viral: to propagate in such a manner; to (be) spread widely and rapidly.

1989 PC User (Nexis) 27 Sept. 31 The staff almost unanimously voted with their feet as long waiting lists developed for use of the Macintoshes... ‘It's viral marketing. You get one or two in and they spread throughout the company.’

Why coopt viral rather than extend plain old infectious -- or contagious, which has been used in a similar way since 1660 or so? If you're going to take over another word, why not bacterial or fungal or microbial?

I speculate that the use of terms like viral marketing in the late 1980s may have been influenced by Laurie Anderson's popular 1986 performance piece "Language is a virus", in which the phrase "Language is a virus from outer space", attributed to William S. Burroughs, is projected behind her.

I haven't been able to find that particular phrase in Burroughs' works -- if you know where it's from, please tell me -- but the virus metaphor was one that he used often. The earliest use that turns up in a quick web search was in his 1959 novel Naked Lunch, where the character Dr. Benway says (p. 112 in the 2004 Grove Press "restored edition"):

Democracy is cancerous, and bureaus are its cancer. A bureau takes root anywhere in the state, turns malignant like the Narcotic Bureau, and grows and grows, always reproducing more of its own kind, until it chokes the host if not controlled or excised. Bureaus cannot live without a host, being true parasitic organizations. (A cooperative on the other hand can live without the state. That is the road to follow. The building up of independent units to meet needs of the people who participate in the functioning of the unit. A bureau operates on the opposite principle of inventing needs to justify its existence.) Bureaucracy is wrong as a cancer, a turning away from the human evolutionary direction of infinite potentials and differentiation and independent spontaneous action to the complete parasitism of a virus.

(Benway is not necessarily speaking for Burroughs here -- but someday, someone should track this attitude from Burroughs and the Beats to Reagan and the Republicans.)

What inspired Ms. Anderson more directly was probably the virus theme in Burroughs' Nova trilogy: for example, the passages like this in his 1967 novel The Ticket that Exploded:

The "Other Half" is the word. The "Other Half" is an organism. Word is an organism. The presence of the "Other Half" a separate organism attached to your nervous system on an air line of words can now be demonstrated experimentally. One of the most common "hallucinations" of subjects during sense withdrawal is the feeling of another body sprawled through the subject's body at an angel . . yes quite an angle it is the "Other Half" worked quite some years on a symbiotic basis. From symbiosis to parasitism is a short step. The word is now a virus. The flu virus may once have been a healthy lung cell. It is now a parasitic organism that invades and damages the lungs. The word may once have been a healthy neural cell. It is now a parasitic organism that invades and damages the central nervous system. Modern man has lost the option of silence. Try halting your sub-vocal speech. Try to achieve even ten seconds of inner silence. You will encounter a resisting organism that forces you to talk. That organism is the word. In the beginning was the word. In the beginning of what exactly? The earliest artifacts date back about ten thousand years give a little take a little and "recorded" -- (or prerecorded) history about seven thousand years. The human race is said to have been on set for 500,000 years. That leaves 490,000 years unaccounted for. Modern man has advanced from the stone ax to nuclear weapons in ten thousand years. This may well have happened before. Mr Brion Gysin suggests that a nuclear disaster in what is now the Gobi desert wiped out all traces of a civilization that made such a disaster possible. Perhaps their nuclear weapons did not operate on the same principle as the ones we have now. Perhaps they had no contact with the word organism. Perhaps the word itself is recent about ten thousand years old. What we call history is the history of the word. In the beginning of that history was the word.
[...]
"The Venusian invasion was known as 'Operation Other Half,' that is, a parasitic invasion of the sexual area taking advantage, as all invasion plans must, of an already existing fucked-up situation [...]

And so on, including related stuff in the (even more incoherent) follow-on novel Nova Express.

A collection of Burroughs' works published in 1998 was called Word Virus, and includes (p. 311-312) a selection from 'Electronic Revolution' (1970-71) telling us that

A far-reaching biologic weapon can be forged from a new language. In fact such a language already exists. It exists as Chinese, a total language closer to the multi-level structure of experience, with a script derived from hieroglyphs, more closely related to the objects and areas described. The equanimity of the Chinese is undoubtedly derived from their language being structured for greater sanity. I notice the Chinese, wherever they are, retain the written and spoken language, while other immigrant peoples will lose their language in two generations. The aim of this project is to build a language in which certain falsifications inherent in all existing Western languages will be made incapable of formulation. [...]

I have frequently spoken of word and image as viruses or as acting as viruses, and this is not an allegorical comparison. It will be seen that the falsifications in syllabic Western languages are in point of fact actual virus mechanisms.

I don't know enough about Burroughs to guess whether the "Venusian" fantasy and the "Other Half" image are somehow connected to his eccentric and extreme sexual politics, as exemplified in these passages from The Job, 1968, p. 116 and p. 122:

Q: How do you feel about women?

A: In the words of one of a great misogynist's [sic] Mr Jones, in Conrad's Victory: "Women are a perfect curse." I think they were a basic mistake, and the whole dualistic universe evolved from this error. Women are no longer essential to reproduction as this article indicates: ["Oxford Scientists Reproduce Frogs From Single Cells" , By Walter Sullivan, NYT] ...

Q: What do you think of American women?

A: I think they're possibly one of the worst expressions of the female sex because they've been allowed to go further. This whole worship of women that flourished in the Old South, and in frontier days, when there weren't many, is still basic in American life; and the whole southern worship of women and white supremacy is still the policy of America. They lost the Civil War, but their policies still dominate America. It's a matriarchal, white supremacist country. There seems to be a very definite link between matriarchy and white supremacy.

Back in the world of reason, the idea of using the term viral for certain marketing strategies-- those that rely on peer-to-peer transmission rather than mass media -- may also have been seeded by Richard Dawkins. Specifically, in his 1976 book The Selfish Gene, he tried to reconstruct cultural transmission and development as "memetics", on the analogy of evolution by descent with modification in genetics.

Although Burroughs was out there first, with a similar idea about words and ideas and organization as viruses, all tangled up in the rest of his crazy theories about sex, drugs and society, I don't think there's any reason to think that Dawkins was influenced by him. In fact, the idea of cultural evolution as descent with modification predates Darwin, in the form of 19th-century theories about the historical development of languages, which Darwin explicitly cites in The Origin of Species as a model for his own theories of biological evolution. And Burroughs' idea about "the word" as a viral infection is an example of cross-species transmission, not tree-structured evolution.

[ Cosma Shalizi sends a link to a passage from André Siegfried's Germs and Ideas: Routes of Epidemics and Ideologies (1965; translation of Itinéraires de Contagions: Epidémies et idéologies, 1960, American title Routes of Contagion): Part Four, ``The Spreading of Ideas and Propaganda,'' Chapter 7, `"Conditions under which ideas spread, and factors determining the choice of route.''

Cosma comments:

There is actually a fairly long history of the idea-infection analogy, and the idea-evolution analogy, among scientists. I once thought of writing a paper about it, but abandoned it to actually make some progress in graduate school. I could dig up some of my notes, if you're interested.

I think the idea of the language virus coming from outer space is indeed in Nova Express, though the formula "language is a virus from outer space" may be Anderson's.

In the graphic novel series The Invisibles (Grant Morrison et al.), urban, technological civilization is a virus from outer space; this is slightly more plausible because you do, after all, need advanced technology to have a space program and go off to infect another world. Given the other material in the books, I'm pretty sure this idea is derived from Burroughs.

This history is implicit in the 17th-century extension of infectious and contagious to refer to emotions, attitudes and ideas. I guess that it wouldn't be a surprise to find that some third-century Romans saw Christianity as a plague, or that the rest of the world saw Islam that way during its phase of expansion.

One thing I wonder: it's pretty obvious to treat the spread of some disliked ideology or religion or social group to the spread of a disease. When does this analogy start to apply to positively-evaluated information or attitudes or groups? ]

Posted by Mark Liberman at 05:30 AM

September 03, 2007

Come to think of it, "Arbor Day" doesn't make a whole lot of sense, either

Faces of labor, read the header on the feature that took up the left-hand two-thirds of the front page of today's San Francisco Chronicle, over a picture of a 74-year-old railway worker. "On this Labor Day," the copy read, "we take a look at some workers who do their jobs behind the scenes, quietly helping the world go around, as a thank-you to them and to others who play an unsung role in our community." The runover had photos of four more workers: the assistant director of the San Francisco Opera, a recently hired clerk at a local bookstore, an engineer at a local radio station, and a doula from Marin. Probably one or two of them were union members, but the Chron made no mention of that, or of unions at all, and the l of labor in that header was lower-cased, in what turned out to be the word's only appearance in the text other than in the name of the holiday itself.

Nowadays, it seems, the "Labor" of Labor Day has been stripped of any semantic association with the movement that initiated the holiday back in 1882, when the Central Labor Union in New York City proposed a street parade to demonstrate "the strength and esprit de corps of the trade and labor organizations," followed by "a picnic or festival in some grove." The labor movement may be "the folks who brought you the weekend," but we manage to enjoy this one without a word for its sponsor.

Posted by Geoff Nunberg at 08:21 PM

GendergapGirl

According to Elizabeth Jensen, "A New Heroine's Fighting Words", 9/2/2007, NYT:

THERE’S a new superhero on the block this fall, and she might just have the strength (or as she would most likely say, the “fortitude”) to render a big vocabulary cool among schoolchildren.

The weapon of choice for PBS’s new “WordGirl” is words: the more expressive, the better. When the fifth-grader Becky Botsford dons her red cape and spits out mouthfuls like “preposterous” and “bicker” and “cumbersome,” her enemies — from the often-tongue-tied Chuck the Evil Sandwich-Making Guy (whose name is a chance for WordGirl to define “absurd”) to the Butcher, who mangles words while hoarding meat — capitulate.

In search of a viral-marketing boost, PBS has posted nine clips on YouTube. (Topic for another post: how "viral" became a good word...)

Considering the cast of writers and actors that Jensen describes below, I found the WordGirl clips a bit disappointing:

“WordGirl” draws its writers not from the ranks of children’s television but from places like the satirical newspaper The Onion and Fox’s twisted adult cartoon series “Family Guy.” The voice of the narrator, Chris Parnell, will be better known to adults from “Saturday Night Live.”

Many of the actors who voice the characters come from the improv comedy world, and riffing was encouraged during taping sessions; occasional crackups audible on the soundtrack are infectious.
The cast includes Jeffrey Tambor (“Arrested Development”) as Mr. Big; Fred Stoller (“Everybody Loves Raymond”) as Chuck the Evil Sandwich-Making Guy; Tom Kenny (best known as the voice of SpongeBob SquarePants on Nickelodeon) as Dr. Two-Brains; Ryan Raddatz as Todd Ming, better known as Scoops, Becky’s fellow reporter at the school newspaper; and, as the heroine, Dannah Feinglass, a former cast member of “Mad TV” on Fox.

The quality of the jokes aside, it seems that WordGirl teaches vocabulary mainly by correcting other people's word choices, or using (and condescending to explain) a rare and formal word when a commoner and less pretentious one would have worked just as well. This reinforces the idea that knowledge serves mainly to one-up or impress other people. It also reinforces the idea that intellectuals are snarky and obnoxious.

I'm not the only one who picked up on this aspect of the show. Rob Owen ("Tuned In: The good word on kids' shows", 9/2/2007, Pittsburg Post-Gazette) wrote:

I have particular affection for this series because it provided laughs to me and others throughout July's TV critics press tour. Whenever a questioner or the person answering a question mangled the English language during a press conference, my buddy and fellow critic would lean over and say, "We need WordGirl!"

Given that pre-adolescents are traditionally in search of ways to feel superior to adults -- not that they ever need to look very far -- perhaps exactly what the nation needs these days is a role model to inspire more fifth graders to become snarky and obnoxious by using big words and correcting the usage of others. I think the idea, though, is just to use the snark as a way to sugar-coat the pill of vocabulary enrichment, not to turn out a whole generation of David Foster Wallaces.

But children's shows are inevitably about symbols of identity as well as about transmission of literal content. (I don't think it's an accident that the generation who grew up with the multicolored fur of the Sesame Street puppets took to dying their hair orange and green and purple in college.) And from the point of view of identity symbols, the role of gender and class in WordGirl also deserves a comment.

WordGirl is presented as the champion of English as a literary language. Her super-enemies are almost all adult males, including especially The Butcher, complete with deep voice, five-o'clock shadow, male pattern baldness, and stereotypical working-class accent. (The Butcher's superpower apparently involves forcing opponents to eat too much meat, but that's a topic for another post.) So I wonder: how long will it take, after the pilot airs at 4:30 this afternoon, before the first schoolboy with an interest in reading and writing is nicknamed "WordGirl" by his classmates? My bet is on the morning recess at school tomorrow, in those areas where the schools are already in session.

Maybe the show's creators have a plan for dealing with this issue, I don't know. But on the face of it, the clips posted to YouTube suggest that this show could have a large and negative impact on the educational gender gap.

I've been vocally skeptical about the genetic and neurological basis that some have claimed for this problem; and it remains unclear whether the problem is that boys are doing worse, or that girls are doing better. But the gap is real, and the way to improve things may not be to associate vocabulary improvement with a 5th-grade female superhero correcting the English of adult male villains.

The show's regular slot will be 3:30 p.m. on Fridays and 10 a.m. on Sundays.

[If it's not clear to you why WordGirl is any way problematic, try to imagine the reaction to a show where MathBoy defends the world against villains like Ms. BadAd, a PR consultant who doesn't understand percentages, and CheckoutGirl, who is too dim to make change correctly.

Apparently a female villain, "Lady Redundant Woman", is due to appear in a WordGirl episode next spring -- though her fault, I imagine, will be to use too many words. And according to the wikipedia entry, two of the 30 WordGirl shorts (shown on PBS Kids GO! last fall) featured the villain Granny May, although neither of these appears to be on YouTube -- and it's not clear that Granny has any vocabulary problems. The WordGirl web site identifies her as "a mean, grumpy criminal ... [who] plays the role of a feeble, kindly, hard-of-hearing grandma in order to deceive the city and rob everyone blind".

Perhaps at some point we'll learn that Y chromosomes are also to be found on the planet Lexicon, where WordGirl comes from, and that not everyone who misuses or misunderstands the vocabulary of Standard Written English is male. But as presented in the YouTube clips, the show seems designed to persuade boys that words are for girls.]

[Daniel Tobias writes:

Back in the '70s, on the PBS kids' show "The Electric Company", they had a male alphabet-based superhero, Letterman. (I don't think his first name was David.)

Right -- though Letterman's opponent was Spell Binder, voiced by Zero Mostel, who was not presented as an illiterate working-class adult female. The types of conflict involved are indicated by this Wikipedia quote:

In the first part, Spell Binder sneaks into Letterman's home in order to exact revenge for being foiled, time and time again. He spies upon Letterman, who happens to be packing away all of his letters into a trunk as he prepares to go on vacation. Spell Binder changes the "trunk" into "junk," then proceeds to shrink Letterman down to six inches tall by changing "junk" into "shrunk." However, Letterman then changes "shrunk" into "hunk," which restores his college-football-player physique. He then proceeds to bend Spell Binder's wand, rendering it useless. In the second part, Spell Binder has been placed behind bars. The narrator of the episode declares, "This looks like the end of a fiend," which inspires Spell Binder to use his bent wand as a letter "r" to change "fiend" into "friend." A strange-looking monster appears (as the narrator exclaims, "I didn't know he had any friends!") and bends the bars of Spell Binder's prison cell, allowing him to slip out and escape.

]

[Idiotgrrl, commenting at Suzette Hayden Elgin's Ozarque weblog, wrote:

Dennis the Menace's Margaret. Lucy Van Pelt. That's the image these people are setting up, and considering the background of these writers, I can only think they're doing it on purpose. How long before we're back to Philip Wylie's sexist screed "Generation of Vipers"?

Yes, exactly. On the other hand, lizthefair commented:

I remember wanting to use exactly the right word to express myself, and being told I should use "easier" ones so people would know what I was talking about--never mind that I would no longer really be saying what I meant. If this show in any way makes it "cool" or even just "acceptable" for kids to use the words they know than I'll be happy.

We can hope that the show will have that good effect for many kids -- and even better, will teach a lot of vocabulary. But I'm afraid that for some of the kids who need it the most, the effect may to amplify the feeling they already have, that they're not (and shouldn't be) among the Margarets and Lucys of the world. ]

Posted by Mark Liberman at 08:17 AM

September 02, 2007

My Baby Cs

Fun for the (U.S.) holiday weekend, from Wendy Nather on the Friends of Elizabeth Zwicky mailing list: an alternative version of the alphabet song that she's been teaching her three-year-old:

Baby CD effigy
Hijack Elmo entropy
Curious SUV
www dot xyz
Now I know my Baby Cs
Next time won't you sing with me?

There might be a problem when the kid gets to preschool.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 01:11 PM

Immanuel Kantzendine

Steven Gross sent in this quotation from Kant's Lectures on Ethics (p. 202 of the Cambridge Translation):

"Just as practical taciturnity is an excess on the one side, so is loquacity on the other. The first is a male shortcoming, the second a female one. Some writer has said that women are talkative, because the upbringing of infants is entrusted to their care, and that by reason of their chattiness they soon teach children to talk, since they are able to keep babbling to them all day long; among men, however, it would take the children much longer to speak."

This much is consistent with the gospel according to (folks like) Louann Brizendine. Kant continues:

"Taciturnity is an odious habit. We are irritated by people who say nothing. They betray a sort of pride. Loquacity in men breeds contempt, and is unbecoming to their strength."

Posted by Mark Liberman at 07:02 AM

September 01, 2007

Prosody and anaphora (again)

From Ken Belson, "At I.B.M., a Vacation Anytime, Or Maybe No Vacation at All", NYT 8/31/07, p. 1:

"If leadership never takes time off, people will be skeptical whether they can," said Kim Stattner of Hewitt Associates, a human resources consultant. "There is a potential for a domino effect."

On first reading, I took they to refer to the company's leadership: never taking time off suggests that they aren't in fact able to do so, that they're compelled to work. On reflection, and taking into account the following allusion to a domino effect, it became clear to me that Stattner was saying that employees ("people") would be reluctant to take time off. My first reading comes from taking the pronoun they to have the ordinary prosody of anaphoric pronouns: unaccented. What Stattner actually said, however, surely had a contrastive accent on they, a prosody that's not represented in any way in the Times report.

There are ways to represent this prosody. For example, in a Zits cartoon I reproduced here on May 12, a contrastive accent is indicated by bold-faced italics:

Jeremy's mother: I trust Hector. Hector is a Good Boy. ["Good Boy" is in italic script, suggesting yet another special prosody]

Hector objects to Jeremy: I don't call her names.

But the Times (like newspapers in general) is very sparing indeed with special fonts within the body of stories or editorial pieces (a number of readers have suggested to me that this allows for material to be sent electronically as plain ascii text). A while back, I posted on another NYT piece where a special font would have been useful to represent contrastive accent:

Reducing unintended pregnancy is the key -- half of pregnancies are unintended, and 4 in 10 of them end in abortion.

To get the intended interpretation here, them must be read with contrastive accent, which is not represented in the text as printed. In this case, these or those would have done the trick, but in the vacation story the writer was pretty much stuck with the words Stattner uttered; the only good alternative is to shift from direct quotation to a less direct representation of Stattner's words -- something like:

Kim Stattner of Hewitt Associates, a human resources consultant, observed that if leadership never takes time off, people will be skeptical whether they themselves can.

Probably the writer didn't see the problem here: as in the abortion story, the writer no doubt heard the words in his head with contrastive accent, and didn't see that what was on the page could very easily be read otherwise.

[Addendum 9/2/07: A correspondent writes to say that the -s on "takes" indicates that "they" cannot refer to the leadership. As we've pointed out several times here on Language Log, the facts of usage are that a great many speakers allow "they" to be anaphoric to a collective noun -- that is, to refer to the members of an entity introduced into the discourse via a collective noun like "leadership" -- and that this possibility is available even when, as here, the collective noun has singular number agreement. For such speakers (I am one, and Mark Liberman and Geoff Pullum are others), the unintended interpretation, with apparently unaccented "they" referring to the leadership, is easily and immediately available.]

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 12:07 PM

Zolf bar baad

Today's NYT has a piece on Mohsen Namjoo (Nazila Fathi, "Iran's Dylan on the Lute, With Songs of Sly Protest", 9/1/2007). Searching on YouTube for Namjoo turns up this lovely song, a setting of a 14th-century ghazal by Hafez, as the first hit:

In this case, the "sly protest" appears to be mainly that the embattled actress Zahra Amir Ebrahimi is featured in the accompanying video. (But perhaps Hafez is intrinsically problematic in today's Iran, I don't know.)

When the video came out back in March, the Persian poem and an English translation were posted at TehranAvenue. I don't know any Persian, but in this case, I'll bet that the translation loses even more of the original than usual. An interlinear version doesn't seem to be available -- if someone can help me to create one, let me know, and I'll post it here when we're done.

[One of the YouTube commenters suggests that Namjoo is more like Vladimir Vysotsky than like Bob Dylan, and after listening to a few of his songs, this seems right to me.]

Hafez was one of the most important and influential Persian writers, and also played an interesting role in the history of Western linguistics and of English literature. William Jones, later known as the author of the Indo-European hypothesis, published in 1771 a Persian Grammar that includes many quotations from the Persian poetry, especially that of Hafez:

I shall in this manner quote a few Persian couplets, as examples of the principle rules in this grammar: such quotations will give some variety to a subject naturally barren and unpleasant; will serve as a specimen of the oriental style; and will be more easily retained in the memory than rules delivered in mere prose.

He begins the grammar (p. 10-12) by quoting an "ode" of Hafez, in full and in Persian, and ends it with a paraphrase of the poem (p. 135) and a (separate) verse translation in English. (Unfortunately, this is not the same poem that Mohsen Namjoo sings in the video above.)

Here's how the Persian version of the first couplet is printed in Jones' grammar -- as the illustration of the section "Of Vowels"!

Here's the wikipedia's unicode version of the same couplet (I think...):

اگر آن ترك شيرازى بدست‌آرد دل مار
بهخالهندويشبخشمسمرقندوبخارارا

and here's Jones' transliteration:

Egher ân turki Shirázi bedest âred dili mára
Bekháli hindúish bakhshem Samarcandu Bokhárára.

Jones' paraphrase:

If that lovely maid of Shiraz would accept my heart, I would give for the mole on her cheek the cities of Samarcand and Bokhara.

Here's his version in verse. It might have been written by Coleridge, who was born the year after Jones' Persian Grammar was published.

Sweet maid, if thou wouldst charm my sight,
And bid these arms thy neck infold,
That rosy cheek, that lily hand
Would give thy poet more delight
Than all Bocára's vaunted gold,
Than all the gems of Samarcand.

According to the wikipedia article on Hafez,

In one famous tale, "a tradition too pretty to be trusted" says a noted historian, the famed conqueror Timur the Lame angrily summoned Hafez to him to give him an explanation for [this couplet...]

With Samarkand being Timur's capital and Bokhara his kingdom's finest city,"With the blows of my lustrous sword," Timur complained, "I have subjugated most of the habitable globe...to embellish Samarkand and Bokhara, the seats of my government; and you, miserable wretch, would sell them for the black mole of a Turk of Shiraz!". Hafez, so the tale goes, bowed deeply and replied "Alas, O Prince, it is this prodigality which is the cause of the misery in which you find me".

So surprised and pleased was Timur with this response that he dismissed Hafez with handsome gifts.

An interesting essay by Dick Davis, cited back in 2004 by Language Hat: "On Not Translating Hafez."

Posted by Mark Liberman at 05:58 AM

AC	anterior commissure
VP	tip of vocal process
AnAC	angle of bilateral vocal folds at AC
GWP	glottic width at vocal process level
LEG	length of entire glottis
LAG	length of anterior glottis
LPG	length of posterior glottis
LMF	length of membranous vocal fold

	Male	Female	Ratio M/F
AnAC in degrees	16	25
LMF in mm	15.4	9.8	1.57
GWP in mm	4.3	4.2	1.02
LAG in mm	15.1	9.5	1.59
LPG in mm	9.5	6.8	1.40
LEG in mm	24.5	16.3	1.50