Last week, according to a widely reproduced AP story, New York Governor George Pataki vetoed a bill that would have required the use of politically correct terminology in laws, regulations, and charters when referring to people with disabilities. Pataki said that he vetoed the bill because it established "vague and subjective" standards and observed that not only do preferences change over time but that people with the same disability disagree as to what terminology they prefer.
I'm afraid that I have to side with Governor Pataki on this one. The bill isn't about avoiding obviously and egregiously offensive terminology like gimp and crip. To my knowledge not a single New York law or regulation uses such terms. According to its sponsor, Harvey Weisenberg, the bill would have required the disabled to be called instead people with disabilities. Hunh? I see no relevant difference in meaning betwen the two. The main difference is that expressions like people with disabilities are longer and in many contexts will be more awkward and less readable. Yet another impediment to clarity and readability of laws and regulations is somethng we really do not need.
The bill also reflects the assumption of a form of the Sapir-Whorf hypothesis. Assemblyman Weisenberg is quoted as saying:
By using the correct language in legislation, New York state lawmakers can make a positive impact on how people with disabilities are perceived by society,
I doubt it. It could be true, but I find it striking that this and other similar ideas are put forward by serious people without a shred of evidence.
I have no idea what disabled people, and people with disabilities, think about this. Their thoughts on the matter, if any, are not mentioned in any of the news articles and do not turn up on casual Googling, but I know from my own experience that well-meaning people seem to come up with non-existant distinctions that mean nothing to those they are trying to benefit. An acquaintance once explained to me that he thought that one should never say that someone is a Jew, always that someone is Jewish. He thought that He is a Jew. is somehow offensive, while He is Jewish. is not. I don't know of any linguistic or grammatical principles from which this would follow, nor of even a smidgen of evidence that anyone else shares his perception. I certainly don't, and I am a Jew. That's how I put it. If anything, I prefer He is a Jew. to He is Jewish because I associate the latter usage, rightly or wrongly, with people who wrongly consider being Jewish to be purely a matter of religion, like being a Baptist or a Hindu.According to a CNN story about New Orleans mayor Ray Nagin's frustration over lack of coordination,
"There is way too many fricking ... cooks in the kitchen," Nagin said in a phone interview with WAPT-TV in Jackson, Mississippi, fuming over what he said were scuttled plans to plug a 200-yard breach near the 17th Street Canal, allowing Lake Pontchartrain to spill into the central business district.
Arnold Zwicky, who told me about this by email, wondered about three editorial differences between this textual version and what he remembered from hearing the clip on CNN TV news, which he rendered as
There's way too many frickin' -- excuse me -- cooks in the kitchen.
CNN has the clip on their website (if the link doesn't work, try going through the story linked above), so I was able to verify that Arnold's memory is exact. I've extracted just the cited phrase here.
The three differences, obviously, are
With respect to the first and second points, Arnold was curious about whether his memory was wrong, or perhaps the Mayor was using informal language in a formal register. But Arnold's memory was exact, and I suppose that the partial formalization of the quote comes from a CNN copy editor, or an assumption by the writer that these are the right ways to render such things.
Opinions differ about how to transcribe informal speech. I tend to agree with the practice of writing -ing rather than -in', as Mark Twain did, though "fricking" in particular seems kind of silly. I feel that removing contractions is a bad idea, since in conversational English the lack of contraction in cases like this often indicates either some sort of emphasis or some extra dose of formality. In this case, the un-contraction is especially odd, since the singular is itself is non-standard. Still, Mark Twain himself did this, as in his maxim "The trouble ain't that there is too many fools, but that the lightning ain't distributed right".
Eliding the excuse serves to make a punchier quote, but a bigger issue is the extent to which the quote itself was set up by the interviewer. Here's the whole context:
Ray Nagin: | Uh I'm a very impatient person, I would love to see those uh resources come a lot quicker, and I would love to see some of the chiefs that keep showing up down here to kind of stay away for a minute and let us get to- these implementations uh phases adequately done. |
Interviewer: | Are- are there too many cooks in the kitchen, is that what I hear you saying, Mr. Mayor? |
Ray Nagin: | Absolutely, in my opinion, there's way too many frickin' -- excuse me -- cooks in the kitchen, we had this implementation plan going, they should have done these uh sandbagging operations first thing this morning, and it didn't get done and I- quite frankly I'm very upset about it. |
This is the sort of ritual exchange that Rasheed Wallace lampooned here. It serves its purpose -- here we are, writing about Mayor Nagin's remarks, which took on a force that they would have lacked if the quote were just "I would love to see some of the chiefs that keep showing up down here to kind of stay away for a minute" or something similar.
Nagin apparently had good reason to be upset, and the reported helped him to express his anger in a way that got people's attention:
The National Weather Service reported a breach along the Industrial Canal levee at Tennessee Street, in southeast New Orleans, on Monday. Local reports later said the levee was overtopped, not breached, but the Corps of Engineers reported it Tuesday afternoon as having been breached.
But Nagin said a repair attempt was supposed to have been made Tuesday.
According to the mayor, Black Hawk helicopters were scheduled to pick up and drop massive 3,000-pound sandbags in the 17th Street Canal breach, but were diverted on rescue missions. Nagin said neglecting to fix the problem has set the city behind by at least a month.
"I had laid out like an eight-week to ten-week timeline where we could get the city back in semblance of order. It's probably been pushed back another four weeks as a result of this," Nagin said.
"That four weeks is going to stop all commerce in the city of New Orleans. It also impacts the nation, because no domestic oil production will happen in southeast Louisiana."
So it's easy to sympathize with what the reporter did: the conventions of the genre force him to make the point by asking the mayor a leading question, rather than expressing an opinion in his own voice.
[Update: Arnold Zwicky emailed
small additional subtlety on "there's" vs. "there is": "there is" + NPpl is indeed nonstandard (and somewhat more common in the south and south midlands than elsewhere, i believe -- i'm away from my sources on this today), but "there's" + NPpl should really be characterized, in current english, as merely informal/colloquial, rather than nonstandard. millions of people (like me) who wouldn't use "there is two people at the door" are entirely happy with "there's two people at the door". so the two versions differ not only in emphasis and/or formality, but also (for many of us) in standardness.
]
Yesterday's Washington Post featured a column by Ruth Marcus entitled "Powerpoint: Killer App?" It begins with this very provocative first paragraph:
Did PowerPoint make the space shuttle crash? Could it doom another mission? Preposterous as this may sound, the ubiquitous Microsoft "presentation software" has twice been singled out for special criticism by task forces reviewing the space shuttle disaster.
The rest of the column is, as columns like these often are, equal parts funny and disturbing -- and each in several ways. I'm one of those sad folks who use Microsoft products like PowerPoint out of some ill-defined sense of necessity, and I'm always down for some Microsoft (product) bashing. However, I won't tolerate the gratuitous bashing of second-grade students' writing abilities nor that of those students' teachers' abilities to instruct them in writing.
I'm referring to this paragraph from the column, with emphasis added:
The most disturbing development in the world of PowerPoint is its migration to the schools -- like sex and drugs, at earlier and earlier ages. Now we have second-graders being tutored in PowerPoint. No matter that students who compose at the keyboard already spend more energy perfecting their fonts than polishing their sentences -- PowerPoint dispenses with the need to write any sentences at all. Perhaps the politicians who are so worked up about the ill effects of violent video games should turn their attention to PowerPoint instead.
Almost certainly, Marcus makes this claim in the absence of (a) any qualitative evidence of how "tutoring in PowerPoint" proceeds in second grade (was this vulnerable age group used simply for rhetorical effect?) or (b) any quantitative evidence of the ratio of time/"energy" that students spend "perfecting their fonts" vs. "polishing their sentences". Reading this criticism-founded-on-PTA-anecdote of both students and teachers has the bitter aftertaste of poor research on important issues -- I'm not saying the claims are false, only that I'm certain that they haven't been shown to be true in any significant way and that I don't see what good they do for any kids.
Though I do agree about the video games bit at the end of the above-quoted paragraph, at least when it comes to politicians. Leave the real work to the psychologists.
[Link to the column courtesy of Paul de Lacy.]
[
Update: John Lawler writes to tell me how well-worth reading Edward Tufte's piece "The Cognitive Style of PowerPoint" (cited by Marcus) is. I haven't spent the $7 to order it yet, but Marcus also cites a freely available story by Tufte that appeared Wired Magazine in 2003, "Powerpoint is Evil" ("Power corrupts. PowerPoint corrupts absolutely."), where Tufte mentions the use of PowerPoint in "elementary school":
Particularly disturbing is the adoption of the PowerPoint cognitive style in our schools. Rather than learning to write a report using sentences, children are being taught how to formulate client pitches and infomercials. Elementary school PowerPoint exercises (as seen in teacher guides and in student work posted on the Internet) typically consist of 10 to 20 words and a piece of clip art on each slide in a presentation of three to six slides -a total of perhaps 80 words (15 seconds of silent reading) for a week of work. Students would be better off if the schools simply closed down on those days and everyone went to the Exploratorium or wrote an illustrated essay explaining something.
I don't think I can accuse Tufte of not thoroughly researching this, but I would like to see more evidence (and will therefore probably fork over the 7$). for the references here to "teacher guides" and "student work posted on the Internet" -- this hardly seems like cause for this level of alarm, or for the conclusion that PowerPoint tutelage is somehow replacing "learning to write a report using sentences".
(Sidenote: Tufte's Wired story is accompanied by a somewhat different point of view on PowerPoint by David Byrne.)
Plus: Language Log's own Geoff Nunberg writes to remind me of his 1999 piece "Slides Rule" from Fortune Magazine, which also appeared in The Way We Talk Now, pp. 213-215.
]
[ Comments? ]
Last spring , Jean-Noël Jeanneney warned us about "cette inquiétude lancinante du n'importe quoi, de la dispersion du savoir en poudre" ("this throbbing anxiety for anything and everything, for scattering knowledge like dust"). Well, according to The Onion, here comes the vacuum cleaner: Google Purge.
"Our users want the world to be as simple, clean, and accessible as the Google home page itself," said Google CEO Eric Schmidt at a press conference held in their corporate offices. "Soon, it will be."
As John Battelle explains
"Thanks to Google Purge, you'll never have to worry that your search has missed some obscure book, because that book will no longer exist. And the same goes for movies, art, and music."
As a phonetician, I'm especially excited about Google Sound:
"Book burning is just the beginning," said Google co-founder Larry Page. "This fall, we'll unveil Google Sound, which will record and index all the noise on Earth. Is your baby sleeping soundly? Does your high-school sweetheart still talk about you? Google will have the answers."
Page added: "And thanks to Google Purge, anything our global microphone network can't pick up will be silenced by noise-cancellation machines in low-Earth orbit."
Finally, speech and language scientists will be able to do away with old-fashioned sampling methods, and rely instead on statistics calculated from the entire domain of phenomena under investigation! In fact, scholars and scientists of all types will be able to complete their transformation from field and lab-bench investigations to purely digital research:
Although Google executives are keeping many details about Google Purge under wraps, some analysts speculate that the categories of information Google will eventually index or destroy include handwritten correspondence, buried fossils, and private thoughts and feelings.
The company's new directive may explain its recent acquisition of Celera Genomics, the company that mapped the human genome, and its buildup of a vast army of laser-equipped robots.
I guess this is what Jean-Claude Juncker and other European politicians were talking about when they warned of "virulent attacks" on European culture, fearing that "Google's ambitious plans could result in important European literary works missing out and being lost to future generations". Did Jacques Chirac slip them classified reports from a DGSE mole in Mountain View?
I feel in honor bound to warn readers that the Onion is a satirical publication, and this post is a joke... However, I do think that there is a serious point to be made here. And it's not that there is a paranoid strain in European intellectual culture, or that Google's servers are the leading edge of the The Matrix.
For me, the lesson is a narrower one, directed at publishers in general, and scientific and scholarly publications in particular. There is growing evidence that Open Access increases impact. In my opinion, this effect is certain to increase, asymptotically approaching the point where publications that are not indexed and accessible on line will effectively cease to exist. No one will have to purge them -- they will have purged themselves.
[Onion link via Kerim Friedman]
So say Cornelia Dean and Andrew C. Revkin in today's NYT. After contributing to one of the disaster relief organizations, you might distract yourself up by taking a look at John Cowan's page of Essentialist Explanations. This is essentially a list of 736 sentences of the form <Language X> is essentially <Language Y> <produced under conditions Z>. Some are funny, some are silly, some are mildly offensive, some are nearly true.
A sample:
English is essentially Norse as spoken by a gang of French thugs.
English is essentially the works of Joyce with the hard bits taken out.
Swedish is essentially Norwegian spoken by Finns.
Danish is essentially Norwegian, only you drop out all the consonants, skip all the vowels and then mispronounce the rest.
Spanish is essentially Italian spoken by Arabs.
Francophones are essentially Germans speaking the bad Latin they were taught by Gauls.
French is essentially an attempt by the Dutch to speak a Romance language.
French is essentially a language that elides everything that doesn't get out of the way fast enough, and nasalises everything else.
Russian is essentially Punjabi that fell off the wagon. Contrariwise, Punjabi is essentially Russian with better spices.
Modern Greek is essentially Classical Greek as spoken by Venetians.
Mandarin is essentially Chinese as spoken by Mongols.
Here's the solution to yesterday's encoding puzzle If you look at the HTML metadata, the page claims to be in ISO-8859-1 (aka Latin-1), an ASCII extension in which things like accented characters occupy codepoints above the ASCII range, while still remaining in a single byte. The claim, though technically true, is misleading. All of the characters are ASCII characters. That is, not a single byte on that page has a value greater than 0x7F. Technically, you can call that ISO-8859-1, since it is consistent with it, but really the page is in the ASCII subset of ISO-8859-1.
Inspection of the page source reveals that the accented letters are each represented by a sequence of two HTML decimal numeric character entities. For example, é e with acute accent, is not represented by the single byte with value 0xE9 as it would be in ISO-8859-1. Rather, it is represented by a sequence of twelve bytes: é. à is an HTML representation for à upper case a with tilde; © is an HTML representation for © copyright symbol. That's why the word représente comes out as représente on your terminal. (Don't anybody write in to say that this is the usual spelling used by dyslexic speakers of North African French when text-messaging after they've had a few drinks or something like that. Writing Unix man pages is a serious, indeed sacred, matter. Learned authors have compared the interpretation of Unix man pages to the study of the Talmud.)
What do I mean by saying that à is an HTML representation of à and that © is an HTML representation of ©? In HTML, characters may be represented as many as four ways:
So, how did é end up represented as é? Well, é is an ASCII-fied representation of a sequence of bytes whose numerical values are 0xC3 (aka 195) and 0xA9 (aka 169). Notice how the use of decimal numeric character entities obscures things. It just happens that 0xC3 0xA9 is the UTF-8 encoding of UTF-32 0xE9. In its pure and ethereal form, Unicode codepoints are all 32 bits, or 4 bytes. For various reasons (discussed previously on Language Log and in more detail here) the preferred form for exchange of Unicode-encoded text is UTF-8, in which most characters are encoded as two or more bytes.
To pull all this together, the garbled man pages are what you would get if you started off with a page in UTF-8, and mistakenly thinking that it was in ISO-8859-1 ran it through an HTML-izer that converted anything outside the ASCII range to numeric character entities.
The reason for using an HTML-izer is that some software, such as the software that runs this blog, cannot handle bytes whose high bit is set. If you enter such a byte into a Language Log entry, it looks fine when you enter it, but you will find the post truncated immediately before the first such byte. So if you want to use non-ASCII characters with confidence in web pages, it is wise to convert them all to character entities. I have written a couple of programs that do this myself.
Several of our readers figured this out: Diane Bruce responded in the wee hours of last night not long after I posted the puzzle. The others are Aaron Elkiss and John O'Neill.
Recently I looked something up in the GNU/Linux manual pages at http://maconlinux.net/linux-man-pages/fr/strtol.3.html, which are in French. and couldn't get them to display correctly. Most of the text came out fine, but accented letters, and generally anything outside the ASCII range, came out garbled. At first I thought that the browser might be displaying the page using the wrong encoding, but changing encodings didn't solve the problem. The Spanish manual pages at http://maconlinux.net/linux-man-pages/es/strtol.3.html exhibit the same problem.
[Note: when I checked these URLs just now, I got a server error. If it is still acting up, here's a link to the Google cache of the French page.]
Although I couldn't get these pages to display correctly, short of writing a little script to transform them before letting the browser at them, after a few minutes I figured out what had happened to them. There is a perfectly straightforward explanation for what happened to them. For now, I'm going to leave the solution as an exercise for the ling-technically inclined reader. I'll post it tomorrow.
[Guest post by Benjamin Zimmer] Linguistic persnicketiness is certainly not restricted to any particular political ideology. But prescriptivist gripes are sometimes grounded in a conservative distaste for loosey-goosey moral relativism and the like. Here are two defenders of language conventions hailing from the political right: one a comic-strip character and one the current Supreme Court nominee.
The first example is Bruce Tinsley's comic strip Mallard Fillmore, marketed as a conservative answer to such left-leaning fare as Doonesbury and The Boondocks. Tinsley describes his protagonist (who bears a striking resemblance to Daffy Duck) as "a seasoned, rumpled ex-newspaper reporter" who "thinks we average, hardworking Americans need a break instead of a lecture." On Sunday, however, Mallard apparently thought we average, hardworking Americans needed a lecture after all (albeit in rhyme): a punctuation rant in the manner of Lynne Truss.
Mallard must have been reading Truss's best-seller, Eats, Shoots & Leaves: The Zero Tolerance Approach to Punctuation, since he echoes her impassioned plea, "How much more abuse must the apostrophe endure?" But Truss never states a black-and-white rule that apostrophes are only used "when you leave a letter out, or if you want to show possession." There are plenty of exceptions, such as the use of apostrophes in the pluralization of letters (e.g., "mind your p's and q's") or the pluralization of words used citationally (e.g., "the's and a's are reduced before consonants").
Mallard's examples of rampant apostrophization not surprisingly include the much-maligned greengrocer's apostrophe ("fresh apple's"), along with extraneous apostrophes in decade names ("the 80's") and pluralized family names ("the Smith's"). But as "a seasoned, rumpled ex-newspaper reporter," Mallard should know that the jury's still out on apostrophizing decade names. The New York Times house style, for instance, keeps the apostrophe in for names of decades; a search on the Times archive finds five examples of "the 80's" in Sunday's paper alone. On the other hand, given his usual tirades against the liberal domination of the media, Mallard might simply take this as an indication that the bastion of the MSM is too weak-kneed and morally relativistic to enforce proper rules of punctuation.
Rush, the little boy in the strip, warns Mallard that he's "turning into one of those grumpy old grammar cranks." (Linguists have grown accustomed to seeing "grammar" used as a catch-all term to encompass any number of perceived linguistic conventions, from punctuation to usage to pronunciation, depending on the pet peeves of the writer.) As it happens, the apostrophe-abusing New York Times had an article on Monday about a grumpy young grammar crank in the 80s (or the 80's), one who would later in life be nominated to become a Supreme Court justice. Under the headline "In Re Grammar, Roberts's Stance Is Crystal Clear," Anne Kornblut approaches the recently released Reagan-era memos of John Roberts with an eye towards his "tendencies as a grammarian." Roberts, we learn, "frequently peppered notes and documents with minor syntax corrections even when the basic legal arguments w ere sound." Some of his corrections really did have to do with grammar or syntax, such as his insistence on maintaining consistent parallel structure and pronominal reference. Other cases noted by Kornblut simply indicate a pickiness regarding word choice, such as Roberts's preference for voluntarism over volunteerism, ensuring over insuring, and multilateral over plurilateral. (We would need to see the context of these memos to know what beefs Roberts might have had with the offending words.)
Roberts also took issue with the phrasing of Neil Armstrong's famous line when setting foot on the moon (which was to be quoted by his boss, White House counsel Fred F. Fielding, in remarks at a Kennedy Space Center picnic):
"It is my recollection," Mr. Roberts wrote, "that he actually said 'one small step for a man, one giant leap for mankind,' but the 'a' was somewhat garbled in transmission. Without the 'a,' the phrase makes no sense."
Roberts is right that the phrase makes no sense without the "a", but he should take it up with Armstrong himself. The line wasn't actually "garbled in transmission" — Armstrong flubbed it, as the Snopes urban legends website has documented.
Kornblut writes, "If Judge Roberts is confirmed, and his word-consciousness follows him to the court, it will put him in the upper tier of justices who have put a premium on the English language." It's difficult to tell from the analysis of the memos if Roberts's "word-consciousness" will rise above the level of mere curmudgeonliness. But one auspicious omen appears in the graphic sidebar accompanying the article. In a memo from 1983, Roberts complains about how newspaper columnists focused on Ronald Reagan's memorable use of the word keister:
"Frankly, I've had it up to my keister with newspaper columns about an expression fairly common to those of us reared in the Midwest. I have drafted a reply." He concluded: "It is interesting how familiarity with slang phrases often varies among different parts of our country. In this case, excuse the bad pun, but I suppose it may depend on where one was reared."
Roberts makes a rather obvious dialectological point, but it's one that is frequently lost on self-appointed guardians of good "grammar". I take this as a hopeful sign that Roberts is no strict constructionist when it comes to linguistic variation.
[Update (by Mark Liberman): Bez Thomas, among others, reminded us of a classic apostrophe rant in cartoon form, from the left end of the political spectrum:
Both from the right and from the left, (some of) these defenses of linguistic norms are notable for their moral and emotional fervor. As reported in the NYT, Judge Robert's linguistic strictures are rather even-tempered in comparison.]
I've been catching up on Language Log and various other things. Mark's post about silly things people say about linguistics reminded me of a visit I had from a student some years ago when I was teaching at the University of Northern British Columbia. She came to discuss with me the topic on which she wanted to write her term paper for someone else's course, namely her idea that the Gitksan-Witsuwit'en are the Lost Tribes of Israel.
I began by objecting to the idea that the Gitkwan-Witsuwit'en could be the Lost Tribes of anywhere, on the grounds that they aren't a unitary group at all. The Gitksan speak a Tsimshianic language, closely related to Nisga'a, whereas the Witsuwit'en speak an entirely different Athabaskan language, whose closest relative is Carrier, which I have mentioned here from time to time. Their languages are no more similar to each other than English and Navajo. The reason that the term Gitksan-Witsuwit'en exists is that the two were for a time allied for political and legal purposes in the form of the Office of the Gitksan-Witsuwit'en Hereditary Chiefs. This is the organization behind Delgamuukw v. British Columbia, the lawsuit that ultimately led the Supreme Court of Canada, in 1997, to rule that aboriginal title still exists in British Columbia as a burden on the title of the Crown. The fact that these two quite different groups formed an alliance no more means that they shared a common history than does the fact that Turkey and Germany were allied in the First World War. We do not draw from this fact the inference that there is a Turco-Germanic people.
In addition to pointing out, with no evident impact, the fact that there is no such tribe as the Gitksan-Witsuwit'en, I enquired as to what precisely the evidence was, in her view, that the Gitkwan-Witsuwit'en were the Lost Tribes of Israel. I was pretty certain that they were not mentioned in the Bible. Was there some other evidence she had in mind?
She immediately demanded to know whether I believed in the Bible. I responded that my view of the truth of the Bible was irrelevant since the Bible had nothing to say about the matter. We went back and forth on this briefly. Then she stalked off, convinced that I was yet another unbeliever whose denial of the truth of the Bible led him to reject her hypothesis about the Gitksan-Witsuwit'en. There's no point in arguing with some people.
I just got the following email message:
Dear user of babel.ling.upenn.edu, mail system administrator of babel.ling.upenn.edu would like to inform you that, We have found that your account was used to send a huge amount of spam messages during the recent week. Most likely your computer had been infected and now contains a trojan proxy server. We recommend you to follow our instruction in order to keep your computer safe. Best regards, babel.ling.upenn.edu user support team.
It is accompanied by a zip file putatively containing the instructions that I am supposed to follow. I imagine that it actually contains a virus, though I'm not going to go to the trouble of finding out. (This is the one downside to running GNU/Linux - if I actually want to try out a virus I have to go find a machine running Microsoft Windows. I feel so left out...)
Anyhow, any native speaker of English will detect a number of errors in the above message, some of them errors or deviations from standard written usage of the sort that a native speaker is not likely to make at all, or even a non-native speaker who has been here long enough to be working as a system administrator. There's the use of a comma at the end of the salutation in place of a colon, the failure to start the first sentence on a new line, the failure to capitalize the first letter of the first word of the new sentence, and the omission of the before mail. Then there is the use of a comma rather than a colon before something set off like a quotation or list entry and the incorrect treatment of a subordinate clause as such. A native speaker would not say recent week instead of past week, or had been infected instead of has been infected. The construction We recommend you to follow... is not English.
Such a plethora of errors should alert just about anyone that the message is a fake. Are the scammers so foolish or ignorant that they don't realize this? It probably wouldn't be too hard to get someone to polish their prose. Or are enough computer users too dense to realize that messages like this are fake that the scammers don't bother?
Now joining the heavy metal umlaut is, apparently, the modish macron. To the right is a picture of the awning sign of a local hair place, VŌG. I walk past it frequently, wondering who's supposed to be attracted by evocations of Vogon style, but I didn't realize it was part of a trend. Recently, Phillip Jennings wrote in with news of "a new downtown Minneapolis salon named all-caps-something-or-other BLŪ", and also a magazine called "Modern HŌM". I can't find any web presence for either of these, but I'll take Phillip's word for it.
If you know of any other examples, send them along. Extra points for cases that don't involve back vowels or capital letters.
This usage apparently imitates the conventions of pronunciation fields in (some) American dictionaries, rather than from the sort of diacritical associations involved in the heavy metal umlaut, or the more general allure of foreign branding. This may be related to Qwest's belief that badly faked dictionary pronunciations are authoritative. However, I imagine that the real motivation is the difficulty (both legal and psychological) of establishing a brand around common words like vogue or home.
Unfortunately, the modish macron doesn't help our campaign to promote the IPA through popular culture.
[Update: Jesse Sheidlower points out PŪR, and mentions
"another one I'm thinking of, that I can't quite place, that so irritates me that I deliberately mispronounce it because I feel so manipulated by the macron".
Marilyn Tarnowski points out the Sprint WordTraveler FŌNCARD.
Eric Bakovic writes that
I used to laugh at a commercial from the (early? mid?) '80s for a shampoo called FOHO ('For Oily Hair Only'), but a quick google search fails to confirm my possibly wrong memory that both Os had macrons over them. But this doesn't really disconfirm my memory either; the product's been (predictably) discontinued, and the few hits I got only had ASCII-text examples, no images of the labels or anything like that. I did discover that it used to be a Gillette product, but that's about it.
Aaron Dinkin writes that
I seem to remember that there was a brand of juice box called "Boku" - macrons over the O and the U, and it was pronounced "beaucoup".
More information about BŌKŪ can be found here .
And Chris Waigl gets extra points for reminding me of her 2/8/2005 post on IPA and exoticism, which includes the example of séxūal, with a lower-case u macron.]
[Update #2: David Low was the first of several readers to point out that in Episode 9F22 of The Simpsons, Sideshow Bob is shown with LUV and HĀT on his knuckles (like other characters on that show, he has just three fingers plus a thumb on each hand). This seems less like a "modish" macron and more like a creative way to update an old movie reference (picture here) for consistency with a cartoon anatomical convention.]
[Update #3: David Doherty also gets extra points for the lower-case o with macron in the Seattle nightspot TōST. ]
[Update #4: Ed Keer at Watch Me Sleep pointed out that the board game Hūsker Dū uses macrons, which the rock band Hüsker Dü changed into heavy metal umlauts; according to the wikipedia entry, "The name of the game is spelled with macrons to emulate Scandinavian letters with macrons over them (even if macrons are only used in hand-written text)", and the game was originally published that way in Sweden in the 1950s, so if there's any connection to the new VŌG for macrons, it can only be because of some childhood experience of today's marketeers.]
[Update #5: Rebekka Puderbaugh mailed in a link to Zōe's Flax & Soy products.]
[And reported by Andrew Malcovsky, the PAYDĀTA company in Vermont...]
[And here's another example, the Riō mp3 player, submitted by Kilian Hekhuis:
]
[And another:
Cepacol, submitted by Mark Wayne:
]
In the midst of the disaster, some people are still worried about usage and pronunciation:
When this is over, someone please tell Tucker Carlson and the other national newscasters that "St. Louis" is in Missouri, and we call our town "Bay St. Louis". Hope they don't try to pronounce Pascagoula, Gautier, or Delisle.
Here's hoping that Tucker Carlson's misrendering of toponymic shibboleths is the worst damage they suffer.
I first saw the new antihero last year on a waitperson's chest (slogan: "Cute but psycho"), but I didn't know her name then, or even that she was a nameable phenomenon. A few days ago, courtesy of a junior-schooler excited about her new t-shirt (slogan: "not listening") I learned that it's "Happy Bunny". Wikipedia shows Happy Bunny in the (literal) mug shot linked on the right.
But actually, it's not Happy Bunny, it's It's Happy Bunny, even in subject position: "Does It's Happy Bunny dislike Boys?? Of course not. It's Happy Bunny dislikes everybody." Likewise after which, as in the numerous "which IT'S HAPPY BUNNY are you?" quizzes.
Joanne Jacobs reports that
Some blunt-spoken Happy Bunny messages, including "You're ugly and that's sad" and "It's cute how stupid you are," wouldn't make the cut at Highland Park High School.
"We consider that harassment, and we just don't allow it," Principal Jack Lorenz said.
Thought the target demographic is very different, this reminds me of the BOFH phenomenon -- both BOFH and IHB involve openly flaunting well known but traditionally covert hostility.
IHB slogans like "I think I gave you crabs" hint that the original It's Happy Bunny target might have been a bit older and more cynical than the group that has responded is. And indeed this article quotes IHB's inventor, Jim Benton, confirming this:
When Benton originated It's Happy Bunny, he expected the products bearing his artwork -- including a handful containing anti-boy phrases -- to appeal to young women ages 16 to 26. "It actually turned out to be much broader in appeal than we thought," he says. In the Bay Area, for instance, It's Happy Bunny can be found in shopping malls at Claire's, a nationwide retail chain that targets its accessories to girls ages 7 to 12.
IHB's role in validating adolescent female hostility is none of our business here. Instead, I want to make a linguistic point -- phrasal names like It's Happy Bunny introduce into English, in a small way, the phrasal names that are dominant features of many other cultures and languages. The most common source for this kind of thing in English has been bands whose names are sentences like They Might Be Giants or Frankie Goes to Hollywood. The ease with which such phrasal names enter general use seems to show that the difference in this respect between English and (for example) Yoruba is more a matter of general cultural choice than of linguistic structure.
[As far as I know, IHB consumers are all female, but there seems to be some uncertainty about the gender of the bunny itself.]
"Refreshingly simplistic," was how a VH1 reviewer described a new CD by some artist whose name I didn't recognize. I couldn't jot it down, since I was wheezing on a treadmill at the time, but a Google search turns up 425 instances of the phrase, with results that are variously comical and bizarre. A Web design company boasts that its work is "stylish and refreshingly simplistic." SunMex Vacations tells vacationers that "most of Mazatlan remains refreshingly simplistic." And an Amazon.com customer review of Nelson Goodman's classic Fact, Fiction, and Forecast says, "The way that Goodman perceives our inductive system is unique and refreshingly simplistic." The press isn't immune, either -- the phrase is rare there, but it turns up in a 1996 article in Billboard and a 1998 article in The Independent.
A familiar sort of malaprop, but there's a bit more going on here.
That analysis of simplistic as merely a fancy synonym for simple seems to be implicit in the word oversimplistic. If you accept Merriam-Webster's definition of simplistic as "oversimple," then oversimplistic would be a pleonasm. Yet the word gets more than 14,000 Google hits and appears in 178 stories in Nexis major newspapers (the earliest cite I've found is from a 1970 story in The New York Times, but this would probably be easy to antedate). In fact Merriam-Webster's gives oversimplistic as a run-in in the entry for the prefix over-, and while the OED doesn't list oversimplistic as a word, the editors actually use it in their definition for nothing-but-ism: "An oversimplistic approach to the explanation of a phenomenon, which excludes complicating factors; reductionism."
You could argue, of course, that the over- of oversimplistic is chiefly an intensifier, the way it is in items like overbrutal, overfacile, overfussy, and overhasty, in all of which the root itself carries an implication of excess. But the existence of phrases like "refreshingly simplistic" shows that for some people, at least, simplistic itself has acquired a purely positive meaning. My guess is that this development is helped along by an analogy with simplicity. Someone looking for an adjectival version of "refreshing simplicity" (6290 Google hits) might be drawn to "refreshingly simplistic," particularly given the effective absence of the intermediate forms simplism and simplist that words ending in -istic tend to imply. (The words actually exist, but are rare and recondite.)
No, I don't want to disown Jakob and Wilhelm Grimm, the first of whom is something of a hero of historical linguistics. I want to disown the movie The Brothers Grimm, and I'm doing this on behalf of linguists everywhere.
What the movie has in common with the real world is: two brothers named Grimm, early-19th-century Germans who were involved with fairy tales. As far as I can tell, that's it. Imagine a Life of Noam in which, through the miracle of miniaturization, the heroic Chomsky (played by Brad Pitt in a revealing latex bodysuit) takes a band of brawling adventurers into the deepest recesses of the human brain, to recover bits of the language organ for sale through his start-up company -- a sort of cerebral 21st-century Fantastic Voyage. Appalling.
In any case, not a movie to put on the recommended viewing list for students in your intro linguistics classes.
[Update, 8/30/05: Correspondents have now suggested two alternative scenarios. First, from Andrew Malcovsky, on his blog, a proposal that sticks much more closely to the historical facts than Terry Gilliam did, yielding something that might be entitled The True Adventures of Will and Jake.
And then from Tim Fitzgerald, who finds my Brothers Grimm/Chomsky comparison unfair (since "these two men have been chosen for their role in storytelling"), a counterproposal for "a much more apt comparison":
... 200 years from now a movie (or whatever form of mass entertainment they may use) on Spielberg's harrowing attempt to fight off dinosaurs from the Temple of Doom with the help of his loving extra-terrestrial friend.
Ah, California Spielberg and the E-Temple of Doom. Please, don't write to tell me that Steven Spielberg was born in Cincinnati, Ohio. I know that. But he belongs, truly belongs, to California].
zwicky at-sign csli period stanford period edu
No more inveigling in California courts, according to a story by AP legal affairs writer David Kravets that appeared on 8/25/05 in the San Francisco Chronicle:
When California jurors sit on kidnapping cases, judges will no longer be required to explain that the perpetrator had to "inveigle" his victim.
Instead, as part of an eight-year effort to simplify jury instructions, the judge may say it like it is -- "enticed" his victim.
The new guidelines also revise the characterizations of (among others) "reasonable doubt" and "mitigation" and, in a move objected to by many prosecutors, has them referred to as "prosecutors" rather than as "the people". Though the changes are modest (intentionally so, according to lawyer-linguist Peter Tiersma, who helped craft them), some judges maintain that they "dumb down the justice process", an accusation that would be hard to make stick on the basis of the examples Kravets provides; "entice" for "inveigle", for instance, is scarcely a giant step away from judicial clarity and towards street speech.
A Google web search gives ca. 74,800 hits for "inveigle" in its various forms, vs. ca. 4,740,000 for "entice" in its various forms, so "inveigle" seems to be enormously less frequent -- less familiar -- than "entice". The disparity is much greater than this, though, since a huge number of the "inveigle" hits are mentions rather than uses -- they're from discussions of the meaning of "inveigle", including as a legal term -- and many more are uses in specifically legal contexts. Not that "inveigle" lacks ordinary-language uses; consider "He had slyly inveigled her up to his flat / To view his collection of stamps" (Flanders and Swann, "Have Some Madeira, M'Dear") and many everyday occurrences like these:
... as per usual, was one poor SOB trying to inveigle shoppers into buying ... (Leah Garchik, "The In Crowd" Column, San Francisco Chronicle, April 14)
inveigle yourself into the homes and wineries of a few big names whose
egos ...
(link)
Still, "entice" is probably a small improvement on "inveigle".
The changes seem to be mostly in vocabulary. For instance, the old version defines "mitigation" as
any fact, condition or event which does not constitute a justification or excuse for the crime in question, but may be considered as an extenuating circumstance in determining the appropriateness of the death penalty
which is now
any fact, condition, or event that makes the death penalty less appropriate as a punishment, even though it does not legally justify an excuse for the crime
This maintains the two-clause syntax, with coordination replaced by subordination, and it reverses the order of the proviso (about not justifying the crime) and the main part of the definition (about allowing certain factors to be taken into account), in favor of putting the main part first, which is surely an improvement. It also reduces the nominalization quotient a bit, by replacing "justification" by "justify" and "appropriateness" by "appropriate". And it replaces the restrictive relativizer "which" by "that", which could be seen as either as a move towards informal English or as a move towards prescriptively standard English, depending on who you read.
But mostly what it tries to do is unpack the meaning of the term of art "extenuating circumstance".
Another change tries to unpack "innocent misrecollection", also a term of art (ca. 447 Google webhits, all of them apparently in legal contexts), via replacing
Innocent misrecollection is not uncommon.
by
People sometimes honestly forget things or make mistakes about what they remember.
More side-by-side comparisons in Kravets's article.
zwicky at-sign csli period stanford period edu
So says the NYT code of ethics, as well it should. For the past couple of months, I've been muttering about sloppy if not dishonest quoting practices in print media, including at the NYT. There's a particularly striking example in an 8/18/2005 article by Joel Brinkley and Steven R. Weisman, based on an interview with Condi Rice, which ran under the headline "Rice Urges Israel and Palestinians to Sustain Momentum".
The NYT article starts like this:
Secretary of State Condoleezza Rice on Wednesday offered sympathy for the Israeli settlers who are being removed from their homes in Gaza but also made it clear that she expected Israel and the Palestinians to take further steps in short order toward the creation of a Palestinian state.
"Everyone empathizes with what the Israelis are facing," Ms. Rice said in an interview. But she added, "It cannot be Gaza only."
The transcript of the interview was posted by the U.S. State Department web site under the title "Interview With The New York Times", dated August 17, 2005. In that transcript, the only occurrence of the string "empathize" is this one:
I know, in having talked to them and watched how hard and I think everybody empathizes with what every Israeli has to be feeling and with people uprooting from homes that they have been in for a generation and the difficulty and the pain that that causes.
And the only place where "Gaza only" occurs is here:
The other thing is, just to close off this question, the question has been put repeatedly to the Israelis and to us that it cannot be Gaza only and everybody says no, it cannot be Gaza only.
In between those two sentences are more than 1,300 words and 20 conversational turns.
Taking the first "quote" first, and ignoring the problem of yanking a phrase out of context, we've got an "approximate quotation" by anyone's standards (State Department transcript in black, NYT quote in blue):
Everyone empathizes with what the Israelis are facing ... and I think everybody empathizes with what every Israeli has to be feeling and ...
On the construal most favorable to the NYT -- scoring only the fragment from "everybody" to "feeling", and giving maximum credit for substitutions instead of insertions and deletions -- we have 5 substitutions and 2 deletions relative to 10 original words, for a word error rate of 70%. The meaning is similar, but that makes it a paraphrase rather than a quote.
In the case of the second quoted fragment, which Secretary Rice is said to have "added", there are three obvious problems. First, it's wrong to take a clause out of an indirect quotation and pretend that it's direct speech. If you say "everybody tells me that X", I can't quote you as asserting X -- you might well go to add "but I don't believe it for a minute". In this case, Rice does seem to include herself among the "everybody" who says that "it cannot be Gaza only", but that brings us to the second problem: she goes on to explain what she (at least) means by "not Gaza only", and it's not very much. Specifically,
There is, after all, even a link to the West Bank and the four settlements that are going to be dismantled in the West Bank. Everybody, I believe, understands that what we're trying to do is to create momentum toward reenergizing the roadmap and through that momentum toward the eventual establishment of a Palestinian state.
And finally, the linking phrase "but she added" seems to me to be the most dishonest thing of all. The meaning of add in question is something like "to say or write further", with the implication that the addition is in immediate rhetorical contiguity with what is added to. The use in Brinkley and Weisman's third sentence carries the clear implication that Rice chose to extend her remarks about empathy for the Gaza evacuation with a contrasting reminder of the need for further Israeli territorial concessions.
Now, that's the NYT's editorial line, and it might be the right line to take, but it's not really what Rice said. Not only was her "addition" yanked out of indirect speech attributed to others, not only was it was hedged immediately by a reference to the four West Bank settlements already being evacuated and a vague commitment to "momentum towards reenergizing the roadmap", but most important, it was in response to a different question, roughly eight minutes later, following 9 other intervening questions and answers.
I surmise that Brinkley and Weisman (or their editor) wrote the lede based on what they wanted to project as Rice's intent, and then looked through their notes on the interview for an illustrative quote. Not finding one, they stitched something together out of widely-separated fragments taken out of context. Somehow it's more surprising to see this done to the U.S. Secretary of State than to the San Antonio Spurs' scoring leader . But whether the speaker is Tim Duncan or Condi Rice, we should be able to believe that words in quotation marks in a newspaper stories are an accurate reflection of what was said, and give a fair impression of what was meant.
This is not just my own opinion. I've previously cited the NYT's own code of ethics on quotations:
Readers should be able to assume that every word between quotation marks is what the speaker or writer said. The Times does not "clean up" quotations. If a subject’s grammar or taste is unsuitable, quotation marks should be removed and the awkward passage paraphrased. Unless the writer has detailed notes or a recording, it is usually wise to paraphrase long comments, since they may turn up worded differently on television or in other publications. "Approximate" quotations can undermine readers’ trust in The Times.
The writer should, of course, omit extraneous syllables like "um" and may judiciously delete false starts. If any further omission is necessary, close the quotation, insert new attribution and begin another quotation. (The Times does adjust spelling, punctuation, capitalization and abbreviations within a quotation for consistent style.) Detailed guidance is in the stylebook entry headed "quotations." In every case, writer and editor must both be satisfied that the intent of the subject has been preserved.
Assuming that the State Department's transcript is accurate, the Brinkley and Weisman article seems to be a clear violation of both the letter and the spirit of this policy. Unfortunately, such violations are the norm rather than the exception, not only at the NYT but in print media in general.
I'm used to noticing new things in old books, usually descriptions or situations or emotions that went past me before but now catch my attention for some reason. Last night I was surprised to learn a new word from a book that I've read at least once before: Ross Macdonald's Black Money, originally published in 1965.
Lew Archer has tracked Leo and Kitty Ketchel from LA to a mansion in "Santa Teresa", Macdonald's alias for Santa Barbara. Kitty is speaking. Lew, as usual, is thinking.
"Leo made a lifetime of enemies. If they knew he was helpless, his life wouldn't be worth that." She snapped her fingers. "Neither would mine. Why do you think we're hiding out in the tules here?"
To her, I thought, the tules meant any place that wasn't on the Chicago-Vegas-Hollywood axis.
[p. 189, 1990 Warner paperback edition]
When I hit that passage, I had absolutely no memory of ever having seen or heard the word tule, in that book or anywhere else.
The OED says that tule refers to
Either of two species of bulrush (Scirpus lacustris var. occidentalis, and S. Tatora) abundant in low lands along riversides in California; hence, a thicket of this, or a flat tract of land in which it grows.
and gives citations back to 1837
1837 P. L. EDWARDS Jrnl. 20 July (1932) 26 Driving her along the margin of a bulrush or Tule pond she turned about.
1845 J. C. FRÉMONT Rep. Exploring Expedition 252 They..live principally on acorns and roots of the tulé, of which also their huts are made.
1850 W. R. RYAN Personal Adv. Upper & Lower Calif. I. 298 The Indians of the party were despatched to hunt up the banks of the river for toolies.
The etymology is given as
[ad. Aztec tullin, the final n being dropped by the Spaniards as in Guatemala, Jalapa, etc.]
and the pronunciation is as suggested by the alternative spelling toolies.
The AHD entry explains further that
Low, swampy land is tules or tule land in the parlance of northern California. When the Spanish colonized Mexico and Central America, they borrowed from the native inhabitants the Nahuatl word tollin, “bulrush.” The English-speaking settlers of the West in turn borrowed the Spanish word tule to refer to certain varieties of bulrushes native to California. Eventually the meaning of the word was extended to the marshy land where the bulrushes grew.
Merriam-Webster's Unabridged has similar information, as does Encarta, which adds that "to be in deep tules" is a Hispanic expression meaning "to be in trouble with the law".
The OED has toolies, glossed as "Backwoods; remote or thinly populated regions.", with citations back to 1961 -- but curiously, flags it as a Canadian regional term rather than a Californian one:
1961 R. P. HOBSON Rancher takes Wife i. 22 We're plenty far back in the toolies at Batnuni.
Kenneth Millar (who wrote as Ross Macdonald) was born in Los Gatos but educated in Canada, for what that's worth.
Among the dictionaries I checked, none besides the OED gives tules, under any spelling, the meaning that's apparent in the Black Money passage. And glancing through the first hundred Google hits for {"in the tules"} didn't turn up any similar figurative uses, except that Bret Harte's short story In the Tules does make an implicit pun on Ultima Thule. However, the hits for {"in the toolies"} are a different matter:
Ok proof I've lived in the toolies just a tad too long, as I find that amusing.
There was a sense that we were out in the provinces, in the toolies.
At the time, this stretch of the old Route 66 was still "out in the toolies."
Please picture me and two tiny little kids in a very small stone house WAYYY out in the toolies.
You may find that it sometimes have you stopping so far out in the toolies that no hotels/campgrounds are anywhere nearby.
And so on. This seems to be a case of a word in fairly common use that is spelled one way when it's meant literally, and a different way in a figurative meaning. I wonder if it was Millar's choice to spell it "tules" in Black Money, or the idea of a copy editor at Knopf?
[Update: several readers have pointed me to a lovely page about the natural history of tule marshes of California's central valley, which also cites "out in the tules" as an equivalent to "out in the sticks". This page also mentions the "tule fogs", which several correspondents including Arnold Zwicky have described to me as their strongest association with the word.]
OK, this is "Language Log", not "Complaining about Editorial Standards at the New Yorker Log", so I was going to let it pass. But several readers have written to point out something strange in the little Mountweazels item that I linked to yesterday:
Anne Soukhanov, the U.S. General Editor of Encarta Webster’s, was the first to weigh in. “Ess-kwa-val-ee-ohnce—I want to pronounce it in the French manner—is your culprit,” she said.
It's the status of the made-up word esquivalience that's at issue, and Tom Rossen's reaction was the most pungent:
Kwa she talkin' 'bout, Willis? If that's what Microsoft's finest think is the French pronunciation of "qui", I'm at a loss for mots!
"Ess-kwa-val-ee-ohnce" is indeed a strange notion of how to pronounce esquivalience "in the French manner", but I don't think that it's safe to attribute the idea to Soukhanov. The pages of the New Yorker are by no means bereft of linguistic carelessness -- we've documented hallucinations about pronunciation and a preposterous transcription error, among other things, and the Soukhanov quote's chain of transmission is unclear. Henry Alford writes that "The six words and their definitions were e-mailed to nine lexicographical authorities", which suggests that the responses might have come by email as well; but then he uses the tag "she said", not "she wrote" or "she e-mailed", so maybe he talked with Soukhanov on the phone. If her answer was spoken, then the lamely fake representation of pronunciation is entirely Alford's. And if Soukhanov answered by email, that part of the quote might have been edited, either by Alford or by someone else at the New Yorker. This is the familiar problem of attributional abduction.
But even if Soukhanov provided the pronounciation as printed -- which I doubt -- it seems to me that the magazine is at fault. Depicting a respected senior lexicographer as ignorant of French pronunciation is a distraction from the light-hearted point of the piece. The spirit of Miss Gould is fading further.
"It's like tagging and releasing giant turtles", says Erin McKean. Read all about in Henry Alford's Talk of the Town piece on lexicographic honeypots (though they are not identified by that name, which comes from the computer security area). I note that Alford says that esquivalience "has since been spotted on Dictionary.com, which cites Webster’s New Millennium as its source", but it's not there now.
Whether or not you're interested in the content of the on-going debate in Cognition about approaches to language evolution, you might find it interesting to contemplate its schedule. Here's a summary of the time line:
Hauser, Chomsky and Fitch (HCF): published in Science November 22, 2002.
Pinker and Jackendoff (PJ): Received 16 January 2004; accepted 31 August 2004. Available online 19 January 2005. Published March 2005.
Fitch, Hauser and Chomsky (FHC): Received 5 November 2004; accepted 15 February 2005. Available online 19 August 2005. Not published yet.
Jackendoff and Pinker (JP): posted on J's website March 23 2005. No information yet available from Cognition.
So Pinker and Jackendoff took a year or so to decide to respond to HCF and to send in their critique PJ, which arrived at Cognition roughly 14 months after HCF was published. It took 7.5 months for it to be accepted, and then another 4.5 months for it to be put on line, and another 2 months to appear in paper form. The response FHC to PJ was sent in 2 months after PJ's acceptance, 2.5 months before it appeared on line, and 4.5 months before it appeared in print; FHC was accepted 3 months after submission, published on line after an additional 6 months, and not published yet. Meanwhile, the response JP to FHC was completed and put on line (and thus I assume was sent to Cognition) about 1 month after FHC was accepted, and five months ago; apparently Cognition has accepted it, but it has not yet appeared on the Cognition web site.
(I'm not singling Cognition out for criticism — this is a typical sort of schedule for such sequences.)
This conversational tempo is reminiscent of 18th century correspondence between Europe and North America, or Europe and India, when a message could take as much as six months to reach its destination. If we start the conversational clock at the point where PJ was accepted by Cognition, and entirely ignore the timing and distribution of actual print media, we get the following sums for the conversational sequence PJ/FHC/JP:
Thinking and writing: 3 months
Review: 10.5 months + unknown time for JP, still not known to be accepted -- say 13.5 months?
Waiting for publication on line after acceptance: 10.5 months + unknown time for JP -- say 11.5 months?
The sums are uncertain because the time periods involved are not entirely disjoint (so that only 19 months in total have elapsed since PJ was received by Cognition), but it still seems likely that the mechanics of the system have slowed this conversation down at least as much as sending the manuscripts by square-rigger across the oceans would have done. Shouldn't there be a way to carry out scientific discussion that's a bit brisker? Certainly one good candidate for elimination is the 11.5 months or so clocked in this case by waiting for accepted articles to appear on line.
Somewhat more tentatively, I'd like to raise the question of whether the review process is always worth its cost in time. As is normal and probably inevitable in the refereed literature, this particular back-and-forth includes a number of factually doubtful statements and presuppositions. I'll cite just one example: in PJ we read that
HCF do discuss the ability to learn linearly ordered recursive phrase structure. In a clever experiment, Fitch and Hauser (2004) showed that unlike humans, tamarins cannot learn the simple recursive language AnBn (all sequences consisting of n instances of the symbol A followed by n instances of the symbol B; such a language can be generated by the recursive rule S→A(S)B).
and in FHC, we read the response that
The inability of cotton-top tamarins to master a phrase-structure grammar (Fitch & Hauser, 2004) is of interest in this discussion primarily as a demonstration of an empirical technique for asking linguistically relevant questions of a nonlinguistic animal.
The reader will naturally conclude from this that Fitch and Hauser (2004) actually did establish something about the abilities of tamarins and humans to learn the language AnBn, whereas the sad fact is that this conclusion was a serious over-interpretation of a rather limited experiment, and seems to be incompatible with later research.
In addition to such (inevitable) mistakes, the programmatic nature of this exchange results in an unusually large fraction of statements of opinion, where the role and value of the review process is especially unclear. I'll also point out that the review process, though regarded as a sacred ritual by our academic culture, is a relatively recent development. I recall reading that when Albert Einstein moved to the U.S. in the 1930s, and first submitted an article to an American journal, he was shocked and offended to learn that that it was being sent out for review. This was not because he thought himself in particular above such things, but rather because he had never encountered the practice before, so that his first reaction was that he was being singled out as an untrustworthy source. (Memory says, perhaps falsely, that I read this in Abraham Pais' wonderful biography of Einstein, Subtle is the Lord, which I don't have at hand.)
I'm a conservative sort of person, though not nearly as conservative as most academics are about their culture, so I'm not about to propose that we scrap the existing journal system. As Churchill is said to have said about democracy, it's the worst possible system, except for all the others.
But all the same, among the emerging technologies of networked text archives, links, indices and so on, there are a wide range of other possible solutions to the problems of scientific and scholarly communications that refereed journals have evolved to solve. And as a result, I'll predict that within 50 years, scientists and scholars will use a very different set of methods for communicating and discussing their research results, and the existing system of scientific and scholarly journals will survive only in a vestigial form, analogous to the caps and gowns that academics once wore all the time, and now put on only for ceremonial occasions.
[Update: Jay Cummings writes
In the upcoming Physics Today (September 2005), the story of Einstein's objection to being reviewed is told. The reviewer (with some trepidation, because after all, he knew this was Einstein) suggested a correction. Einstein refused the correction, but he turned out to be wrong.
I haven't seen the article -- I'll look forward to learning the details.
On August 19, 2005, the journal Cognition posted on line a 19,000-word article by Tecumseh Fitch, Marc Hauser and Noam Chomsky, entitled "The evolution of the language faculty: Clarifications and implications" (free version here), referencing an additional 6,000-word appendix "The Minimalist Program". This is the third turn in a (so far) four-turn, three-year debate with Steve Pinker and Ray Jackendoff.
Chris at Mixing Memory has posted on FHC 2005, asking especially for help in decoding Chomsky's Minimalist appendix. I'll limit myself to observing that it's entirely "inside baseball": seven pages of text that mention no linguistic facts and no specific languages, nor any simulations, formulae, or empirical generalizations. Aside from a very general and abstract account of Chomsky's view of the goals of his research, the only topic is who said what when, sometimes with a very abstract explanation of why. It's an odd document -- I can't think of anything at all comparable from a major figure in a scientific or scholarly field, except perhaps some controversies over precedence (which is not an issue here). I agree with the judgment of Jacques Mehler, the editor of Cognition, who asked for it to be cut; and it seems to me that it's a distraction for outsiders (including most of the normal readership of Cognition) to try to understand it.
However, the larger discussion of language evolution has many points of general interest, which we've touched on in this blog from time to time, and will again. So as a public service, here's a quick overview, with links, of the Chomsky/Fitch/Hauser vs. Jackendoff/Pinker story so far:
Step 1 (HCF, 2002): Marc Hauser, Noam Chomsky, and Tecumseh Fitch wrote an article in Science entitled "The Faculty of Language: What Is It, Who Has It, and How Did It Evolve?" (Vol 298, Issue 5598, 1569-157 , 22 November 2002). A free version is available here.
Step 2 (PJ, 2004): Steven Pinker and Ray Jackendoff responded with an article in Cognition entitled "The faculty of language: what's special about it?" (Volume 95, Issue 2 , March 2005, Pages 201-236 -- free version here).
Step 3 (FHC, 2005) Fitch, Hauser and Chomsky have responded, with an article due out in Cognition entitled "The evolution of the language faculty: Clarifications and implications" (free version here). The abstract refers to an "online appendix" where "we detail the deep inaccuracies in their characterization of [the Minimalist Program]". The appendix does not seem to be linked anywhere in the online paper, but it is on line here, with the authors ordered as "N. Chomsky, M.D. Hauser and W.T. Fitch", entitled "Appendix. The Minimalist Program."
Step 4 (JP, 2005): Jackendoff and Pinker will respond to the response, in an article entitled "The Nature of the Language Faculty and its Implications for Evolution of Language" (listed as "in press" at Cognition, but not yet available on line -- free version of 3/25/2005 here).
If you want a quick overview of what the conversation is about, without reading all 57,440 words so far expended by all sides, here are the abstracts, again with links to the full versions:
Step 1 (2002): Marc Hauser, Noam Chomsky, and Tecumseh Fitch wrote an article in Science entitled "The Faculty of Language: What Is It, Who Has It, and How Did It Evolve?" (Vol 298, Issue 5598, 1569-157 , 22 November 2002). A free version is available here. The abstract:
We argue that an understanding of the faculty of language requires substantial interdisciplinary cooperation. We suggest how current developments in linguistics can be profitably wedded to work in evolutionary biology, anthropology, psychology, and neuroscience. We submit that a distinction should be made between the faculty of language in the broad sense (FLB) and in the narrow sense (FLN). FLB includes a sensory-motor system, a conceptual-intentional system, and the computational mechanisms for recursion, providing the capacity to generate an infinite range of expressions from a finite set of elements. We hypothesize that FLN only includes recursion and is the only uniquely human component of the faculty of language. We further argue that FLN may have evolved for reasons other than language, hence comparative studies might look for evidence of such computations outside of the domain of communication (for example, number, navigation, and social relations).
Step 2 (2004): Steven Pinker and Ray Jackendoff responded with an article in Cognition entitled "The faculty of language: what's special about it?" (Volume 95, Issue 2 , March 2005, Pages 201-236 -- free version here). The abstract:
We examine the question of which aspects of language are uniquely human and uniquely linguistic in light of recent suggestions by Hauser, Chomsky, and Fitch that the only such aspect is syntactic recursion, the rest of language being either specific to humans but not to language (e.g. words and concepts) or not specific to humans (e.g. speech perception). We find the hypothesis problematic. It ignores the many aspects of grammar that are not recursive, such as phonology, morphology, case, agreement, and many properties of words. It is inconsistent with the anatomy and neural control of the human vocal tract. And it is weakened by experiments suggesting that speech perception cannot be reduced to primate audition, that word learning cannot be reduced to fact learning, and that at least one gene involved in speech and language was evolutionarily selected in the human lineage but is not specific to recursion. The recursion-only claim, we suggest, is motivated by Chomsky's recent approach to syntax, the Minimalist Program, which de-emphasizes the same aspects of language. The approach, however, is sufficiently problematic that it cannot be used to support claims about evolution. We contest related arguments that language is not an adaptation, namely that it is “perfect,” non-redundant, unusable in any partial form, and badly designed for communication. The hypothesis that language is a complex adaptation for communication which evolved piecemeal avoids all these problems.
Step 3 (2005) Fitch, Hauser and Chomsky have responded, with an article due out in Cognition entitled "The evolution of the language faculty: Clarifications and implications" (free version here). The abstract:
In this response to Pinker and Jackendoff's critique, we extend our previous framework for discussion of language evolution, clarifying certain distinctions and elaborating on a number of points. In the first half of the paper, we reiterate that profitable research into the biology and evolution of language requires fractionation of “language” into component mechanisms and interfaces, a non-trivial endeavor whose results are unlikely to map onto traditional disciplinary boundaries. Our terminological distinction between FLN and FLB is intended to help clarify misunderstandings and aid interdisciplinary rapprochement. By blurring this distinction, Pinker and Jackendoff mischaracterize our hypothesis 3 which concerns only FLN, not “language” as a whole. Many of their arguments and examples are thus irrelevant to this hypothesis. Their critique of the minimalist program is for the most part equally irrelevant, because very few of the arguments in our original paper were tied to this program; in an online appendix we detail the deep inaccuracies in their characterization of this program. Concerning evolution, we believe that Pinker and Jackendoff's emphasis on the past adaptive history of the language faculty is misplaced. Such questions are unlikely to be resolved empirically due to a lack of relevant data, and invite speculation rather than research. Preoccupation with the issue has retarded progress in the field by diverting research away from empirical questions, many of which can be addressed with comparative data. Moreover, offering an adaptive hypothesis as an alternative to our hypothesis concerning mechanisms is a logical error, as questions of function are independent of those concerning mechanism. The second half of our paper consists of a detailed response to the specific data discussed by Pinker and Jackendoff. Although many of their examples are irrelevant to our original paper and arguments, we find several areas of substantive disagreement that could be resolved by future empirical research. We conclude that progress in understanding the evolution of language will require much more empirical research, grounded in modern comparative biology, more interdisciplinary collaboration, and much less of the adaptive storytelling and phylogenetic speculation that has traditionally characterized the field.
Step 4 (JP, 2005): Jackendoff and Pinker will respond to the response, in an article entitled "The Nature of the Language Faculty and its Implications for Evolution of Language" (listed as "in press" at Cognition, but not yet available on line -- free version of 3/25/2005 here). The abstract:
In a continuation of the conversation with Fitch, Chomsky, and Hauser on the evolution of language, we examine their defense of the claim that the uniquely human, language-specific part of the language faculty (the “narrow language faculty”) consists only of recursion, and that this part cannot be considered an adaptation to communication. We argue that their characterization of the narrow language faculty is problematic for many reasons, including its dichotomization of cognitive capacities into those that are utterly unique and those that are identical to nonlinguistic or nonhuman capacities, omitting capacities that may have been substantially modified during human evolution. We also question their dichotomy of the current utility versus original function of a trait, which omits traits that are adaptations for current use, and their dichotomy of humans and animals, which conflates similarity due to common function and similarity due to inheritance from a recent common ancestor. We show that recursion, though absent from other animals’ communications systems, is found in visual cognition, hence cannot be the sole evolutionary development that granted language to humans. Finally, we note that despite Fitch et al.’s denial, their view of language evolution is tied to Chomsky’s conception of language itself, which identifies combinatorial productivity with a core of “narrow syntax.” An alternative conception, in which combinatoriality is spread across words and constructions, has both empirical advantages and greater evolutionary plausibility.
Theologians can debate whether it counts technically as bearing false witness against the AP, but in this passage from his show yesterday (8/24/2005), Pat Robertson certainly seems to be straying from the intent of the ninth commandment. The relevant portion of the transcript:
Thor Halvorssen: | Essentially, Hugo Chavez has turned Venezuela into a dictatorship. Now, I think that it's very important to also note that your comments were about assassination. The person- I think that alternative is lowering to his level- |
Pat Robertson: | Uh I didn't say "assassination", I said our- our special forces should quote "take him out", and "take him out" can be a l- a number of things including kidnapping, there are a number of ways to take out a dictator from power besides killing him. Uh I was misinterpreted by the AP, but that happens all the time. |
What Pat actually said on 8/22/2005 was:
Thanks, Dale. If you look back just a few years, there was a popular coup that overthrew him and what did the United States State Department do about it? Virtually nothing. And as a result, within about forty eight hours that coup was broken, Chavez was back in power; but we had a chance to move in; he has destroyed the Venezuelan economy, and he's going to make that a launching pad for communist infiltration and- and uh muslim extremism all over the continent. You know, I don't know about this doctrine of assassination, but if he thinks we're trying to assassinate him, I think that we really ought to go ahead and do it, it's a whole lot cheaper than starting a war, and uh uh I don't think any oil shipments will stop. But this man is a terrific danger, and the United- this is in our sphere of influence, and we can't let this happen. We have the Monroe Doctrine, we have other doctrines that we have announced, and uh without question, this is a dangerous uh enemy to our south controlling a huge pool of oil, that could hurt us very badly. We have the ability to take him out, and I think the time has come that we exercise that ability. We don't need another two hundred billion dollar war uh to get rid of one you know, strong-arm dictator. It's a whole lot easier to have some of the covert operatives do the job and then get it over with. Kristi? [emphasis added]
It's hard to listen to this passage and understand "take him out" to mean anything other than "kill him". It's true that "to take someone out" can mean to escort them on a date, and that a football block can take someone out of a play. Googling "took him out of the" gives us completions like "the chariot", "the town", "the school", "the picture", "the fraternity", "the crate" and so on. However, the obvious meaning in this context was "assassinate".
And in any case, the first quote given in the 8/22/2005 AP story was not the "take him out" business, but this:
"You know, I don't know about this doctrine of assassination, but if he thinks we're trying to assassinate him, I think that we really ought to go ahead and do it," Robertson said. "It's a whole lot cheaper than starting a war ... and I don't think any oil shipments will stop."
Aside from cleaning up a few uhs, the AP quote is completely accurate, and gives the lie to Robertson's assertion that "I didn't say 'assassination'".
It's especially curious that Robertson makes this feeble attempt to deny his own words, given that a few minutes earlier in the same (8/24/2005) show he sketches a justification for assassination, based on the ethical theory and practice of Dietrich Bonhoeffer:
Ladies and gentlemen, you can see Dale's full story on cbn.com, and uh a number of people have expressed their opinions about Chavez, including a man named Thor Halvorssen. He's a Venezuelan and president of the Human Rights Foundation, and he's going to join us in just a minute from New York, but before I get to that, I want to tell you about a statement of uh the great Dietrich Bonhoeffer, who suffered under Adolf Hitler, and wondered what would be the case of a wicked dictator like Hitler, how would Christians react to that. And uh Dietrich Bonhoeffer is reported to have said "if you see a car going out of control, and heading toward a group of people, do you try to stop the car or ((do)) you console the victims after it hits them?" And he said after weighing the moral consequences of that, he determined it would be better to stop the car and therefore he allied himself with those who were attempting to assassinate Adolf Hitler, and to take this monster off the world stage. That by the way cost uh this brave soldier of the cross his life, because one did not speak out against Adolf Hitler.
You can read more about Dietrich Bonhoeffer here and here. According to the second link, Bonhoeffer was indeed hung, not for speaking out against Hitler, but for participating in a plot to assassinate him:
Bonhoeffer's role in the conspiracy was one of courier and diplomat to the British government on behalf of the resistance, since Allied support was essential to stopping the war. Between trips abroad for the resistance, Bonhoeffer stayed at Ettal, a Benedictine monastery outside of Munich, where he worked on his book, Ethics, from 1940 until his arrest in 1943. Bonhoeffer, in effect, was formulating the ethical basis for when the performance of certain extreme actions, such as political assassination, were required of a morally responsible person, while at the same time attempting to overthrow the Third Reich in what everyone expected to be a very bloody coup d'etat. This combination of action and thought surely qualifies as one of the more unique moments in intellectual history.
The car story seems to come from a memoir by G. Leibholz, reproduced in the beginning of The Cost of Discipleship, where he writes of Bonhoeffer (in the English translation)
As he used to say: it is not only my task to look after the victims of madmen who drive a motorcar in a crowded street, but to do all in my power to stop their driving at all.
Shortly after Robertson's Bonhoeffer passage, when Halvorssen objects to the assassination idea, Robertson doesn't deny it (link):
Thor Halvorssen: | Now I- I did wanted to mention, Pat, that the report filed by Dale Hurd yesterday, that aired on CBN, is without question one of the most accurate um reports that have appeared in the media; certainly a lot more accurate and uh alarming than what has appeared in the mainstream media; and for that I would like to commend you. Uh by the same token, I would categorically like to say that I- I disagree with you on uh in terms of the solution to some of these issues and I do not think that assassination is- is a- the route that any country should take in the case of Chavez. |
Pat Robertson: | Well I appreciate that; what would you do to stop him? |
So it's all the stranger that he breaks in later to claim that he didn't say what his own archives clearly documents him as saying. I can't imagine Bonhoeffer attempting to deny his own words in the way that Robertson has done.
[Update: as Timothy Noah points out in Slate, Robertson also appeals to the example of Bonhoeffer in the text of his official apology. In fact, Robertson's statement says that calling for assassination was wrong:
Is it right to call for assassination? No, and I apologize for that statement. I spoke in frustration that we should accommodate the man who thinks the U.S. is out to kill him.
but never that performing an assassination would be wrong; and the extended discussion of Bonhoeffer that follows makes Robertson's views on that matter clear enough. This is a non-apology of a different kind than those that Geoff Pullum dissected earlier: it does indeed have the grammatical form of an apology. It's rather like saying "I apologize for calling you a liar. I spoke in frustration, because I was upset about all the times you said things that aren't true." ]
My friend Caroline Henton lent me her copy of another book with an unspeakable title. In this case it is not a matter of modesty, as when NPR refuses to read a title out loud because it contains an Anglo-Saxon term for excrement, but rather that there is strictly no possible out-loud reading: it is a book whose orthographic title has no phonetic counterpart (like the two films I have mentioned elsewhere). The author is Sterling Johnson, an ESL teacher and lecturer from Pacific Grove, California, and the title is Watch Your F*cking Language: How to Swear Effectively, Explained in Explicit Detail and Enhanced by Numerous Examples Taken from Everyday Life (New York: Thomas Dunne Books). The asterisk is there even in the Library of Congress cataloguing data. Incidentally, I am not recommending the book. It only just manages to get to page 100 by dint of much wasted space and lots of large gaps between paragraphs, and in my opinion it s*cks. At least, I did not s*cc*mb to its charms. Caroline will get her copy back *ns*llied and relatively *nth*mbed. It's not the first book with an asterisk in the title; it's not even the first by Sterling Johnson, who has an earlier book entitled English as a Second F*cking Language.
Amid all the fuss about Pat Robertson's assassination suggestion, no one seems to have picked up on what I thought was the oddest part of his outburst, namely the reference to the Monroe Doctrine.
I transcribed the entire passage, from the video in the 700 Club archive. The format is that of a news program. We're at the end of a canned segment on Venezuela, and about to start a segment on some events in Iraq. We switch to Robertson behind the anchor desk, and he says:
Thanks, Dale. If you look back just a few years, there was a popular coup that overthrew him and what did the United States State Department do about it? Virtually nothing. And as a result, within about forty eight hours that coup was broken, Chavez was back in power; but we had a chance to move in; he has destroyed the Venezuelan economy, and he's going to make that a launching pad for communist infiltration and- and uh muslim extremism all over the continent. You know, I don't know about this doctrine of assassination, but if he thinks we're trying to assassinate him, I think that we really ought to go ahead and do it, it's a whole lot cheaper than starting a war. and uh uh I don't think any oil shipments will stop. But this man is a terrific danger, and the United- this is in our sphere of influence, and we can't let this happen. We have the Monroe Doctrine, we have other doctrines that we have announced, and uh without question, this is a dangerous uh enemy to our south controlling a huge pool of oil, that could hurt us very badly. We have the ability to take him out, and I think the time has come that we exercise that ability. We don't need another two hundred billion dollar war uh to get rid of one you know, strong-arm dictator. It's a whole lot easier to have some of the covert operatives do the job and then get it over with. Kristi?
I expect that Mr. Robertson is old enough to have learned in school what the Monroe Doctrine is. Does he really think that Hugo Chavez represents a case of European intervention?
For a change, the reproduction of Robertson's quotes in the media are pretty accurate (thus Laurie Goodstein's NYT story quotes 38 words that entirely agree with my transcript, punctuation choices aside), but there are a few oddities. For example, the story on the Bloomberg wire replaces "have" with "let", introduces an ungrammatical "to", and deletes "then" in one of Pat's phrases:
It's a whole lot easier to let some of the covert operatives to do the job and get it over with. It's a whole lot easier to have some of the covert operatives do the job and then get it over with.
And the Knight Ridder story deletes "of the", changes "covert" to the ungrammatical "cover", makes "operatives" singular, and also elides the "then":
It's a whole lot easier to have some cover operative do the job and get it over with. It's a whole lot easier to have some of the covert operatives do the job and then get it over with.
The Bloomberg version of the phrase has 4 errors in 21 words, for a word error rate of 19%; Knight Ridder has 5 errors in 21 words, for a W.E.R. of 24%. This is better than you often see -- and rest of the reported quotes from Robertson were generally even closer to what he actually said. But really, is there any excuse for not getting it completely right in this case, where the reporters were presumably not basing their quotes on notes from a live presentation, but were transcribing from the same archival recording that I used?
Worse, each transcription error introduced a solecism: "...let some of the covert operatives to do the job..."; "have some cover operative do the job". Shouldn't an editor have noticed this, and asked someone to spend a few minutes to check whether a highly verbal media personality like Robertson really said it that way?
This sort of carelessness with elementary facts, which seems to be the norm rather than the exception in newspapers today, cuts the ground out from under arguments about the value of editors.
[Update: as several readers have suggested, Mr. Robertson's reference to "other doctrines that we have announced" probably was a swipe in the general direction of the (Theodore) Roosevelt corollary to the Monroe Doctrine. However, Calvin Coolidge and Herbert Hoover explicitedly repudiated the Roosevelt Corollary in 1928 and 1930, as did FDR in 1934 and others since. ]
In today's WSJ, Philip Howard has a review (subscribers only) of the new edition of "Webster's New World College Dictionary". Howard is a writer at the London Times, and he takes the opportunity to meditate on differences between British and American varieties of English, sprinkling his review with gracious little transatlantic compliments that are so forced as to seem almost like insults:
"It may be painful for a Little Englander to admit, but Webster leads Oxford in priority, in the same way that the U.S. leads the U.K. in technology, fashion, and the thousand other variables that make up modern living."
Another dribble of soft soap:
"We can conclude (e.g., from rhymes) that the pronunciation of Shakespeare was closest to that of a Boston Brahmin."
But the BBC told us not long ago that original-accent Shakespearean English is "completely intelligible if you happen to come from North Carolina".
However, I don't think we should trouble ourselves examining Howard's scholarship too deeply. He ends his review with some remarks on the virtues of printed books over on-line text, and the benefits of editing, which he demonstrates by telling a funny story about the bad things that happen when something is published without being edited. Oops, make that a funny story about the bad things that happen when something is subjected to editing....
I trust books from a reputable house to have been edited: I don't trust anything on the Internet. At the London Times we have a correspondent called Brian Cosker, the economics head of a group of English schools, who writes to us from Baldock in Hertfordshire. A copy editor in a hurry ran his letter through Spellcheck and Mr. Cosker appeared in print as "Drain Coaster from Padlock." What Mr. Cosker thought of that, you can be sure, will never make it into Webster's.
This one of those stories that sounds good over a pint, but seems increasingly implausible if you think about it seriously. I tried searching the Times archive, which is unaware of any articles authored by "Drain Coaster" since the start of the archive in 1985. I rather doubt that an editor would have run "Spellcheck" over submitted copy at the Times, in a hurry or otherwise, before 1985. So either Mr. Howard is embroidering, or the on-line version has been corrected.
In any case, his logic is odd. He trusts books from reputable houses, because they are edited; but he doesn't trust "anything on the internet", because a Times editor once turned Brian Coster from Baldock into Drain Coaster from Padlock. Should we conclude that the Times is not "a reputable house"? Surely not -- rather, it seems that Mr. Howard is angling for the prestigious Michael Gorman Prize for Pleistocene Punditry, and lost the thread of his argument while trying to maximize the number of pokes at computers and the internet he could fit into the few dozen words available to him.
Philip Howard's normal beat seems to be the Modern Manners column, which makes sense, since logic and historical accuracy are less relevant to advice about etiquette than they are in other areas of modern journalism.
An article by Edward Wyatt in today's NYT calibrates Paul Anderson's new 1,360-page, four-pound-nine-ounce novel "Hunger's Brides" as weighing as much as 2.5 copies of "The Da Vinci Code". Next, maybe someone will figure out what all the pages laid end-to-end would add up to in smoots.
Back in June, I puzzled over a particular example of a snowclone that I heard in a movie and on the radio: "that's why they call it acting". Several responses came in almost immediately, some saying basically the same thing as (or agreeing with) Mark's analysis: the dictionary definition of acting offers appropriate multiple senses of the word to render the example not so remarkable after all.
I'm not here to argue against this completely reasonable, polysemy-based analysis of the "... acting" example, but I do still wonder -- as do some of my correspondents -- if that's what the people who used the example were thinking when they decided to use it. My doubts are primarily fueled by other examples of the "that's why they call it x" snowclone that I've come across, including the "that's why they call it money" example that I noted at the end of my original post. I think a strict polysemy analysis of any of these examples (if one is even possible) is far more of a stretch than it is for the "... acting" example.
What I mean by "strict polysemy" here is critical, because I'll contrast it with "loose polysemy" in a moment. The way I'm defining it, the word substituting for x in the "that's why they call it x" snowclone is "strictly polysemous" if it has a generally agreed-upon set of senses (as defined, say, by a dictionary), at least two of which can be convincingly argued to be invoked and compared/contrasted by the snowclone. (I do realize that "generally agreed-upon" and "convincingly" are major points of weakness in this definition, but I'll go on for lack of a better way to put it.) For example, money has at least the following seven senses:
1. A medium that can be exchanged for goods and services and is used as a measure of their values on the market, including among its forms a commodity such as gold, an officially issued coin or note, or a deposit in a checking account or other readily liquifiable account.
2. The official currency, coins, and negotiable paper notes issued by a government.
3. Assets and property considered in terms of monetary value; wealth.
4a. Pecuniary profit or loss. b. One's salary; pay.
5. An amount of cash or credit: raised the money for the new playground.
6. Sums of money, especially of a specified nature. Often used in the plural.
7. A wealthy person, family, or group.
Unlike Mark's analysis of the "... acting" example, I just don't see how any two of the senses above can be contrasted to explain the "... money" example, so money is not strictly polysemous in the sense that I've defined to be relevant to this post (though I'm looking forward to the flood of correspondence I'm likely to get on this conclusion).
What I'd like to suggest is a weaker, "loose polysemy" analysis of the "that's why they call it x" snowclone: two relevant senses are coerced for x, even when the two senses can't be matched up (by the listener) with generally agreed-upon senses of x. In other words, what makes the "that's why they call it money" example interesting is the fact that you are forced to imagine what money might mean other than the obvious sense in 1. above -- you might even go through the other senses in 2. through 7. in your head, find that none does the trick, and arrive at the (I think intended) interpretation that the speaker is obsessed with money. (The "... acting" example was also interesting to me in this vague sort of way, at least until Mark and others showed me that relevant senses are available.)
Here are a few more examples I've been collecting, all of which have the same basic loose-polysemy flavor (to me) as the "... money example.
From the pilot episode of Monk:
Adrian Monk (Tony Shalhoub): "How long have you and Warren been married?"
Miranda St. Claire (Gail O'Grady): "Five years."
Monk: "Must be tough -- he's so busy, and now he's running for mayor. I would think that would be kind of stressful."
St. Claire: "You've been married, right?"
Monk: "Yes, I have."
St. Claire: "Then I don't have to tell you: every marriage is stressful. That's why they call it marriage."
marriage:
1a. The legal union of a man and woman as husband and wife. b. The state of being married; wedlock. c. A common-law marriage. d. A union between two persons having the customary but usually not the legal force of marriage: a same-sex marriage.
2. A wedding.
3. A close union.
4. Games The combination of the king and queen of the same suit, as in pinochle.
From 3rd World Bomb Squad (warning: graphic/tasteless/not-for-the-faint-of-heart), "an apparently real-life video clip" (forwarded to me by Neil Whitman, who heard about it elsewhere) with accompanying commentary (insert [sic] where appropriate):
Frame 1: Let me get this straight. You find a briefcase abandoned in a third world country and you think it might be a bomb.
Frame 2: What should you do?
Frame 3: (A) Open it and find out what's inside.
Frame 4: (B) Allow bystanders to look over your shoulder and crowd around
Frame 5: (C) Open it yourself without any protective equipment while being assisted by another officer equally unprotected, [pause] all while other officers are present who at least have body armor on.
Frame 6: (D) All of the above [pause] What do you think Third World Police Officer picked......
[This is followed by a 35-second video clip of a group of men crouched around a briefcase, which explodes, apparently killing some and injuring others.]
Frame 7: Thats why they're called 3rd world countries.
(Speaking of Neil Whitman: back in December, he discussed another kind of "that's why they call it x" example.)
One more example: I agree with Bridget at Ilani Ilani that the Elton John / Bernie Taupin lyric "I guess that's why they call it the blues" doesn't make much sense: the only sense being talked about in the song, as far as I can tell, is "feeling blue" -- sure, there's the salient sense of "blues music", but ...
And another, by way of Ben Zimmer (added 10/7/2005):
In Tuesday's episode of "Veronica Mars", Veronica is investigating a man's death, and she brings together his grieving daughter (Jessie) and his mistress (Carla) for the first time. The dialogue goes:
Carla: You look just like your picture.
Jessie (bitterly): That's why they call them *pictures*.Here, the sense seems to be "Duh, the whole purpose of a picture is to resemble the person pictured." So the snowclone works to underscore the tautological obviousness of Carla's opening pleasantry, which Jessie explicitly rejects. (Jessie later warms up to Carla, of course.)
[ Comments? ]
In yesterday's NYT magazine, Peter Maas has an article called "The Breaking Point", which features the concerns of Matthew Simmons about Saudi oil reserves, and puts Simmons' report of Saudi "fuzzy logic" to important rhetorical use:
Two years ago, Simmons went to Saudi Arabia on a government tour for business executives. The group was presented with the usual dog-and-pony show, but instead of being impressed, as most visitors tend to be, with the size and expertise of the Saudi oil industry, Simmons became perplexed. As he recalls in his somewhat heretical new book, ''Twilight in the Desert: The Coming Saudi Oil Shock and the World Economy,'' a senior manager at Aramco told the visitors that ''fuzzy logic'' would be used to estimate the amount of oil that could be recovered. Simmons had never heard of fuzzy logic. What could be fuzzy about an oil reservoir? He suspected that Aramco, despite its promises of endless supplies, might in fact not know how much oil remained to be recovered.
We can deduce from Simmons' ignorance of "fuzzy logic" that he hasn't bought a rice cooker recently. Not that buying a fuzzy logic rice cooker, or riding in a fuzzy logic elevator or a fuzzy logic subway train, would offer much insight into what the term really means. Of course, he could have checked with Google or looked at the Wikipedia entry.
On the other hand, looking the term up might not have helped. According to Simmons' book, when he asked the Aramco manager "what fuzzy logic precisely [means]", he got the standard sort of answer describing the work of Lotfi Zadeh on the logic of statements that are not crisply true or false, but are true to some intermediate degree. Thus the various Saudi oil fields, he was told, are neither exactly young and vigorous, nor old and played out, but somewhere in between. Simmons was not impressed by this answer, and writes that "hearing the Aramco manager's comment was one of the little events that tipped my thinking about the Saudi Arabian Oil Miracle towards skepticism". In fact, I suspect that the "fuzzy logic" presentation in fact was based on relatively sensible methods (though I have no idea whether Simmons skepticism about Saudi oil projections is justified on other grounds or not).
This SF Chronicle story "Rice goes digital cooked the fuzzy logic way" gives a similar sort of formulation:
Fuzzy logic recognizes more than simple true and false values; it sees degrees of truthfulness, for example, in the statement, "There is a 25 percent chance of rain today." Fuzzy logic deals with complex real systems. The Japanese learned exactly how well it worked when they used fuzzy logic to operate subway cars, which then ran and stopped more smoothly than when they were human-operated or automated. Fuzzy logic balanced out the complex components of acceleration, deceleration and braking.
Rice cooks in basically four stages: It stands in water, it boils, it absorbs (the "steamed stage") and then it rests. Heat is accelerated or decelerated for each stage and in different ways for each variety of rice.
This also is likely to leave a logical reader somewhat puzzled. Why are the complexities of subway car operation, or the four stages of rice cooking, improved by an approach that treats propositions as (say) 25% true?
I learned about Zadeh's fuzzy logic when I was a graduate student, back in the paleolithic era, but despite the intrinsic interest of the idea, there didn't seem to be any really impressive results or really useful applications. When I first heard about "fuzzy logic" control systems (during the neolithic age, about 20 years ago -- before Google or Wikipedia), I was puzzled. What exactly does the degree of truth of statements have to do with algorithms for controlling trains or elevators? When I asked this question after a dog-and-pony show at a Japanese research lab in the mid-1980s, I got answers like those that Simmons and the SF Chronicle got, repeating what I already knew about fuzzy logic, without adding anything convincing about the application to control theory. It sounded to me like technological double-talk. I was sure that the engineers were doing something relevant to control in complicated situations, but the "fuzzy logic" label seemed like a flack's evocative slogan for a variety of different technologies that didn't seem to have anything much to do with logic, fuzzy or otherwise.
A friend with a background in chemical engineering set me straight. His explanation went something like this: Standard control systems are linear. That means that controllable outputs (heating, accelerating, braking, whatever) are calculated as a linear function of available inputs (time series of temperature, velocity, and so on). Linearity makes it easy to design such systems with specified performance characteristics, to guarantee that the system is stable and won't go off into wild oscillations, and so on. However, the underlying mechanisms may be highly non-linear, and therefore the optimal coefficient choices for a linear control system may be quite different in different regions of a system's space of operating parameters. One possible solution is to use different sets of control coefficients for different ranges of input parameters. However, the transition from one control regime to another may not be a smooth one, and a system might even hover at the boundary for a while, switching back and forth. So the "fuzzy control" idea is to interpolate among the recipes for action given by different linear control systems. If the measured input variables put us halfway between the center of state A and the center of state B, then we should use output parameters that are halfway between state A's recipe and state B's recipe. If we're 2/3 of the way from A to B, then we mix 1/3 of A's recipe with 2/3 of B's; and so on.
In the case of the four stages of rice cooking, I suppose that a fuzzy logic controller is able to treat the process as a series of fuzzy or gradient transitions rather than a series of hard, stepwise transitions. I suspect that Simmons' Aramco executive was trying to present research that used a vaguely analogous method to fit a smoothed piecewise linear model to data about oil recovery as a function of various independent variables, including oil field "age". In both cases, the fuzzy approach might well be appropriate, under whatever name (though here's an alternative story about heating control -- and I have to say that I'm still quite happy with my old non-fuzzy thrift shop rice cooker...).
If you've shopped for a rice cooker recently, you'll have seen the addition of yet another buzzword: some cookers are not just "fuzzy", they're "neuro fuzzy". That term has a "what is this applicance doing to my brain" vibe that may not appeal to Americans -- I notice that our malls are not yet flooded with neuro fuzzy microwaves, for example. And indeed even plain fuzzy is by no means an entirely positive word. When George Bush famously accused Al Gore of "disparaging my [tax] plan with all this Washington fuzzy math", it was not a warm fuzzy moment.
But if you want to understand what "neuro fuzzy" means, you can read about it here. And there is a whole fuzzy world out there, as these links can help you discover. Though you might want to read this semi-skeptical review first.
[Update: Fernando Pereira emailed
Petroleum geologists have been pioneers on pretty sophisticated spatiotemporal estimation and smoothing techniques, for instance kriging (aka Gaussian process regression for statisticians). There are tight connections between GP regression and spline smoothing (via the theory of reproducing kernel Hilbert spaces). Either the Saudis are not hiring the best petroleum geologists, or they are being deliberately obfuscating with marketroid talk. I can't think of any situation in which fuzzy ideas (pun intended) would be preferable to Bayesian statistics for inference.
Well, if the "fuzzy logic" stuff in this case was for marketing purposes, it clearly had the opposite of the desired effect on at least one of its targets.]
[Update #2: Mike Albaugh emailed links to an interesting review article by Daniel Abramowitch, with an associated set of slides. ]
Well, John Dryden and the Duke of Buckingham are still leading the words as turds parade in the general category, but the U.S. Military Academy's "annual yearbook" the Howitzer has dethroned T.S. Eliot and Ezra Pound in the race for the earlier reference in writing to the specific term B.S. and its relatives. Ben Zimmer pointed out by email that HDAS cites B.S. from the Howitzer in 1900, in a volume not yet available digitally. Ben also cites a glossary of West Point Slang ("Published for the benefit of our struggling relatives and others who try to read our letters"), from the 1905 issue of the Howitzer and available from the U.S. Military Academy Digital Library, with these entries:
B-essy -- An adjective used to describe a person addicted to the use of superfluous or flowery language.
B.S. -- British science: the English language. Superfluous talk.
Big Green B. S. -- Popular name for Williams' "Composition and Rhetoric."
Little Green B. S. -- Abbot's "How to Write Clearly."
Red B. S. -- Meiklejohn's "English Language."
The gloss for "B.S." is especially nice.
Ben also made the general observation that
Taboo restrictions on "shit", "crap", etc. certainly limit historical investigations into the "language as excrement" metaphor. But it's interesting to note that various nonsensical terms for nonsense have been *interpreted* to refer to excrement in some euphemistic fashion. So, for instance, "horsefeathers" is widely believed to be a euphemism for "horseshit", though there is no solid evidence for that derivation [1], [2]. Similarly, "poppycock" is often reported to be an Anglicization of a Dutch word, pappekak, meaning 'soft dung'. But no such word has been found in Dutch dictionaries, and the etymological conjecture was put forth by Webster's New International 2nd Edition (1934) more than eighty years after the earliest known usage of "poppycock" [3]. Then there's "bushwa", which HDAS says is probably from French bourgeois, though it is now taken to be a euphemism for "bullshit".
It's readers like Ben who have enabled us to establish the position of Language Log in the highly competitive field of B.S. scholarship.
He certainly didn't scoop T.S. Eliot on bullshit, but he might still have been the first poet to use the excretion of bodily wastes as a metaphor for the deprecated expression of ideas. Dryden was definitely the one who invented the idea that preposition-stranding is wrong, but it's odd to think that someone could have invented the Language is Excrement metaphor. This connection seems so natural that I thought it must be as old as the concepts are, like the relation between increasing and rising. (Or is there a culture where your age goes down as you get older?) However, a bit of poking around in the excremental vocabulary of the classical languages failed to discover any examples of crap words used in a figurative sense to describe the expression of false or foolish or shoddy ideas. And a search of LION turned up a poem from the late 17th century, entitled "A Familiar Epistle to Mr. Julian, Secretary to the Muses", and variously attributed to Dryden and to George Villiers, Duke of Buckingham, in which this metaphor is proclaimed:
1 Thou common-shore of this Poetick Town,
2 Where all our Excrements of Wit are thrown:
3 For Sonnet, Satire, Baudry, Blasphemy,
4 Are empty'd and disburden'd all on thee.
5 The cholerick Wight untrussing in a Rage,
6 Finds thee, and leaves his Load upon thy Page.
LION says that "The attribution of this poem is questionable", and does not date its composition, but Villiers lived 1628-1687, and Dryden 1631-1700. Pending new claims (and I'll be surprised not to get some), I'll take this to be earliest documentation of the Language is Excrement meme, of which the term bullshit is a later instance.
Another candidate from the same LION search is an extraordinary piece of poetic vituperation by John Oldham, entitled "Upon the Author of a Play call'd Sodom", and dated 1680. He certainly makes many comparisons between deprecated writing and various noxious materials:
22 Vile Sot! who clapt with Poetry art sick,
23 And void'st Corruption, like a Shanker'd Prick.
24 Like Ulcers, thy impostum'd Addle Brains,
25 Drop out in Matter, which thy Paper stains:
26 Whence nauseous Rhymes, by filthy Births proceed,
27 As Maggots, in some T---rd, ingendring breed.
and he ends with his own excremental comparison, where however the crucial analogy seems to be between a deprecated work and pieces of toilet paper:
47 Or (if I may ordain a Fate more fit)
48 For such foul, nasty, Excrements of Wit,
49 May they condemn'd to th'publick Jakes, be lent,
50 For me I'd fear the Piles, in vengeance sent
51 Shou'd I with them prophane my Fundament)
52 There bugger wiping Porters, when they shite,
53 And so thy Book it self, turn Sodomite.
Yesterday the AP wire ran a story by Erin Texeira with the lede
What do you call a minority that is becoming the majority? News that Texas is the fourth state in which non-Hispanic whites make up less than 50 percent of residents has renewed discussion about whether the term 'minority' has outlived its usefulness; critics include both liberals and conservatives.
Texeira quotes Roderick J. Harrison, identified as "a demographer with the Joint Center for Political and Economic Studies", as presenting a bit of etymology that suprised me:
"The word's origins are that these are populations that once had the status of minors before the law," Harrison said.
It's certainly true that minority can mean "The period of a person's life prior to attaining full age", and that this usage is very old. with OED citations back to 1493.
However, the OED sets up a separate sense for
3. a. A group or subdivision whose views or actions distinguish it from the main body of people; (originally spec.) a party voting together against a majority in a deliberative assembly or electoral body.
with citations from 1716:
1716 J. ADDISON Freeholder No. 9 p.11 The Parliament of Great Britain, against whom you bring a stale accusation which has been used by every minority in the memory of man.
1736 R. AINSWORTH Thes. Linguæ Latinæ, Minority (lesser number).
1765 L. STERNE Life Tristram Shandy VIII. xix. 66 To prevent your honours of the Majority and Minority from tearing the very flesh off your bones in contestation.
1790 E. BURKE Refl. Revol. in France (ed. 2) 186 In a democracy, the majority of the citizens is capable of exercising the most cruel oppressions upon the minority.
These uses don't seem to derived the legal term minor (which is attested from 1552), but instead seem to be transparently related to the ordinary word minor meaning "lesser" or "relatively small" (attested from 1230 or so), as applied to the counting of heads in a political contest.
The OED then treats the sense
3.b. A small group of people differing from the rest of the community in ethnic origin, religion, language, etc.; (now sometimes more generally) any identifiable subgroup within a society, esp. one perceived as suffering from discrimination or from relative lack of status or power.
as an extension of this "political minority" sense, with citations from 1837:
1837 U. S. Mag. & Democratic Rev. Oct. 3 Though we go for the republican principle of the supremacy of the will of the majority, we acknowledge, in general, a strong sympathy with minorities, and consider that their rights have a high moral claim on the respect and justice of majorities.
1855 N. Amer. Rev. Jan. 171 The nucleus afforded by a vast and unappropriated country for the establishment and growth of political and religious minorities transplanted from ancient states and hierarchies.
1888 S. MOORE tr. Marx & Engels Manifesto Communist Party i. 11 All previous movements were movements of minorities or in the interests of minorities.
1917 Times 28 Dec. 8/1 According to the declarations of..the quadruple alliance, protection of the right of minorities forms an essential component part of the constitutional right of peoples to self-determination.
The point of Texeira's article is that the term minority is being overtaken by demographic events. Harrison's argument (if he was quoted corrected, which is always a gamble) seems to be that it's appropriate to go on using the term, even if the groups so named become collectively the numerical majority, because it referred originally not to demographic statistics, but to the legal status of being a child before the law. But none of these groups have such a legal status today, so why would this etymology be relevant, even if it were true?
It's common enough for the literal sense of a word to evaporate in favor of what started as connotation. If that's happening to minority, so be it -- we can go on using it, if we want to, without making up factually and logically dubious excuses.
Reading through a fairly positive NYT review of the new movie The 40 Year-Old Virgin, I found out that it co-stars Catherine Keener. I had one of those tip-of-the-tongue-type reactions where I recognized the name but was having difficulty matching it with a face, so I IMDB'd -- and found that Keener also co-starred in the recent movie The Interpreter (mentioned at least a couple times here on Language Log). I also found, much to my surprise and amusement, that the convention of putting articles (a, the) at the end of a movie (or book, etc.) title for alphabetizing purposes has a funny result in French (and, I assume, other languages that are like French in relevant respects).
Quick background, for those who may not be (so) familiar: in French, the article corresponding to English the is le (masculine) or la (feminine), both of which lose their vowel (which is replaced orthographically by an apostrophe) when they precede a vowel-initial word: le livre 'the book' but l'homme 'the man' (the initial 'h' of homme is silent); la table 'the table' but l'église 'the church'). (This rule applies in pretty much the same way with several other function words, such as de 'of', je 'I', etc.)
The rule is strictly based on the sound that the immediately following word starts with; for example, 'the big man' is le grand homme or l'homme grand, depending on where you place the consonant-initial adjective. There's no rule for how to pronounce/write one of the relevant function words when it appears phrase-finally -- these words never appear in such contexts under natural circumstances -- but all signs point to the prevocalic form being the special case and the preconsonantal form being the elsewhere, default case.
(Update, added immediately after posting: this may depend on the variety of French you speak, as devoted Language Log readers may know. For at least some Canadian French speakers, for example, a sentence may end in a preposition such as de; some European French speakers, on the other hand, accept stranded prepositions except de and à.)
Except, of course, in the case of this convention of putting articles at the end of a title: the rule, at least orthographically, appears to be to use the form of the article that is used when the convention is not in force: The Interpreter is L'Interprète, so Interpreter, The is Interprète, L'. (Figure, go.)
(Continuation of update: And I doubt that those Canadian French speakers would either pronounce or write d' when the fronted object of the stranded preposition is vowel-initial ...)
(Cross-posted, mutatis mutandis, on phonoloblog.)
[ Comments? ]
On Sunday (8/14/05), the New York Times Magazine's Ethicist, Randy Cohen, took on some ethical issues in publishing, in response to a translator who had discovered that an article she was translating (from Hungarian into English) "was copied in large part from a lexicon published in 1929" and asked whether she should report her discovery to her employer ("a major American research institution"). Yes, says Cohen, not surprisingly. But then he goes on to enunciate a duty to correct errors in the language of texts -- a position that strikes me as well-intentioned but potentially troublesome in practice.
Cohen begins by observing that if the translator doesn't report the copying, probably no one will. And she has a duty to:
When it comes to ordinary civilians, both law and ethics impose only a limited duty to report wrongdoing... But you are not an ordinary civilian; you are part of a scholarly community, and different contexts entail different obligations. Intellectual integrity can be maintained only if members of your community report transgressions. Without this self-policing, the field cannot sustain its own values.
So far, we're in familiar territory. Now come the language issues:
You also have a duty to your employer. Everyone in the publishing process should report a solecism that would otherwise go undetected--a misspelling, a grammatical error. Similarly, all should report a serious ethical transgression. To keep silent would undermine the project on which you are employed.
There are two duties here, one apparently more weighty than the other: to report serious ethical transgressions and to report solecisms in language. Perhaps the second duty falls short of being an ethical imperative, but it is still a significant responsibility, according to Cohen. Cohen might want to think about how to frame this responsibility, since "solecism" covers a lot of territory -- and just how much depends on who you read. The authorities are by no means on the same page here, so to speak.
No one's denying that writers fall into error. There are typos, "cutnpaste errors" (in which parts of two different formulations survive the editing process), inadvertently omitted words, ill-chosen words, and much more. If you're part of the publishing enterprise and these come by your eyes, you should of course report them. But then there's a large gray (or grey) area, which includes matters on which there are house styles, different styles for different houses, and also "usages people keep telling you are wrong but which are actually standard in English" (in the words of Paul Brians, on his non-errors page). Brians's page is a place to start, but for serious, detailed advice you'll need to consult MWDEU.
What you don't want to do is start reporting all those things that some manual or other says are solecisms: people used as the plural of person, over used as a quantifier meaning 'more than', once used as a subordinator meaning 'after, when, as soon as', restrictive relative which, and on and on. That will only make you a pest to your colleagues and employers, and a monkey wrench in the works of the publishing process.
zwicky at-sign csli period stanford period edu
From Des von Bladet, two (implicitly) related items. First, a link to a story at BBC News telling us that
Former Spice Girl Victoria Beckham has admitted she has never read a book in her life - despite having apparently written her own 528-page autobiography.
and then a quote from the introduction to the 165-page Routledge textbook Knowledge and the social sciences: theory, method, practice:
When it snows I just see snow. If I think hard about it I might see sleet. But an Inuit living in Northern Canada. whose language includes over a dozen words for snow, will see a much more nuanced snowstorm than I ever can.
Des explains:
The claim is utterly false, of course. (See the title essay of G. Pullum's The Great Eskimow Vocabulary Hoax for an entertaining discussion, or his remarks here.)
Also, the Inuit are the Greenlandic bunch and the Canananadians aren't keen on the name; their habitat is technically an Arctic desert, so they don't get as much snö as all that, and the "than I ever can" is just plain icky - learn, you couldn't?
But most exasperating of all is that this necessarily unsourced factoid stars in the introduction to the block on "knowledge and knowing" where we get to be all epistemological for once. It'll star in my essay, too, I think.
With an appropriate modification of Des' jokey spelling, you can find Geoff's book here.
The second (and most recent) edition of the Routledge text, published in 2000, can be searched at amazon.com, and the quote Des cites is really there. A bit of the prior context:
Three key elements of the social construction of knowledge are explored in this book
Language is a social phenomenon and no description or explanation of the world can be created without recourse to it. But the language we inherit shapes what it is we see in the world and what we cannot see, what we know and what we cannot know.
And then on to the subtle cognitions of the frozen north. Some of what comes next:
Institutions are equally important in shaping the content and standing of knowledge systems. At one extreme, the dominance and public legitimacy of knowledge systems has been backed up and underscored by the use of force, terror and censorship. But even in the context of more diverse, open and plural societies, institutions exert powerful effects.
And then:
Which brings us to power. The production, dissemination and legitimization of knowledge requires access to and use of resources economic, political and cultural and, as the examples above suggest, these resources are rarely equally distributed.
The author of this introduction is David Goldblatt. I wonder if the stuff about Institutions and Power in this textbook is any better researched than the stuff about Language seems to be. Des, who is apparently reading the book, may tell us. It's certainly a lot easier to write a textbook if you can just kind of make up stuff that sounds plausible.
Posh Spice is straightforward about her scholarship:
The 31-year-old wife of England captain David Beckham told a Spanish magazine she does not have time to read.
I wish that academics who make things up about language were equally forthright.
"Spin Bunny" hears segue, thinks it's a neologism based on a metaphorical use of segway, blogs about it:
Played buzzword bingo in meetings? We thought so. PR is such a sad industry that such pasttimes are required in order to keep your mind on the job.
Yet buzzwords have sunk lower than we ever thought possible. Overheard in a recent fluffette meeting was (albeit by a client) the use of the word Segway as a verb. As in "Oh I guess now's the time to just Segway into that issue a little".
Bunny is not the first to perceive segue this way: Izzy's Guide to Starting and Running an Underground Paper sensibly suggests:
Before you craft the first sentence in a paragraph, ask yourself, "What the hell is my point for this paragraph?" Think about your thesis and what you're trying to argue. The first few sentences must support or relate to the strong stand you've already made on an issue.... The rest of the paragraph should be spent arguing this one point. Don't segway into another point. You want coherent writing, not chaos.
And at bookrags.com ("Writing and Studying Skills and Tips"), as part of "How to Write a Five Paragraph Essay", we have the nominal form:
The Introduction consists of an opening line. This opening line can be a generalization about life that pertains to your topic. It can also be a quotation. Another segway into the introduction is to start it with a little anecdote (or story). By "breaking the ice" so to speak with the reader, you are luring him or her into the rest of your essay, making it accessible and intriguing.
Not all uses are in advice to writers:
The second X-Files movie may or may not reveal straight undisputeable facts, but the last episode was still a great closer and segway into the movie, which I sure as hell can't wait for.
Overall, this little book offers much as a solid segway into intro Perl programming for bioinformatics.
As it is, this can be a good segway into an art lesson you have planned.
Someone asked me recently, "What's up with the 'jay Is' thing anyway?" (well, that's not entirely true, but makes a nice segway into a new blog entry)...
Gee, you spout a boatload of nonsense and gibberish about God knows what, then segway into something about your hard drive going bye bye because you picked up a virus?
As usual with eggcorns, this is a perfectly sensible metaphor, and wouldn't raise any problems if it weren't blocked by an existing usage.
[Update: Ben Zimmer points out that "segway" as an oddball spelling for segue pre-dates the naming of the scooter:
Just read your LgLog post on "segway". Though your first example is clearly an eggcorn, the other examples may simply be spelling errors, with no semantic reinterpretation (though the prominence of the Segway brand may have popularized the error). One can find the "segway" spelling in the Usenet archive all the way back to 1985:
http://groups.google.com/group/net.sf-lovers/msg/011176c532f8f296
Note: The June issue of JSF is a segway into the July issue and is therefore more enjoyable if you know the characters.Also, "segway" has appeared as a conscious misspelling long before the introduction of the Segway scooter. Larry Monroe has hosted an Austin-based radio and TV show called "Segway City" since 1977:
http://www.larrymonroe.com/program/SC.html
http://www.kut.org/site/PageServer?pagename=mus_larrymonroe
]
[Update #2: Rich Baldwin had another theory...
An anecdote relating to your segue post, which you may find funny.
When I was very much younger, and having never understood it the few times I saw it in print, I thought the word segue was actually spelled segway. Further, I was sure that it was a borrowing from pig-latin, like ixnay. But I could never figure out what a "weseg" was; I kept expecting to find a well used phrase from somewhere in the worlds of stage and screen describing scene changes that started with "w" and had a "seg" in it, but I always came up empty handed.
Boy was I surprised when I learned the correct pronunciation of "seh-gooey"!
]
[Another anecdote by email from Ella at Cherrier:
My boyfriend produces an internet radio show, and the description for it in the directories used to read that it had "a million different segways going all over the place". He never really understood why this image put me into paroxysms of laughter - but eventually my excess of hilarity shamed him into changing it, more's the pity.
And Neal Goldfarb sent in citations showing that the spellings { segueway} and even { sequeway} are fairly common as well. ]
In my posting on Dr. Language and I, I pointed out two seductive effects of selective attention: the Recency Illusion (if you've noticed something only recently, you believe that it in fact originated recently) and the Frequency Illusion (once you notice a phenomenon, you believe that it happens a whole lot). The point here is that your impressions are unreliable; you need to find out what the facts are.
Now my colleagues Elizabeth Traugott and Isa Buchstaller have pointed out that when people lament, "Those kids today!", they're likely to be victims not only of the Recency Illusion ("today") but also of related illusion that I'll call the Adolescent Illusion, the consequence of selective attention paid to the language of adolescents ("those kids") by adults. This illusion is a special case of a much broader effect, in which people pay attention selectively to members of groups they don't see themselves as belonging to and so locate phenomena as characteristics of these groups: an Out-group Illusion.
There are many familiar examples. Ask people about retro-not ("I think that's a smart idea -- not!") and lots of them will tell you it's both recent and characteristic of teen speech. As Larry Horn has observed (in, for example, his 1992 paper "The said and the unsaid", in Ohio State University Working Papers in Linguistics 40.163-92), neither of these impressions is really accurate.
Teenagers are likely to be blamed for most things that (some) people find reprehensible in language. This is not an entirely unreasonable view, since a great many linguistic changes do seem to originate in adolescent language. But, of course, you have to figure out whether the phenomenon you're looking at actually is one of these changes in the early stages of progress. Sometimes it is, sometimes it isn't.
More generally, people sometimes are exquisitely sensitive to some linguistic feature in groups they don't belong to, while missing it almost totally within their group. My current favorite example of the Out-group Illusion is a contribution to a Linguist List discussion of "double be" last year (issue 15.535, 2/9/04). Jill Murray, writing from Australia, joins the conversation:
Just as I was reading this posting I had a phonecall from an Irish speaker who used the construction twice in a five minute conversation. It is not a feature of Australian English and I had never heard it before. Both were "The thing is, is that ..."
Pat McConvell, who had been posting and writing about the phenomenon for over 15 years, then chimed in (issue 15.560, 2/12/04) to flatly contradict Murray's subjective impressions: most of his examples were from Australian speech, and he collected new examples "virtually every day" from Australian-born colleagues, on the radio in Australia, etc. Murray was detecting the feature only when it came from people whose speech she was likely to judge as unusual, exotic, marked.
Like I said, you just have to go and find out. I no longer trust my own subjective impressions, or those of other linguists, no matter how reputable. The OED, for instance, sometimes gives judgments about how frequent certain uses were in particular periods (many of them James Murray's, from well over a hundred years ago), as do reference works like Tauno Mustanoja's Middle English Syntax, but those are impressions based on experience with unsystematic samples, and they simply aren't reliable. There's lots of work to do.
zwicky at-sign csli period stanford period edu
In my recent posting on linguistic modesty at Henry Holt, I reviewed some episodes of primness in the New York Times, which caused reader Matthew Hutson to point out one that I had somehow missed: Michael Brick's "Longing for a Cuss-Free Zone", in the Fashion and Style section of 7/31/05. Brick fastidiously avoids even the indirect f-bomb (a vivid version of f-word) in favor of the still more indirect word-bomb (and its abbreviated version bomb), used as both noun and verb in his piece.
Brick glosses word-bomb, which he admits is a "clunky construction", at arm's length:
I've made ["word-bomb"] up as a stand-in for a well-known hyphenated term that refers to an actual profanity. In use for at least a decade, the original hyphenated term (which begins with the first letter of the profanity and ends with "bomb") gives a knowing wink to the actual profanity's paradoxical place as a taboo in wide circulation.
And the second-level indirection (avoiding the contaminated letter f) doesn't come with a knowing wink? Yeah, sure. What's next? The word-b, which sidesteps the now possibly contaminated word bomb? Obzo, the rot-13 version of bomb? (Shpx would clearly be too racy.)
Apparently the whole exercise really is designed to keep the NYT from winking at its readers (don't you just hate it when newspapers do that?):
... very rarely does the paper print those obvious, winking, letter-word stand-ins. As the Times's two-page stylebook entry on obscenity says, "An article should not seem to be saying, 'Look, I want to use this word but they won't let me.'"
So what, kids, does word-bomb say? "I'd never use this word in polite company, and can barely bring myself to allude to it, even very obliquely"? Well, aren't you fastidious!
As icing on the cake, there's a letter on 8/7/05 objecting to Brick's verbing of the noun (word-)bomb, as in: "Outside office buildings smokers bomb their bosses"! A. Scott Falk writes, "This inconsistent use of a misguided neologism strikes me as a greater affront to the English language in polite society than any familiar four-letter word could ever be." You read it here: verbing is so evil, such a symptom of the breakdown of society, that it's even worse than fuck. So: no fuckin' verbing! And no fuckin' winking, either! And wipe that smirk off your face!
zwicky at-sign csli period stanford period edu
I've learned my lesson about misquotations in the New York Times, so I'm taking this one from today's Maureen Dowd column with a grain of salt (and added emphasis):
Pressed about how he could ride his bike while refusing to see a grieving mom of a dead soldier who's camped outside his ranch, he added: "So I'm mindful of what goes on around me. On the other hand, I'm also mindful that I've got a life to live and will do so."
This is a nice example of do so anaphora in which the referent is not exactly syntactically present in the discourse (do so = "live my life / live the life I've got"). These types of examples (and even more striking ones) have been studied extensively by my colleague Andy Kehler in collaboration with Gregory Ward. Some of their joint papers are available in PDF format from Greg's publications page; their most recent statement, "Constraints on Ellipsis and Event Reference", appeared in The Handbook of Pragmatics (Blackwell, 2004).
The source of the quote in Dowd's column is, of course, George W. Bush. As usual (IMHO), Dowd puts it well, but I think Ann Telnaes puts it better:
[ Comments? ]
In discussing Jim Holt's New Yorker piece on bullshit, I neglected to mention what Holt has to say about the etymology:
The word “bull,” used to characterize discourse, is of uncertain origin. One venerable conjecture was that it began as a contemptuous reference to papal edicts known as bulls (from the bulla, or seal, appended to the document). Another linked it to the famously nonsensical Obadiah Bull, an Irish lawyer in London during the reign of Henry VII. It was only in the twentieth century that the use of “bull” to mean pretentious, deceitful, jejune language became semantically attached to the male of the bovine species—or, more particularly, to the excrement therefrom. Today, it is generally, albeit erroneously, thought to have arisen as a euphemistic shortening of “bullshit,” a term that came into currency, dictionaries tell us, around 1915.
Holt apparently got his information from the OED, which is much more dismissive of those "venerable conjectures" than his description might lead a reader to believe:
No foundation appears for the guess that the word originated in ‘a contemptuous allusion to papal edicts’, nor for the assertion of the ‘British Apollo’ (No. 22. 1708) that ‘it became a Proverb from the repeated Blunders of one Obadiah Bull, a Lawyer of London, who liv'd in the Reign of K. Henry the Seventh’.
Though the OED admits the word bull in the sense "Trivial, insincere, or untruthful talk or writing; nonsense" is "of unknown origin", it directs our attention to
OF. boul, boule, bole fraud, deceit, trickery; mod.Icel. bull ‘nonsense’; also ME. bull BUL ‘falsehood’, and BULL v.3, to befool, mock, cheat.
all of which seem sounder references than those that Holt chooses to quote.
Holt also neglects to tell us that the OED's two earliest citations for bullshit are from Wyndham Lewis and E.E. Cummings:
c1915 WYNDHAM LEWIS Let. (1963) 66 Eliot has sent me Bullshit and the Ballad for Big Louise. They are excellent bits of scholarly ribaldry.
1928 E. E. CUMMINGS Enormous Room vii. 194 When we asked him once what he thought about the war, he replied, ‘I t'ink lotta bullsh--t.’
But the first citation is self-debunking, since it refers to the title of something written earlier! And indeed, according to this paper (Loretta Johnson, "T.S. Eliot's Bawdy Verse: Lulu, Bolo and More Ties", Journal of Modern Literature 27.1 (2003) 14-25), the letter in question was sent to Ezra Pound and refers to some poems that Eliot had written earlier:
On February 2, 1915, Lewis wrote to Pound, "Eliot has sent me Bullshit & the Ballad for Big Louise." He mistitles "The Triumph of Bullshit" and "Ballade pour la grosse Lulu." "They are excellent bits of scholarly ribaldry ... I am longing to print them in Blast; but stick to my naif determination to have no 'Words Ending in -Uck, -Unt and -Ugger.'_"
Johnson explains that:
"The Triumph of Bullshit" and "Ballade pour la grosse Lulu" address the vagaries of publishing and the mediocrity of the press. ... In three octaves and a final quatrain, the narrator thumbs his nose at the "Ladies" who are reading his work and determining its fate. In the first three stanzas he addresses the "Ladies, on whom my attentions have waited," "Ladies, who find my intentions ridiculous," and "Ladies who think me unduly vociferous." Then each stanza ends with "For Christ's sake stick it up your ass." The abababab, cdcdcdcd, and efefefef rhymes are unconventional, linking "waited" with "alembicated," "constipated," and "imitated." And "small" rhymes partially with "galamatias" and "ass." The final stanza refers to the word "bullshit" in the title. "It" shall triumph when "with silver foot" they step in it, "among the Theories scattered on the grass." "And then for Christ's sake," the narrator adds, "stick them up your ass."
Apparently Eliot's Bullshit was originally ungendered:
The first version of "Triumph," written or transcribed probably in 1910, addresses the "Critics" instead of the "Ladies." When it first was written, Eliot was not in print, except for poems in the Smith Record and the Harvard Advocate. In 1914, he wrote to Aiken stating he was writing and enclosed his "war poem," entitled "UP BOYS AND AT 'EM," which ends: "But the cabin boy was sav'd alive/ And bugger'd, in the sphincter." Eliot, perhaps amused by the idea of offending sensitive female taste, joked about publishing the Notebook, naming it "Inventions of the March Hare." He wrote he could give a few lectures and become a "sentimental Tommy," punning on his name and alluding to poetry readings at Harold Monro's Poetry Bookshop in London and the popularity of J.M. Barrie's Sentimental Tommy (1896). Critical of the "Ladies" who influence popular taste, Eliot yearned to be published, in part to impress his father. According to Valerie Eliot, when they parted for the last time at the end of his 1915 visit, Eliot was convinced that his father thought him a failure." Publication might reverse that problem.
When Eliot changed "Critics" to "Ladies" in 1916, he changed the meaning significantly. Ricks suggests that Eliot may have felt "at the mercy" of several women, including his wife. Other "Ladies" could have been Dora Marsden and Harriet Weaver of The New Freewoman, editors from whom Pound, in contest with Amy Lowell, tried as early as 1913 to wrest some editorial control. Pound was also working on Harriet Monroe to publish Eliot's poetry. After his premature discontent and following the instrumental encouragement of Pound, Eliot began to publish. Monroe published "The Love Song of J. Alfred Prufrock" in the 1915 issue of Poetry. She also accepted "The Boston Evening Transcript," "Cousin Nancy," and "Aunt Helen" for the October 1915 issue.
I guess this must be the 1916 version of The Triumph of Bullshit:
Ladies, on whom my attentions have waited
If you consider my merits are small
Etiolated, alembicated,
Orotund, tasteless, fantastical,
Monotonous, crotchety, constipated,
Impotent galamatias
Affected, possibly imitated,
For Christ's sake stick it up your ass.Ladies, who find my intentions ridiculous
Awkward insipid and horribly gauche
Pompous, pretentious, ineptly meticulous
Dull as the heart of an unbaked brioche
Floundering versicles feebly versiculous
Often attenuate, frequently crass
Attempts at emotions that turn isiculous,
For Christ's sake stick it up your ass.Ladies who think me unduly vociferous
Amiable cabotin making a noise
That people may cry out "this stuff is too stiff for us" -
Ingenuous child with a box of new toys
Toy lions carnivorous, cannons fumiferous
Engines vaporous - all this will pass;
Quite innocent - "he only wants to make shiver us."
For Christ's sake stick it up your ass.And when thyself with silver foot shalt pass
Among the Theories scattered on the grass
Take up my good intentions with the rest
And then for Christ's sake stick them up your ass.
I haven't been able to find the 1910 version.
But anyhow, I don't believe that T.S. Eliot really invented bullshit in 1910. He could hardly have aimed to shock the "ladies" by naming his little poem "The Triumph of Bullshit" if the term had not already been a commonplace vulgarity.
[Update: Steve from Language Hat emailed
To complete the modernist trifecta, Ezra Pound used it in 1914 in a letter to Joyce:
"I enclose a prize sample of bull shit."
(That's the first clearly metaphorical use cited in HDAS; they include a couple of references to the actual excrement of the bull from much earlier.)
"HDAS" is the Oxford Historical Dictionary of American Slang.
And Uche Ogbuji at Copia has some thoughts about Eliot's Triumph:Horrid genius. Eliot attaches several senses to "ladies", including (and this is the sense that does find best concord with the poem), the society matrons who influenced popular, and hence critical, taste. But Eliot is also a bit of a coward here. ...
... when it's time for brave, open sally, Eliot prefers weak targets.
Or at least targets that he can treat as weak.
Anyhow, it's -- poetic justice? -- to find Pound, Eliot and Joyce all lexicographically implicated in the origins of bullshit. ]
One of the staff assignments here at Language Log Plaza is to keep track of instances of conspicuous linguistic modesty in the media. It all started with a little rant by Geoff Pullum about how NPR managed to broadcast an entire talk show about Harry Frankfurt's book On Bullshit without a single mention of the title. Meanwhile, the very modest New York Times refers to the book as On Bull _ _ _ _. (It's been on the NYT Book Review's nonfiction best seller list for 20 weeks now, so the issue comes up at least once a week.)
A while back, I noted the way the Book Review coped with a double-whammy, the title of Nick Flynn's memoir Another Bullshit Night in Suck City: Another Bull _ _ _ _ Night ... (avoiding shit with one kind of ellipsis mark and Suck City with another).
And now, under the imprint of Henry Holt and Company (the Metropolitan Books division), Guy Deutscher goes to such lengths to avoid the naughty word fart that he has to supply clues to its identity.
The word comes up on p. 85 of Deutscher's entertaining and informative The Unfolding of Language, in a discussion of Grimm's law and the doublets it gives rise to in modern English, like pater(nal) and father:
Sometimes the siblings have gone such separate ways that upon meeting up they would hardly give each other a second glance. This is the case with the borrowed part(ridge) and the native ****. (The Greeks, who are the ultimate source of the loanword partridge, presumably gave it this name because of the loud whirring sound it makes when suddenly flushed out.)
We have the Grimm's law context, which suggests that the averted word begins with the Germanic counterpart to p, that is, f -- and, of course, that it ends with the Germanic counterpart to t, that is, in Deutscher's transcription th (as in tooth). That would suggest something like farth. The allusion to loud whirring sounds is, I figure, supposed to bring the reader to a similar-sounding naughty English word that has something to do with sound. That's probably enough to lead you to the word fart.
But why such indirection? Even fuck doesn't usually get all four of its letters ellipted. Surely f**t would have been sufficiently prim, and if the editors were worried that readers might think of foot first, why then f*rt would have worked. The story about whirring partridges would still be necessary, to account for the semantic relationship, but the whole business would have been easier on the reader.
[Added 8/21/05: Deutscher has written to ask: "... and what about the joy of discovery - is that worth nothing?" Ah, the "****" and the whirring partridges were meant as a little puzzle for the reader, but I didn't see that.]
Me, I would have gone for fart, flat out. It's a bit on the vulgar side, but we're not in the prim pages of the NYT here, and we're all adults. (Not that I would warn children away from Deutscher's book, but I suspect that few children would make it through an extended discussion of triliteral roots in Semitic, or of the laryngeal hypothesis and the discovery of Hittite. Great stuff for teenagers, though.) Anyway, sometimes a little vulgarity is just the ticket, as in one of John Mortimer's essays in Where There's a Will, p. 141:
Some of the best things in life, works that are a pleasure to be handed on to the generations to come, have vulgarity and sentimentality in spades... Indeed it's impossible to read through, say, the novels of Virginia Woolf without longing for a touch, a mere hint of vulgarity or sentimentality, a tear-jerking scene perhaps, or even a joke about a fart.
[Added 8/21/05: A correspondent going by the name Yarrow has written to observe that Virginia Woolf was not above a certain coarseness on occasion. Yarrow points to the beginning of Orlando:
He--for there could be no doubt of his sex, though the fashion of the time did something to disguise it--was in the act of slicing at the head of a Moor which swung from the rafters. It was the colour of an old football, and more or less the shape of one, save for the sunken cheeks and a strand or two of coarse, dry hair, like the hair on a cocoanut.
The "no doubt of his sex" is indirect, but the description of the old head is vividly earthy.]
zwicky at-sign csli period stanford period edu
According to Jim Holt's discussion in the New Yorker of Gerald Cohen's paper "Deeper into Bullshit", Cohen independently discovered what I have always privately called Labov's Test:
...how could one prove ... that a given statement is hopelessly unclear, and hence bullshit? One proposed test is to add a “not” to the statement and see if that makes any difference to its plausibility. If it doesn't, that statement is bullshit.
I described and exemplified Labov's Test in one of my first Language Log posts, though I attributed it to an anonymous "colleague" rather than to its author, Bill Labov. Since I was relying on my memory of a dinner-table anecdote from many years ago, I wasn't sure that I had the story exactly right. And Bill is a person who is very serious about quantitative validation of his empirical claims, so I thought he might be uneasy about being cited as author of an informal experiment without a large enough N or a double-blind design. And I was writing about whether or not there is any signal in Jacques Derrida's noise, so I felt constrained by contrast to be careful and exact.
But finding the truth depends as much on open presentation and discussion as on private research and reasoning; and this is a blog, not a refereed journal; and Labov's Test is a worthwhile rule of thumb, whose (co-)invention Bill deserves credit for. So I'm outing him as the author. I suppose we could call it the Labov-Cohen test. Some might object that Cohen actually wrote about it, while Labov's contribution was only through the oral tradition, but basic methodological innovations of this kind are often introduced by what scholars call "personal communication".
Anyhow, I have another purpose in bringing this up, which is to criticize Holt for carelessness in interpreting this useful test. (I'm not sure whether to bring Cohen into this or not, since I haven't yet read his paper “Deeper into Bullshit,” in Sarah Buss, Ed., Contours of Agency: Essays on Themes from Harry Frankfurt.)
A key thing about the test, missing from Holt's discussion, was front-and-center in Bill Labov's original presentation, as I recalled and described it:
This ... reminds me of a parlor game that a colleague of mine claims to have played, back in the day when it was easier to find academics who took Derrida seriously.
My colleague would open one of Derrida's works to a random page, pick a random sentence, write it down, and then (above or below it) write a variant in which positive and negative were interchanged, or a word or phrase was replaced with one of opposite meaning. He would then challenge the assembled Derrida partisans to guess which was the original and which was the variant. The point was that Derrida's admirers are generally unable to distinguish his pronouncements from their opposites at better than chance level, suggesting that the content is a sophisticated form of white noise. On this view, as Wolfgang Pauli once said of someone else, Derrida is "not even wrong.".
The point is that this is a test of communication from author to audience, not a test of the author's meaningfulness in itself. And it is framed as a behavioral test, not as a test of the author's intentions with respect to the relation between text and truth, or any other aspect of the author's state of mind. Labov's test could fail as easily because the audience is ignorant as because the text is nonsense.
For example, a 1999 JHEP paper by Seiberg and Whitten contains one of these two sentences, differing only in the introduction of "not" in the second one:
(1) ... at nonzero B (unless B is anti-self-dual) a configuration of a threebrane and a separated −1-brane is BPS, so an instanton on the threebrane cannot shrink to a point and escape.
(2) ... at nonzero B (unless B is anti-self-dual) a configuration of a threebrane and a separated −1-brane is not BPS, so an instanton on the threebrane cannot shrink to a point and escape.
Both sentences seem equally plausible to me. However, I don't take this as evidence that string theory is bullshit, but rather as evidence that I don't understand its mathematics. If I claimed to understand the mathematics of string theory as discussed in this paper, but was unable to pass Labov's test with respect to a set of pairs of sentences like this one, you'd be justified in concluding that I was bullshitting about understanding the paper, but not that Seiberg and Whitten were bullshitting in writing it. (By the way, the second sentence is the original one; and BPS is short for Bogomol'nyi, Prasad and Sommerfeld, and is discussed at greater length here, if that helps...)
(My interpretation of) Labov's claim about Derrida and similar writers is that all of his readers will fail the test (statistically speaking) all the time. If this were true, then we could conclude that everyone who claims to have understood Derrida (for example) is a bullshitter, or at least is in some sense deluded. This universal obscurity would certainly raise the suspicion that there was no suitable object of understanding available, for instance because the work is simply (or rather, complexly) nonsense.
I suspect that (my memory of) Bill's experimental hypothesis is often true, in the sense that in a controlled experiment, the partisans of such "theory" would fail to distinguish its statements from their negation at greater than chance level, in a large proportion of replications across statements and subjects. Of course, the ostentatious obscurity of such work suggests that its practitioners might be pleased rather than distressed by this result. And we'd need another test to distinguish this case from the similar results obtained for any sufficiently difficult mathematics. However, the point of my original post was that the test would not always be negative. Sometimes, Derrida was just wrong.
Peter Suber sent in a usage that struck him as odd, from an article by John Blossom:
Usage statistics are the lifeblood of this exercise and they can illuminate a collection's importance to some degree, but what happens to content once it's away from the bounds of centralized statistics? It gets referenced in citation links, onpassed in emails and generally works its way into the infrastructure of an organization. [emphasis added]
I speculated by return email that Blossom's past role as VP of Outsell, Inc. might have given him a taste for words made from a preposition and a verb written solid. Peter's reponse:
I read enough stuff from Outsell and John Blossom to think that this isn't a house style at Outsell. But I like the idea that it's a house joke and that just this once it slipped through the copy editor (or throughslipped the copy editor).
Google finds 338 instances of {onpassed} and 186 of {onpassing}. A number are from South Asian sources, but some appear to come from native speakers of English -- often from Australia or New Zealand:
While tracking back through the snail trail left by my Nike email, a friend onpassed a story in The Financial Times where Doug Miller, of the Toronto-based consultancy, Environics warned ...
Unfortunately, those myths have been onpassed to current generations.
The major payments are for public housing (onpassed by the Budget to the Department of Housing - a Non Budget Sector agency) and roads.
The onpassing of these receipts to the ACT Government’s Central Financing Unit...
The first category relates to payments for onpassing to other bodies and individuals.
And at least two are from other articles by John Blossom, so if it's a joke that throughslipped the copy editor, (s)he upmessed more than once:
Though their consumer orientation would probably keep them out of the EContent 100 directly, expect companies such as Shared Media Licensing, Inc., creators of the Weed rights management system enabling content to be monetized as it's onpassed, and new companies such as SNOCAP to represent the beginning of a new wave of rights management capabilities that enable both traditional publishers and individuals and institutions to find profits from content in ways that they had never considered before.
Think long and hard about how the digital objects that you distribute can have life in the hands of your users beyond the first glance and as they get onpassed from one person to another.
There are plenty of precedents, such as bypass and uplift and overflow, and apparently onpass is well established for some people. But it's news to some others, one of whom asks:
(Q. I'm onpassing to you an email...)
A. Is onpassing sort of like outgassing?
No, I think that outgassing is more like upskirting... While outgassing is not related to a prepositional phrase of the form "out the gas", it could be related to a phrase like "let out the gas", just upskirting could be related to a phrase "look up the skirt".
[Update: Drew Smith writes:
I did a quick LexisNexis Academic search and found 92 probable uses of "onpass", "onpassed", or "onpassing" in major papers dating back as early as the April 23, 1982 issue of The Financial Times of London, which had this by Peter Montagnon on page 25 in Section II: "Although the borrowing vehicle is Morgan's offshore subsidiary J. P. Morgan International Finance N.V. proceeds of the note will be onpassed to Morgan Guaranty Trust in the form of a subordinated capital note."
Thanks to Drew for uplooking this and alongsending it! I just outchecked Google News, and found just one example, from a South African source: "He says the Protector avoids dealing with this core issue -- Imvume's onpassing of R11m of taxpayers' money to the ANC -- by 'a neat yet outrageous ... " A search of the current indices of the NYT and the WaPo didn't upturn anything.
Seriously, it's clear that onpass is an established usage, though it's one that some people (incuding Peter Suber and me) have managed to miss up to now. I note that many of the citations, from the web as well as this journalistic example, seem to come from the vocabulary of finance-speak, so perhaps it's become established there and is now outleaking into more general use.
In response to my post "What happened to the 1940s?", Michel Vuijlsteke sent in the results of some experiments of his own on the relative frequency of 20th-century decade names in Dutch, French and German. His conclusions (which he qualifies as "preliminary and unscientific"):
- Dutch, French or German: no one writes of the 1940s much on the web, as you pointed out.
- Dutch speakers don't care much for the 1990s
- German speakers on the other hand *love* the 1990s (fall of Berlin Wall / reunification perhaps?)
- There's something fishy going on with Google's French pages: it contradicts all of the other trends in all of the other languages
Ah, Google and the French: wheels within wheels.
Anyhow, here are Michel's graphs for Dutch:
French:
and German:
and his counts for Dutch:
de jaren 10 | de jaren 20 | de jaren 30 | de jaren 40 | de jaren 50 | de jaren 60 | de jaren 70 | de jaren 80 | de jaren 90 | |
895 | 31900 | 56100 | 29200 | 122000 | 192000 | 228000 | 295000 | 162000 | |
Yahoo | 6430 | 103000 | 175000 | 81300 | 374000 | 538000 | 651000 | 750000 | 513000 |
MSN | 2842 | 26255 | 41826 | 19876 | 74246 | 104467 | 136817 | 144827 | 119179 |
for French:
les années 10 | les années 20 | les années 30 | les années 40 | les années 50 | les années 60 | les années 70 | les années 80 | les années 90 | |
6330 | 143000 | 269000 | 124000 | 712000 | 948000 | 654000 | 730000 | 822000 | |
Yahoo | 19100 | 395000 | 684000 | 316000 | 1E+06 | 2E+06 | 2E+06 | 3E+06 | 2E+06 |
MSN | 12445 | 76693 | 141856 | 61837 | 219592 | 326127 | 397380 | 516601 | 367362 |
and for German:
1910er | 1920er | 1930er | 1940er | 1950er | 1960er | 1970er | 1980er | 1990er | |
77900 | 219000 | 263000 | 139000 | 341000 | 454000 | 496000 | 510000 | 692000 | |
Yahoo | 92300 | 443000 | 437000 | 243000 | 701000 | 891000 | 980000 | 1E+06 | 1E+06 |
MSN | 6075 | 51201 | 54000 | 19345 | 77429 | 101193 | 107841 | 112712 | 142733 |
[Update: Trevor at Kalebeul emailed:
My lunch companion suspects that the 1940s score so low because people often refer to them as "the war years." The yuppie years were neither so dramatic nor so focussed, so "the 1980s" tends to be preferred.
Yes, this was also my hypothesis.
In addition, I believe that "the war years" (essentially the first half of the decade) are felt to be very different from "the postwar years" (the second half of the decade and onward), so that there is a smaller tendency to refer to the decade as a whole under any name.
When people talk about "the 1960s" they really mean "1965-1969" or so, in most cases, but the 1960-1965 period has no real identity of its own, so "the 1960s" is used.
Trevor offered another suggestion as well:
Another more creative excuse for under-represented 40s would be that, because of paper shortages, they were not very good at documenting themselves, thus depriving us of primary sources. My grandad wrote many of his letters from the front on toilet paper, and the censors fortunately understood the concept of added value.
Though I'm no archivist, my guess is that the WWII years are nevertheless pretty well documented, and they were certainly full of events for people to write about in retrospect.]
[Update #2: Andrew Gray emailed with the plausible suggestion that the smaller number of references to the teens is essentially morphological in character:
...in English, we don't really ever say "the tens" (though we will say "the nineteen-tens"), because it seems verbally clumsy, and it's quite possible that this will spill over into writing about the decade. If you never *say* a phrase, you're less likely to use it in writing, in my experience. As an aside, try saying it to yourself - does it evoke any mental images? It draws a blank for me - "the 1910s" is a phrase with very few connotations, and so is probably less likely to be used idiomatically.
I went and did a little searching. I've taken Google numbers, as you did, for every decade back to 1510, so we can compare five centuries, and in all cases the numbers drop off sharply in the --10s decade. I missed out the first decade of each century, "1700s" and the like, since these usually refer to the century as well as the decade and so are pretty skewed.
http://www.generalist.org.uk/decades.png
The numbers on the Y-axis are "percentage of hits for that century which were this decade"; I normalised it to this to allow comparison across centuries, as the absolute numbers steadily shrank over time.
]
In response to a call from Ben Goldacre at Bad Science, the folks at Cosmic Variance are swapping "silly talk about science" anecdotes, and Brian Weatherson is collecting "silly talk about philosophy" tales.
Some of the best stories come from forced conversations -- on a plane, or getting a haircut. For example, one from a comment at Cosmic Variance
Woman on plane: “So, what do you do?”
me: “I’m an astronomer.”
woman: “That must be fun. But… what’s left to do? I mean, we already know the names of all the stars!”
and one from a comment at TAR:
The setting: Prof. Garrett, on the plane, sitting next to a middle-aged woman.
She asks, "So, what do you do?"
Prof. Garrett, "I'm a philosopher."
"Oh! What are some of your sayings?"
Here's my contribution to the genre:
Person at party: "Someone told me that you know how to interpret spectrograms. That's so interesting! Could you teach me?"
Phonetician: "Well, sure, it's not hard to learn the basic techniques."
Person at party: "That would be so exciting! I've always been sensitive to communications from the spirit world, and with the help of scientific instruments, I can only imagine..."
("spectre-grams", get it?)
My own feeling about these situations is that they present a wonderful opportunity to emulate Ali G, but in reverse, so to speak. Any self-respecting philosopher ought to be prepared with some gnomic sayings that can bear several interpretations, at least some of them scandalous. An astronomer might point out, deadpan, that with the fall of communism, all the stars, comets and asteroids named by the Russians are up for grabs again, with the rights going for big bucks on the international cosmology auction circuit. And spectrographic interpretation of the voices of the dead is a piece of cake, actually, but the real research challenge is to analyze the voices of those who haven't been born.
Anyhow, if you have any good silly-talk-about-linguistics stories, send them to me and I'll add them to this post.
[Well, several people sent in the inevitable "you're a linguist? so how many languages do you speak?" question. So far only Eric Bakovic has supplied a good answer: "Both of them."
Joshua Guenter offers the following: "Once, at a party, upon telling a person that I studying Linguistics, I got the reply 'Oh, so William Safire must be the bigwig in your field, right?'"
And a literary silly-talk from Carrie Shanafelt:
Once, when I lived in Cleveland, I went to my favorite diner in the middle of the night to read Boswell and drink coffee. A woman at a nearby table yelled, "Miss! What the hell you readin'? That thing's biggern the goddamn Bible!" I said it was the Life of Samuel Johnson. She nodded knowingly, smiled, and said, "I loved him in Pulp Fiction."
]
[TStT has a lovely silly-talk anecdote about a mistaken (Western) folk tale about Japanese -- no, Tokyo is not Kyoto backwards -- see his post for the details. My favorite part of the story is not about linguistics or about silly talk, however, but about etiquette and self-presentation:
"At this point in the conversation, I was presented with a dilemma. In social situations, I don't like to act like that guy—you know, the guy who has to be right all the time and rubs your face in it? (Note that I say I don't want to "act like" him, because I am in fact that guy, but I try to keep a lid on it.)"
Words of wisdom for us all.]
[Emily Bender sent in three examples. One is the other commonest comment on being a linguist:
Aside from "How many languages do you speak?" the other one I get all the time is "I better watch what I say around you!"
I've never been able to come up with a better answer to this one than "Good plan!" or "Glad to hear it!" (Maybe "I promise to be merciful"?) Emily didn't help with this, but she did provide an excellent generic answer to questions about clothing text in foreign languages:
I have a t-shirt with the name of the university I studied at in Japan (Touhouku Daigaku), in kanji. When I wear it, someone invariably asks me "What does your shirt say." I usually answer "It says 'Ask me what my shirt says'." ... and people usually buy it!
Unfortunately I don't have any such shirts. And Emily also supplied a more personal story:
When I graduated from college, my family suddenly decided to try to understand what I had been studying. When I tried to explain Linguistics to my great-grandmother, she concluded that I was going to be a judge. The chain of reasoning apparently went like this:
My great-granddaughter is studying Linguistics That's about languages. She can speak lots of languages. Where do they need people who can speak lots of languages? In the courts! But my great-granddaughter is going to be *important*. ... She's going to be a judge!
So, better watch what you say. It all makes sense now...]
[Jesse Sheidlower offers some other come-backs to "I'll have to watch what I say in front of you":
This is by far the most common question I get as well after identifying myself as a dictionary editor. My stock response is usually "That's OK, I don't give tickets," with a smile.
I have long wanted to use an answer that someone suggested in a Miss Manners column to a similar question: "Thank you, but I am perfectly capable of forming a low opinion of you on entirely different grounds." But I just don't have the guts.
]
[From Robert G. Lee:
As a Certified ASL Interpreter, my favorite is this (all too common) exchange:
Person: So what do you do for a living?
Me; Among other things, I am an American Sign Language interpreter.
Person: Wow! So you know Braille!
Me: {sigh}
]
[This one is from Ella:
I get mostly a lot of puzzled looks when people hear that I'm a linguist (even more difficult to explain that I'm a phonetician working in computational linguistics for an IT company but I don't know how to code. I've taken lately to telling people that I'm a taxidermist just to put them at ease). But the oddest question I ever had about linguistics was - 'so if I learn Russian will I be a good chess player?'
]
Posted by Mark Liberman at 12:09 AM
According to David Remnick's retelling of New Yorker lore,
Six decades ago, not long after being hired by Harold Ross as a copy editor at The New Yorker, a shy young woman, an Oberlin graduate, set to work on a manuscript by James Thurber and soon came across the word “raunchy.” She had never heard of the word and thought it was a mistake. “Raunchy” became “paunchy.” Thurber’s displeasure was such that the young woman barely escaped firing.
But Eleanor Gould Packard, who died in February, would surely have modified "parauque" to "pauraque" before it made it into print for Steve at Language Hat to catch. I expect that she would also have doubted whether the head of the Family Research Council asked that every lion tongue would be cast down.
There are several different ways to refer to decades -- "the seventies", "the 1970s", and so on. Of these, the textual form YYY0s is the most unambiguous -- a bit of web searching for patterns like {1990s} should convince you that nearly all uses refer to ten-year spans of time rather than to model numbers or the like.
And looking through the counts for the decades of the 20th century shows a main effect of recency: counts decline more or less linearly as the dates move backwards. But there are some divergences from this trend. In particular, the 1940s (and less clearly, the 1910s) are under-represented.
Here the effect is shown graphically:
In order to compare the counts from Google, Yahoo and MSN, I've expressed each search engine's results in terms of ratios to its average for the nine decades cited. The actual counts that I got are given in the table below, in millions. Note that MSN gives exact-seeming counts, while Google and Yahoo give approximations.
Google |
Yahoo |
MSN |
|
1990s | 27.4 |
50.1 |
4.524152 |
1980s | 23.9 |
48.4 |
5.078620 |
1970s | 21.3 |
41.4 |
4.502463 |
1960s | 18.8 |
36.7 |
4.219955 |
1950s | 13.7 |
28.7 |
3.249855 |
1940s | 6.53 |
15.0 |
1.908465 |
1930s | 9.19 |
20.8 |
2.609433 |
1920s | 6.38 |
15.6 |
2.000209 |
1910s | 0.801 |
2.04 |
0.301935 |
The three search engines disagree about the status of the 1990s, but the 1940s and the 1910s are apparently below the trend line in all three counts. Is this because the 1940s and the 1910s were dominated by WW II and WW I respectively? I'm not sure.
I stumbled on this particular oddity because I was setting up to write something about an interesting recent paper by Anatol Stefanowitsch entitled "The function of metaphor" ( International Journal of Corpus Linguistics, Volume 10, Number 2, 2005, pp. 161-198). You can't read it, unless you have a subscription or are willing to pay the extraordinary sum of $37.17 -- a dollar a page plus change -- because, alas, the IJCL is not open access. If you could read the article, you'd find some interesting ideas about using collocational frequencies to explore the functions of metaphorical language. Specifically, Stefanowitsch contrasts what he calls "cognitive" and "stylistic" theories about the nature of metaphor. I thought I'd try to explain the basic issues for people who might not otherwise come across the paper, and especially to describe the methods used, which could be applied much more widely. So I started out to reproduce and extend a couple of Stefanowitsch's test cases, which deal with the distribution of metaphorical (e.g. "the dawn of <time period X>") and literal (e.g. "the beginning of <time period X>") expressions that are more-or-less referentially equivalent.
Stefanowitsch's idea is that we ought to find clues to the nature of the choice between metaphorical and literal expressions by looking at the words that tend to be more closely associated with each of them. He calls these collocational associates "collexemes". Thus with respect to "dawn of" vs. "beginning of", he writes that
...the events and time spans referred to by the collexemes of the literal pattern ... are much shorter and much more clearly delineated than those referred to by the distinctive collexemes of the metaphorical expression ...
I'll save the details (of S's paper and my reactions to it) for another post or two. But as a teaser, here's a bit of the data that I collected for looking at collocations relevant to metaphorical and literal time-period references. S used the British National Corpus, which is only about 100 million words; using the 5-10 trillion words on the web, we can examine some particular semantic fields in much more detail than he did.
For example, we can see that the 21st century is about 35 to 55 times more dawnish than the 18th. This makes metaphorical sense, I guess, but not because the 18th century is either shorter or more clearly delineated:
(Google) "the dawn of the_ " |
(Google) "the beginning of the _" |
(Google) Ratio |
(Yahoo) "the dawn of the_ " |
(Yahoo) "the beginning of the _" |
(Yahoo) Ratio |
(MSN) "the dawn of the_ " |
(MSN) "the beginning of the _" |
(MSN) Ratio |
|
21st century | 64,900 |
129,000 |
2.0 |
206,600 |
390,000 |
1.9 |
44,177 |
91,834 |
2.1 |
18th century | 466 |
33,600 |
72.1 |
1,280 |
114,000 |
89.1 |
252 |
29,322 |
116.4 |
As another example, the 1960s seem to have been two or three times as dawnish as the 1980s:
(Google) "the dawn of the_ " |
(Google) "the beginning of the _" |
(Google) Ratio |
(Yahoo) "the dawn of the_ " |
(Yahoo) "the beginning of the _" |
(Yahoo) Ratio |
(MSN) "the dawn of the_ " |
(MSN) "the beginning of the _" |
(MSN) Ratio |
|
1980s | 831 |
60,500 |
72.8 |
2,530 |
135,000 |
53.4 |
762 |
44,519 |
58.4 |
1960s | 983 |
19,500 |
19.8 |
2,410 |
48,100 |
20.0 |
464 |
12,300 |
26.5 |
Again this makes post hoc sense, but again the key is not the concreteness of the time period referenced.
[Update: Rob Malouf writes to draw my attention to Pollmann T. and R.H. Baayen, " Computing Historical Consciousness. A Quantitative Inquiry into the Presence of the Past in Newspaper Texts", Computers and the Humanities, Volume 35, Number 3, August 2001, pp. 237-253(17). From the abstract, it looks relevant and interesting:
In this paper, some electronically gathered data are presented and analyzed about the presence of the past in newspaper texts. In ten large text corpora of six different languages, all dates in the form of years between 1930 and 1990 were counted. For six of these corpora this was done for all the years between 1200 and 1993. Depicting these frequencies on the timeline, we find an underlying regularly declining curve, deviations at regular places and culturally determined peaks at irregular points. These three phenomena are analyzed.
Mathematically spoken, all the underlying curves have the same form. Whether a newspaper gives much or little attention to the past, the distribution of this attention over time turns out to be inversely proportional to the distance between past and present. It is shown that this distribution is largely independent of the total number of years in a corpus, the culture in which it is published, the language and the date of origin of the corpus. The phenomenon is explained as a kind of forgetting: the larger the distance between past and present, the more difficult it is to connect something of the past to an item in the present day. A more detailed analysis of the data shows a breakpoint in the frequency vs. distance from the publication date of the texts. References to events older than approximately 50 years are the result of a forgetting process that is distinctively different from the forgetting speed of more recent events.
Pandel's classification of the dimensions of historical consciousness is used to answer the question how these investigations elucidate the historical consciousness of the cultures in which the newspapers are written and read.
Unfortunately, this is not an open-access journal, and the Penn library doesn't have an electronic subscription to it, nor do I, and I'm not yet curious enough to pay $40 plus tax for a sixteen-page article relevant relevant to a blog post -- nor even to make a special trip to the library stacks].
For a moment, it looked like a scandalous story of desertion and plundering by American forces in Iraq. The headline (p. 1 of the New York Times, 8/13/05) read:
G.I.'s Deployed in Iraq Desert
With Lots of American Stuff
Oh, the noun desert, not the verb desert.
Headline writing is full of perils.
zwicky at-sign csli period stanford period edu
Well, I figured this would happen pretty soon. Al Feuer's 8/12/2005 NYT story on the Air America financing scandal, running under the headline "Bronx Boys Club's Finances Investigated", had a correction appended on 8/13/2005:
Correction: Aug. 13, 2005, Saturday:
An article yesterday about state and city investigations of a loan made by a Bronx social service agency to the liberal radio network Air America quoted incorrectly from comments made on the air by Al Franken, the host of an Air America program. Referring to Evan M. Cohen, a former official of the network whom Mr. Franken accused of having engineered the loan, from the Gloria Wise Boys and Girls Club, Mr. Franken said: "I don't know why they did it, and I don't know where the money went. I don't know if it was used for operations, which I imagine it was. I think he was robbing Peter to pay Paul." (He did not say: "I don't know why he did it. I don't know where the money went. I don't know if it was used for operations. I think he was borrowing from Peter to pay Paul.")
The reason for the correction? Bloggers compared the original quote to the NYT version, and complained.
Here's the crucial bit of the original NYT article:
“I don’t know why he did it,” Mr. Franken said, according to a transcript of the broadcast made by the Department of Investigation. “I don’t know where the money went. I don’t know if it was used for operations. I think he was borrowing from Peter to pay Paul.”
Here's what Brian Maloney at The Radio Equalizer transcribed:
I don’t know why they did it, and I don’t know where the money went, I don’t know if it was used for operations, which I imagine it was. I think he was robbing Peter to pay Paul.
I went to the audio linked at Brainster's blog, and did my own transcription, which agrees in all relevant details with Maloney's. Here's an aligned comparison between the NYT version and the truth:
I don't know why he did it / I don't know where the money went I don't know why they did it and I don't know where the money went
I don't know if it was used for operations I don't know if it was used uh for r- operations which I imagine it was
I think he was borrowing from Peter to pay Paul I think he was robbing Peter to pay Paul
By my count, leaving out the disfluencies, that's 38 words in the genuine quote. To get the NYT version, we need to remove 8 words and add 3, for a word error rate of 11/38 = 28.9%. We can give them a bit more credit, since they split the quote into two pieces, and not charge them for the missing "and" at the break. That would make it 10/37 = 27% W.E.R.
But any way you count it, that might actually be better than the norm for quotation accuracy at the NYT! See this 7/30/2005 Language Log post, "'Quotations' with a word error rate of 40-60%" for documentation.
So Michelle Malkin may be right that
The omission of those five little words ["which I imagine it was"] matters because Al Franken's actual statement suggests that the money was in fact stolen from poor kids to pay Air America's bills--a speculation that the Times attributes to "conservative-leaning blogs," but not to the Times' favorite liberal talk show host who said it himself.
And there might be some truth to other speculations that the switch of "they" to "he" was politically motivated (one rotten apple, not a barrelful), and likewise softening "robbing" to "borrowing from".
But then again, maybe it was just the print media's astonishingly cavalier standards for quotation accuracy. Sometimes it doesn't matter, but this time it bit them.
When digital recordings of the original source are available on the web, you can count on someone checking the accuracy of cited quotations, especially when there's a question of bias. Why not take a few seconds to get the quote right?
Some other relevant Language Log posts:
"Quotations" with a word error rate of 40-60% and more (7/30/2005)
Linguists beware (7/9/2005)
Quotes from journalistic sources: unsafe at any speed (7/9/2005)
More comments on quotes (7/1/2005)
Bringing journalism into the 21st century (6/30/2005)
Down with journalists! (6/27/2005)
Ritual questions, ritual answers (6/25/2005)
Ipsissima vox Rasheedi (6/25/2005)
What did Rasheed say? (6/23/2005)
Newspapers and wire services are certainly welcome additions to the world's information economy, but these media, valuable as they are, can never be fully accepted as sources of information until they put into place some reasonable standards of editorial oversight and some workable mechanisms for detecting and correcting errors.
Um, that's a joke, sort of, but its conclusion is all too true. A lovely example was recently documented by Steve Outing at Poynteronline (and others too numerous to mention -- I got it via Peter Suber).
It seems that on Monday, Reuters ran a story claiming that
Wikipedia, the Web encyclopaedia written and edited by Internet users from all over the world, plans to impose stricter editorial rules to prevent vandalism of its content, founder Jimmy Wales was quoted as saying Friday.
In an interview with German daily Sueddeutsche Zeitung, Wales, who launched Wikipedia with partner Larry Sanger in 2001, said it needed to find a balance between protecting information from abuse and providing open access to improve entries. ...Restricting access to entries particularly susceptible to unwanted attention could be one way of preventing [abuse], he said.
Wales has been at a meeting of those behind the successful free encyclopaedia in Frankfurt, which lasts until Monday.
He said that setting up a form of "commission" might be one way of deciding which entries could be "frozen" in perpetuity.
But apparently Wales said no such thing:
"The interesting thing is that the media simply made up the story about us permanently locking some pages. It's just not true. ... There is absolutely no truth at all to the story. None, zero. It is a complete and total fabrication from start to finish."
How did this canard get started? According to Outing,
Wales says the problem appears to be in the translation. He was in Germany recently and was interviewed by dozens of reporters, including from the Sueddeutsche Zeitung. He thinks the SZ reporter may have misinterpreted his comments. Then Reuters apparently translated his comments in German back to English, and his meaning got turned into something he didn't say.
What did he really say? According to Wales' explanation on slashdot,
"I spoke to one journalist about our longstanding discussions of how to create a 'stable version' or 'Wikipedia 1.0.' This would not involve substantial changes to how we do our usual work, but rather a new process for identifying our best work."
So, sloppy quoting at SZ, sloppy reporting and no quote checking at Reuters -- no biggie, this could happen even to a blogger. In fact, it did, because many bloggers credulously picked up the Reuters story. The real problem is lack of any interest at all in corrections, according to Wales (as quoted by Outing):
"The story seems to have legs, even though we've contacted Reuters and every other outlet to try to get a correction, no one seems to care at all. ... No response. We're important enough to write about, but not important enough for them to listen to at all."
As Peter Suber writes:
It's obvious but I'll say it anyway. An error like this would not have lasted 10 minutes on Wikipedia.
I just checked Google News, and found 564 hits for Wikipedia. I checked the first 20, and found about a dozen versions of the Reuters story, but no corrections. On Technorati, out of the first ten hits for "Wikipeida Reuters", three were corrections roughly equivalent in content to the listing above. On Blogdigger, five of the first ten set the record straight.
The journal Information and Computation has announced that
for one year, effective immediately, online access to all journal issues back to 1995 will be available without charge. This includes unrestricted downloading of articles in pdf format.
Retrieval traffic during the open access period will be considered as future subscription policies are formulated.
Journal articles may be obtained on Elsevier's Sciencedirect at http://www.sciencedirect.com/science/journal/08905401
This is obviously not the first journal to adopt an open access policy, experimentally or permanently, but is it the first one published by Elsevier to do it via Elsevier's site?
According to Peter Suber's page of Open Access lists, editors at several Elsevier journals have declared independence over the years (though not all of these have gone all the way to open access):
Elsevier is by no means the only publisher to have been put into the role of King George by rebellious journal editors, but I've quoted these example's from Suber's list to indicate one of the sources of pressure that may have led to a change in policy at Elsevier specifically.
Several other recent conversions to open access are documented on Suber's Open Access News blog, such as the Netherlands Journal of Medicine, which was once an Elsevier publication.
Suber's blog also quotes from a recent article by Robert Kiley, Head of Systems Strategy for the Wellcome Trust, annnouncing that
...from 1 October 2005, all new grant recipients will be required to deposit in PMC [PubMed Central -- myl], or a UK equivalent, any papers arising from Trust-funded research. This condition will be extended to all existing grant holders from October 2006. All papers deposited with PMC will be made freely available to the public, via the Web, within 6 months of the official date of final publication.
Kiley goes on to write that
Ultimately, for the benefits of open access to be fully realised, we need to win over the hearts and minds of those who actually do the research and write the papers – the scientists and researchers. For this group, the key drive behind publishing is a desire for their research to be read and cited. To misquote President Clinton ‘it’s about impact, stupid’. Fortunately for advocates of open access, research1 is starting to show that open-access articles were cited between 50–300% more often than non-open access articles from the same journal and year....The developments announced by the Wellcome Trust over the past couple of months – coupled with the public access initiatives at the US National Institutes of Health and the recent announcement from Research Councils UK (RCUK) in support of open access – all suggest that we are witnessing a sea-change in the way research findings will be disseminated and made accessible in the future.
Are there any language-related journals that have moved in this direction? DOAJ lists 37 open-access journals in the subject area of linguistics, but these don't include the titles that I for one would like to see there.
Related Language Log posts:
Costs and business models in scientific research publishing (5/11/2004)
More on scientific and scholarly publishing (6/14/2004)
A small rant (6/30/2004)
The status quo just can't stand (7/31/2004)
Abusive publisher of the month (8/25/2004)
Open access again (8/31/2004)
Some open access advice for Michael Silverstein (9/1/2004)
Prickly paradigms under a bushel? (9/3/2004)
Pamphleteering, old and new (9/3/2004)
You couldn't have a starker contrast (9/17/2004)
Scirus (12/1/2004)
Prairie dog talk (12/8/2004)
Blogs disgoogled? (2/23/2005)
Raising standards -- by lowering them (3/7/2005)
[Update 8/15/2005: Vincent Arnaud points out by email that there is a larger list of open-access journals, " maintained by Jan Szczepanski, a librarian at Sweden's Goteborg University", with a copy posted by Peter Suber here. This list includes more than 170 OA language-related journals (search for "LINGVISTIK"). However, I'll repeat my observation that most of the titles I'd like to see there are missing. ]
Today's news includes this article about the "Jerk-o-Meter", described as a device which can tell you whether the person you're talking to on the phone, or you yourself, is paying attention. It analyzes speech for features that reflect "activity" and "stress". The same technique provides better-than-chance predictions of the outcome of speed dates.
How useful a device this is is not so clear. Anmol Madan suggests that it might help to improve relationships by preventing arguments. Maybe, but I'm not so sure. Is receiving confirmation that the person you're talking to is not interested really going to improve the relationship? And note that it doesn't predict the outcome of speed dates in any useful way: its "prediction" is based on the conversation during the date, so it doesn't save you any time or angst.. Its kind of like the situation in weather prediction about twenty years ago, where they could predict the weather three days in advance but the computer model took three days to run, only in this case the problem won't disappear with faster computation.
Another suggested use is that:
it might assist telephone sales and marketing effortsI thought technology was supposed to improve our lives. As an antidote, I recommend the National Do Not Call Registry.
If you read the paper that underlies the press reports, it turns out that there is something interesting here.
we propose that minute-long averages of audio features often used to measure affect (e.g. variation in pitch, intensity, etc.) taken together with conversational interaction features (turn-taking, interrupting, making sounds that indicate agreement like 'uh-huh') are more closely related to social signaling theory rather than to an individual's affect.In other words, these features aren't uncontrollable subconscious cues to the speaker's mental state but controllable aspects of communication.
Some time ago we had a discussion of whether speakers of minority languages have much interest in the localization of computer software. I just encountered an interesting datapoint while checking out Freshmeat, a catalogue of Unix and cross-platform software whose front page contains announcements of new releases. One program on today's front page is version 0.0.16 of the GNU Generic Security Service Library. Here's the blurb:
Generic Security Service (GSS) is an implementation of the Generic Security Service API (GSSAPI). It is used by network applications to provide security services, such as authenticating SMTP/IMAP, via the GSSAPI SASL mechanism. It consists of a library and a manual, and a Kerberos 5 mechanism that supports mutual authentication and the DES and 3DES ciphers.
The first of the changes announced today is:
A Kinyarwanda translation has been added.
I don't know who did the Kinyarwanda translation or why, but somebody evidently sees a need for a Kinyarwanda translation of a fairly technical piece of software that will be used only by programmers.
The stimulus? A journal article about functional brain imaging of men listening to variously-hacked men's and women's voices.
The response? Worldwide resonant evocation of sexual stereotypes, congruent and contradictory alike.
Some headlines: "Er, you what, luv?" -- "Man Leaves Wife, Realizes Six Hours Later" -- "Female Voices are Easier to Hear" -- "What We Have is Failure to Communicate" -- "Men do Have Trouble Hearing Women" -- "Why Imaginary Voices are Male" -- "It's official! Listening to women pays off" -- "Men do have trouble hearing women, scientists find".
The blogospheric reactions are just as creative: "I can't hear you, honey...you're just too difficult to listen to" -- "What to tell your wife when you didn't hear her" -- "Men who are accused of never listening by women now have an excuse -- women's voices are more difficult for men to listen to than other men's, a report said" -- "I've been waiting for this for a long time. I'm often accused of 'selective hearing' in which certain statements just disappear from my consciousness - often statements made by Mrs. HolyCoast. It usually occurs when I'm multi-tasking, such as watching TV or blogging while listening to my better half..." -- "Science explains patriarchal monotheism!" ...
So I went and read the journal article: Dilraj S. Sokhi, Michael D. Hunter, Iain D. Wilkinson and Peter W.R. Woodruff, "Male and female voices activate distinct regions in the male brain", In Press, NeuroImage. I'm deeply puzzled by some of the research that paper describes -- if Sokhi et al. really did what they seem to be saying they did, I don't see how the results can be interpreted at all -- but I'm pretty sure that the experiment doesn't mean most of the things that people are saying it does. Maybe it doesn't mean any of them.
Here's what they did. They recorded 12 male and 12 female speakers reading some "emotionally neutral" sentences, "balanced in being directed to three main cortical modalities: vision ('look in the newspaper'), auditory ('listen to the music') and motor ('open the kitchen door')". As expected, the average pitch (F0) was different for the two groups -- "112.01 ± 8.11 Hz for male speakers and 204.68 ± 19.31 Hz for female speakers". They took from the literature the observation that there is a "gender-ambiguous" F0 range in the region of 135 to 181 Hz., where the typical "tessitura" of male and female speakers overlaps, and so they scaled each phrase in four steps from its original speed to a speeded-up or slowed-down version whose average pitch would be at 135 Hz. (for female speech) or 181 Hz. (for male speech).
If that sounds like a strange thing to do, it is. But here's what they say:
We calculated the difference between a speaker's F0 and the ‘target F0’ which, defined by the GAR F0 (see above), was 181 Hz for male speakers and 135 Hz for female speakers. We then derived speaker-specific scalar factors (SFqs) to pitch-scale a speaker's F0 in four equidistant steps (q = 1 to 4) to the ‘target F0’ without preserving Fn.
When they say "without preserving Fn", what they (seem to) mean is that they didn't do any fancy processing to change the pitch (F0) without changing the vocal-tract resonances (F1, F2, F3 etc.); instead they just increased or decreased the overall playback speed in proportion to the needed F0 changes. And the maximum amount of change was considerable -- an average male recording would have been sped up to as much as 181/112 = 162% of the original rate, while an average female recording would have been slowed down to as much as 135/205 = 66% of the original rate. (It's possible that I've misunderstood this, but I don't see any other way to interpret what they say...)
In fact they didn't use quite this much shifting, because they did perceptual tests to find the amount of shift that would produce "gender-ambiguous" stimuli, "defined by reaching the 50% mark .. for accuracy in reporting the gender of a given set of voices", and this was achieved by shifting a selected subset of stimuli to "159.13 ± 5.52 Hz for male speakers and 156.83 ± 4.09 Hz for female speakers, where the F0s for the corresponding selected natural stimuli were ... ‘male gender-apparent’ = 107.55 ± 6.46 Hz and ‘female gender-apparent’ = 211.77 ± 14.07 Hz."
So they speeded up the male recordings, on average, by a ratio of 159.13/107.55 = 1.48, and they slowed down the female recordings by an average ratio of 156.83/211.77 = 0.74. Still quite a big shift -- I'd expect these stimuli to be species-ambiguous, not just gender-ambiguous. Also, note that the duration of the phrases will be changed by the same factors, so the female phrases slowed down so as to be sexually ambiguous will be roughly twice as long as the male phrases speeded up so as to be sexually ambiguous.
Why did they do this? Well, they say that the shifted recordings "were selected for the fMRI experiment as these stimuli were matched for F0, thus removing the confound of simple pitch effects during perception of gender from heard speech". The basic idea is a sound one, since they want to be able to claim that they're seeing the effects of perceived speaker sex, not just the effects of higher versus lower pitched voices. But this is a strange way to go about it, since the shifted stimuli (according to their perceptual experiments) were selected so as to be identified as male and female about equally often! (There are indeed other cues to sex in the voice besides F0, as the authors mention, but they've specifically selected the artificial stimuli so that sex judgments are roughly equal...). Logically, I would have expected them to choose naturally-occurring male-perceived voices and female-perceived voices with F0 in an overlapping range, but they didn't try to do this. And the rate-shifting manipulation that they did (apparently) use not only doesn't preserve perceived sex, it introduces some other non-sex-linked acoustic factors (like duration differences) that seem just as problematic as the F0 difference it eliminated. They could have used pitch-shifting technology to change the pitch without changing the duration or the vocal-tract resonances, but they didn't, again I'm not sure why.
In any case, they've got four classes of stimuli:
The original samples of the sentences recorded from each speaker, with the original F0, together with the set of new stimuli of frequency F0(g-amb) gave 96 stimuli falling into four categories: ‘male gender-apparent’ (unaltered in pitch), ‘male gender-ambiguous’ (pitch-scaled and ‘gender-ambiguous’), ‘female gender-apparent’ and ‘female gender-ambiguous’ stimuli.
They played these 96 stimuli to 12 male subjects. It's not clear why they only studied males -- at least I couldn't find any reason for this. Maybe they're planning to look at female subjects in a different study. But 12 subjects is not a big fMRI experiment, so I'm not clear why they didn't look at both sexes. (And as you'll see, having female subjects would make a big difference in interpreting the results...)
Anyhow, the key thing in such functional imaging studies is that you can't just look at one condition. You need to compare the distribution of cerebral blood flow when subjects are doing X to the distribution when they are doing Y, or some more complicated sort of comparison of a similar kind. This is roughly for the same reason that in studying the effect of drug on a disease, you can't just give it to some patients and see how many get well; you need to compare the results for a matched set of patients who didn't get the drug. In this experiment, they defined their comparisons as follows:
Thus when they say (in their press release) that "when a man hears a female voice" such-and-such a region of his brain is activated, what they mean is that the specified region is (among the regions where) the two conditions specified in (i) are met: first, 'female gender-apparent' recordings create significantly more activation than 'male gender-apparent' recordings, and second, 'female gender-ambiguous' recordings yield significantly greater activation than 'male gender-ambiguous' recordings.
But there are some other descriptions you could give of that set of conditions. For example, you could say that these are the brain regions that respond more to higher-pitched speech than to lower-pitched speech; and for speech in a medium pitch range, respond more to recordings that have been slowed down to reach that level than to recordings that have been speeded up to reach that level. Or perhaps, respond more to phrases that are longer in duration than to phrases that are shorter in duration. This last is not a trivial issue, especially since the subjects were listening to the stimuli against the background of scanner noise, which is roughly like being in a boiler factory inside one of the boilers. (It's possible to arrange the scanning acquisition so that audio stimuli are played in silent intervals, but that was not done in this experiment). So higher-pitch or longer-duration stimuli will probably be more acoustically salient, especially in this very noisy environment, and therefore might show increased auditory activation, quite apart from any sexuality judgments. And lower-pitch or shorter-duration stimuli will be harder to hear, and therefore might engage some additional attention-focusing mechansisms, again apart from any sexuality judgments.
Whatever the reasons, their results were these:
Conjoint contrast |
Brain region |
“Female vs. male” | Right anterior superior temporal gyrus |
“Male vs. female” | Right precuneus |
“‘Gender-apparent’ vs. ‘gender-ambiguous’” | Posterior superior temporal plane contiguous with inferior parietal lobule |
“‘Gender-ambiguous’ vs. ‘gender-apparent’” | Right anterior cingulate gyrus |
So as I said, I'm really puzzled about how to think about what these results mean. Whatever is going on, though, there's nothing in their results to stand behind statements like "[t]he female voice is actually more complex than the male voice, due to differences in the size and shape of the vocal cords and larynx between women and men", as the Sheffield press release asserts.
And the same press release says that "when a man hears a female voice the auditory section of his brain is activated, which analyses the different sounds in order to 'read' the voice and determine the auditory face" -- are we supposed to conclude that males hears male voices in a way that by-passes the auditory cortex? Well, they go on to say that "[w]hen men hear a male voice the part of the brain that processes the information is towards the back of the brain and is colloquially known as the 'mind's eye'. This is the part of the brain where people compare their experiences to themselves, so the man is comparing his own voice to the new voice to determine gender."
But if even if their conjoint contrast (ii) is really male-vs-female and not lower-pitch-and-shorter-phrases-vs.-higher-pitch-and-longer-phrases (and similarly for the other three contrasts), the results are still not about males-hearing-sex-identified-voices. They're (at best) about males-hearing-males after you subtract out everything this has in common with males-hearing-females; and males-hearing-females after you subtract out everything this condition has in common with males-hearing.males. And because they don't have any data on females-hearing-males vs. females-hearing-females (or females-hearing-lower-pitch-and-shorter-phrases, and so on), interpretations in terms of "people comparing their experiences to themselves" are at best highly speculative.
Unless I'm missing something, it seems to me that the increased STG activation in their condition (i) -- which they explain as males hearing females in areas adjacent to the auditory cortex -- might just as well be explained as subjects responding to acoustically more salient stimuli (higher pitch or longer duration) with more activation in acoustically-specialized areas of the brain. As for the increased precuneus activation in their condition (ii) -- which they explain as males responding to males by self-comparison in "the mind's eye" -- the precuneus (a structure in the parietal lobe) has been implicated in all sorts of things, from representation of the visual periphery to motor imagery of finger movement, with some stuff about attention along the way, so that I'd think you might just as plausibly explain this effect in terms of subjects attending more closely to acoustically less salient stimuli in a noisy environment, while thinking harder (or for a longer time) about which button to press to register the perceived sex of each stimulus.
The journal article starts out with some statistics about auditory verbal hallucinations in schizophrenia -- "The voices of AVHs are perceived as male 71% (and female 23%) of the time irrespective of the patient’s gender. The characteristics of the voices of AVHs are also commonly middle-aged, external to the person, right-lateralised, ‘‘BBC newsreader’’ accent in quality and derogatory in content (Nayani and David, 1996)." This is interesting, but I'm not convinced that the fMRI findings help us to understand this, especially the middle-aged, BBC newsreader, derogatory parts, which are properties totally orthogonal to anything in the experiments.
And as for the rorschach-blot reactions in the popular press and the blogs, about how this explains why men have a hard time paying attention to women, or why women's speech is more valuable, or why men and women often fail to communicate... Well, what's responsible for these responses is not the STG or the precuneus, it's the limbic system. When people have strong and complex feelings about a topic, research results become a screen for them to project their preconceptions onto.
A few days ago, the mail brought me a copy of The Chimwiini Lexicon Exemplified, by Charles W. Kisseberth and Mohammad Imam Abasheikh, which is no. 45 in the Asian and African Lexicon series published by the Research Institute for Languages and Cultures of Asia and Africa, of the Tokyo University of Foreign Studies.
The book came in a cardboard package without any stamps or any other indication of a specific amount of postage having been paid. The upper right corner of the package is blank, but in the top center there is a circular stamping like a postal cancellation, which (after being de-circled) reads:
BUREAU DE POSTE MUSASHIFUCHU TAXE PERÇUE |
I didn't know that percevoir can mean "to receive (payment)" as well as "to perceive" or "to comprehend". Taxe perçue is a charming expression, as if it's enough for the Japanese postal authorities that TUFS has perceived its financial obligation in this matter. Or perhaps, since the perceiving agent is unspecified, it's only someone in the Musashifuchu tax office who perceived it? In any case, despite Shintaro Ishihara's insults, the Japanese government is apparently still using French to let the rest of the world know that adequate tax-perception has occurred. And the fact that the tax was perceived by someone in Japan, and thus noted in French, was also enough to persuade the U.S. Post Office to deliver the book to my office in Philadelphia
Chimwiini is a Somali dialect of Swahili. Ethnologue specifies
Region: The Mwini live in Baraawe (Brava), Lower Shabeelle, and were scattered in cities and towns of southern Somalia. Most have fled to Kenya because of the civil war. The Bajun live in Kismaayo District and the neighboring coast.
Dialects: Mwini (Mwiini, Chimwiini, Af-Chimwiini, Barwaani, Bravanese), Bajuni (Kibajuni, Bajun, Af-Bajuun, Mbalazi, Chimbalazi).
Comments: Reported to have come centuries ago from Zanzibar. Mwini: artisans (leather goods); Bajun: fishermen.
According to Kisseberth and Abasheikh's Preface,
Chimwiini is a dialect of Kiswahili which has, for some centuries, been spoken in the town of Mwiini (generally known as Brava or Barawa) in southern Somalia. Brava was at one time not the only location in Somalia where forms of Kiswahili were spoken. Historical evidence shows that some centuries ago, Kiswahili was spoken at least as far north as Mogadisho. The Somali language eventually displaced Kiswahili in Somalia except for Brava. The people of Brava (numbering roughly 10,000 in the early 1970's, according to MIA's estimate) somehow resisted the Somali language hegemony. Civil war and the political chaos in Somalia in the first part of the 1990's have apparently led to the dispersal of the population of Brava, with many people currently refugees in Kenya or further afield. The present outlook for the language's continued existence looks bleak indeed.
While Chimwiini is a dialect of Kiswahili, its differences from Kiswahili in phonology (especially in the prosodic features of length and accent), morphology, and lexicon (due in large part to the significant influence of Somali) warrant detailed study of all aspects of its structure.
The preface also explains what the authors mean by "exemplified":
This volume atempts (a) to document the lexicon of Chimwiini and (b) to exemplify the morphological, phrasal and sentential patterns of the language as fully as possible given the limitations of our research. ... The examples include single words, phrases and sentences. From the point of view of a purely lexical study, the examples are often redundant (i.e. do not provide new information about the meaning or use of the item in question). They do, however, serve the purpose of richly documenting a little studied, endangered language.
Although the authors have been working on this project on and off since 1973 (among many others activities for both of them), and although one of the authors is a native speaker of Chimwiini, the work often admits to uncertainty, in some cases about fairly basic things. For example, one of the exemplifications given for ma-haba "love, affection" is
wa'ishiize pamooyi ka mapeenḏo na mahabbá they lived together in love and affection (phon. This item was recorded with gemination, but the precise status of gemination in the language is not easy to determine -- is it entirely stylistic? is it a combination of both stylistics and lexicon? in the case of borrowed words, what is the relevance of gemination in the source language to its treatment in Mw.?)
I think this frankness about scholarly uncertainty is refreshing and praiseworthy.
Most of the uncertainties are more local, like about the relationship of words to possible cognates in Kiswahili or Somali. There are also quite a few entries whose gloss is "[unfortunately we did not obtain a gloss for this item]", and notes like this one, for a word glossed "at a hotel":
(Note: Doubtless the basic from of this noun exists in Mw., but we only recorded the locative form and thus now cannot be certain about what the correct vowel quantity is in the basic form.)
The work is full of helpful little grammatical notes, as in the entry for iḻa "defect", whose exemplification includes
[numba yaa wé/ nthukiingilá/ híiwi/ iḻáye] [prov.] the house that you/ have not entered/ you cannot know/ its defects (notice that a negative relative verb does not end in o but rather a)
And sometimes the grammatical notes are not so little. My favorite part is the discussion, scattered throughout the work, of the interaction of lexical, morphological and phrasal factors in determining which syllables are accented. Describing the development of the authors' ideas about this aspect of the language, the preface says:
Lexical items are characterized by penultimate accent in the unmarked case, but there are morphosyntactic factors that trigger ultimate accent. The principles governing vowel length and accent are critically dependent on the parsing of sentences into "prosodic phrases" [=PP]. Whether a vowel can be long depends on its position in the PP; whether a vowel is accented or not depends on its position in the PP. ...
In principle, we would have like to record each and every example in what we might call a narrow transcription. That is, we would have liked to indicate whether a given vowel was long or short, accented or not, and how each example is organized into prosodic phrases. Many of the examples in this book are in fact given in such a narrow transcription. These examples can be recognized as follows: there is a left bracket ("[") at the beginning of the example and a right bracket ("]") at the end; the right edge of all phrases except the last is marked by a slash ("/"); short vowels are written with a single symbol (e.g. a) while long vowels are written double (e.g. aa); and accented vowels are written with an acute accent mark over them (e.g. á) while unaccented vowels have no accent mark.
While we would have liked to always give a narrow transcription, this has not been possible. Unfortunately, at the time when the data was collected, we did not fully understand the accentual system. While we made an attempt to accurately transcribe the vowel length facts of every example we collected (and believe that our observations in that regard are generally accurate), we could not mark the accent fully. ...
Recently, we have achieved a much better understanding of accent, and armed with that understanding, it is possible to re-examine material that was tape-recorded and assign such material a narrow transcription. It is also possible to return to many examples that we collected (but did not tape record) and assign them an accentual structure and a PP-phrasing that is undoubtedly correct. But there are various reasons why this is not always possible ...
So they use three other kinds of transcription: what they call a broad transcription, which indicates vowel length and accent to the extent that they are sure of them, and prosodic phrasing to the extent that it is determined by that information; what they call a phrasing-free transcription, which indicates some vowel length and accent position but does not attempt to make any prosodic phrase boundaries; and what the call a prosody-free transcription, which makes no attempt to mark vowel length, accent or prosodic phrasing. Prosody-free transcriptions are required in cases where they were given examples in written form by others, or where examples are known to them only from song recordings (from which vowel length and accent can't reliably be determined in this language).
Anyone who has worked on an undocumented language or dialect will be familiar with this kind of situation. In fact, any honest observer who has worked on even the most extensively documented speech communities will recognize the sort of thing that they are writing about. For example, in the work recently sketched here on the pronunciation of the and a in English, there are some transcriptional uncertainties that are quite similar to the sorts of uncertainty that Kisseberth and Abasheikh discuss so frankly with respect to the Chimwiini examples. I'll pick this up again in another post.
[Update: Steve from Language Hat checked the OED, as I neglected to do, and discovered that English used to have the same sense of "take into possession" for perceive. This makes perfect sense, since the Latin root meant "to take"...:
II. To take into possession. Cf. L. percipere, F. percevoir, in lit. sense, from L. capere to take.
8. trans. To receive (rents, profits, dues, etc.).
1382 WYCLIF Tobit xiv. 15 Al the eritage of the hous of Raguel he perceyvede [Vulg. percepit]. 1472-3 Rolls of Parlt. VI. 4/2 Every of the seid men Archers, to have and perceyve vid. by the day oonly. 1512 Knaresb. Wills (Surtees) I. 4, I will that my forsaid doghters have and persaive all the revenieuse. 1596 BACON Max. & Use Com. Law I. xx. (1636) 73. 1625 Concession to Sir F. Crane in Rymer Fædera XVIII. 60 To have, houlde, perceive, receive and take the said annuitie or yeerely pension of two thousand pounds.
 b. in gen. sense: To receive, get, obtain. Obs.
1482 Monk of Evesham (Arb.) 75 Gretely merueylde why he yat was so honeste of leuyng..had not yette perceiuyd fully reste and ioye. 1540-54 CROKE Ps. (Percy Soc.) 19 Full spedely let me obteyne Thy socoure, and perceyue the same. 1591 SHAKES. Two Gent. I. i. 144 Pro. Why? could'st thou perceiue so much from her? Sp. Sir, I could perceiue nothing at all from her; No, not so much as a ducket for deliuering your letter. 1748 J. NORTON Redeemed Captive (1870) 22 Mrs. Smeed was as wet.. but through the good providence of God, she never perceived any harm by it.
]
Prescriptive grammarians routinely disparage innovative usages as introducing ambiguities: speaker-oriented hopefully, logical rather than temporal since and while, and on and on. Non-standard usages, like multiple negation, are sometimes attacked on the same grounds. Yet everyday language (even in conservative and standard varieties) is jam-packed with ambiguity, not all of it easily resolved in context. We end up having to ask whether someone meant 'spicy hot' or 'hot in temperature', 'funny-ha-ha' or 'funny-peculiar', 'just now' or 'just-only', etc.
Non-standard varieties not infrequently have usages that help to disambiguate; the choices in AAVE between a tensed copula ("They are sick"), the zero copula ("They sick"), and invariant be ("They be sick") is a famous case in point. This morning the New York Times (8/10/05, p. A15) provided another example, having to do with the ambiguity of have 'own, possess' vs. 'have on/with one'.
The example comes in Michael Winerip's "On Education" column, "Essays in Search of Happy Endings", about teachers and students in the disfunctional setting of Locke High School in Los Angeles:
They were supposed to do a half-hour of silent reading and write about it, but only a handful brought books. The rest... were allowed to write an essay on why it's important to bring your book. "If I write, 'I ain't got it; that's why I don't got it,' is that worth points?" asked one of three boys who taunted the young teacher the entire two hours.
I've boldfaced the relevant bit, in which the 'own, possess' sense is conveyed by negation with ain't, while the 'have on/with one' sense is conveyed by negation with don't. The student could have said, "If I write, 'I don't have it; that's why I don't have it'...", but that would have been just baffling. The student could have said, "If I write, 'I don't own it; that's why I don't have it with me'... ", that would have more or less worked (though own isn't quite the right verb here, since students don't usually buy their books, but have them issued to them). What the student did say was both clear and succinct (brevity is also a much-touted virtue), though seriously non-standard.
zwicky at-sign csli period stanford period edu
Can you guess who delivered this speech, and when?
"The [Supreme] Court has been acting not as a judicial body, but as a policy-making body. ... The Court in addition to the proper use of its judicial functions, has improperly set itself up as a third house of the Congress, a super-legislature, as one of the justices has called it, reading into the Constitution words and implications which are not there, and which were never intended to be there.
We have, therefore, reached the point as a nation where we must take action to save the Constitution from the Court and the Court from itself. ... We want a Supreme Court which will do justice under the Constitution and not over it.
I want - as all Americans want - an independent judiciary as proposed by the framers of the Constitution. That means a Supreme Court that will enforce the constitution as written, that will refuse to amend the constitution by the arbitrary exercise of judicial power. ... I will appoint Justices who will act as Justices and not as legislators
During the last half century the balance of power between the three great branches of the Federal Government, has been tipped out of balance by the Courts, in direct contradiction of the high purposes of the framers of the Constitution. It is my purpose to restore that balance. "
Was this Ronald Reagan explaining why he nominated Robert Bork? Was it George W. Bush during the 2004 campaign, describing his philosophy on judicial appointments? Or was it Rick Santorum, explaining why he's decided to run for president in 2008?
No, it was Franklin Delano Roosevelt, in his ninth Fireside Chat, "On the Reorganization of the Judiciary", delivered on March 9, 1937.
A (rather errorful) transcript is here; a streaming Real Audio version is here, and a downloadable mp3 here.
My corrected transcript is here. I 've fixed a variety of omission, insertions and substitutions; divided the speech into breath-group-sized phrases; and noted the pronunciation of the indefinite article "a", with reduced forms ("uh", IPA [ə]) in blue and unreduced forms ("ay", IPA [ej]) in red.
This being Language Log, the pronunciation was what motivated me to listen to this speech. It's another data point in the on-going saga of article unreduction. In this 34-minute speech, FDR almost exactly splits the difference -- 41 of his a's are reduced and 40 unreduced.
If you look over the transcript and listen to the audio, I think you'll find that it's not trivial to predict where unreduction will strike. It's clearly not a marker of disfluency, but it doesn't always seen to be a phonetic hi-liter either. For example, FDR makes a contrast in which the first a is unreduced while the second one is reduced:
The Court has been acting not as a judicial body, but as a policy-making body. (audio link)
In this case, there's no pause after the unreduced a, or anywhere else in the phrase "a judicial body", but he does pause in "a policy-making ___ body".
A little later he gives two phrases in apposition where the first has a reduced a and the second an unreduced one:
The Court in addition to the proper use of its judicial functions
has improperly set itself up as a third house of the Congress
a super-legislature,
as one of the justices has called it (audio link)
In this case, he pauses in the middle of the phrase with the unreduced a: "a super ___ legislature".
And a bit later, there's a list where the first two instances of a in "a Chief Justice" are reduced while the third one is unreduced:
President Taft appointed five members
and named a Chief Justice;
President Wilson, three;
President Harding, four,
including a Chief Justice;
President Coolidge, one;
President Hoover, three,
including a Chief Justice. (audio link)
Go figure. Anyhow, Chris Waigl and I are still gathering data on this phenomenon -- you may have noted some interesting pronunciations of the in these FDR audio clips as well -- and you'll hear more from us on this over time. I'll admit, though, that I've posted about this speech because I thought the content was an interesting counterpoint to the current debate over judicial philosophy. I learned in high school about Roosevelt's attempt to add six justices to the Supreme Court, in order to overcome judicial resistance to his legislative agenda. I didn't know that he used this "courts should not legislate" rhetoric, though of course it makes perfect sense.
Once again Doonesbury has Zonker's surfing master talking in what is supposed to be Yoda's syntax, and as Mark noted here a couple of months ago, Gary Trudeau has no real idea of what the syntactic characteristics are meant to be. At one point the master says "Never thought the day would I see." This is meant to mean "I never thought I would see the day [when beach access for surfers at Malibu was finally restored]." He has the direct object of say in the subordinate clause fronted, and also subject-auxiliary inversion in that clause (the auxiliary would is positioned before the subject I), but in the main clause the verb precedes its complement and the subject is missing... Even Yoda would be surprised at this syntax, I think. People who try to write Yodic seem to imagine that if you just sling the words around a bit in random directions, that will count. The real Yoda from the Star Wars scripts has somewhat more in the way of syntactic regularities: the basic sequencing principle for clause constituents is Complement-Subject-Verb (which if applied here would yield "See this day I would, never [I] thought", or if applied only to the subordinate clause, "Never [I] thought see this day I would"). However, Mark found that extending the data under consideration (he looked at all of Yoda's utterances in episode 3) made the syntactic situation less clear rather than clearer. In real natural languages, looking at a larger quantity of data generally makes it clearer what the grammatical principles are. If that wasn't so, linguistics as a field would not exist.
I'm not the only Language Log contributor who's a fan of Rob Balder's wonderful PartiallyClips, a cartoon strip made by the very simple device of adding speech balloons to an uncopyrighted clip art picture repeated three times. This recent strip is about language and logic.
Rob's male character is wrong about paradoxes (looking up the right definition is left as an exercise for the reader); but he is right that sometimes the effect an utterer's speech activity has on the context can affect the possible truth values of a statement. I am now moving my lips, which contains four bilabial consonants, is another example: the moment you say it, it becomes true.
Not logically true, though; contingently true in virtue of a property that the context of utterance picks up because of what you are doing. It might be called (if you want a term for it) a contextual tautology.
I swear, I'm not one of those people who thinks that Western Civilization is entering its Last Days. At least, not in general. I've defended modern students, writers and others from Camille Paglia's charge that "interest in and patience with long, complex books and poems have alarmingly diminished not only among college students but college faculty in the U.S.". I've defended email and cellphone usage against shoddy pseudoscientific indictments. But there are a few areas where I'll agree that civilization has indeed been overwhelmed. In particular, when it comes to elementary usage of linguistic terminology, intellectuals have joined the general population in untroubled ignorance, and even the sacred groves of academe have been clear cut, strip mined and used as a landfill.
Here at Language Log, we've noted example after example of this. Arnold Zwicky documented another one yesterday, and it's a doozy. William Howarth, writing about Rachel Carson in The American Scholar, mis-identified progressive verbs ("eels were leaving the marshes") as "passive gerunds". The walls of the city have fallen, and some Visigothic looter, swilling cognac, complains to his companions that the scotch tastes funny. But this is not a kid sounding off on his blog, or even a journalist mis-using terminology that he doesn't care to check: William Howarth is a professor of English at Princeton University, and The American Scholar is the "literary and intellectual quarterly" of the Phi Beta Kappa society.
What to do? We could simply abandon the terminology to the vagaries of current usage. For example, we could admit that the word passive just doesn't mean "passive" any more, but has developed two new senses, one for phrases whose subject is not an agent, and another for phrases involving a form of "to be" anywhere in the vicinity. Then we would have to make up new terms, like "perispectual verboid", to replace the old ones.
But I'm not ready to give up yet. Another option is suggested by Geoff Pullum's recent post on his sighting (well, hearing) of an aux-initial clause with complex subject, and Arnold Zwicky's old post about his search for first-mention possessive antecedents. Both pieces resonate with the joy of a birdwatcher adding a rare species to his Life List.
So what we need is a new social phenomenon: verbwatching. There would be books ("The Verbs of North America", "Geoff Pullum's Field Guide to Predicative Adjuncts", "Preterites of the Central Brazos Valley"), web sites, clubs, field trips, videos, ...
Well, it could happen. Seriously, although there are excellent books on English Grammar, I don't know any that entirely solve the access problems sketched in this overview of field guides, so perhaps there is a niche for a Field Guide to English Grammar. And someday, the editors of intellectual periodicals will have learned enough from their field trips to correct a Princeton professor who submits a piece identifying robins as warblers.
[Update: Linda Seebach emails
Heh. Try Googling "passive tense."
The presence of some form of "to be" isn't even necessary. I used to be on a listserv for writers, all professionals and many highly regarded -- the person who organized it, Jon Franklin, has won two Pulitzers -- and he, as well as another list member who taught journalism, were both convinced that "she looked sad" was a passive sentence.
Well, that comes under the "subject is not an agent" heading, I guess.
For other examples, see this earlier post and those it links to.]
Victor Mair sent me some interesting observations about the slogan for the 2006 Beijing Olympics. The English version is "One World, One Dream", while the Chinese version is "tong2 yi1 ge shi4jie4, tong2 yi1 ge meng4xiang3" in pinyin, or 同一个世界 , 同一个梦想 in simplified characters, or 同一個世界 , 同一個夢想 in traditional characters.
Victor has interesting things to say about the source of the slogan (it was devised in English and then translated into Mandarin), the slogan's division into words, the history of the words, and so on. But early in his note he makes a quantitative comparison
Mandarin: | 10 syllables, 8 words 75 pen(cil) strokes (traditional) / 58 (simplified) |
English: | 4 syllables, 4 words approximately 25 pen(cil) strokes |
asks (what I take to be) a rhetorical question about it:
In cybernetic / IT terms, which is more economical? This is NOT even taking into account that there are only 26 letters of the alphabet to deal with, in contrast to at least 26,000 characters that have to be separately considered when determining memory size.
Now, to a first approximation, I reckon that the cost of text storage is now zero -- a compressed copy of all the text I've ever written is roughly the size of one high resolution digital photograph -- and so the answer to Victor's question may not matter very much, since a mere factor of two or three hardly matters in a situation where pictures, audio and video consume many orders of magnitude more storage than text does. However, I was still curious about the facts of the matter in a larger sample than just this one slogan. So I turned to the LDC's catalog of Chinese/English parallel text.
One of our offerings in that area is a body of United Nations documents. There are 7,070 pairs of documents. I believe that in most cases, the documents were written in English and then translated into Chinese. These are essentially plain text documents -- no mark-up -- and they are not compressed. The Chinese is GB encoded. The totals byte counts are:
Chinese: 54,640,469
English: 123,301,197
Now, the fact that English puts spaces between words, while Chinese does not, accounts for some of this difference. But in any case, the direction of the difference in bytes is the opposite of Victor's counts of syllables, words, strokes etc., and the magnitude of the difference is a factor of about 2.26.
In the Olympic slogan, the Chinese version is 11 characters, including the comma. Even encoded as 2 bytes per character, that's only 22 bytes. The English version is 20 characters -- at one byte per character, that's 20 bytes. This suggests that the slogan is not typical of other material.
Another database on which we can make comparisons is some material from Hong Kong. There are three subcorpora: the "Hansards", which are the parliamentary records; the legal code; and an archive of news stories. This data includes some formatting information, but it's largely the same in both languages. This table shows the disk usage in megabytes for the various subcorpora:
Chinese | English | English/Chinese Ratio | |
Hansards | 158.454 |
270.472 |
1.76 |
Laws | 50.094 |
68.796 |
1.37 |
News | 78.898 |
117.890 |
1.49 |
I'm not sure why the ratios vary so much, nor why they're all lower than the UN ratio (perhaps because these were written in Chinese and translated to English?), but they certainly all favor Chinese texts as being smaller than the corresponding English texts.
Several people have written in to ask about the size relationship once the files have been compressed. I wondered too, but didn't have time earlier to check. The results of a couple of experiements suggest that it reduces but does not eliminate the discrepancy in size. For example, the Hong Kong News corpus, put into a tar archive and compressed with gzip, is 33,535,939 bytes in English, and 29,291,135 bytes in Chinese, for a ratio of 1.14. This is smaller than 1.49, but it's not 1.
[Update: Xiaoyi Ma observes that the LDC parallel Chinese/English corpora in general amount to some 218M English words and 370M Chinese characters, or about 1.7 Chinese characters per English word. In terms of byte count, he gets the following English/Chinese ratios:
text |
gzipped |
|
FBIS | 2.27 |
1.41 |
Sinorama | 1.95 |
1.19 |
UN | 1.96 |
1.24 |
All FBIS and Sinorama text was translated Chinese to English, while 90% of the UN data was translated English to Chinese. The ratios again are variable, but clearly show that Chinese texts are smaller than the corresponding English texts, with the difference shrinking but not disappearing under compression.]
The future certainly isn't what it used to be. That's true in general, but I'm talking about English verbal morphology here. In response to my post on future on, Darryl McAdams emailed to point out that there's been a development of fixing to into finna, parallel to the development of going to into gonna (and want to into wanna). He observes that {"I'm finna"} has some 5530 ghits, and turns up examples like this Kanye West lyric:
I wanna tell the whole world about a friend of mine
This little light of mine and I'm finna let it shine
I'm finna take yall back to them better times
I'm finna talk about my mama if yall don't mind
For an example with with a subject that's not a pronoun, and with the copula deleted, there's this from Missy Elliott:
Missy finna spit this simply raw
Misdemeanor always make MC's feel small
Stick you on the table with a plastic cup
Say grace, then eat ya ass up
Darryl adds that "[a]A friend learned it in middle school (Apollo, in Hollywood, FL)".
This one was completely new to me, since fixing to isn't part of my dialect. However, I do use a contracted form of trying to that might be put into IPA as [ˈtɹɐj.nə], and seems to be represented in conventional English orthography as "tryna". This one comes up in recent song lyrics too:
You tryna wear my shoes
You tryna wear my clothes
You tryna be like me,
I'm tryna be like you bro,
What I'm really tryna say
You got to keep it all real
but I can testify that it's been a normal part of American English pronunciation for a long time, even though I don't recall ever having seen it written before I (just now) looked for it on line. I wonder why "gonna" and "wanna" have been standard non-standard orthography for so long, while "tryna" has lagged? Is it because the contraction is newer -- you couldn't prove that by me, I've used all of them from the cradle -- or because "tryna" is just orthographically weirder?
My flight from Philly to Denver was a couple of hours late, I missed my connection to San Jose, and I'm waiting for the next flight. The good part is I had plenty of time to read Elmore Leonard's Mr. Paradise, which I bought to read on the plane. On page 332 (of the 2004 Harper paperback), I noticed a way of marking immediate future time, in the family of gonna, gone, I'ma, I'monna and so on, that's new to me. Well, really it's not, in the sense that I even blogged about it before, but I didn't understand what I wrote at the time.
The characters are Frank Delsa, acting lieutenant of Squad Seven, Homicide Section, Detroit Police Department, and Orlando Holmes, a drug dealer who killed three of his suppliers and cut one of them up with a chain saw. Delsa speaks first.
"You know who put the stuff on you?"
"Somebody close to me, his girlfirend's punk-ass brother. Is how it goes. But listen, I'm on tell you something, I was scared."
Google finds a discussion of "I'm on tell you" in an amazon.com reader review of Donna Tartt's "The Little Friend":
Tartt has written a novel with all of Faulkner's insights about the South in clear, enjoyable prose. She adds the element of likeable characters and believable women, both black and white. She has captured the language of the white "redneck" class: "on" is exactly how we say "going to," "I'm on tell you one more time."
Leonard's Orlando Holmes is an African-American living in Detroit. I've heard the form that this orthography represents, I think, though my own dialect's version of it is I'monna. At least I think they're equivalent. But Carrie Shanafelt characterizes I'monna as a "deep Southernism".
(And I should have seen the connection to "on", given Carrie's observation that "I'monna go run" is "more often 'I'monngo run'" -- but I didn't unpack the the run-together orthography and the doubled 'n'.)
Now, I'monna is not exclusively southern, since I use it, and I was born and raised in rural eastern Connecticut. However, I'll take Carrie's observation as evidence that the form is used in the south, wherever else it may show up. But then what's the difference between "I'monna tell you" and "I'm on tell you"?
Is this just a variant pronunciation, an instance of the sporadic loss of unstressed vowels in some (I think mostly rural) southern dialects? That's what Carrie's remark suggests.
Or have some speakers re-analyzed the form as a version of the spatial preposition on? That would be a sensible thing to do, but if it happened very often, I'd expect to see more hits for spellings like { "I'm on tell you"}.
In my spare time, I've been continuing to chip away at the the-and-a reduction problem that Chris Waigl and I took up a little while ago. The question at hand is, when (and why) do the and a appear with unreduced vowels? One interesting answer is given in the psycholinguistic literature; but (I think) it's wrong, or at least it's incomplete. In fact, we've already seen some examples of phenomena that this theory doesn't cover, and in this post, I'll give some others. For extra bloggy relevance, I'll take some of the examples from an interview with Glen Reynolds, the Instapundit.
Here's the background of the problem, which you can skip if it's old hat to you. (Even if it's new to you, you might want to skip to the examples and come back to this list later.)
1. In standard dialects, English the and a are pronounced as IPA [ði] and [ej] -- sometimes symbolized orthographically as "thee" and "ay" -- when they are used in citation forms ("the word 'the' is spelled tee aitch ee") or when they are contrastively stressed ("it's *A* factor, but not *THE* factor").
2. In fluent speech, when followed by a word starting with a consonant, both words are usually pronounced with a schwa-like reduced vowel, IPA as [ðə] and [ə], sometimes symbolized in conventional spelling as "thuh" and "uh".
2. When fluently followed by a vowel, the is usually pronounced with a higher vowel, roughly the same as in the second syllable of slithy. In most American dialects, this is the same vowel quality as in a stressed monosyllable such as fee, and is sometimes symbolized in conventional spelling as "thee" ([ði] in IPA) In some British dialects, the vowel is somewhat lower, more like the vowel in fin or this.
3. In all dialects, when a is fluently followed by a vowel-initial word, the form "an" is normally substituted.
4. Some fraction of the's and a's, followed phonetically by a consonant, show the forms [ði] and [ej] instead of the forms [ðə] and [ə].
5. A fraction of pre-vocalic the's and a's show the schwa-voweled forms [ðə] and [ə] instead of [ði] and [ej].
6. When followed fluently by a "filled pause" (the various sounds usually written as "uh" or "um" or "ah"), the pronunciations [ði] and [ej] are usually but not always used. Note that this is expected for the (because the pause sounds are vocalic) but not for a.
7. When followed by a disfluent pause, all of the various alternative forms occur, though [ði] and [ej] are fairly common.
The question, again: why are he "full" forms [ði] and [ej] (thee and ay) sometimes used where the standard rule would say that the "reduced" forms [ðə] and [ə] (thuh and uh) should appear?
For the case of the, Jean Fox Tree & Herb Clark gave an answer in their 1997 Cognition paper "Pronouncing 'the' as 'thee' to signal problems in speaking" (Cognition 62 (1997) 151–167). The answer is telegraphed by the title; as David Beaver summarized it in an earlier Language Log post, people "use the full form when they can't figure how to say whatever the hell they want to say next". It would make sense to extend the same model to the pronunciation of a as well.
But we've seen several examples already that don't seem to fit that mould. In one post, for instance, I cited FDR's use of unreduced a five times in his famous "infamy" speech, along with one use of unreduced the, all six cases in fluent performance of a prepared speech, with no signs of compositional or reading difficulty. And in another post, I cited a single non-reduced a in voice-over by George Vecsey, which struck me as expressing a rhetorical underlining of the following noun phrase, not any compositional problem.
A real solution to this sort of problem requires careful compilation and statistical analysis of many examples from many sources, with audio as well as transcriptions available. However, there's some initial value in looking at anecdotal clips, if only to get a sense of the range of phenomena to be counted, and the aspects of the performances that might be relevant.
The next few examples come from Chris Lydon's 2003 interview with Glen Reynolds. In the portion of Reynolds' speech that I've transcribed, 7 of 97 phonetically preconsonantal the's are unreduced ([ði] rather than [ðə]), and 5 of 74 phonetically preconsonantal a's are unreduced ([ej] rather than [ə]). These rates are fairly typical of what we've seen in other cases. The point here is not the rate of unreduction, but its context.
Chris Lydon opened the interview with this long-winded question:
Let me just say, you know when I was in school, my idea of a god of journalism was Walter Lippmann, he had lunch at the Metropolitan club every day, talked to big shots, and then well sometimes talked to them at home next to the National Cathedral there in Washington and ((then)) he turned out these beautifully phrased short essays for American newspapers twice a week, distilling the mood and the mind of Washington, and of course shaping it. Uh today, the Walter Lippmann is a University of Tennessee law professor with a thing about guitars and Mazda sports cars, uh who's reading hundreds, maybe thousands of web sites all the time, and cuing the rest of the world to where the good stuff is. I want you to tell me how the world created this monster "Instapundit".
Glen Reynolds' answer began:
uh monster's probably the1 right word. [audio link]
um
I am uh hardly in the2 Walter Lippman category, uh about all I can say is that my rate of fire exceeds his [audio link]
um
but uh but that's about all.
Professor Reynolds' first "the" is pronounced [ðə], as we expect in before a consonant; but the second one is [ði], despite the fact that it's produced in fluent sequence with the following consonant-initial word "Walter". Furthermore, "Walter Lippmann" is hardly new information, since the full form of the name was used twice in Lydon's question, just a few seconds before, and it's not very credible that Reynolds is having trouble remembering it. Nevertheless, Reynolds seems to want to emphasize it a bit, and he accomplishes this in in part by the non-reduction of the preceding "the". While he's speaking deliberately overall -- his verbal "rate of fire" in the interview as a whole is rather slow -- I don't hear (or see) any evidence in the prosody of any sort of phrasal juncture before "Walter".
Another example, about 13 minutes in, occurs when Reynolds is talking about future "horizontal models" of journalism:
and I suspect we will see that sort of thing grow, as the1 software gets better and as the2 network gets larger. [ audio link]
Here both of the the's are unreduced, without any indication that Reynolds is having any trouble fetching the words "software" and "network".
In the other direction , there are several examples in the interview of Reynolds' using reduced (though elongated) articles in front of quite long pauses-for-thought, for example at about 2:50 of the recording:
uh though I'm not sure that the
tone is all that different
than it would have been if I had a couple of hundred readers
because for me the experience is the same. [audio link]
There's a 980 msec. pause between "the" and "tone", and the vowel of "the" is 340 msec. long, but "the" is still pronounced [ðə]. The word "tone" is new to the interview, and Reynolds appears to be giving himself a "think-pause" before choosing it, but the prepausal article is still produced with a schwa.
That's not to say that disfluency and uncertainty are never relevant. However, it's neither a necessary nor a sufficient condition for unreduction. We already know that uncertainty (real or feigned) is not an essential ingredient in unreduction, because of citation-form and contrastive pronunciations. What we're adding here is the idea that there's a species of article-unreduction that is mainly about vocal underlining of the following word or phrase. (There's clearly another form of article-unreduction as well, a sort of reading pronunciation that can occur even in quite fluent reading from some speakers, especially those that are less well educated.)
Sometimes when Reynolds uses unreduced articles, it does indeed seem plausibly to be linked to uncertainty about what to say next. Here's an example from about 7:20 of the interview:
uh in fact I got an email just today about that, I had linked to the1 blog of a2 military guy in Iraq
uh named L T Smash, that's not his real name, I
actually know his real name, but
but he blogs anonymously [audio link]
Here both the "the" and the "a" are unreduced; there are no silent pauses or overt disfluencies, but Reynolds slows down as he thinks about how to describe Lt. Smash and his blog, perhaps inhibited by the problem of internally swapping the pseudonym for the true name.
Switching away from Reynolds for a moment, here's another example of emphatic unreduction of a, from NASA's 7/29/2005 Mission Status Briefing. Phil Engelauf is answering a question from the AP's Marcia Dunn, about 17 minutes into the briefing. I've divided (this small piece of) his answer into breath groups:
There has been some discussion about whether or not we might send the crew
to uh take a close look at or remove one of those gap fillers that's protruding
uh that is a1 very very preliminary discussion at this point, ((it-)) we've been sort of asked to uh [audio link]
take a look at what the impact of doing that would be
uh I don't think that there's a consensus that that's required yet
it's really just a-2 a preliminary "what if" discussion [audio link]
Case 1 is "a" pronounced [ej] without any pause or pseudopause and without any indication of disfluency or uncertainty. Nor is the following word technical or rare or hard-to-understand -- it's just plain old very, somewhat emphasized. In my opinion, this is basically the same phenomenon as the unreduced a in George Vecsey's comment that Lance Armstrong "goes out as a great champion with a clean record".
Case 2 is "a" followed by a short pause and a repetition. Despite the disfluency and the speaker's clear momentary uncertainty about how to go forward, "a" is pronounced [ə] here.
And for another interesting bit of anecdotal phonetics, this time from Britspeak, here's another example that I heard this morning as I was writing this post (from the BBC Newshour 8/3/2005 12:00 GMT edition, about 48 minutes into the hour). Former Ford president Sir Nick Scheele is being interviewed:
BBC: | ... the American car companies are terminally uncompetitive, aren't they? |
Scheele: | ah th- they have a huge problem there is no question that the health care cost problem allied to declining profitability is causing a major squeeze -- however I think to say that this is terminal is m- a vast exaggeration. [audio clip] |
In this case, "a major squeeze" has [ej] (or really in this case more like [e]), while "a huge problem" and "a vast exaggeration" have[ə]. The three phrases "huge problem", "major squeeze" and "vast exaggeration" are all reasonable candidates for being underlined, while the only one of them near a disfluency is "vast exaggeration". So the emphatic unreduction theory can't claim any sort clean sweep here; but the think-pause theory doesn't help at all. In fact, these few data points might make you think that Sir Nick has some sort of vowel harmony thing going on...
In a later post, I'll take a critical look at the details of the Fox Tree & Clark paper. In particular, I'll look at their finding that
"About 20% of the time, speakers continue after THIY without further disruption, apparently able to repair the problem in time. But about 80% of the time they deal with the problem by pausing, repeating the article, repairing what they were about to say, or abandoning their original plans altogether"
which was based on counts made from the transcriptions in a British speech corpus for which audio was not available to them, but seems quantitatively very far away from the numbers that we've been seeing in material for which we have the audio. It's hard to tell whether this is because of dialect differences or because of some sort of transcription bias.
Mean media metaphor of the month: Jack Shafer's judgment on Judge Richard Posner's essay "Bad News":
Maybe Posner should stop composing his essays with a paint roller and switch to a Sanford Uniball Micro.
Courtesy aside, Shafer's criticisms are reasonable ones: Posner's piece links broad-brush conventional wisdom about lowered barriers to entry with mostly-unsupported assertions about increased sensationalism and polarization. However, Shafer ends his critique with an astonishing and gratuitous piece of quantitative idiocy, which significantly undermines his whole "let's draw rational conclusions from documented facts" stance.
First, let's set the stage. Here's Judge Posner's conclusion:
Thus the increase in competition in the news market that has been brought about by lower costs of communication (in the broadest sense) has resulted in more variety, more polarization, more sensationalism, more healthy skepticism and, in sum, a better matching of supply to demand. But increased competition has not produced a public more oriented toward public issues, more motivated and competent to engage in genuine self-government, because these are not the goods that most people are seeking from the news media. They are seeking entertainment, confirmation, reinforcement, emotional satisfaction; and what consumers want, a competitive market supplies, no more, no less. Journalists express dismay that bottom-line pressures are reducing the quality of news coverage. What this actually means is that when competition is intense, providers of a service are forced to give the consumer what he or she wants, not what they, as proud professionals, think the consumer should want, or more bluntly, what they want.
This is a plausible story, but as Shafer observes
The authentic media maven understands that newspapers have been "dying" since the advent of radio in the 1920s, with the number of titles dwindling steadily with the rise of every new media (television, cable, the Web) and their share of the audience shrinking.
(A linguistic aside: note that media, like data, is now firmly singular in general usage...)
Shafer persuaded me that Posner's essay combined fuzzy thinking with factual carelessness. But Shafer's take-down makes an astonishing claim in its conclusion, a quantitative assertion that a few seconds of common-sense reasoning will show to be several orders of magnitude off.
Posner reveals the sort of rigor he applied to this piece of hackwork in his conclusion, where he notes that a survey by the National Opinion Research Center recorded the public's confidence in the press declining from 85 percent in 1973 to 59 percent in 2002 "with most of the decline occurring since 1991." He writes:
So it seems there are special factors eroding trust in the news industry. One is that the blogs have exposed errors by the mainstream media that might otherwise have gone undiscovered or received less publicity. Another is that competition by the blogs, as well as by the other new media, has pushed the established media to get their stories out faster, which has placed pressure on them to cut corners.
How could blogs have played any role in eroding public trust by 2002 when almost nobody in the mainstream had heard of them? The press loves to seize on new trends, especially techno-trends, but the word "blogs" doesn't appear in a Nexis search of all U.S. newspaper and wire stories until 2000, when it was mentioned in 22 stories. In 2001, the word appeared in 67 stories. In 2002, the concluding year of the survey cited by Posner, it appeared in 359 stories. That's too few by a factor of about 100,000 to have had an impact on the public's view of the press.
Does Shafer really mean that for blogs to have an impact on the public's view of the press, the word blogs would have to appear in about 359*100,000 = 35.9 million newspaper and wire stories within a calendar year?
The version of Lexis-Nexis that I have access to won't give me a response if the size of the set returned is greater than 1,000. So as a proxy, I tried single-month searches, with the results as follows. All searches were done on Lexis-Nexis Academic, in the category of "General News", source "Major Papers", search terms "blogs" in "Full Text".
March |
April |
May |
June |
Sum March-June |
Full year |
Shafer's counts |
|
2001 | 1 |
10 |
7 |
4 |
22 |
50 |
67 |
2002 | 8 |
19 |
15 |
24 |
66 |
205 |
359 |
2003 | 78 |
57 |
58 |
66 |
259 |
883 |
|
2004 | 119 |
129 |
201 |
170 |
619 |
? |
|
2005 | 524 |
539 |
619 |
673 |
2,355 |
(I guess that Shafer has access to a "media pro" version of Lexis/Nexis that indexes a somewhat larger set of sources -- but his counts are within a factor of 2 of mine, and the exaggeration we're talking about involves a factor of 1,000 or so.
Shafer's basic point against Posner is obviously correct. To attribute to the influence of blogs something that happened over the period 1991-2002 is preposterous. But in his excess of indignation, Shafer does something that Posner doesn't -- he pulls a specific number out of nowhere that is roughly three orders of magnitude too large. Here's a reprise of this bit of froth:
In 2002, the concluding year of the survey cited by Posner, it appeared in 359 stories. That's too few by a factor of about 100,000 to have had an impact on the public's view of the press.
Again, 359*100,000 = 35.9 million. My Lexis-Nexis count for blogs in the March-June period of 2002 is 66, and for the same period of 2005 it's 2,355. That's an increase by a factor of 35.7, which is way less than 100,000. It's 2,801 times less, to be precise.
One way to read this is that blogs are not yet having an impact on the public's view of the press, and won't do so until there are 36 million newspaper and newswire stories a year that include the wordform blogs. But surely this is not what Shafer means. If that's the criterion, there can't be many developments that actually do have any impact on the public's view of anything. I mean, it might have seemed like there were 36 million stories about Michael Jackson last year, but there weren't -- checking Lexis-Nexis for "Michael Jackson" in June of 2004 turns up a mere 319 stories...
No, I think Shafer just pulled a big number out of the air. It wasn't a number based on careful sociological studies of the impact of media on public opinion, and it wasn't even a number that Shafer bothered to evaluate for common-sense plausibility. It was just a big-ass number. So if I were Richard Posner, I'd offer to stop writing my essays with a paint roller if Jack Shafer agrees to stop doing arithmetic with his rear end.
[Update: a couple of readers have suggested that maybe Shafer meant that the number of stories in 2002 was too low by an additive increment of about 100,000, not a (multiplicative) factor of 100,000 -- 359+100,000, not 359*100,00. Frankly, I don't see any evidence that he gave the matter enough thought to distinguish those two cases. In any event, this would be contrary to the ordinary-language meaning of the word factor, e.g. "A quantity by which a stated quantity is multiplied or divided, so as to indicate an increase or decrease in a measurement", as the American Heritage Dictionary puts it. And if you're beating up on someone for sloppy thinking, careless writing and poor factual support, and you want to avoid charges of hypocrisy, this is not a good mistake to make.
Even an additive increment of 100,000 is probably hyperbole, since extrapolation from my 4-month Lexis-Nexis counts for 2005 suggests fewer than 10,000 stories in major newspapers containing the word "blogs" this year.]
Get Fuzzy for 7/29/2005 illustrates creativity with quotations:
And the 8/01/2005 strip exemplifies "What is this 'snowclone' of which you speak?"
[links via Ben Zimmer]
Pass the hát. |
* |
Type
twice for truth? |
More arithmetic problems at Google |
Questioning
reality |
Google recall (They stole his mind,now he wants it back.) |
When
things don't add up |
Uh Oh... |
Inspired by Scott Adams' favorite Language Log posts:
Well, those might be his favorite posts, if he reads Language Log...
That's what Karen G. Schneider at Free Range Librarian calls Michael Gorman's interview with Josh Sanburn of the Cox News Service. Gorman, you remember, is the new president of the American Library Association, who did so much to inspire Jean-Noël Jeanneney's campaign against
"that throbbing anxiety for anything and everything, scattering knowledge like dust", characteristic in his view of Google's project, "which the president of American libraries" -- Michael Gorman -- "has so persuasively and disturbingly denounced"
(as Le Monde put it). Now Gorman is (quoted as) rallying the troops to keep "The Education of Henry Adams" from being digitized:
"It's a kind of foolishness to say that just because you want to digitize the Oxford English Dictionary and the Yellow Pages, therefore you should be digitizing a biography of Henry Adams," he said.
News flash, Mr. Gorman: it's too late.
Meanwhile, Ms. Schneider has been wondering "Why am I not as famous as Stephanie Klein?", complaining (perhaps too politely for someone who aspires to notoriety) about the lack of "kicky phrases" and high-quality one-liners in the links we send her way. She explains:
O.k., maybe I do see why this blog has not led to fame, a New York Times article, or a book deal. But I can change, starting today!
First, let me adopt a more au courant writing style. No more biblish, no more tiresome polysyllabic nonsense, no more mundane middle-class mutterings. From now on, in the words of Ms. Klein, "Yeah, right. Okay. Whatever." No more talk of buying sports bras at Target (though mind you, I did finally settle on the two-for-$8 deal and I like these bras better than much more expensive over-the-shoulder-boulder-holders I have purchased in the past. See how casual I can be?). No more free verse. No more discussion about the American Library Association. And many more kicky phrases, such as "I love etymology almost as much as karaoke." (Why can't Language Log come up with one-liners like that?) Not to mention Klein's soliloquy to her date that made my toes curl with envy: "I just spent half a day telling you, communicating with you, saying things that were really hard for me to admit, and then, you apologize, say it won't happen again. Then, BAM! You pull a fcuking Emril on me."
Then--let's get to why people really read Klein's blog--there's the sex and the other lurid personal details (because it certainly isn't the writing, and is this what Barnard turns out these days?). Yes. As soon as Sandy comes home this afternoon I will ask for her permission to write about our sex life, past, present, future, and imagined. She is very supportive of my writing endeavors (oh dear; "endeavor" is not a very Klein sort of word) and I am sure she will agree that splashing our personal life onto this blog, where it will then have a digital half-life in perpetuity, is a reasonable exchange for my personal gain, particularly for a book that very important people will read for at least one season.
I know it awaits me: the celebrity, the book deal, the book jacket with the pink cover and the high heel and martini glass on it. It can be mine! I just have to--BAM!--change my tiresome ways.
Does it help for me to point out that a librarian ought to start with an advantage in reaching at least some segments of the American reading public, as documented in Dan Lester's scholarly study The Image of Librarians in Pornography? No, I thought not. Well, I'll work on those kicky one-liners.
That's the message at the top of the National Forensics League's home page. In response to my post on dramatic license at the Globe Theatre, Ryan Miller wrote that
...high school thespian competition on a national scale in the United States is under the auspices of the National Forensics League whose rules are the following:
1) Any amount of cuts can be made as long as the original word order is not changed.
2) Up to 10% of the production by time can be words or phrases not actually present in the original.
3) The above changes must be consistent with authorial intent.
Well, Colin Hurley's version of what Shakespeare wrote for Thersites wouldn't pass muster with the NFL, since the original order was changed. But Peter's performance on the cell phone would be fine, since the message received was exactly what its author intended...
The New Yorker has a well-deserved reputation for being carefully (if sometimes eccentrically) edited. As Tom Rossen pointed out to me today by email, however, something strange has happened on page 49 of the current issue. The scene is a gala dinner for Tom DeLay at the Capitol Hilton:
Finally, Tony Perkins, the head of the Family Research council, delivered a benediction. "Heavenly Father," he said, "we are here tonight to thank you for our leader, Tom DeLay. We thank you for him, and we want to pray for him and Christine," -- DeLay's wife. "We lift them up before you, and we ask that you put a shield around them. Father, we pray, your own word over them, that no weapon formed against them would prosper. Lord, that every lion tongue would be cast down. And we pray, Lord, that they will come out on the other side of this, servants more usable in your kingdom. [emphasis added]
[John Cassidy, "The Ringleader: How Grover Norquist keeps the conservative movement together", The New Yorker, August 1, 2005, p. 49]
I've got to assume that "lion tongue" is a slip of the ear for "lying tongue". The King James Version has 5 instances of "lying tongue", but none of "lion tongue". "Lying tongue" makes sense in the context, while "lion tongue" makes no sense at all. If there were any lions besetting Tom and Christine DeLay with their tongues over at the Capitol Hilton, John Cassidy didn't learn about them. At least he doesn't tell us, and you'd think that if he had, he would have.
Every tongue cast down is perhaps not the most coherent of images -- I see them draped over the landscape like Dali watches -- but extracting the tongues from every member of some relevant set of lions doesn't help. Google has 637 hits for {"lion tongue"}, but they seem to deal with the actual tongues of lions, which as I've said seem to be thin on the ground at the Capitol Hilton. In contrast, there are 22,100 hits for {"lying tongue"}, many of them in religious contexts similar to Perkins' benediction.
The error must have happened when Cassidy (or some underling) transcribed the benediction. There's no indication that Cassidy was was given Perkins' prayer in writing, if a written form ever existed; and if the phrase had come in writing from Perkins, I imagine that Cassidy would either have silently corrected it or added a sic.
And then Cassidy's transcriptional eggcorn made it through the New Yorker's copy-editing process. Not to speak of the famous fact checkers. But I doubt that even the New Yorker fact-checks prayer, so maybe this is a case where theory checkers would have been more advisable: "Mr. Perkins, I'm a theory checker from the New Yorker, and we're trying to make sense of those lions whose tongues you asked to be cast down. Can you offer any coherent story about just where these beasts are, and what they have against the DeLays?"