The English language Goodle web page to which Mark referred is indeed puzzling, but the mystery is easily resolved if you know a bit about Korean language and culture. 온들 [ondɯl] "warm stone" is the traditional Korean heating system, a kind of hypocaust. In its original form, a wood fire was built in a fireplace like the one shown in the photograph and used to heat a wide stone called a 구들장 [gudɯlʤaŋ] under the floor of each room. One virtue of this system is that the stone has a large heat capacity and so stores up heat from the fire and releases it gradually, making the temperature of the room insensitive to the state of the fire. A variant uses a number of smaller stones with a layer of mud over them. A still later version uses water in a network of pipes embedded in concrete, a system which I believe was a favorite of the American architect Eichler. This company manufactures what they promote as a new, improved, pre-fabricated heating system derived from the traditional 온들. The term goodle is presumably an anglicization of 구들 [gudɯl], a synonym for 온들. The company very likely chose this term in an attempt to play on the English word good as well.
So says the English intro at the L&J Corporation's web site. The description explains that the company has "realized Korean traditional heating system Ondol into the product inner GOODLE by succeeding merits and supplementing problems within its system", and that "although it is regarded as the best heating system which has excellent functions and needful merits in it, panel heating system with Ondol has not been sufficiently studied in comparison with research accomplishment in other systems".
Their page on the science of Goodle boasts nine sub-topics: "37.5 C Goodle, Health Goodle, Fast Goodle, Economical Goodle, Strong Goodle, No-defects Goodle, Clean Goodle and Cyber Goodle." I think that these are all aspects of the One True Goodle, rather than different Goodles, but I'm not sure, because after a few minutes of poking around on L&J's site, I'm ashamed to say that I haven't been able to figure out exactly what Goodle is. For once, looking Goodle up on Google left me none the wiser.
[via James Lileks, who mentions finding it by mistyping Google]
A couple of days ago, Laurie Goodstein reported in The New York Times books section on efforts by Christians to debunk 'The Da Vinci Code', and since then, the classicists have been piling on over at Classics-L. Particularly rough treatment is handed out by Elizabeth Vandiver and Jim O'Donnell.
Sample quotes: "a kind of Never-Never-Land of woman-friendly, tree-hugging values overthrown by the evil Constantine and his goons"; "[t]he conflation of a multitude of different cultures into 'the Ancients' drove me batty"; "It's shoddy, filled with characters who can hardly even be called cardboard, and extremely badly written"; "the supposedly brilliant main characters are annoyingly stupid"; "at least Graves, Jung, and Campbell could *write*"; "It has no redeeming merits whatsoever".
I guess if you got them alone over a drink, they'd tell you what they really think. I'm one of the approximately three people who still haven't read the book, and this thread doesn't make me want to run out and buy a copy.
From an online transcript of an interview with Jacob Weisberg and Ann Coulter, CNN. Aired July 3, 2003 - 20:37 ET:
WEISBERG: I think this cowboy rhetoric may have some cost. I think it certainly had cost going into the war when we were trying to gather support for going to war with Iraq. And because of Bush's unilateral stance and his hostility toward the Europeans, and his attitude that we didn't care what nobody else thought, I think we went to war with less support than we could have had. I don't think that makes sense. It's fine to strike a pose and say we want Osama bin Laden dead or alive, we still want him dead or alive. But I don't think it helps when you look at the bottom line ...
Of course, this might have been a mistake on the part of the transcriptionist...
BlogPulse scans about 750,000 weblogs for "Key Phrases, Key People, BlogBites, and Top Links", and displays the results on a day-by-day basis. The algorithms are said to look for "bursty" items rather than simply common ones. As I understand it, this means that "George Bush" (for example) shouldn't show up in "key people" unless the number of mentions of him increases significantly, relative to some estimate of the expected background level.
This is a demo project from the "Intelliseek Applied Research Center", which was set up after Intelliseek "brought on board key members of WhizBang! Labs, a Pittsburgh technology team specializing in natural language programming, text mining, data retrieval and other technologies." Other refugees from WhizBang!, an unfortunate casualty of the dot.com bust, include Fernando Pereira and Andrew McCallum.
Here's the picture that Intelliseek uses to convey their "technology vision":
It's got a lot in common with the vision of DARPA's current TIDES project ("Translingual Information Detection, Extraction and Summarization"). DARPA has been supporting research in related areas for several decades, and it's clearly about time for this investment to start paying off
The technology in BlogPulse seems to work well in some areas, such as detection of personal names, which seems at least not to have many false positives. This is expected since named entity tagging is a pretty mature technology. I'm more impressed that their listing of "Key Phrases" seems to be picking up strings that really are English phrases, as opposed to word sequences that happen to occur more often than expected but cross-cut phrase boundaries. Phrase-finding with a low rate of false positives is not easy.
However, it looks like BlogPulse is not trying to connect alternate forms of names across documents. For example, yesterday's references to Elton John are listed as instances of the name "Sir Elton", and the references to Kofi Annan are listed as instances of "General Kofi Annan" (apparently because the algorithms truncated "Secretary-General Kofi Annan"). Elton John and Kofi Annan are famous enough that if BlogPulse were tracking entity mentions across documents, and doing a decent job of it, they should be getting these right. So I conclude that they've punted on this one -- and this is a big thing to leave out if you really want to turn unstructured text data into "intelligence". In my opinion, (what's sometimes called) cross-document entity tracking is a key problem for technologies of this general kind, maybe the key problem. That's not just because most users want to see the indexing done right -- it's also because if you make the connections accurately, you get a graph (of entity mentions across documents) that you can use for all kinds of other neat (and non-obvious) stuff.
I'm also not convinced that BlogPulse is doing a very good job of distinguishing random statistical blips in term frequency from significant trends. It's hard to judge this for "key people", since any name that occurs fairly often probably reflects discussions that are connected at least via the individual named, and without a fair amount of fussing with the data, it's hard for me to judge whether (say) Kofi Annan really was discussed significantly more often yesterday than usual.
However, for the "Key Phrases", it's a lot easier to make a judgment on this point, and my evaluation is that BlogPulse hasn't got it right yet. For example, "Key Phrase" #3 (of 40) for yesterday (4/29/2004) was "very good friend", and as far as I can tell from the list of "sample citations" given, none of them have anything to do with any of the others. I'll assume that "very good friend" usually occurs less than 19 times a day, but the fact that it came up 19 times yesterday (in the new entries on 750,000 blogs) seems to have been just a random statistical fluctuation, not any sort of leading indicator of warm feelings of fellowship sweeping through the blogosphere.
I feel the same way about many of yesterday's other "Key Phrases". Maybe the BlogPulse algorithm for estimating likelihood ratios needs a tune-up? Or maybe they forgot the Bonferroni correction or some appropriate approximation to it? This is likely a source of problems, since the number of tests implicitly done is quite large (perhaps as large as count of all the N-grams in the day's blogtext, for 2<=N<=4), and so it won't be easy to steer between the Scylla of fantasy and the Charybdis of obliviousness.
I'm not sure what to make of the BlogBites, which are "weblog entries from the Blogsphere which showcase the past day's burstiest themes." The site doesn't tell us what a "theme" is, algorithmically, and I can't say that their selection strikes me as getting at the essence of anything. I wouldn't be shocked to see the same list presented by some human as his or her idea of the most important posts of the day. But on the other hand, I also wouldn't be surprised to see the list emerging from a selection of first paragraphs at random from the day's scraping of blogtext.
One final comment: the limitation to day-by-day textual listing of Key X's is too bad. It would be nice to see graphs of mentions of Key X's over time -- weeks or months. Then you could really see the pulse of the blogs.
A newly graduated linguistics PhD was hit by a bus and tragically killed on the day her dissertation was turned in. Her soul arrived in heaven at the Pearly Gates to meet St. Peter.
"Welcome to the gates of Heaven," said St. Peter. "But let me just say that we have a bit of a problem here. You see, we've never actually had a linguist make it this far -- usually they have lived fairly dissolute lives (you wouldn't believe the things that went on at the 1974 Linguistic Institute), or published things with inaccurate glosses and mismatched brackets or uninterpreted formalisms of one sort or another, and it's clear enough that they're not really suitable candidates for the University of Heaven. But you were just starting out. We're not really sure what to do with you."
"Well, couldn't you just let me in?" said the young woman. "I've tried to be good."
"No, the procedure in these cases, to be scrupulously fair, is to let you experience each and then choose," said St. Peter. "You'll spend one day in Hell and one here in Heaven and then you'll make your decision about eternity."
And with that St. Peter made the necessary travel arrangements and the young scholar was whisked down to the gates of Hell.
She strolled in, naturally rather nervous, and found herself in a lushly vegetated and well-kept courtyard in which stood an elegant Italian fountain. Off the courtyard was a well-appointed seminar room with superb AV equipment, excellent built-in projectors, high-speed radio Internet connection, whiteboards with markers that actually worked, everything.
Down the hall was a very comfortable lounge with a reference library that despite its compact space had the latest edition of the OED; the luxury leatherbound edition of The Cambridge Grammar; every previous grammar she knew about any language; all of Frege's works in their first editions; an unexpurgated `director's cut' hand-sewn edition of The Logical Structure of Linguistic Theory dated 1954... and a subscription to just about every journal that could possibly be relevant to her field. All on open stacks in mint condition.
She began to meet the other linguists who were strolling the courtyard, chatting in the hall, reading in the library. Otto Jespersen was there, and was very nice to her. Edward Sapir, Leonard Bloomfield, and Bernard Bloch all praised her work warmly. She learned that the man in the loincloth meditating by the fountain out in the courtyard was Panini. Jim McCawley took her to a marvellous Chinese buffet for lunch; the salt and pepper prawns flash-broiled in hell fire were fantastic. Through the afternoon there were fascinating discussions on many different linguistic topics. Dinner in the faculty club was a feast of steak and lobster followed by crepes suzette cooked in flames at the table by a demon. Over coffee and brandy she had a brief chance to meet the Devil, who turned out to be a tall, handsome man with a voice rather like Peter Ladefoged's. When the time came for her to leave she was really quite reluctant. But it was time to sample Heaven.
Heaven turned out to be a rather sterile experience of standing around on clouds. It was mildly interesting to discover that she could play the harp (innately triggered abilities, she assumed). The cherubim and seraphim were gentle and polite, but their conversation revolved mainly around falling down before Him in adoration and singing praises unto His holy name, and she rapidly tired of it all. When her 24 hours were up and St. Peter came to ask her for her decision, it was not really very difficult.
"I never thought I'd say this," she said, "I mean, Heaven has been... nice... But I really think I had a better time in Hell. I mean the University of Hell is a better fit for my intellectual interests."
So St. Peter escorted her back. She arrived once more at the gates of Hell, and strolled back in confidently. But the pleasant courtyard was gone.
She was standing in a desolate, filthy, trash-strewn wasteland. The temperature was ninety and rising, and there was a whiff of brimstone in the air. She thought she heard distant howls of agony. The seminar room was a bare room with plaster falling off the walls in a half-derelict building. The library had some battered introductory texts and a few loose copies of Glossa with non-consecutive dates in the 1970s. She did see some linguists, but they were dressed in rags, and appeared to be picking up dead lizards and pieces of potentially edible garbage and putting it in sacks to make an evening meal. They look at her with sad and bitter eyes, pausing from their gathering activities only to tell her that they thought her research was second-rate at best. One of them mentioned that in her absence she had been appointed to a committee. A tattered schedule on a wall said that her first class was at 7 a.m. the following morning.
When the Devil happened to pass by she cried out to him:
"I don't understand! What happened to the library and the Chinese lunch buffet and the faculty club and... What has happened? All the other linguists look miserable, and they seem to hate me. It's all... different!"
Lucifer grinned. He put an arm around her shoulders and laughed a deep, dark laugh. (He really did sound like Peter Ladefoged.) The dark horns high on his forehead, which she had scarcely noticed before, stood out against the glistening scarlet skin, and his arrow-tipped tail waved gently in satisfaction as he explained:
"But yesterday we were just interviewing you! Today you're a junior member of our faculty."
[Non-humorous note: To my surprise, the chairman of a distinguished Department of Linguistics (it shall be nameless) recently emailed a version of this joke to a new PhD graduate from my department after getting that graduate's acceptance in writing of an offer of a tenure-track faculty position. I guess I would have thought it was a bit too much on the cynical side for such a use. Luckily the new appointee had the robustness of spirit to find the joke hilarious, and showed it to me with twinkling eye. Perhaps the sender judged that the way to take the story was as a cautionary tale: a lesson to us all about how not to treat our junior colleagues in the academic profession.]
In the April 19/26 New Yorker, David Owen describes a meeting in Phoenix, AZ, between two writers for the Hallmark greeting-card company and about 20 members of the public. After one of the guests shares a personal greeting-card story, Owen reports this exchange between Hallmark editor Michelle Keller and the audience:
"That's wonderful," Keller said. "And for being the first brave soul to share a story we would like to present you with the Hallmark Blushing Bears.' She held up a pair of white plush Teddy bears dressed in red outfits -- a popular item during this year's Valentine's Day card-buying season. Keller made the bears kiss by pressing their (magnetic) noses together, and a red light inside the female bear's cheeks glowed: a blush. When the other women saw this, they made a sound that is impossible to represent typographically but was approximately "Awwwwwwwwwwwwwwwww!" [emphasis added]
(That's 17 w's, if I've counted right).
It's odd to say that this sound is "impossible to represent typographically". In a sense, no sound can be represented typographically, except perhaps by printing all the numerical values of a digitally sampled waveform. However, Google finds 564 pages with one 'a' followed by 17 instances of 'w', and many of them seem to be instances of the same category of vocal display that Owen recorded, like this one (which comes up first for me):
Awwwwwwwwwwwwwwwww. Sounds like you need a hug *^_^*.
It's true that the number of w's is not standardized, but that just means that people have found many, many different ways to indicate this vocal display typographically -- and Owen is pretty far out on the statistical tail in the choice that he made:
# of w's |
17 |
16 |
15 |
14 |
13 |
12 |
11 |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
ghits |
564 | 683 | 908 | 1600 | 2,030 | 3,550 | 4,290 | 5,610 | 7,860 | 11,500 | 16,500 | 31,100 | 57,400 | 131,000 | 330,000 |
16: Awwwwwwwwwwwwwwww Soooo cute!
15: awwwwwwwwwwwwwww its soooooooo cute!!!!!
14: awwwwwwwwwwwwww... sweet!
13: awwwwwwwwwwwww those rabbits are so cute!!
12: awwwwwwwwwwww *wipes a tear away*
11: ... awwwwwwwwwww....sweet!!
10: Awwwwwwwwww, what a cutie.
9: awwwwwwwww, bless you look so cute when you was younger.
8: Awwwwwwww This picture is so cute, I can't stop laughing.
7: AWWWWWWW! This is my best friend Eric's puppy- "Bonus". He's just
too cute not to share with the world.
6: AWwwwww!! This was just the sweetest story!! I
5: Awwwww. How CUTE!
4: Awwww...Sweet...
3: Awww... You have to watch the slideshow of Annabella on Megan's blog. This
little cutey is adorable
And so on...
"Aw+" -- however we decide to spell it -- is just as well-defined a category of vocal display as "wonderful" or "Keller" is. But it's true that it's a different kind of thing. It doesn't refer to a person, place or thing, it doesn't denote a predicate that can be applied to different arguments, it doesn't represent the grammatical relationship of various other words in a phrase. Instead, it expresses a certain feeling. In that respect, it seems somewhat like the communicative displays of animals. As David Hume put it:
It is evident, that sympathy, or the communication of passions, takes place among animals, no less than among men. Fear, anger, courage, and other affections are frequently communicated from one animal to another, without their knowledge of that cause, which produced the original passion.
We also share with animals the ability to express our "affections" with different degrees of intensity. By choosing 17 w's, Owen is trying to suggest a pretty intense -- or at least prolonged -- form of the vocal display spelled "Aw+".
However, if we look a little further in Google's index, we can see that it's not quite right to say that Aw+ "expresses a feeling," since it can clearly be used sarcastically or insincerely:
Now please answer my questions or can you not even do that? I know what I wrote and I don't have to read it again. Did I hurt your feelings, awwwwww poor baby. ...You need to grow up.
Rather, Aw+ conventionally purports to express a certain feeling. Or something like that.
It's often claimed that animals act deceptively -- a standard example is a parent bird pretending to have a broken wing to lead a predator away from its nest -- but it remains controversial whether this is ever done with a real intention to cause another creature to have a false belief. And I don't know of any purported examples of what one might call "animal sarcasm", though it's logically possible. Thus just a human might say "oh terrific, happy days are here again, tofu meatloaf for dinner!", similarly a dog might sarcastically wag its tail to express disapproval of being served kibble yet again. It doesn't seem likely.
There are several interesting questions about the human (American English?) vocal display in question. One is whether Aw+ is ambiguous, or whether there might be several different similar displays to be distinguished here, since one "sense" is an expression of pleasure in perceiving something cute (like most of the examples cited above), whereas another "sense" is an expression of sympathy for someone who is hurt:
Awwwwwwww, too bad, if ya want to cheer up, goto my HOMEPAGE!!!
AWWWWWWWW poor guy that is girl is mean!!!!!
awwwwwwww big hugssssss ~~~~ thats SUCKS when that happens
Both sets of examples represent a kind of nurturant maternal cooing, but the situations and the feelings seem quite different.
Finally, there's the question of how this vocal display varies across languages and cultures. I don't know the answer to this question, but I would guess that a similar vocal display is available to pretty much every human, but with somewhat different phonetic details.
Even in the case of American English, there's a phonetic question to which I don't know the answer. The sound that I associate with the typography Aw+ is a low-mid back rounded vowel usually indicated by IPA "open o" [ɔ], as in my pronunciation of the words in J.C. Wells' "lexical set" THOUGHT. But many Americans have merged the vowel in this set of words with the vowel in the lexical set LOT (so that caught and cot are said the same way), and pronounce all of them with an unrounded vowel that is something like IPA [ɐ] (or ever fronter to [a]). So do people who merge caught and cot pronounce Aw+ with [ɔ:]? or with [ɐ:] or [a:]? I think that they still use [ɔ:], but I'm not sure.
[Update 4/30/2004: Neal Whitman emails:
In my experience here in central Ohio, the answer to your question is no. In the intro lx class I taught last year, most students made no distinction between [a] and [open o], pronouncing both as [a], but when I asked them what they'd say in the presence of a cute little puppy or kitten, they produced the [open o] with no problem.
I suspect that this is the general situation.
Daniel Ezra Johnson emailed to point out that Google also reports sequences with "ah+" parallel to some of those with "aw+". He tried "ah+ baby", but as he points out, many of these use "ahh" etc. to represent a completely different vocal display. However, "ah+ how cute" works better. Thus we have
aww how cute 4,580 ahh how cute 368
with examples like
Ohh here's a picture of my goddaughter enjoying her first christmas, ahh how cute
As Daniel suggests in his note, this is as likely to be a different idea about the appropriate way to spell the sound [ɔ:] ("open o"), as a different idea about the right sound to make to express shared pleasure in perceiving something small and cute.]
[One other comment: Charles Darwin observed, collected and documented every fact that he could about every area that interested him. Yet his book " The Expression of the Emotions in Man and Animals" does not, as far as I can tell, mention the vocalization(s) we're discussing here. Perhaps this is because he paid much more attention in that book to facial expressions than to vocal displays -- I'm reluctant to believe that it could be because Aw+ was unknown in Victorian England, or in any of the other places where he would have had a chance to observe it.]
About a year ago, in March of 2003, the meaning of "thumbs up" in modern Iraq was discussed by Bendan Koerner in Slate. Koerner observes that a raised thumb is traditionally an obscene insult in the Middle East, and cites University of Kansas classicist Anthony Philip Corbeill as having concluded in 1997 that in ancient Rome "the thumbs up sign actually meant 'Kill him', basing his assertion on a study of hundreds of ancient artworks."
Michel de Montaigne wrote in 1575:
It was at Rome a signification of favor to depress and turn in the thumbs:
"Fautor utroque tuum laudabit pollice ludum:"
and of disfavor to elevate and thrust them outward:
"Converso pollice vulgi, Quemlibet occidunt populariter."
[Essays XII, "Of Thumbs", translated by Charles Cotton]
Neither Koerner nor Corbeill credits Montaigne; you might think that today's established journalists and prize-winning classics professors have "have become unmoored from the mother ship of culture", as Camille Paglia puts it.
I don't agree with this. There's far too much culture out there for any of us to be in touch with all of it -- it's not so much a "mother ship", at this point, as a few dozen big fleets, along with thousands of more or less independent traders, raiders and pleasure craft. I'm sure that the time that Koerner and Corbeill didn't spend reading Montaigne was put to excellent use in some other way. If I ever read this passage in Montaigne myself, I've long since forgotten it, and I just stumbled on it this morning by a bit of random Google serendipity.
But just to help put us all back in touch with the Renaissance fleet, here's Montaigne's whole essay "Of Thumbs":
Tacitus reports, that among certain barbarian kings their manner was, when they would make a firm obligation, to join their right hands close to one another, and intertwist their thumbs; and when, by force of straining, the blood it appeared in the ends, they lightly pricked them with some sharp instrument, and mutually sucked them.
Physicians say, that the thumbs are the master fingers of the hand, and that their Latin etymology is derived from "pollere." The Greeks called them Anticheir, as who should say, another hand. And it seems that the Latins also sometimes take it in this sense for the whole hand;
"Sed nec vocibus excitata blandis, Molli pollice nec rogata, surgit."
It was at Rome a signification of favor to depress and turn in the thumbs:
"Fautor utroque tuum laudabit pollice ludum:"
and of disfavor to elevate and thrust them outward:
"Converso pollice vulgi, Quemlibet occidunt populariter."
The Romans exempted from war all such were maimed in the thumbs, as having no more sufficient strength to hold their weapons. Augustus confiscated the strength of a Roman knight, who had maliciously cut off the thumbs of two young children he had, to excuse them from going into the armies: and before him, the senate, in the time of the Italic war, had condemned Caius Vatienus to perpetual imprisonment, and confiscated all his goods, for having purposely cut off the thumb of his left hand, to exempt himself from that expedition. Some one, I have forgotten who, having won a naval battle, cut off the thumbs of all his vanquished enemies, to render them incapable of fighting and of handling the oar. The Athenians also caused the thumbs of the Aeginatans to be cut off, to deprive them of the superiority in the art of navigation.
In Lacedaemon, pedagogues chastised their scholars by biting their thumb.
As a result of this little bit of happenstance, I read a few of Montaigne's other essays on the same site. At the risk of being parochial, I have to say that they remind me in style and tone much more of weblog entries than of the kind of "essay" that I was trained to write in school.
Of War-horses, or Destriers: I here have become a grammarian, I who never learned any language but by rote, and who do not yet know adjectives, conjunction, or ablative. I think I have read that the Romans had a sort of horses, by them called funales or dextrarios, which were either led horses, or horses laid on at several stages to be taken fresh upon occasion, and thence it is that we call our horses of service destriers; and our romances commonly use the phrase of adestrer for accompagner, to accompany. ...
Of Cannibals: ... I long had a man in my house that lived ten or twelve years in the New World, discovered in these latter days, and in that part of it where Villegaignon landed, which he called Antarctic France. This discovery of so vast a country seems to be of very great consideration. I cannot be sure, that hereafter there may not be another, so many wiser men than we having been deceived in this. I am afraid our eyes are bigger than our bellies, and that we have more curiosity than capacity; for we grasp at all, but catch nothing but wind. ...
Of Coaches: ... Will you ask me, whence comes the custom of blessing those who sneeze? we break wind three several ways; that which sallies from below is too filthy; that which breaks out from the mouth carries with it some reproach of having eaten too much; the third eruption is sneezing, which because it proceeds from the head, and is without offense, we give it this civil reception: do not laugh at this distinction; for they say 'tis Aristotle's. ...
People often observe that weblogs are an ephemeral form, and that as a result of progressively lowered barriers to publication, too much stuff is being written for anyone to keep track of. There's nothing new in that, for better or for worse.
[Note: I haven't read Corbeill's scholarly works and therefore can't really assert with any confidence that he doesn't cite Montaigne -- I'm relying only on the KU press release that I linked to, which of course he didn't write. But his scholarly work apparently relies on a new compilation of evidence from ancient pictures, a context in which it's not really relevant what some 16th-century French polymath did or didn't notice about classical writings on thumbs.]
There's decoding words, there's assimilating sentences, there's losing yourself in an exciting story. And then there's the kind of reading that you need to do when you want to figure out the answer to a question or assess a writer's position or contribution. Timothy Burke at Easily Distracted has given some detailed instructions on "How to Read in College" that I like a lot.
Burke is offering help to the student who has just noticed that "[p]rofessors assign more than you can possibly read in any normal fashion", and is trying to figure out how to cope. I think that Burke describes exactly the right solution -- "skim, skim, skim" -- and manages to make the description vivid and interesting. He illustrates the method with an example, worked out in detail, that makes sense even if you don't have access to the book he's talking about. And as he explains persuasively, intelligent skimming is a complex skill, not at all just a matter of looking at topic sentences or otherwise dipping randomly into a passing stream of text.
I disagree with one thing he says, though: "The first thing you should know about reading in college is that it bears little or no resemblance to the sort of reading you do for pleasure, or for your own edification." Let's leave aside the question of reading for pleasure -- that depends on personal choices about kinds of reading and kinds of pleasure. But someone long past college will still want to use intelligent skimming techniques to evaluate a proposed medical procedure, a current political controversy or a potential investment.
If your doctor suggests an operation to fuse vertebra or replace a creaky hip joint, and you want to learn what the issues really are, you'll have much more to read than you can possible assimilate in a linear way, and you'll have to cope with the fact that most of it is full of vocabulary and concepts that are unfamiliar. If you have a child diagnosed with reading disability or attention deficit disorder or autism, you'll be in the same situation. Making up your mind about bilingual education or global warming or the effects of cell phone radiation poses the same problems. Ditto for deciding how to invest your savings or where to go for your next vacation.
Depending on your interests and tastes, you'll ignore some of these problems or leave them to randomly-chosen experts or to other random influences. But if you can't assimilate lots of text quickly when you decide that you want to, you're at a real disadvantage in modern life. And the way to assimilate lots of text quickly has almost nothing to do with conventional "speed reading" techniques, and everything to do with the kind of process that Burke describes.
The kind of skimming that's appropriate for scientific text is a bit different -- figuring out which tables, figures and equations are really critical is an issue that Burke doesn't address, for example -- but the process is analogous.
I don't know if it makes sense to try to teach such skills directly. In my experience, schools -- and some aspects of ordinary life -- just pile up the readings, and the people who develop the right skills prosper, while the ones who don't, don't. If it works to give the kind of explicit instruction that Burke offers, it should be done much more widely.
[via Liz Ditz at I Speak of Dreams]
Claire at Anggarrgoon writes that
When I haven’t been revising my dissertation, I’ve been filling out a form for getting permission to do research on human subjects (aka going to North Australia over the northern summer to do some fieldwork).
I have all sorts of problems with this type of form. Of course, I see why they’re there, and it’s probably better on the whole that I do fill out one. My problem is not with people checking out my research design (I quite like having someone knowing what I'm up to).
BUT, the biggest problem I have is that I don’t view my “subjects” as “subjects” at all – the speakers I’ll be working with are collaborators in the project.
The terminology here is definitely a problem. One way to look at it is that it's better to treat people as "subjects" than "objects," but as Claire points out, "subject" is not at all the right term for the people that field linguists work with. She prefers "collaborator", but I have to point out that collaborate is a word with two senses, one of which subverts her intentions in a nasty way:
1. To work together, especially in a joint intellectual effort.
2. To cooperate treasonably, as with an enemy occupation force in one's country.
The term informant has similar problems:
1a. One that gives information. b. One who informs against others; an informer.
2. One who furnishes linguistic or cultural information to a researcher.
I usually use the term "language consultant." But Claire puts the consultancy relationship the other way around, and says that she "view[s] the role of a field linguist like me ... as a contractor or consultant, rather than as the head of an project dealing with human experimentation." Fair enough, and sometimes the group concerned actually hires the linguist, which makes this relationship explicit. However, the Institutional Review Board (IRB) will still require the linguist to fill out a human subjects form in this case, I think -- even if it is just an application for exemption. As Claire points out, there is something a bit odd here, since a management professor who consults for an outside company doesn't normally think to ask the local IRB for permission to interview the company executives as "human subjects".
Terminology and social relations aside, there are lots of issues about the interaction between language researchers and the "human subjects" review process managed by Institutional Review Boards at American universities. I'm happy to say that I've had generally excellent experiences with Penn's IRB, but I've also heard some horror stories about misunderstandings that have arisen elsewhere when an IRB that is normally vets clinical research protocols comes up against a linguist or an anthropologist: "But you haven't listed all the specific questions that you plan to ask each subject, in the order that you'll ask them!" or "But you need to promise to destroy all recordings and transcripts after the study is completed!" Some of these stories may even be true.
For those who are interested, here is a summary of "Human Subjects Review for Language Documentation" that I wrote about four years ago. I believe that the main thing that has changed since then is that IRBs are more rigorous in insisting that everyone, including researchers in the social sciences and humantities, needs to go through the review process -- including, for example, people collecting oral histories, journalists and so on. As a result, it has generally become obligatory for even clearly "exempt" research to apply to the IRB to be officially declared exempt. Technically, a linguist who asks acquaintances for grammaticality judgments and publishes the results, without going through the IRB process, is probably in violation of the regulations. This is probably also true for someone who makes use of published corpus data. [Of course, IANAL or even an IRB member, and YMMV].
As Mark already mentioned, yesterday Russell Gray gave a talk about the work on subgrouping and dating that appeared in a paper in Nature on which I commented a while back. The talk and subsequent discussion clarified exactly what they are doing.
One thing that emerged is that I was right about how they are treating the characters. In biology, "characters" are the features that are used for classification. In a traditional morphologically-based classification, a character might be "has a backbone" or "has a nucleus". In a DNA sequence based classification, the characters typically take the form of "has such and such a nucleotide at such and such a position". In a linguistic classification, the characters have to do with what words particular languages have. I've said this somewhat awkwardly because there is more than one way to set up lexical characters.
When linguists set up sets of words for lexical comparison, whether for
classical subgrouping or for lexicostatistics, they are typically
arranged by glosses. That is, we list the form that each meaning
takes in the various languages. For instance, here is some data
for the word for "dog" in a few of the Indo-European languages:
Sanskrit | ʃvān |
Greek | kuōn |
German | hund |
Latin | kanis |
English | dag |
If we were to code "dog" as a single multistate character, we would have three
states, which we can call A, B, and C. The three states represent which
of the three cognate sets (two of which, in our example, have only one member)
represents the meaning "dog".
Sanskrit | A |
Greek | A |
German | A |
Latin | B |
English | C |
Gray and Atkinson did not code their data this way. Instead, they made all
of their characters binary. In order to do this with data that are
naturally multistate, they split each multistate character into a set of
binary characters, one per cognate set. If we recode our "dog" data
into binary characters as Gray and Atkinson did, we have
to create three characters, one for each cognate set.
Each character then represents whether that cognate set is represented
in a particular language. For instance, character A corresponds
to the question: "Does the language have a form cognate to Sanskrit
[ʃvān]?". A 1 means "yes"; a 0 means "no".
Language/Character | A | B | C |
Sanskrit | 1 | 0 | 0 |
Greek | 1 | 0 | 0 |
German | 1 | 0 | 0 |
Latin | 0 | 1 | 0 |
English | 0 | 0 | 1 |
The use of binary characters raises one additional point.
Once the characters become "does a certain cognate set occur", it ceases
to be relevant whether the cognate in a particular language
preserves a particular meaning. For example, in the data above,
English is shown as having a completely different word for "dog" from
most of the other languages. However, English
does have a cognate to the form that occurs in Greek, Sanskrit, and German,
namely "hound". It is not listed as part of the data for "dog" because it
no longer means "dog" but instead denotes a particular kind of dog.
However, if we are just asking whether or not
this cognate set occurs in English, the answer is "yes", so we must revise
the table of character states:
Language/Character | A | B | C |
Sanskrit | 1 | 0 | 0 |
Greek | 1 | 0 | 0 |
German | 1 | 0 | 0 |
Latin | 0 | 1 | 0 |
English | 1 | 0 | 1 |
The dataset used by Gray and Atkinson in their Nature paper consists of a set of data created by Dyen et al. to which Gray and Atkinson added data for Hittite, Tocharian A, and Tocharian B. That dataset is organized by meaning, so it does not contain full cognate sets, only those cognates that retain their original meaning. "hound", for instance, would not be listed. That means that to convert multistate characters to binary characters properly, the original dataset has to supplemented with cognates that differ in meaning. This of course does not affect the validity of the method.
The real significance of the use of binary characters is that the mathematical model that underlies the methods they use is based on the assumption that the characters are independent of each other. Whether an animal has a backbone is taken to be independent of whether or not it has a segmented body, and similarly what word a language has for "hand" is taken to be independent of what word it has for "fire". But when multistate characters are split into multiple binary characters in the manner described, the characters resulting from a split are not statistically independent. For the most part, languages have only a single word with a certain meaning and when a new word comes in, the old word disappears entirely rather than moving into a different meaning the way "hound" did in English. That means that in general, if we know that a language has a word belonging to one cognate set, we know that it does not have a word belonging to any of the others. Since we can predict that value of a character given information about the others, they are not statistically independent. The procedure that Gray and Atkinson used to create binary characters therefore violates the assumptions of the mathematical model.
This is a reason to be nervous about the validity of the results that they obtained, but it does not show that the results are wrong. Some violations render a model useless; others have insignificant effects. In this case, we don't know what the impact of the violation is. They are doing some experiments which they expect to provide information about the impact of the use of binary characters.
In 1955, the conservative government of Melbourne decided to rename the local version of the traditional May 1 "Labour Day" parade. They settled on the theme "Let's get together and have fun!", which they identified as the meaning of the word Moomba in a local Aboriginal language, a word that they therefore adopted as the new name of their May 1 parade. According to this story at Laputan Logic, they were misled by their language consultants, and Moomba should actually be interpreted as an impolite comment on the whole re-branding process.
Yesterday Russell Gray visited Penn and gave a talk based on his much-discussed Nature article with Quentin Atkinson, "Language-tree divergence times support the Anatolian theory of Indo-European origin." (Nature, 426, 435-439). In the audience were Don Ringe and Tandy Warnow, whose reactions I cited in an earlier post, and a collection of linguists, biologists and computer scientists that included Bill Poser and me.
Gray's presentation explained a lot more about their methods than the (necessarily brief) Nature article did. Much of the additional material can be found in this draft chapter (though I understand that a newer version will soon be available).
There was a lively and sometimes heated discussion, both during and after the talk. I have other tasks today, so a detailed account will have to wait for later, but I'll give a few general impressions now. I'm sure that Bill will have some comments as well.
First, everything that I learned reinforced my earlier belief that this is serious and interesting work. Its methods and conclusions remain controversial but they are worthy of very close attention. This is also not a one-shot deal -- Gray is continuing experiments on the Indo-European issues, and has new work on Austronesian in progress.
Second, Gray and Atkinson draw different (and in fact roughly opposite) conclusions from Warnow and Ringe about the reliability of various phylogenetic inferences. As I noted earlier, Warnow and Ringe argue that we can often get good information about tree topology, but (in the present state of knowledge) can't expect any reliable information about times. In contrast, Gray and Atkinson argue that even when tree topology is very uncertain (and even if the history is substantially untreelike as well), it may still be possible to get fairly tight time estimates.
I'm not sure who is right about this. This is partly because I still don't know enough about the details of the models involved. But as far as I can tell from yesterday's discussions, even the folks who know a lot more are really in a similar state. It comes down to an argument about which simplifying assumptions to make, and what effects these assumptions will have on the conclusions that result. I'll go over some of this argument in more detail when time permits.
In thinking about the general problem, an analogy with physics may be helpful. If we assume that the sun, planets and other heavenly bodies are point masses in calculating their orbital dynamics, our model is obviously false to fact. But does this simplification invalidate our conclusions? Well, it might or might not, depending on what calculations we do and what conclusions we want to draw. Any model of orbital dynamics will be simplified -- and therefore false -- to one extent or another. The question is whether this matters with respect to some specific quantitative or qualitative prediction. Giving a correct answer to that question requires a mixture of detailed mathematical reasoning, relevant empirical testing and luck.
One of Russell Gray's slides made this point by quoting the well-known scientific proverb that "A model is a lie that leads us to the truth". I believe that this was originally adapted (by whom?) from a remark made by Picasso:
"We all know that art is not truth. Art is a lie that makes us realize truth, at least the truth that is given us to understand. The artist must know the manner whereby to convince others of the truthfulness of his lies." (The Arts, Picasso Speaks, 1923)
Yesterday Russell Gray made considerable headway in convincing me of the validity of his approach. His talk, and the discussion around it, clarified for me the nature of the simplifying assumptions that he's making, and the (empirical and logical) questions to be addressed in determining whether those simplifications invalidate his conclusions about the dating of Indo-European. He also convinced me that he's continuing a serious program of efforts to test the effects of his assumptions, and that he's serious about understanding and addressing objections. In other words, he's doing science.
So often we are shown how grammar changes over time with comparisons of BEOWULF with Chaucer with Shakespeare with Fitzgerald. But here and now in America, among young African Americans (and affiliated people brown, yellow and white), an interjection has evolved into a piece of grammar under our very noses.
I speak of YO! Time was that YO! was used as in YO! GET OFF OF THAT TABLE! But nowadays, YO! has floated to the ends of sentences and lost its shouting intonation, and has become what linguists would call a pragmatic marker. Listen to young blacks talking casually and savor sentences like THE PARTY WAS REALLY OFF THE HOOK, YO or TELL HIM HE CAN'T BE STEPPIN' TO YOU ALL THE TIME, YO.
For maximal clarity, OFF THE HOOK means roughly "superlatively fantastic" and STEP TO means to initiate a physical altercation.
In any case, we must understand that this is a brand new YO! In the first sentence above, "..HOOK, YO" is pronounced with the same melody as ICE CREAM in the sentence I WAS LOOKING FOR SOME ICE CREAM. The new YO! has no accent, in other words. It has become a little marker of emphasis, also carrying a hint of vernacular warmth, as if to connote that the party was marvelous in a way that the speaker and his interlocutors particularly cherish -- just the right songs, just the right people, just the right feel, yo.
This YO, then, is no longer a call, a shout. It is a word like EVEN when it is used in a similar way. THE SENATOR DIDN'T EVEN SHOW UP FOR THE VOTE, for example. This EVEN is hard to imagine in a newspaper headline because it's too personal, too viscerally judgmental. Only in a parody in THE ONION would EVEN be used like this in a headline. This is because EVEN, here, injects a note of the intimate, foreign to the effort to render newspaper prose maximally objective.
The new YO interests me in that it is not nearly as exotic as one might think. This is how what linguists term pragmatic markers have arisen in languages worldwide. And I have reason to think that less than a century ago, a similar process occurred in a different nonstandard American variety, Brooklynese.
I have been reading through anthologies of the marvelously surreal old comic strip KRAZY KAT by George Herriman lately. Anyone who has missed this minimalist masterpiece concerning an ambiguously gendered cat who craves for scraggly mouse Ignatz to throw bricks at his/her head as a substitute for sex should run, not walk, to their laptop to order a book from Amazon.
The strip ran from the teens into the forties, and Herriman had his characters speaking in a stylized amalgam of highfalutin, Ellis Island, and New York bridge-and-tunnel. Ignatz Mouse (or "Ignatz Mice" as Krazy called him) leans towards the latter.
But now and then, amidst his Jackie Gleason-esque speech style Ignatz comes up with something like: "Hmm -- so he was trifling with me, hey?"
Now, that HEY initially seems a little clumsy. Try saying that line out loud. We imagine HMM or EY rather than HEY. The HEY seems simply unnatural, neither elegant, nor "Yiddische," nor "slangy," but just odd. One encounters that use of HEY in various stone-age American comic strips and vaguely senses that, for example, the artists back then just didn't quite know how to write realistic dialogue.
But Herriman's attention to verbal nuance elsewhere makes it unlikely that these HEYs were just the result of the clumsiness of a pre-Tune-In-Drop-Out man's unfamiliarity with putting real speech on the page. And Ignatz "Mice"'s little verbal tic brings me to mind of something I once caught in an old radio show.
Before I LOVE LUCY, Lucille Ball starred in a radio sitcom called MY FAVORITE HUSBAND, which was in retrospect a kind of dress rehearsal for the television show that would take the nation by storm. Already she was the daffy wife always giving her hubby trouble, including donning costumes and playing parts.
In one episode of the show in the late forties, for reasons I won't bother readers by recounting, Lucy has to pose as a gum-popping gal from Brooklyn. Her characterization includes postposing HEY to every second sentence: WHY DON'T WE MEET DOWN AT THE STATION, HEY? IT WAS THE ONLY WAY I COULD FIND IT, HEY. Again, the HEY has no accent. It wasn't "DOWN AT THE STATION -- HEY!!!!" Instead, "..STATION, HEY" had the melody of "OVERCOAT."
I (born 1965, and having especial occasion to hear vernacular Brooklynese daily in 1986 and 1987) have never heard anyone use HEY in this way. But it is so peculiar that one assumes that the writers based it on some kind of reality. Between Ignatz and Lucy, I hypothesize that in America before about 1950, vernacular speech in, at least, New York City included a use of HEY as a pragmatic marker in a way quite similar to the way baggy-pants teens are today using YO!
I can't help but end this by noting that apparently, Herriman had a healthy dose of African ancestry, born as a creole in New Orleans. But that's just for fun, hey.
"NYC is the only US city where less than 50% of the households do not own a car". Found on a weblog, and added to our extensive collection of overnegations.
I'd like to add a few notes to Mark's post on the New York Times' decision to acknowledge the Armenian genocide. Genocide is defined in international law by the United Nations Convention on the Prevention and Punishment of the Crime of Genocide. One of the clearest and most convincing pieces of evidence of the Armenian genocide is the eye-witness acount of Leslie A. Davis, who was American consul in Harpoot and reported on it to the State Department. It puts paid to Turkish claims that Turkey was merely suppressing a rebellion by the Armenians. Suppressing rebellions does not require the mass killing of women and children and the elderly.
It may seem surprising that Turkey continues to deny that the Armenian holocaust took place. No one now alive could bear any responsibility for it, and the evidence is so overwhelming that the denial has no effect other than to make Turkey look bad. Nor were all Turks at the time guilty of genocide. Indeed, some Turks acted heroically to save Armenians, as described here. I think that there are two reasons for Turkish intransigence. One is that the Armenian genocide is the original sin of the Turkish Republic. Although strictly speaking it took place under the Ottoman Empire, it was the tail end of the Empire, and the people responsible were the Young Turks who created the modern Turkish state. In many ways their accomplishment was remarkable. They succeeded in preventing Turkey from being colonized and created a modern, democratic, secular state. Turkey has had its problems, but it has been much freer, more democratic, and more successful at modernization than any other Muslim country. To take but one example, women have had the right to vote and to be elected to national office since 1934. It is understandably painful for those who are justly proud of this accomplishment to recognize that the founders of modern Turkey had blood on their hands.
The other reason that Turkey is unwilling to acknowledge the Armenian genocide is that the Turkish Republic is founded on ethnic nationalism. The Ottoman Empire was a multiethnic state in which the unifying force was Islam. The Turkish Republic, as a secular state, has Turkish ethnicity as its unifying force. The existence of other peoples with territorial claims is thus a threat to the ideological foundations of the Turkish Republic. Turkey has been fairly tolerant of minorities with no territorial claims. Anti-semitism, for example, has not been much of a problem in Turkey. But groups like the Armenians, the Greeks, and the Kurds, who have claims to Turkish territory, are problematic. The Armenian problem was largely resolved by genocide, and the Greek problem by exchanges of population with Greece, but the Kurds remain a major thorn in the side of Turkey. Until very recently, Turkey denied the very existence of the Kurds and their language. Kurds were referred to as "mountain turks". The use and teaching of Kurdish was banned. Turkey only recently relented on this in order to obtain entry into the European Union. Turkey still opposes the creation of a Kurdish state, even in Iraq, for fear of arousing Turkish Kurds.
It's important to remember these things because, perhaps, the next time, somebody in power will care and put a stop to it. Adolf Hitler said "Who now remembers the Armenians?" and went on to carry out his own genocide. But memory is not enough. It is also necessary that the people with the power to prevent genocide care enough to do it, and most of the time, they don't. When the Hutu began to exterminate the Tutsi in Rwanda in April of 1994, the international community did nothing. At the time, there was a small United Nations peacekeeping force in Rwanda under the command of Canadian Lieutenant General Roméo Dallaire. General Dallaire's expert opinion was that he could have stopped the massacre with only 5,000 troops. He appealed repeatedly for more troops and permission to stop the massacre but never received them. I remember watching his testimony before Parliament. I'd never seen a general cry before.
General Dallaire has described his experience in a moving book Shake Hands with the Devil. It is a damning indictment of the United Nations bureaucracy and of the governments of the countries most concerned: Belgium, France, and the United States, which blocked movements within the UN to stop the killing.
"N.Y.U. doesn't attract just smart students, it attracts smart, eclectic students," said Mr. Beckman, the university spokesman. "We had a film student who wanted to film a couple performing a live sex act in front of a class. We had students who set up a swimming pool in their dorm room. Now we have this fellow."
Mr. Beckman is being quoted in today's New York Times, and by "this fellow" he means Steve Stanzak, a.k.a. the "Bobst boy", who has been living for eight months in the basement of the Bobst Library at N.Y.U., due to financial problems, and documenting his experiences on LiveJournal and a web site homelessatnyu.com.
I'm also impressed by Stanzak's resourcefulness, but I wonder whether the entire N.Y.U. administration agrees with their spokesman's other definitions-by-example of "smart" and "eclectic". Limiting discussion to the business of setting up a swimming pool in a dorm room, and thinking about issues such as floor loading and water damage in rooms below, I would have guessed that a different S-word would be more appropriate. In my experience, it's rarely a good idea to put large amounts of water in unusual places. I don't think that any students in the dorm where I live have ever turned their room into a swimming pool. However, an army buddy of mine once tried to turn his rusty old van into a sort of traveling love palace by removing the rear seats and installing a second-hand water bed, and that experiment came to a mythically disastrous end.
Maybe they build dorms differently at New York University (which is never called "New York", like Boston University/B.U. an example of an "X University" not known familiarly as "X"). Anyhow, NYU has given Stanzak a free room -- under the swimming pool? -- and invited him to come talk to them about a better financial aid package, so he'll have a chance to find out for himself.
An April 17 press release from The Armenian National Committee of New York cites an earlier news release from the International Association of Genocide Scholars to the effect that The New York Times has lifted its long-standing policy against the use of the term "Armenian Genocide".
The press release quotes a "revised guideline for journalists" as saying that "after careful study of scholarly definitions of 'genocide,' we have decided to accept the term in references to the Turks' mass destruction of Armenians in and around 1915", so that "the expression 'Armenian genocide' may be used freely and should not be qualified with phrasing like 'what Armenians call,' etc." The quote continues that "while we may of course report Turkish denials on those occasions when they are relevant, we should not couple them with the historians' findings, as if they had equal weight."
I haven't been able to find either the original or the revised NYT guidelines online, and the NYT has not discussed the matter in its own pages, as far as I can tell from the online search function.
It's amazing that this is still such a controversial issue. Here's a Reuters story, via the NYT, from December of 2003, about a similar judgment adopted by the Swiss Parliament over objections from the Swiss government and vague threats from the Turkish government:
Parliament adopted a resolution, 107 to 67, recognizing the killing of Armenians under the Ottoman Empire in World War I as genocide, defying the Swiss federal government and angering Turkey. Foreign Minister Micheline Calmy-Rey spoke against the resolution, and a spokeswoman said the government hoped the resolution would not strain Switzerland's relations with Turkey, which is deeply concerned about the issue. Turkey reacted swiftly, saying the Swiss assembly bore responsibility for any negative consequences its decision might cause.
Here's the Armenian National Institute's web site on the events of 1915-1923. Their FAQ is a good place to start. Here is the Wikipedia entry on the subject. As the revised NYT guidelines recommend, I'm not going to cite the denials as if they had equal weight, though the Wikipedia entry gives what I believe to be a fair account of the history and status of the controversy.
A minor footnote is the role of denial of the Armenian genocide in the history of spam, as discussed in this Wikipedia entry on Serdar Argic.
[via Blind Höna]
[Update: Gary Bass discusses this in the Talk of the Town section of the May 3 New Yorker.]
A couple of weeks ago, the learned Dr. Weevil answered a query about the difference between "Asymmetric information" and "Asymmetrical information" by explaining that the first is merely a trochaic tetrameter, while the second is a hipponactean. He indicates in a footnote that "[t]he hipponactean is named after the Greek poet Hipponax, the only one I know other than Hank Williams (Senior, of course) who writes of buckets of beer." In a later post, he quotes some (translated) verses, though not the one about beer.
According to the LION database, at least one other (English) poet has written a poem about buckets of beer. Well, he's Australian, and there's only one bucket of beer in the poem, which is mostly about bugs, homelessness and nostalgia, but I think that Hank would've liked the content if not the style.
Baking at night
Geoffrey Lehmann
You don't get bread these days
with blue and green beetle wings baked into it
and pink stains from some crimson bug.
On hot nights
the lights of the bakehouse drew
all the insects of Waugoola Shire,
and strolling past you could smell the dough.
But they've given up baking at night.
You don't see the fires of the bagmen
under the bridge by the river.
They're extinct too.
Mr Long sometimes humped his swag
for far-off places,
drinking methylated spirits, shadow boxing
and trying to kiss people.
I've tasted his johnny cakes,
flour mixed with salt and water on a fence post
and cooked on a sheet of galvanized iron,
zinc curling off around the dough.
Burned specks turned out to be mouse dung.
After his long tramp across One-Tree-Plain
with a 'cigarette swag'
Jim Long (Old Quizzer) dossed for some weeks
with a dozen other bagmen sprawled drunk
under the bridge at Darlington Point.
He got some meat scraps
and cooked soup for them all in a kerosene tin.
A bagman's three-day-old corpse
when it was noticed
was christened 'Hot and Juicy'.
The bagmen dug a hole by the side of the river,
a bucket of beer
was sent down from the Punt Hotel,
and Constable Brindle read the burial service.
You don't see many drunkards, wanderers
or blind people
(like Mrs Stinson---as children we loved
to see her holding her missal upside down
in church, poor woman).
There's no Cancer Joe for children to taunt.
If I wanted to join the bagmen by the river
under the weeping willows
I'd find no one there,
only the rumble of semi-trailers crossing the bridge,
the big headlights hurtling over.
We live in very moral times.
[from Spring Forest (1994)]
Following up on Eugene Volokh's discovery of Jacob Weisberg's fondness for "you know" as a filler, here are a couple of other examples of Weisberg using stigmatized features of the vernacular, in a passage whose structure is incoherent in the way that extemporaneous speech often is:
(link) We were sort of infuriated by that, for a couple of reasons. The main one is the idea that -- I mean, we take our integrity very seriously, and the idea that it's somehow corrupting for NPR to work on a show with journalists from Slate, we didn't understand why, just because Microsoft happens to own us, why we're impure in some way that they're not. [emphasis added]
As I wrote back on Jan. 3
You can make any public figure sound like a boob, if you record everything he says and set hundreds of hostile observers to combing the transcripts for disfluencies, malapropisms, word formation errors and examples of non-standard pronunciation or usage. It's even easier if the critics use anecdotes based on the perceptions and verbal memories of equally hostile listeners.
I think that the excessive focus on George Bush's alleged language problems, fostered by Jacob Weisberg's "Bushism" enterprise at Slate, is a very bad idea. It's a bad idea because it trades on regional and class prejudices -- and yes, I know that George W. Bush has roots in the New England aristocracy, but it seems that he's regarded as a linguistic traitor to his class, just as FDR was seen as an economic traitor to his class. The "Bushism" obsession is also a bad idea because it seizes on and amplifies the most trivial mis-statements, and thus helps push public figures to replace unscripted discussion with artificial exchanges of carefully-packaged sound bites. And finally, it's a bad idea because it exemplifies the urge to replace political discourse with Pavlovian conditioning.
Fernando Pereira at Fresh Tracks describes a discussion with his wife Ana about the relative frequency of the Portuguese words for "child" (criança/as vs. menino/a/os/as) in Portugal and in Brazil. The discussion was provoked by a passage in John McWhorter's book The Power of Babel, which Ana had been reading, and they explored the answer by reference to ratios of ghits.
It used to be that an unabridged dictionary and an encyclopedia would be kept accessible in middle-class homes, for settling questions of language or fact. Now the dictionary is likely to be an online one, and "the internet" is likely to be used for fact finding in place of the encyclopedia. I'm also seeing more and more cases of people using Google and similar search facilities to address usage questions by counting things. Of course, Fernando and Ana are hardly an ordinary couple in this respect.
If I understand Fernando's post, John turns out to be correct, more or less. The most interesting part is the apparent interaction among singular vs. plural and Portugal vs. Brasil (numbers are rations of criança to menino):
Singular |
Plural |
|
Portugal | 2.16 |
5.83 |
Brasil | 1.04 |
1.86 |
Fernando offers an explanation in terms of the interaction between formality (greater in Portugal) and the specification of gender (which is required for forms of menino but not criança, and thus favors menino in informal uses, since one is likely to be talking about particular kids whose identities and genders are known). The idea seems plausible; it could be tested by examining a random sample of uses of each form. In this connection, t would be nice if Google (or a similar search engine) could be persuaded to return a random sample of the hits for a particular query, rather than the usual relevance-ordered list. Doing one's own pseudorandom sampling is not possible, since Google will not serve up results for starting points beyond 1,000 -- and the top 1,000 ghits are almost always a biased sample.
Anyhow, there is a partly analogous case in English with child/children and kid/kids: the former seems more formal and also more British, while the latter seems more informal and also more American. Of course, there is no gender marking involved in either word. In both the .com domain (probably mostly American) and the .co.uk domain (certainly mostly British), forms of child are commoner than forms of kid. However, the .com domain definitely has relatively more kid/kids (confirming that it is an Americanism). However, the effect of singular vs. plural is opposite in the two domains (numbers are ratio of child to kid (singular) or children to kids (plural). The notion that this is an effect of formality more than geography is supported by the fact that the .edu domain, which is almost all American, is even more strongly dominated by child/children than the .co.uk domain is:
Singular |
Plural |
|
.co.uk | 3.43 |
1.93 |
.com | 1.35 |
1.45 |
.edu | 10.8 |
2.87 |
[Update: David Nash points out that because of Google's propensity to ignore apostrophes, the estimates of children/kids ratios are too low, since Google will lump kid's in with kids. (So will all too many writers, alas.) David suggests that counts for childs/child's might help balance things, but one would have to figure out how many of those are the name rather than the possessive form of child.]
Ron Hogan recently asked me about an obscure phrase in Sean O'Hagan's Guardian review of a new book on Bob Dylan's lyrics by Christopher Ricks. As a result, I read the review all the way to the end, which I might not otherwise have done. I'm grateful, because the first sentence of the last paragraph is a gem:
The writing of this book was, I'm told, a labour of love and, as such, I am pained to point out how defeated I was by its ungainly style.
This is a wonderful example of what Saul Gorn used to call a "self-annihilating sentence". It's not as succinct as "Essentially is essentially meaningless", but it has real charm. Its finest feature, in my opinion, is its ungainly use of "as such" to mean "that being the case" or "therefore". A full appreciation of O'Hagan's achievement requires a bit of discussion -- so unless you are a fan of grammar or of irony or both, you may want to skip the rest of this post.
The standard gloss for "as such", as in Merriam-Webster, is "as intrinsically considered : in itself". Generalizing from "in itself" to a wider set of pronouns, this is the meaning when Kant writes (in translation) about "ideas as such", when medical researchers conclude that "Fungal spores as such do not cause nasal inflammation", or when a gardener writes that "I have nothing against cats as such but they do tend to use our garden as a toilet."
However, this gloss is incomplete as a picture of "as such" usage. We need to consider two modifications, one a sort of semantic bleaching, and the other a difference in the connections of as and such with the words and phrases around them.
In the first form of generalization, as such may become a sort of weasel wording, allowing the writer to avoid committing to a fully general statement ("I have nothing against the French as such"), or (as Ken Wilson wrote in The Columbia Guide to Standard American English) as such may be "used mainly for emphasis" The story lines of Wagner's operas display surprisingly little narrative skill as such.".
This bleached-out concessive or emphatic as such seems to be what Charles Bernstein meant to use in writing an article entitled "Against National Poetry Month As Such". Bernstein is against National Poetry Month not only as intrinsically considered and in itself, but also in its relationship to other aspects of contemporary American culture and indeed in every other way he can think of, and he proposes that it should be replaced by "National Anti-Poetry Month". In Bernstein's title, maybe "as such" is just verbal boldface, or maybe it's a way of telegraphing the idea that Bernstein opposes National Poetry Month but favors poetry, or maybe (since Bernstein is a poet who believes that "sense remote / Adduces worth") it's both at once.
In the second form of generalization, "as such" is not used to modify a preceding noun phrase, but rather is connected to a noun phrase occurring later; while at the same time, such refers anaphorically to a completely different phrase in an earlier clause. This structure can be understood by looking at a pair of sentences like these:
As an expert in the field, Dr. Gubser is frequently invited to speak at lectures, and members of the media often seek his opinions and analysis.
Dr. Gubser is an expert in the field. As such, he is frequently invited to speak at lectures, and members of the media often seek his opinions and analysis.
In the first example, "as an expert in the field" is connected to "Dr. Gubser", but precedes it. In the second example, "such" is an anaphoric reference to the preceding noun phrase "an expert in the field", and "as such" has exactly the same relationship to Dr. Gubser that the full as-clause did in the first sentence.
Usage mavens generally advise that such phrases ought to connect to the subject of the following clause, rather than to a noun phrase in some other position. On this view, the following version would be deprecated:
? As an expert in the field, members of the media often seek Dr. Gubser's opinions and analysis.
I generally agree with this advice (though I don't see anything wrong with sentences like "As a bonus, we get a Wide Area Network as well.") When a sentence-initial adjunct needs to connect to a specific noun phrase deep in the following material, it can be confusing. However, this advice is very widely ignored, and because it is often so unclear what the connections to the preceding and following material really are, "as such" has come to be used as a kind of generic bit of inter-sentential stitching, as this Columbia Journalism Review note complains.
My impression is that this linking usage of "as such" is becoming quite common -- if you search Google for strings of the form "as such X", for appropriate values of X, you'll find quite a few:
(link) A street retreat is a plunge into the unknown. As such, no one knows what will happen.
(link) everyone will get the email, and as such, everyone is welcome to join in
(link) The Code sets out general principles to guide employees in making ethical decisions, they cannot and are not intended to address every specific situation. As such, nothing in the Code prohibits or restricts Reed Elsevier from taking any disciplinary action on any matters pertaining to employee conduct...
This is a normal example of syntactic and semantic change in progress, and I'm certainly not about to say that these sentences are ungrammatical -- for those who have made the change. However, these sentences are certainly bad (or at least inconsiderate) writing, because they're puzzling to readers who still see as such as a fronted modifier containing an anaphor, and who will seek in vain for suitable preceding and following linkages.
Now we can see what a perfect little piece of found poetry it is, when O'Hagan uses as such as a linkless linker in the center of his ungainly complaint about Ricks' ungainly style:
The writing of this book was, I'm told, a labour of love and, as such, I am pained to point out how defeated I was by its ungainly style.
David Beaver appears to be suggesting at the end of Language Log's 800th post that I might really be a Finn, or at least might have an alternate Finnish identity. I wish! Finnish is the coolest language. And I did use it, impeccably, at least once. It felt so great.
While I was in Helsinki for over two weeks to be at the 2001 ESSLLI (it's an annual thing, this year in France), I developed a crush on Finnish. I tried to pick up as many snippets of information about Finnish as I could. I listened to every syllable I could overhear. I read to myself under my breath from every sign I passed. I was walking past a window near the university one day when I saw the word YLIOPISTOKIRJAKAUPPA, and I suddenly stopped and realized with a real thrill that I could understand it. All of it. It's in parts: yli "high", opi "knowledge", sto "location", kirja "book", kauppa "retailer": yli-opi-sto-kirja-kauppa "higher learning place book shop" -- it said University Bookstore!
That didn't define me as a speaker of Finnish (a language about which they tell the old tale that the devil was unable to learn it); that was just passive analytical competence. But it was a start. It gave me hope and confidence for my entry into active use of the language. And finally the opportunity came.
A week or two later my friend Polly and I were returning from a wonderful three-day trip we took to Russia by train from Helsinki (the same train to the Finland Station in St Petersburg by which Lenin traveled to Russia to start the revolution that renamed the city Leningrad). We needed to take a taxi to the small hotel where we would spend our last night in Finland before flying home to the States. And I decided that I was ready to tell the taxi driver where we wanted to go, in Finnish.
The Hotel Arthur (it sounded like Hoh-tel Arrh-toorrh, I had learned from overhearing a Finn mention it) was in a street called Vuorikatu (stressed on vu: you always stress the first syllable of everything in Finnish, whether that seems sensible or not; then the intonation just sort of drops off onto a low monotone as if you aren't very interested in the rest of the word). Katu means "street". Now, Finnish doesn't have a preposition meaning "in"; that isn't how things work. Instead you put the ending -lla on a noun that the preposition would go before if there were such a preposition. And what makes it a little trickier is that the ending changes the last consonant in the noun. The change that the t of -katu would have to undergo, I had learned, would turn it into a d. All I had to do was put all that together, and we wouldn't need to wimp out and rely on the taxi driver's ability to understand us naming the destination in what would essentially be English.
Polly and I slid into the back seat of the cab, and I leaned forward and said to the taxi driver in a carefully studied Finnish accent that I had practiced a few time in my head: Hotel Arthur, Vuorikadulla. And off the driver went. No chat; he had understood me perfectly well in the usual language of the city; he just assumed I was a Finnish speaker. I like to imagine that Polly was extraordinarily impressed with my linguistic skills, though she chose not to say anything. I, at least, felt supremely competent just to have fluently rendered the correct case ending and associated morphophonemic alternation and stress contour on an utterance in a language that the devil himself gave up on. Lucky Polly to have such a master of language, such a linguistic stud, as a travelling companion.
It still didn't define me as a Finn (though if David Beaver really thinks I might be one, I have apparently been mistaken for one twice). But it was more than a start. I didn't just take that utterance out of a phrasebook; I constructed it out of its parts; I phonetically honed the parts to fit them together properly; I did the stress pattern right. That's what command of a language consists in -- the ability to put an expression together to do the job you need to do, whenever the occasion arises. It felt great to do it even once. Some day I'm going back and learn some more. More of the coolest language in the world.
Those with a serious interest in the "neo-glottochronology" research by Foster and Toth, and more recently Gray and Atkinson (see Language Log posts here, here, and here), will want to read these papers:
Tandy Warnow, Steven Evans, Don Ringe and Luay Nakhleh, "Stochastic models of language evolution and an application to the Indo-European family of languages" (.pdf)
Steven Evans, Don Ringe and Tandy Warnow, "Inference of divergence times as a statistical inverse problem" (.pdf)
Russell Gray will be visiting Penn this week, and giving a talk on Tuesday, so watch this space for more information.
The second paper's conclusion is quoted below. It's interesting to see a case in which a statistician, a linguist and a computer scientist agree on the appropriateness of Rumsfeld's "unknown unknowns" saying, and like its formulation well enough to quote it.
Much of what we have said has focused on two issues: one is formulating appropriate stochastic models of character evolution (by formally stating the properties of the stochastic processes operating on linguistic characters), and the other is inferring evolutionary history from character data under stochastic models.
As noted before, under some conditions it may be possible to infer highly accurate estimations of the tree topology for a given set of languages. In these cases, the problem of dating internal nodes can then be formulated as: given the true tree topology, estimate the divergence times at each node in the tree. This approach is implicit in the recent analyses in (Gray & Atkinson, 2003; Forster & Toth, 2003), although they used different techniques to obtain estimates of the true tree for their datasets.
The problems with estimating dates on a fixed tree are still substantial. Firstly, dates do not make sense on unrooted trees, and so the tree must first be rooted, and this itself is an issue that presents quite significant difficulties. Secondly, if the tree is wrong, the estimate of the even the date of the root may have significant error. Thirdly, and most importantly perhaps, except in highly constrained cases, it simply may not be possible to estimate dates at nodes with any level of accuracy...
[...]
Therefore we propose that rather than attempting at this time to estimate times at internal nodes, it might be better for the historical linguistics community to seek to characterise evolutionary processes that operate on linguistic characters. Once we are able to work with good stochastic models that reflect this understanding of the evolutionary dynamics, we will be in a much better position to address the question of whether it is reasonable to try to estimate times at nodes. More generally, if we can formulate these models, then we will begin to understand what can be estimated with some level of accuracy and what seems beyond our reach. We will then have at least a rough idea of what we still don't know.
7. EPILOGUEAs we know,
There are known knowns.
There are things we know we know.
We also know
There are known unknowns.
That is to say
We know there are some things
We do not know.
But there are also unknown unknowns,
The ones we don't know
We don't know.-- Donald Rumsfeld
U.S. Secretary of Defense
Posted by Mark Liberman at 10:19 PM
Here's another phrasal template ("snowclone"): searching Google for "have * will travel" turns up X = browser, children, spacesuit, OOPL, geocache, computer, rocket, transgenes, dog and some 215,000 others. Well, 215,000 pages containing such a sequence, anyhow. The origin of the phrase, of course, is the 1950s TV western Paladin, whose hero sported the business card shown on the right.
What brought this up was a piece by Sam Hughes in the most recent Penn Gazette, entitled "Have drill, will travel". It's about Doc Holliday, who got his DDS in 1872 from the Pennsylvania College of Dental Surgery, which later became part of Penn's School of Dental Medicine. I enjoyed the movie Tombstone, in which Val Kilmer played Doc, but I don't remember a role in it for Doc's girlfriend Big Nose Kate, née Mary Katherine Haroney, whose father was personal surgeon to the Emperor Maximillian. She met Holliday in 1878, in Fort Griffin, Texas, where in order to help him escape the consequences of knifing another gambler, she "set a fire in the hotel as a diversion, then used a pistol to persuade the reigning deputy to let him go".
The Holliday piece in the Gazette is a sidebar to an article entitled "Dentist of the Purple Sage", about western novelist Zane Grey (originally "Pearl Zane Gray"), who got his DDS from Penn in 1896. The title is a take-off on Grey's classic Riders of the Purple Sage. One interesting aspect of Hughes article is that he starts it off with a reference to the 19th-century American practice of organizing academic life around inter-class brawls, one of which figured in Grey's autobiographical (non-western) novel The Young Pitcher. All this is by way of agreeing with Semantic Compositions that today's students definitely embody "a shift in cultural attitudes and academic training, both of which indicate a present-day emphasis on material acquisition over other goods", such as fighting skills.
Seriously, SC is only talking about changes since the 1960s, and he's interested in the balance between material goals and "metaphysical well-being", and he cites some persuasive evidence, and he hedges his discussion in many appropriate ways. So I shouldn't kid him about his left-handed defense of Camille Paglia.
As further support for the hypothesis of cultural changes among today's young people, I can't resist quoting from Hughes' description of Grey's initiation into the life of the mind at Penn:
It began when he attended an anatomy lecture in an amphitheater--presumably in the building now called Logan Hall--and made the mistake of sitting in a row traditionally reserved for upperclassmen. A big, blond, "husky-voiced sophomore" got to his feet and roared: "Watch me throw Freshie out!"
Freshie may have been scared, but he wasn't budging. When the sophomore tried to pull him away, Pearl [remember that 'Pearl' was Grey's original first name] gave him a violent shove that sent him backward over a row of seats into the midst of his classmates.
Here's how biographer Frank Gruber, drawing on Grey's unfinished, unpublished autobiography, put it: "Pandemonium broke out. The sophomores rose en masse to get to Pearl, and the freshmen spilled down from their heights to rescue their champion. The amphitheater became a scene of riot, and when it was over Pearl was stark naked, except for one sock. His clothing had been torn from him, including his shoes."
The Plain English Campaign people haven't taken Geoff Pullum's advice to get a life, but BBC News has organized its readers to create an all-cliché short story, perhaps as a sort of flooding therapy.
Our discussion of the problems of deciding how to pluralize Latin and Greek nouns in English (here, here, and here ) leads me to point out that those who have difficulty with this can take some comfort from the fact that the Romans themselves did not always follow the Greek accurately. Steve of Language Hat referred me to a recent discussion of classical plurals in which Justin points out that octopi is not quite as ignorant as it sounds since the Romans themselves sometimes treated Greek words ending in πους [pus] as second declension. He gives the example of the related noun polypus, which should be third declension but occurs with second declension case forms. Interestingly, one of the citations in this Perseus entry is to a line of Plautus in which we find the accusative singular polypum, a second declension form (the third declension form would be polypoda). Plautus was very familiar with Greek - indeed, his plays contain many pasages in Greek - so he was surely not ignorant of the Greek form. That even he would shift the declension shows that this must have been a common phenomenon.
Gildersleeve and Lodge's Latin Grammar has a discussion of the declension of Greek nouns at pp. 32-33. They say that many Greek nouns have mixed declension in Latin, with second declension forms used alongside third declension forms, as with polypus. However, they also say that this mixture is pretty much restricted to the singular; the Romans usually followed the Greek in the plural.
Gildersleeve and Lodge also point out that the Romans sometimes took the accusative of the Greek word to be the stem. For instance, Greek κρατήρ "punchbowl" is a third declension consonant stem noun with accusative singular κρατῆρα. It shows up in Latin both as a third declension masculine, nominative singular crater, genitive singular crateris, as it ought to if one follows the Greek closely, and as a first declension feminine, with nominative singular cratera, genitive singular craterae.
By the way, there's a handy summary of the basics of Latin plurals here.
Stephen Laniel was inspired by Bill Poser to write that
A little post about Latin plurals really makes me want to learn Latin. I’ve wanted to for a while, along with Ancient Greek. All in due time, I guess.
I certainly don't want to encourage delay, but I can offer an inspirational story about learning Greek later in life.
Around 1970, in his mid-60s, I.F. Stone retired from journalism, taught himself Greek, and began systematically reading the extant classical literature in search of "one last scoop." The result was his book The Trial of Socrates, published in 1988, shortly before Stone's own death.
I once met Stone, at lunch at the house of the publisher Ralph M. Ingersoll, who was the father of a school friend. This was (I think) in 1963, when I was 15. Ingersoll, who was contemplating retirement himself, asked Stone about his plans. Stone said that he had always wanted to read classical Greek literature in the original language, and that after retiring, he planned to learn Greek and indulge himself.
I remember feeling surprise that he wanted to do this, and skepticism that he would follow through on the idea. So I was surprised and impressed when his book came out a quarter of a century later, and I took special pleasure in reading it. I'm about the same age now that Stone was in 1963, so I can appreciate his achievement in a different way.
Here's an obituary by Ralph Nader, and a review from the right , both of which underline the appropriate irony of Stone's fascination with Socrates.
Correspondents have pointed out some further examples of Pseudo-Latin Plurals. The cases that I have previously mentioned involve incorrect choices of stem form and ending. The new examples involve attempts to make plurals of things that in Latin were not nouns.
Donald Davidson brings to our attention sub poenae, intended as the plural of sub poena. In legal English sub poena is used as a noun to refer to a court order for a witness to appear or to produce documents or other evidence. In the later usage it is short for sub poena duces tecum "Under penalty you will bring with you". duces is the verb "you will bring". tecum is the combination of the pronoun te "you (singular)" and the preposition cum "with". sub poena is a prepositional phrase consisting of the preposition sub "under" and the noun poena "penalty". A prepositional phrase is not a noun and cannot be made plural by changing the ending of its noun to the plural. Indeed, the nominal part of this prepositional phrase is not in the nominative case. sub governs the ablative case. The way Latin is normally written, you can't tell, but the /a/ of sub poena is long whereas the /a/ of the nominative singular poena is short. You could of course put poena into the ablative plural, but sub poenis (with long /i/) would mean "under penalties". In short, sub poena is a noun only in English, not in Latin, so the only way to make it plural is the English way: sub poenas.
Claire Bowern of Anggarrgoon has encountered non sequituri, intended as the plural of non sequitur. In Latin, non sequitur is a sentence, not a noun. It consists of non "not" and the verb sequitur "it follows". As a sentence, it cannot be made plural by adding the nominative plural suffix for second declension nouns. As an English noun, it has the English plural non sequiturs. If we really wanted to make a Latin plural, we could, since in this case the verb could be made plural - non sequiuntur would mean "they do not follow". That might be a little obscure.
[Update: Keith Ivey has pointed out another example of this type, ignoramus, which is sometimes given the plural ignorami, e.g. here. This would be correct if ignoramus were a second declension noun, but it isn't. ignoramus is a verb meaning "we are ignorant of". It became a noun through its use as the name of a character in the 1615 play by George Ruggle of the same name.]
[Update: John Kozak mentions seeing agendae which I too have encountered from time to time. Google turned up plenty of examples, perhaps the most embarassing of which is this list of information about meetings at the University of North Texas . The problem here is that agenda is the plural of agendum "something to be done"; there is no singular agenda for agendae to be the plural of. Of course, in English agenda is used as a singular noun, so there is no reason we shouldn't use the English plural agendas.]
[Update: Keith Ivey has pointed out a similar example: omnibus, which is sometimes given the plural omnibi, as here. omnibus is a noun, or, to be precise, an adjective used as a noun, but it is already plural and in the dative case. omnibus is the dative plural of omnis "all" and means "for all". It therefore cannot be further inflected as if it were a nominative singular noun. ]
A word whose plural is particularly problematic is octopus. Here are the results of a Google search:
octopi | 34,300 |
octopuses | 33,800 |
octopods | 2,310 |
octopodes | 921 |
octopii | 601 |
The second most popular plural, octopuses, is one of the forms considered correct by dictionaries. It is not a classical form. It is the regular English plural. The correct classical plural is the next-to-least popular, octopodes. The reason is that octopus isn't really a Latin word; it's a Greek word that was borrowed into Latin. In Greek the nominative singular was ὀκτώπους [oktopus] and the plural was ὀκτώποδες [oktopodes]. Like many Greek loans into Latin, it is declined more-or-less as in Greek, as a third declension noun.
What about octopods? This is a scientific term derived by making an English plural from octopod, which is the bare stem of the Greek word, not its singular. A guess is that octopod is a backformation from the neuter plural octopoda, the name of the order containing octopuses.
There is one more form that we haven't discussed: octopussies, for which Google yields 411 hits. Not all of them are references to octopuses, but many are. This is evidently based on a folk etymology of octopus as containing puss "cat", from which the hypocoristic pussy is derived. The same folk etymology is the basis for the pun in the nom de guerre of the title character of the James Bond film Octopussy.
Ron Hogan of beatrice.com emailed an inquiry:
"Browsing this UK review of the latest book from Christopher Ricks, due to come out here in the US shortly, I came across "the phrase 'from the off,' which I've never seen before. As best I can gather, it appears to be a cricket reference, perhaps with "off" short for 'off stump'".
Sean O'Hagan is the author of the review, and here's the phrase in context:
From the off, Ricks dives headlong into Dylan's lyrics, putting all his faith in close readings of the texts, and the texts alone.
Well, I don't think I've ever seen the phrase either, though I have recent evidence that my memory is not always reliable for such things. However, the editors of the OED have encountered it, and they document the encounter, way down at the bottom of the entry for off, sense D. 5.:
colloq. The start of a race... Also (in extended use): the start (of anything), the beginning; departure; a signal to start or depart.
Judging from the citations given, it's a British sports metaphor -- and perhaps a fairly recent one -- but from horse racing rather than from cricket:
1946 Sporting Life 15 June 1/1 Some open betting saw Paper Weight favourite at the ‘off’.
1966 J. PORTER Sour Cream xiv. 180 It was too late. The students nearest to him..thought this was the off. They began to move forward.
1978 Lancashire Life Apr. 50 (caption) Tangle-wrangle: Stan Lyons waits on the slipway for the ‘off’, while helpers sort-out the lines from his harness.
1999 I. RANKIN Dead Souls xi. 68 Rebus knew..how juries could decide from the off which way they'd vote.
O'Hagan's review may have some American-puzzling bits in it, but he also trips over his own trans-Atlantic shoelaces when he identifies Ricks as "formerly professor of English at Cambridge, now professor of humanities at Boston". This entangles O'Hagan in the knotty business of reference to academic institutions, which is one of those quasi-regular aspects of English that are as hard to get right as the plural of nouns ending in 'f', or the endings of ethnonyms.
Ricks is actually employed at Boston University, which is also familiarly referred to as "B.U." (or "BU") but (I believe) is not known as "Boston" to anyone (outside of the pages of the Guardian, of course).
I goofed myself on a similar matter of academic nomenclature a few weeks ago, when I wrote about York University instead of the University of York, and had to be set straight by Geoff Pullum, who (as a graduate of the latter institution) pointed out that "York University" is an entirely different place, located in Toronto rather than in the U.K.. From evidence on their respective websites, I'm confident that both "York University" and "the University of York" are sometimes familiarly called just plain "York", but if I hadn't seen the evidence there, I wouldn't be sure about it.
This raises the interesting question of how I can possibly be so sure that Boston University is never (appropriately) called just plain "Boston". I've never been told this explicitly; no one has ever corrected me (or anyone else in my hearing) for saying or writing the wrong form; there is no principle I can think of that it follows from. (Of course, I could simply be wrong about this -- but the point here is that I believe that I've learned it.)
In fact this is a case of implicit negative evidence, a phenomenon that some language learning theorists have claimed not to exist. I've learned that it's "wrong" to refer to BU as "Boston" because I've heard people refer to BU, in speech or writing, many thousands of times in my life, and none of them (I think!) ever referred to it as "Boston" before Sean O'Hagan did. This evidence, though statistical in nature, is good enough to give me high confidence in the conclusion.
I believe that the role of this kind of evidence in human language acquisition is relatively uncontroversial now, though some researchers still haven't really digested its implications. However, I can tell you that when I made this point in a talk at MIT about a decade ago, it was by no means uncontroversial. I should also say, for those of you who are not in the biz, that this is all connected to an interesting piece of intellectual history, dealing with the nature and source of human knowledge, which is also connected to a famous sentence from the 1950s, discussed here. But that's a topic for another post.
The little words do
big things hypothesis:
the most frequent words in a
text are closed class words that are
essential for stringing together complex sentences and texts, and their
frequency
is proportional to (or at least some upward monotone function of) the
average syntactic complexity of the text. |
Camille Paglia owes (the anonymous author of) Semantic Compositions a hug, for his spirited defense of her unsupported generalizations about generational changes in attention span and verbal facility. Not a very big hug, though, because his defense reminds me of the Radio Yerevan jokes that a college friend of mine used to collect:
Question to Radio Yerevan: Is it correct that Grigori Grigorievich Grigoriev won a luxury car at the All-Union Championship in Moscow?
Answer: In principle, yes. But first of all it was not Grigori Grigorievich Grigoriev, but Vassili Vassilievich Vassiliev; second, it was not at the All-Union Championship in Moscow, but at a Collective Farm Sports Festival in Smolensk; third, it was not a car, but a bicycle; and fourth he didn't win it, but rather it was stolen from him. (loosely adapted from this page)
SC's version is something like this:
Question to Semantic Compositions: Is it correct that "interest in and patience with long, complex books and poems have alarmingly diminished not only among college students but college faculty in the U.S.", because "the new generation, raised on TV and the personal computer but deprived of a solid primary education", lacks "the most basic introduction to structure and chronology", has "degraded sensitivity to the individual word and reduced respect for organized argument", as well as "demonstrably reduced attention span", so that "[s]tudents now understand moving but not still images"?
Answer: In principle, yes. But first of all...
Read the rest here.
This reference to a review:
a trendy new English cookbook devoted to the preparation of offcuts, snouts, rectii, marrow, and bladders of all description.contains a pseudo-Latin plural. What is evidently intended is the plural of rectum, which is properly recta. rectii would be correct if the stem ended in i, that is, were recti, and if its gender were not neuter, in which case the nominative singular would be rectius. It's interesting where these things come from. I suppose that people who don't actually know Latin but think that a word should have a Latin plural work by analogy from other Latin plurals they have heard. In this case, the analogy must be quite indirect, since no Latin noun ending in -um has a plural in -ii. (Such a noun would have to be a non-neuter ending in ium, which to my knowledge does not exist.)
It's easy enough to see how someone who doesn't know Latin could fail to realize that certain plural endings go with certain singular endings. That would account for someone deciding that the plural ending was i, not realizing that this was true only of masculine nouns, not neuters. So recti would not be a surprising error. But it looks like this form is derived by adding the ending ii to the stem rect. Where does this ii come from?
Some plurals do end in ii, but they are all plurals whose singulars end in ius, e.g. gladii "swords", singular gladius. Why don't people assume, as seems natural, that what is invariant, that is, gladi, is the stem, and that the plural ending is therefore just i? And how do they decide that the stem of rectum is rect? My best guess is that they have come up with a generalization about the form of the case/number endings, namely that they consist of one or more vowels possibly followed by a consonant. This is true of nominatives singular and plural of all nouns other than some third declension consonant stems. Someone who doesn't actually know Latin will generally encounter nouns in the nominative case, so that much is plausible. What I can't quite explain is where they get the idea that the ending can contain more than one vowel. Maybe they pronounce ii and i the same and so treat both as a single vowel, leading them to treat ii as a single morpheme and forcing them to conclude that the ius of words like gladius is also a single morpheme.
[Update: A convenient summary of Latin declension and conjugation is available on-line here.]
Today was the last day of spring term classes at Penn, and therefore it was also hey day, when the junior class celebrates the beginning of their end. The OED explains that "hey-day" is "apparently a compound of [the interjection] HEY; the second element is of doubtful origin, but at length identified with day. The early heyda agrees in form, but less in sense, with Ger. ˈheida, heiˈda = hey there!".
The sense is given as "An exclamation denoting frolicsomeness, gaiety, surprise, wonder, etc.", and the citations include others 1598 B. JONSON Ev. Man in Hum. IV. ii, Hoyday, here is stuffe! , a sentiment with which I'm sure we can all agree.
There is also a noun hey-day or heyday, which the OED says is "Of uncertain origin; perh. connected with prec." (I love how the OED saves space by abbreviating words like "perhaps"), and glosses as "1. State of exaltation or excitement of the spirits or passions." or "2. The stage or period when excited feeling is at its height; the height, zenith, or acme of anything which excites the feelings; the flush or full bloom, or stage of fullest vigour, of youth, enjoyment, prosperity, or the like."
And that's exactly what it was.
A while back, we pointed out (following the lead of Cinderella Bloggerfeller) that Google's spelling correction algorithms could sometimes produce amusing exchanges like this one:
Search entry: gaaaaaaaaaa
Helpful Google: Did you mean: gaaaaaaaaa ?
A recent observation by Trevor at kaleboel led me into a new sort of dialogue with Helpful Google:
Query -- Prabble gnack pubble, tnil pniffertrub
Helpful Google-- Did you mean: Prebble gnack pubble, t nil pniffertrub ?
The exchange that Trevor cited, following Syntactic Saccharose, is a bit different, and depends on the fact that "Flurble gronk bloopit, bnip Frundletrune" is a string that is found in certain packets emitted by version 3.2.0 of NetStumbler (version 3.2.3 uses "All your 802.11b are belong to us" instead, etc.). Misspelling "Flurble" in that string leads Google to do something genuinely helpful, namely find the original NetStumbler string. In contrast, the probe cited above produces amusing results precisely because I'm (in effect) "teasing" earnest old Google with something that is beyond hope. Why this should be (even mildly) funny is a question in social psychology whose answer I don't know.
[Update: In a slightly different vein, Scott Parker emailed this conversation:
Search Entry-- ohhhhhh (o followed by 6 h's)
Helpful Google-- Did you mean: ahhh ?
]
Eugene Volokh has found some transcripts of Jacob Weisberg speaking extemporaneously, and given him a small dose of his own "Bushism of the day" medicine.
A few months ago, I pointed out that
You can make any public figure sound like a boob, if you record everything he says and set hundreds of hostile observers to combing the transcripts for disfluencies, malapropisms, word formation errors and examples of non-standard pronunciation or usage. It's even easier if the critics use anecdotes based on the perceptions and verbal memories of equally hostile listeners.
At the time, I looked around on the web for transcripts of Weisberg interviews, and came up empty. So I suggested this:
I'll buy dinner for Jacob Weisberg, if he'll let me record a couple of hours of convivial conversation..., and then examine the transcripts carefully for Weisbergisms ...
EV looked harder, or more cleverly, and found five pieces of evidence that Mr. Weisberg is given to using "you know" as a hedging pause filler. As EV then says:
Our oral comments are full of this sort of filler, and of grammar and usage errors of various sorts. Nearly anyone who has read a transcript of his own comments can tell you that.
But given that articulate, thoughtful people like Weisberg say these sorts of things, where's the humor, the aptness, or anything else in finding instances of Bush doing the same?
It would be even better to go through some recordings, since transcripts always underestimate the degree of disfluency.
Ray Girvan emailed to mention that the Henning Mankell (or was it Hanning Menkell?) thread reminds him of Sellar and Yeatman's "1066 and All That",
"where the authors deliberately play on the names of the pretenders during the reign of Henry VII, Lambert Simnel and Perkin Warbeck. (They refer to them variously as Warmnel, Perbeck, Wimneck, Warmneck, Lamkin, Lamnel, Simkin, Permnel, etc)."
Note that this is exponentially worse than the Manning Henkel problem, since there are not two but four dissyllables to conjure with.
The outlines of a Henning Mankell experimental paradigm are beginning to emerge -- present one or more reference names, and then (after a delay and perhaps some distraction) ask subjects whether each of a set of probe names was in the reference set. The aim would be to predict error rates and reaction times, based on a model of the structure of the morphophonemic subspaces involved.
In fact, there's probably already a relevant literature on this... In any case, I'll bet that such phenomena turn out to exhibit the same lack of sequential independence that was demonstrated repeatedly here (by google counting methods) for spelling variation in words like "emperor", "jennifer", and "attila".
Mark's posting on Camille Paglia's charges of decline in attention is right on the mark -- this is just an antique jeremiad in new packaging. People have been saying the same thing for centuries, with no more justification than anecdotal observations. (As Montesquieu said, looking back on the long line of complaints about the state of culture: "If all of this were true, we would be bears today").
One person who has made an honest effort to quantify these effects is Todd Gitlin. In an article in The Nation a few years ago called "The Dumbdown," he reported a study he'd done that showed that that the length of the average sentence in novels on the New York Times bestseller list has decreased by more than 25 percent over the last sixty years, while the average number of punctuation marks per sentence has dropped by more than half.
But that method is subject to lots of confounds -- for one thing, the bestseller lists are computed very differently now. And I was curious enough about this to do my own little study, with Brett Kessler, which revealed a very different pattern.
Kessler and I did similar calculations, not for bestsellers but for articles from The New York Times and Science. (For the Times we took the lead sentences of the most prominent story for each of 40 consecutive daily issues starting in October 1 of every twentieth year, going back to 1856. For Science we used a slightly different but roughly equivalent method for selecting articles, beginning in 1896.)
We found that both sentence length and number of punctuation marks per sentence had indeed declined slightly over the period that Gitlin looked at -- in 1936 the average Times lead sentence was almost 38 words long; in 1996 it was less than 35 words long. The figures in Science were analogous -- between 1936 and 1996, sentence length dropped from 27.1 to 25.8, though the average number of punctuation marks per sentence actually increased slightly. In fact that was part of a century-long cycle:
NYT |
Science |
|||||||
Words per sentence |
Punctuation
per sentence |
Words per sentence |
Punctuation
per sentence |
|||||
Year | Mean |
StdDev |
Mean |
StdDev |
Mean |
StdDev |
Mean |
StdDev |
1856 |
31.8 |
19.6 |
2.6 |
2.7 |
- |
- |
- |
- |
1876 |
29.9 |
15.7 |
1.7 |
1.8 |
- |
- |
- |
- |
1896 |
30.0 |
13.4 |
2.0 |
1.8 |
34.9 |
23.9 |
3.3 |
4.1 |
1916 |
38.1 |
20.8 |
2.1 |
2.6 |
36.6 |
20.4 |
2.1 |
2.4 |
1936 |
37.8 |
11.1 |
1.8 |
1.3 |
27.1 |
11.8 |
1.3 |
1.3 |
1956 |
21.0 |
6.3 |
0.7 |
0.9 |
30.2 |
13.6 |
1.9 |
1.8 |
1976 |
34.0 |
9.7 |
1.3 |
1.0 |
23.2 |
10.5 |
1.8 |
1.9 |
1996 |
34.7 |
11.8 |
1.9 |
1.1 |
25.8 |
11.2 |
1.8 |
2.1 |
MEAN |
32.2 |
1.8 |
29.7 |
2.0 |
But what does all this mean? The differences are pretty small, and in any event it would be hard to argue that either publication has been dumbed down over the course of the past 70 years, or that it requires less attention to read Science now than it did then.
And even if you were determined to interpret declining sentence length as an indicator of declining attention capacity, you'd be led to a curious conclusion. In fact, the mean sentence length in the Times reached a low of 21.0 in 1956, but since then it has been climbing -- by 1996 it was up almost 75 percent from its low. And the average number of punctuation marks per sentence has almost doubled over that period. Similarly (if less dramatically) for Science, where there has been a 10 percent increase in mean sentence length over the past 20 years.
In short, sentences have gotten longer and more complex since Camille Paglia's youth. If you're looking for a decline in attention, you might start with hers.
Inspired by Geoff Pullum, Håkan Kjellerstrand at hakank.blogg has written an icon program to generate plausible variants of "Henning Mankell" and compare them with the list in Geoff's original post. Kjellerstand is also the author of the perl module MakeRegex, which "composes a regular expression from a list of words", based on common prefixes. I've been hoping, though, that someone will follow up on David Beaver's post by writing a program to help with (various approaches to) estimating the statistical density of what David called "the Henning Mankell morpheme space".
This is a serious issue in psycholinguistics, as should be clear from reading what Geoff and David wrote. More on it later, maybe.
Someone ought to do a textual version of this.
If you want to read Eddaic and Skaldic poetry, and it's not obvious to you that "hunger-diminisher of din-seagulls of animal-gleam of Heiti" is a fancy way to say "warrior", you'll want to check out this Lexicon of Kennings and Similar Poetic Circumlocutions. For an example of a more complex and discursive analysis, see the essay "When is a fish a bridge? An investigation of Grímnismál 21" on the same site.
I learned about all this from an entry in Ray Girvan's Apothecary's Drawer weblog, which also includes references to several other traditions in which reference is mediated by complex allusions: Maori proverbs, and the Tamarian language of Darmok from Star Trek.
Perhaps the biggest single source of kennings in contemporary American culture is The Simpsons. Consider for example the post on Long Story Short Pier entitled "We are all Frank Grimes now". The meaning of the allusion to "Frank Grimes" is clarified for outsiders by a link to Simpsons episode 4F19.
One of the comments on the LSSP post states: "Release the robotic hounds that shoot bees from their mouths!". Unlike the main post, the comment provides no exegetical link, but we can begin to understand it by consulting this page entirely devoted to cataloguing animal-attack references from the Simpsons, which in turn brings us to Simpsons episode 1F16, in which we find this passage:
Homer: Bart, you're coming home. Bart: I want to stay here with Mr. Burns. Burns: I suggest you leave immediately. Homer: Or what? You'll release the dogs, or the bees, or the dogs with bees in their mouths and when they bark they shoot bees at you? Well, go ahead -- do your worst! [Burns slams the door and locks it] [disbelieving] He locked the door! I'll show him -- [rings the doorbell and runs away]
I'm still not sure what it all means -- fish, bridge, dogs, bees, doorbell, whatever -- but Simpsons 1F16, like Grímnismál 21, clearly requires attentive navigation of a dense semiotic network.
Mystery
Name |
Ghits |
mennkell | 106 |
mennkel | 1 |
mankel | 9660 |
menkel | 7710 |
mankell | 206000 |
mannkell | 17 |
manning | 2670000 |
menning | 66800 |
hanning | 75100 |
henning | 2080000 |
henkell | 11000 |
hankell | 160 |
hankel | 70500 |
henkel | 661000 |
hennkel | 18 |
hennkell | 9 |
kurt |
7350000 |
wallander |
92200 |