September 30, 2006

Gender myths: letting science mislead

Last night's edition of ABC's 20/20 included several segments ( "The Truth Behind Women's Brains" and "Gender Myths: Let Science Decide") that show the growing influence of pseudo-scientific neuro-biologism in American public discourse about sex roles. The "Truth Behind Women's Brains" segment featured an interview with Louann Brizendine, which presented pretty much all of the material from her book that's been discussed in earlier Language Log posts. I won't repeat myself here.

But some striking fragments of misinterpreted science also made it into the "Let Science Decide" segment, for example this one:

"The male brain … actually has a harder time processing the female voice versus the male voice, which is a possible explanation to why we don't listen when our wives call us," Dr. Billy Goldberg said on "20/20."

Goldberg and Mark Leyner are co-authors of "Why Do Men Fall Asleep After Sex?"

They said it was true that men listened less because of biology.

Right. I believe that this is a reference to some research that got a great deal of (spectacularly misleading) media exposure about a year ago: Dilraj S. Sokhi, Michael D. Hunter, Iain D. Wilkinson and Peter W.R. Woodruff, "Male and female voices activate distinct regions in the male brain", NeuroImage 27(3) 572-578, 2005.

I blogged about this research at mind-numbing length in an earlier post ("Rorschach science", 8/12/2005). Here's the executive summary:

  1. They recorded some male and female voices.
  2. They speeded up the male voices until the pitch was like the female voices, and slowed down the female voices until the pitch was like the male voices. As a side effect, this also made the male recordings about 1.5 times faster (and shorter), and the female voices a comparable fraction slower (and longer).
  3. This gives four categories of sounds: original male, original female, hacked male, hacked female.
  4. They played the results to subjects in an fMRI brain-imaging experiment.
  5. The subject were 12 males. (And zero females).

They then determined which brain regions showed various boolean combinations of statistically-significant effects, e.g.

  1. (original female > original male) AND (hacked female > hacked male). (which they called "female versus male")
  2. (original male > original female) AND (hacked male > hacked female). ( which they called "male versus female")

The purpose of hacking the voices was to eliminate the obvious possibility that the subjects' brains were just responding to the differences in pitch. The researchers' goal was to find where and how the men's brains were responding to the identification of femaleness or maleness in the voices -- though this is curiously at variance with their assertion that the pitch-adjusted voices were perceived as "gender-neutral". (I suspect that they were probably perceived as species-neutral as well...)

Anyhow, I pointed out that there are lots of alternative ways to describe the effects as they defined them -- e.g. lower-pitch-and-shorter-phrases-vs.-higher-pitch-and-longer-phrases rather than male-vs-female. More to the point, though, the researchers only used men as subjects -- for all they (and we) know, female subjects would have responded in exactly the same way.

However, my objection was not to the research itself. I would have preferred to see more appropriate methods of signal processing applied to the voices, and it seems important to see a comparison with female subjects, but those are issues that could be addressed in follow-up experiments.

What I objected to was the media reaction. Here's how I quoted it, back in August of 2005:

Some headlines: "Er, you what, luv?" -- "Man Leaves Wife, Realizes Six Hours Later" -- "Female Voices are Easier to Hear" -- "What We Have is Failure to Communicate" -- "Men do Have Trouble Hearing Women" -- "Why Imaginary Voices are Male" -- "It's official! Listening to women pays off" -- "Men do have trouble hearing women, scientists find".

The blogospheric reactions are just as creative: "I can't hear you, honey...you're just too difficult to listen to" -- "What to tell your wife when you didn't hear her" -- "Men who are accused of never listening by women now have an excuse -- women's voices are more difficult for men to listen to than other men's, a report said" -- "I've been waiting for this for a long time. I'm often accused of 'selective hearing' in which certain statements just disappear from my consciousness - often statements made by Mrs. HolyCoast. It usually occurs when I'm multi-tasking, such as watching TV or blogging while listening to my better half..." -- "Science explains patriarchal monotheism!" ...

My conclusion?

[A]s for the rorschach-blot reactions in the popular press and the blogs, about how this explains why men have a hard time paying attention to women, or why women's speech is more valuable, or why men and women often fail to communicate... Well, what's responsible for these responses is not the STG or the precuneus, it's the limbic system. When people have strong and complex feelings about a topic, research results become a screen for them to project their preconceptions onto.

And now the writers of pop neuro-psychology books like "Why do men fall asleep after sex?" take it as scientifically established that "men listen less because of biology", and that "The male brain … actually has a harder time processing the female voice versus the male voice".

I keep reminding myself that this is all still a step up from the infamous BBC cow-dialects story, in that at least there is an actual published study behind it, even if the underlying research lends no credence at all to the interpretations it's being given. But then again, maybe it's worse. The only social consequence of belief in cow dialects is to help spread the appellation d'origine controllée concept to cheese, and who can really object that?

In any case, for ABC 20/20 to call this "letting science decide" is splendidly ironic.

Posted by Mark Liberman at 07:25 PM

Stop agglutinating, you devious dog!

I'm working on a post (on musical intervals in speech) that's going to take at least one more breakfast-time to complete, so this morning I'll just send you over to TstT, where the Tensor has a terrific summary of one of my favorite stories, Robert Sheckley's "Shall we have a little talk?"

Posted by Mark Liberman at 08:07 AM

September 29, 2006

Secrets of the BBC sexes

I should have known that there would be a BBC science angle to this words-per-day business. David Beaver writes:

At http://www.bbc.co.uk/science/humanbody/sex there's a six part test to see whether you're more female or male (I'm... male). One of the later tests concerns verbal ability, and along with your result on this part of the test is the following text:

Did you know that, on average, women use 15,000 words a day while men use 7,000?

Reader, did you know that, on average, there are 23 different versions of this phony comparison published every day in the world's media? Well, that's not true either -- but I think this is the 11th different pair of values that I've come across recently, ranging from a high of 50,000 vs. 25,000 to a low of 5,000 vs. 2,500. (And in case you're coming late to this discussion, all of these numbers seem to have been made up or copied from someone who made them up -- the many studies that actually count words and correlate the counts with sex find no group difference, or a relatively small difference. When a difference is found, it's usually in the direction of more words from men.)

David reports that the BBC sex-test message continues:

Women took about twice as long as men to end their online instant messenging conversations in a 2003 study of US university students. The study, which was published in the Journal of Language and Social Psychology, also found that women were much more likely to use emoticons (representations of emotions using punctuation marks).

The most popular emoticon was the smiley face :- )

(He observes that "it really said 'messenging', a nice combination of 'messenger' and 'messaging'". This is a fairly common blend -- 432,000 ghits -- but it seems to be a sporadic error rather than an up-and-coming variant.)

The "2003 study of US university students" seems to be Naomi Baron, "See you online", Journal of Language and Social Psychology 23(4) 397-423, 2004. This paper describes its raw materials as follows:

An IM corpus was collected in April 2003 from 22 college-aged students who were attending school or had graduated the semester before the study was undertaken. [...]

The corpus consisted of 23 distinct IM conversations. A total of 9 conversations took place between females (FF) and 9 between males (MM). An additional 5 conversations involved female/male dyads (FM). In a number of the FF and FM conversations, a single student experimenter had conversations with several people on his or her Buddy List. Because several student experimenters withdrew from the project and could not be replaced, most of the MM conversations were between the same two interlocutors.

Taken collectively, the 23 IM conversations contained a total of 2,185 conversational turns, made up of 11,718 words.

The first ironic statistic of the day: the paper reporting on this corpus is 27 printed pages, comprising roughly 11,000 words, so that the paper is about the same size as the corpus it's based on.

The comparison of conversation length and ending length involved only the 9 FF and 9 MM conversations, with the numbers provided in the table below. The second ironic statistic of the day: since "most of the MM conversations were between the same two interlocutors", the BBC science journalists (or their advisors, if any) are making general statements about "women" and "men" based mostly on a sample of two male college students. OK, I admit, it's a step up from the cows in terms of sampling methodology.

Anyhow, here's the data:

Baron's footnote indicates that the difference in conversation length were "not significant":

Large variances associated with the means on turns per conversations (female to female [FF] M = 121.89, SD = 82.68; male to male [MM] M = 85.22, SD = 88.96) and on minutes per conversation (FF M = 31.33, SD = 25.57; MM M = 18.89, SD = 17.27) rendered no statistically significant differences between gender pairs.

But I have to say that this is beside the point. Even if the observed differences had been statistically "significant", they could not have been scientifically meaningful. The male part of the sample was mostly derived from conversations between just two males (and there were not a whole lot of females)! If you could prove beyond a shadow of a statistical doubt that those particular two male college students had shorter IM conversations than the dozen or so (geographically and socially uniform) female college students in the sample, then what?

The business about turns and time "to close" is about the semi-ritualized exchanges at the very end of conversations. Baron's example:

Gale: hey I gotta run
Sally: Okay.
Sally: I’ll ttyl?
Gale: gotta do errands.
Gale: yep!!
Sally: Okay.
Sally: :)
Gale: talk to you soon
Sally: Alrighty.

I guess it's plausible that American women generally spend more turns and time in closing conversations than American men do, whether in face-to-face interactions, in telephone conversations, or in IM exchanges. However, I don't know whether this is true or not, in fact, and Baron's article doesn't really help us get closer to an answer. It's true that in her corpus, there was a "statistically significant" difference in closings:

For instant message (IM) closings, however, FF/MM differences were significant both for number of turns per closing (FFM= 9.78,SD = 5.12;MMM= 4.29,SD = 1.13), t (14) = 2.77, p = .015; as well as seconds taken to close (FF M = 41.00, SD = 20.12; MM M = 16.29, SD = 15.48), t (14) = 2.68, p = .018.

And from this we can conclude that Baron's two college males used more abrupt IM closings -- at least in messaging each other -- than her dozen college females did. But generalizing from a sample of two guys to American males college students in general is... well, let's say we shouldn't do it. Using this method, we could easily prove a large number of pretty surprising things about any group you care to name.

OK, enough. I'll just point out that the BBC science writers use this (in my opinion meaningless) result in a highly misleading way. Consider again what they wrote:

Did you know that, on average, women use 15,000 words a day while men use 7,000? Women took about twice as long as men to end their online instant messenging conversations in a 2003 study of US university students. The study, which was published in the Journal of Language and Social Psychology, also found that women were much more likely to use emoticons (representations of emotions using punctuation marks).

I believe that most readers will interpret that passage to mean that women's IM conversations were found to be twice as long as men's. But the fact of the cited study, you'll recall, was that there was no significant difference in conversation length. The only difference was in the length of closings -- and the data, again, came mostly from two men.

Science journalism is pretty bad overall, but I wonder, is there is any corner of it that is worse than reporting on sex differences? Perhaps reporting on animal communication is also in the running for the booby prize.

[Update -- one good result of the sad Foley scandal (aside from a marginal reduction in the proportion of hypocrites on Capitol Hill) will apparently be a doubling (or more) of Baron's sample size...]

Posted by Mark Liberman at 07:57 AM

September 28, 2006

Teaching: cheaper than therapy

Most of the linguists I know have day jobs as teachers at universities and colleges to support their habits of  research and writing. I spent a little over forty years in the classroom, first as a junior high English teacher, then six years teaching linguistics to undergrads, and the rest of the years struggling to help grad students become linguists. Shifting briefly into the Rumsfeldian mode, was it satisfying? Certainly. Was I successful? Hardly. Do I feel good now that it's over? You bet, although it's never really "over." Teaching  goes on and on, as illustrated by a recent article in the New York Times (see here).

This article tells us about a CCNY political scientist, Stanley Feingold, now 80 and long since retired, who keeps on keeping on. He doesn't teach in the classroom anymore but his former students still seek his counsel. Well, five times a year anyway at their luncheons in New York, when they get him to fly from his home in Seattle to hold still more seminars with their honored professor. Really neat, huh?

Looking back now, I find that I absorbed most of my own learning from professors in whose classrooms I never had the opportunity to sit. Their books and articles, papers given at conferences, correspondence, and informal discussions did the job for me. Often it wasn't the linguistic content that mattered most. It was their attitude, their excitement, their advice, and their ways of expressing ideas. I've never met Professor Feingold but I'll bet that he must have been one of those who could communicate these qualities exceptionally well.

In today's world we honor publication and research (as we should, of course) but we don't usually  think about the other kind of teaching we got, which is often the major cause of how we got where we are. Information sources are now abounding but attitude, advice, and encouragement comes from good teachers, giving hope for the importance of teaching and the continuing future of the classroom setting. I can list my own examples  of linguistics teachers who made right-angle turns in my life. Most of us can, if we take the time to think about it.

I disagree, however, with Professor Feingold's regrets about choosing teaching as his career. He says that his decision to be a teacher was a mistake because, in his words, "I know of no other profession where your on-the-job performance counts so little." Maybe that's because we do an inadequate job of measuring on-the-job performance in our universities. It does count, but it's not always recognizable in the standard course evaluation forms collected at the end of a term.

On-the-job performance might be better measured by the way students' lives are changed. And as for what the teacher gets out of it, as  Professor Feingold puts it, "teaching is cheaper than therapy."

Posted by Roger Shuy at 01:53 PM

Demographic Prediction

David Palfrey has drawn my attention to Microsoft adCenter Labs' "Demographic Prediction" tool, which allows you to "use adCenter technology to predict a customer’s age, gender, and other demographic information according to his or her online behavior—that is, from search queries and webpage views."

OK, let's see if it works with basic stereotypes:

So far, so good. What about Language Log?

As I interpret what the adCenter Labs page says about this, they're predicting that the people who search for and/or read Language Log are youthful, and evenly balanced as to sex:

General Distribution is the breakdown by age of MSN Search users—based on a one-month MSN Search log—regardless of search query used.

Predicted Distribution is the predicted breakdown by age of MSN Search users for a single search query, based on the adLabs predictive model.

Well, if this is true, it's good news for the future of the field of linguistics.

But guess what? It turns out that interest in math is strongest among mature females:

This is clearly due to the role of differential equations in fending off attacks from ticked-off cavemen.

[Update -- Theo Vosse offers the results of some further explorations:

A few tries in the reliability of the adLab tool (Male - Female percentages):

http://www.google.com: 43 - 57
http://www.cnn.com: failed to predict for URL
http://www.bbc.co.uk: 61 - 39

Is Google female and the BBC male? And CNN neuter?

Simple morphological variation:

sister: 52 - 48
sisters: 28 - 72

Synonyms and plurals:
struggle: 67 - 33 struggles: 42 - 58
battle: 61 - 39 battles: 57 - 43
fight: 59 - 41 fights: 62 - 38
competition: 57 - 43 competitions: 30 - 70
match: 50 - 50 matches: 48 - 52
conflict: 50 - 50 conflicts: 30 - 70
contest: 44 - 56 contests: 32 - 68

Class variations:
cherry: 52 - 48
pear: 45 - 55
banana: 41 - 59
strawberry: 27 - 83
fruit: 35 - 65

Named entities:
travolta john: 56 - 44
travolta j: 53 - 47
j travolta: 51 - 49
travolta: 45 - 55
john travolta: 36 - 64

I wonder how you should interpret that...

So "competitions" is 30-70 female, while "struggle" is 67-33 male. Found poetry, or random noise? We report, you decide.]

Posted by Mark Liberman at 07:52 AM

If it's a whistle, the dogs aren't hearing it

A couple of days ago, in response to my post "The comma was really a dog whistle" (9/26/2006), Josh Jensen wrote:

A very conservative evangelical, I would never have associated Bush's 'comma' reference with the period/comma saying, though I don't doubt that one of his speech writers got the idea there. Perhaps I'll take the quotation to a seminary class tomorrow (a Greek class, likely to have generally well-educated Evangelicals in it) and then to work (a Christian adoption agency) to find out whether anyone hears the whistle. (The experiment may only prove that we're all the wrong kind of Evangelical, though I suspect that we all voted for Bush.)

(Look at the bottom of the earlier dog-whistle post to see the rest of Josh's comments, including a link to an interesting G.K. Chesterton essay "On the Cryptic and the Elliptic".)

Today, Josh wrote back with a report on what he found.

I talked with 12 people today, all conservative Evangelical (Presbyterian, Southern Baptist, and independent Baptist).

I read Bush's statement and made sure that the person understood the context. Then I asked, "Does the comma reference bring to mind any well-known sayings or images?"

No one came up with any version of "Our periods are God's commas," and no one felt that they would have come up with that on their own.

Some other numbers:

The group was evenly divided by sex (6 female, 6 male).
3 have Ph.D.s (O.T. Interpretation, N.T. Interpretation, Psychology).
1 has a master's degree (MPH).
3 are working on master's degrees.
2 never graduated from college but work as professionals.

Half indicated that they recognized the saying once I revealed the secret code.

Three listened to the Bush quotation and then said, "Like a blip [on the screen]?" (Perhaps Bush is sending coded messages to radar operators.)

A bit later, Josh sent in some additional results:

Perhaps I've run this into the ground, but a few more observations from the very conservative:

One person said that Bush's comment was the most idiotic thing she'd ever heard (she wasn't objecting to using a comma as a metaphor, but to Bush's minimizing the current situation in Iraq<

One person called Wolf Blitzer a bully and suggested that Bush needs a speaking coach.

One person objected to Bush's goal of spreading democracy.

At least three people thought it would be cool if Bush were sending them secret messages, but they didn't indicate that they'd ever received any.

By occupation: two work for a Christian mission board; three are students in seminary classes; two are seminary professors; four are adoption professionals; and one is an accountant.

The fact that President Bush said

I like to tell people when the final history is written on Iraq, it will look like just a comma... [emphasis added]

suggests that the "comma" metaphor was (perhaps a slightly garbled version of) one of the president's talking points on Iraq. And whether he came up with it himself, or got it from a speechwriter or other spinmeister, there's a good chance that Gracie Allen's proverb and its spread by the UCC played some role in evoking the idea. But the "political dog whistle" theory is looking about as plausible as Leonard Sax's story about why girls think their fathers are yelling.

Posted by Mark Liberman at 01:05 AM

September 27, 2006

On getting to know the secretary

Mark's posts (here and also here) about exchanges between males and females in law offices prompts me to report that informal conversation also works well between outside male consultants and the lawyer's female secretaries or administrative assistants. In my some 30 years of working (long distance most of the time) with various law firms around the country, I've learned to get to know the secretaries well enough for us to call each other by our first names and to engage in some small talk before they put me on the phone with their bosses. This really pays off when I need to get the lawyer's (male or female) attention about something important but, perhaps most of all, when I want to get my bills paid. Some lawyers neglect invoices until reminded several times. Don't bother calling or even writing them. It's practical to get to know the secretary. Anyway, small talk can be kinda fun.

Posted by Roger Shuy at 04:44 PM

Sex, status and law-office culture

A reader from the midwest has responded to an earlier post ("Stereotypes and facts"), where I asked whether keeping up with colleagues' personal lives is something that comes more naturally to women than to men:

My father, husband, and I are all attorneys. My father and husband both work in offices with a fairly equal division between male and female attorneys, both at junior and senior levels, but in both offices all the secretaries are female. They both have discovered there's a certain advantage to the female attorneys who more easily discuss personal lives with the secretaries; the secretaries are more willing to save your ass if you remember their birthdays or talk about their kid starting kindergarten ... or if they know your spouse is sick and you want to get home. They also, of course, have a gatekeeping function and some ability to control who gets through to you on the phone, and who talks to new clients. I don't think any of this is malicious or manipulative; just that if Stacy Secretary knows that Annie Attorney is having a rough week because her husband has pneumonia, she'll be more likely to not put through the client from hell, whereas she might put through such a client to Arnold Attorney who is also having a bad week, but Stacy doesn't know about it.

I'm sure a great deal of this is cultural (I know that in real life, my husband is waaaaaaay better than me at this kind of social small talk and remembering people's birthdays and children and pets and things) and that women are often simply more comfortable chatting with other women.

But both my father and my husband make a concerted effort to engage in small talk with the secretarial staff at their respective offices, and some of the other male attorneys wonder why husband and father's secretaries are willing to stay late to help finish a filing, while their own secretaries are off at the dot of five. There's a definite career advantage to learning to navigate the "other side of the gender divide" in their offices.

On the other hand, my brother, also an attorney, practices in an office where at least 3/4 of the attorneys are male, and chatting with the female secretaries is seen as unprofessional and unimportant. In that office, the bigger advantage is to women attorneys who hold themselves aloof from the secretaries.

I'm in solo practice, so I don't have a secretary at present, but I also find that if when I drop by my husband's office, if I stop to chat with his secretary and mention that I heard her kid is playing soccer this year ... he gets a bump from that, too. And when I DID work in offices with secretarial staff, the attorneys who were social with the secretaries DEFINITELY got their documents done faster and the "intelligence" from the secretarial staff passed on to them. ("Bill Boss is having surgery next week, so he's really touchy. Just FYI.")

Might be an interesting study, to look at lawyers or doctors (where the gender numbers are equalizing) and their relationships with their staffs, which are still largely or exclusively female. (And potentially removing the science/humanities divide you might find in an engineering office; in a non-patent law office, most of the lawyers and secretaries typically majored in the liberal arts.) Also because in the highest levels of both professions, you'll still have male-heavy leadership (even male-exclusive leadership), so you could get some really interesting comparisons between male/female equalized offices and male-dominated offices.

Posted by Mark Liberman at 02:40 PM

I missed it

A striking example of synchronicity -- September 24th was National Punctuation Day.

Roger Shuy pointed out that the NPD web site includes the punctuation of the expression, "do's and do'nts."

And David Beaver theorized that this is to avoid any potential for confusion of "does 'n don'ts" with "dozen donuts."

Posted by Mark Liberman at 08:52 AM

September 26, 2006

Commas biomedical, theological and poetical

From the midwest, Jonathan Lundell offers a new take on comma/period semiotics:

I'm in Minneapolis today visiting family, and just saw a billboard promoting the cardiac care department of a local hospital:

    heart attack.  or  heart attack,

From the middle of the 17th century, Samuel Sheppard's Epigram 31, "Disorder the fore-runner of Ruine" [from Epigrams theological, philosophical, and romantick (1651)] attributes periods as well as commas to the divine plan, though not in a way that will provide any comfort to those concerned about the situation in Iraq:

Both bodies Politick, and Naturall,
By this ill-shaped enemy doe fall:
Christendomes whip, who now doth soare so high,
By this in her own ruine low shall lie,
Factions those Comma's are, ordain'd by God,
When he'l bring Kingdomes to their period.

And in Aram Saroyan's 1998 "How to be an American poet", commas set off brain-storms:

Moreover, you have a select group who see the comma
As the way in, and out, of all poetic reality. Such
Poems, hunched with the determination to forge an
Electric pattern through plain talk, sometimes de-
Light the mind into déjà-vus, or cause electric storms
In the living rooms of the brain. The heart's telephone
Goes on ringing though, and there is no one to answer
The call. Not the baby nor the kids nor the Mommy
Nor the Daddy nor the neighbors nor the whole town
In the full moon of Grandfather Night. But the next
Morning the birds begin on time, and this is what we
Must remember, what we must hold on to in the terrible
Disorder of our century, the madnesses and absolutes---
Those birds are simple. And being simple, they are
Naturally excellent poets.

Posted by Mark Liberman at 11:54 PM

The comma was really a dog whistle

That's the theory of Ian Welch at The Agonist. According to him, when President Bush said that the Iraq war would "look just like a comma" to future historians, he wasn't using a creative and unexpected metaphor-- he was evoking a well-known proverb that urges steadfastness, "Never put a period where God has put a comma."

This being Language Log, of course we're going to check the numbers. And there are 440,000 Google hits for {period God comma}, mostly indeed variants of this expression:

Don't put a period where God has put a comma.
Never place a period where God has placed a comma.
If we stop there we are placing a period where God has placed a comma.
Never put a period, where God has put a comma.
Don't put a period where God puts a comma.
Don't put a period where God put a comma.
Don't place a period where God intended a comma.
God’s period is what allows our lives to have commas.
...we must be alert to the caution Gracie Allen left us not to put periods where God has put commas.
Today’s Bible stories are both about God putting a commas where humans might be tempted to put periods

Ian Lynch "on behalf of the Commission on Communication, Massachusetts Conference, United Church of Christ" has developed a version of this phrase into a "series of Lenten litanies", under the title "Ellipses and Commas; A Punctuated Lenten Journey":

The comma as a symbol of this campaign comes from the quote by Gracie Allen, “Never place a period where God has placed a comma.” The concept of this series is to slowly build this complete sentence through the seven Sundays of Lent and Easter.

And this sermon by Larry Reimer, dated May 7, 2006, reviews the whole story:

About five years ago our national denomination, the United Church of Christ, was looking for a phrase to define itself. They found the perfect words from Gracie Allen, the wife and comic partner of comedian George Burns.

Gracie Allen was a brilliant and perceptive woman. She left a message in her papers to be discovered by her husband after her death that has become the motto for the United Church of Christ: “Never put a period where God has placed a comma.”

Gracie was encouraging George to remember that life had many chapters. George was 68 when Gracie died. Rather than place a period after his career, Burns went on to star in a number of movies, including playing God, twice. He even headlined at Gator Growl in the 1970’s. He died at age 100, having lived the life of the comma.

Way back when the Pilgrims sailed from Holland to the new world on the Mayflower, their pastor John Robinson, who was forbidden to go with them, sent them off with another momentous phrase, “There is yet more truth and light to break forth from God’s holy word,” a forerunner of the comma.

The Pilgrims became the Congregational Church. The Congregational Church merged with the Evangelical and Reformed Church in 1957 to become the United Church of Christ. Today, we in the United Church of Christ believe that God is still speaking. We gather as a church as our compact says, “to learn from our religious heritage, yet to grow by seeking new dimensions of truth.” William Sloane Coffin, Jr. said “Hell is truth seen too late.”

In grammar, a period is a place where a thought stops dead. A comma is a pause to take a breath and then pick up the thought again.

We pride ourselves here at UCG on being people of the comma. We love our own motto, “It’s not like this every Sunday!” which we say every Sunday. But pride, as they say, goeth before the fall. We can become a little full of ourselves about our wonderful comma-ness.

Wise words. Except for the part about thought stopping dead. I mean, that's going to make it hard to compose a coherent paragraph, not to speak of a whole sermon.

Anyhow, Ian Welch is obviously right about the source of President Bush's comma, and Ken Layne was wrong. It was religion, not drugs.

But why is this allusion a "dog whistle"? Welch argues that President Bush

is constantly littering his speeches with code words and phrases meant for the religious right. Other people don't hear them, but they do, and most of the time it allows Bush both to say what those who aren't evangelical or born again want to hear, while still reassuring the religious right wants to hear.

For example, one of the most famous episodes of this was Bush's reference in the 2004 debates to the Dred Scott decision. Most people couldn't figure out what the heck he was talking about - it seemed like a non-sequitur. But, as Paperwight pointed out at the time, anti-abortion activists see themselves as similar to anti-slavery activists. And they take heart that eventually Dred Scott v. Sandford was overthrown. [...]

The other name for this is dog whistle politics. When you blow a dog whistle humans can't hear it, but the dogs sure can. It's a pitch higher than humans can hear. When you speak in code like this, most of the time the only people who hear and understand what you just said are the intended group, who have an understanding of the world and a use of words that is not shared by the majority of the population. So it allows you to send out two messages at once - one pitched for the majority of Americans, the other pitched for a subgroup. This goes on all the time, and usually it isn't caught - most people don't hear it, and the media is made up of people who can't make the connections because they don't belong to these subgroups. So they can't point out the subtext either.

It's very effective, and it's one reason why Bush still has his hard core of support - he's constantly reassuring them, at a pitch the rest of us can't hear.

It seems to me that this is true on one level, and profoundly unfair on another. We all "constantly litter" our speech and writing with messages that will be fully received only by those who share our verbal and conceptual associations. But we don't usually do this in order to create a Straussian double message, an esoteric wolf in an exoteric sheepskin. We do this because we can't help it, it's how language works, and also how thought works.

New ideas and new discourses are built out of fragments of old ones. As a result, almost everything that we say or write is a "dog whistle": even if the basic meaning is clear to everyone, some people will pick up on implications that are lost to others. And that's just as true for Hilary Clinton -- and for Ian Welch, and for Ken Layne, and for me -- as it is for George W. Bush. At least once a week, I get an email from someone who has seriously misunderstood something that I wrote, not because I expressed it badly (well, that happens a lot, too), but because they missed an association or an allusion, or because they made an association or saw an allusion that I never intended.

The lesson to draw from this episode is not that our president is using his word choices to send coded messages, but that that people with different life experiences sometimes receive very different messages from the same text. In the context of the president's answer to Wolf Blitzer's question about Iraq, "comma" was either a reassuringly familiar cliché for steadfastness, or an unexpected, bizarre and inappropriate metaphor -- depending on how you've been spending your Sundays over the past five years.

[Comma tip from Edward Wilford]

[Update -- Rob Sears writes:

Interesting post -- but there is much more to dog whistle politics. The point is that Bush knows that explicitly religious references will cause a fuss among certain audiences (us Europeans will think he is even less rational than we already do, for instance), and so he or his advisors deliberately find ways to get across implications to their base, that will be lost on others. The basis of the charge is that it cannot be coincidence that so many of Bush's religious reference are shrouded. If he spoke naturally, the ratio of explicit to hidden religious invocations would be much higher. (Obviously, it would be a big job to prove this.) So when you say:

"As a result, almost everything that we say or write is a "dog whistle": even if the basic meaning is clear to everyone, some people will pick up on implications that are lost to others. And that's just as true for Hilary Clinton -- and for Ian Welch, and for Ken Layne, and for me -- as it is for George W. Bush. "

...it is missing the point, at least for you, I'm sure, and maybe for the others. You are not deliberately concealing your meaning from a particular group of readers, as Bush is doing from Godless types. If a particular group doesn't get something, that is not because you were deliberately trying to keep something secret from them. You are trying to be clear, not to sneak something past readers.

I hope this distinction is clear, even if the charge against Bush remains unproven.

But if you believe that President Bush chose the word "comma" in this case in order to send a coded message to evangelical Christians, you're giving him credit for a degree of linguistic subtlety that is, let's say, in conflict with the current stereotype. And Kenny Easwaran observes that "religious" doesn't mean ""right(-wing)":

It seems to me very improbable that Bush was referring to the phrase "Never put a period where God has put a comma" as a secret reference for the religious right. This is primarily because that phrase is associated with the United Church of Christ, which as far as I can tell is part of the religious left. We've got their churches all around Berkeley (or at least, one very big one right near campus), and I always took their motto to mean that the Bible is not the complete source of divine law, but that it changes to reflect changing sensibilities over time. I've always imagined this is one of the reasons the UCC supports gay marriage. A very unlikely audience for Bush to be targeting indeed.

I agree that this makes it seem even less likely that the word was consciously intended as "dog whistle" message. But Google provides some reason to the think that the "comma" meme has spread beyond the UCC, and it does seem plausible that some version of this saying was in the president's mind when he chose this phrasing, instead of calling Iraq a "bump in the road" or any of the other commonplace phrases for a temporary problem.]

[John Brewer adds some depth to the discussion:

I was intrigued by your post on the possible religious subtext of the "comma" remark, but am left a bit uncertain as to who the dog-whistle-detectors might think the President was signaling to. I would think if you asked most scholars of the contemporary American religious scene to name the major Christian denomination that had the least in common with the so-called Religious Right, the United Church of Christ would probably be the consensus pick, at least if the Unitarians were excluded due to lack of self-identification as Christian. (The UCC's most recent walk-on role in national politics is as the group Howard Dean joined after he left the Episcopal Church due to a dispute over a bicycle path.) Indeed, the UCC seems to use the comma slogan as a way of distinguishing itself in the marketplace from more traditional Christians, who actually believe they are in possession of the capital-T Truth followed by a full-stop period. Like the sequence of periods in God said it. I believe it. That settles it. By contrast, the UCC approach might be caricatured by its opponents, if they were given to linguification, as having lots of commas: God is said to have said it, but it strikes me as outdated, embarrassing, and in conflict with the teachings of the New York Times, so I'm eager to hear a rationale for why it's not applicable to me. Others might phrase that less pejoratively, but the point is not a point about steadfastness, but about continuing revelation, or at least an ecclesiastical analogue to a Whiggish and/or Hegelian theory of history.

Now it's certainly possible that the comma/period contrast is also used in an entirely different way in evangelical circles which might be more appropriately linked to the Religious Right. I haven't looked into the context of your other Google results. Like the American political right, the American RR is subject to caricature as an interesting melange of doom-and-gloom condemnation of sin and decadence, on the one hand, and upbeat motivational-speaker hokum, on the other, and this could be an example of the latter. But from that perspective (to beat the metaphor into the ground), the hope that we're still in the middle of the sentence and the difficult present circumstances may improve before the sentence ends would be based on the faith that God's previously uttered Word has been punctuated with a definitive period guaranteeing that it will not be contradicted by a subsequent clause. "God's period is what allows our life to have commas," from your search results, makes that point. But that's not the UCC's point.

]

[And Ben Zimmer adds some historical perspective:

The Youth's Companion, July 31, 1919, p. 412:
Don't get to thinking in ultimate terms too quickly about life, my dear. There are not so many finalities in life as you young folks think. Remember the old saying, "Man's periods are God's commas."

]

[Josh Jensen writes:

I enjoyed the comma and dog whistle post -- and the post-scripts. My own observations:

A very conservative evangelical, I would never have associated Bush's 'comma' reference with the period/comma saying, though I don't doubt that one of his speech writers got the idea there. Perhaps I'll take the quotation to a seminary class tomorrow (a Greek class, likely to have generally well-educated Evangelicals in it) and then to work (a Christian adoption agency) to find out whether anyone hears the whistle. (The experiment may only prove that we're all the wrong kind of Evangelical, though I suspect that we all voted for Bush.)

So I think you're right to be reluctant in accepting the dog whistle theory. No doubt Bush talks in certain ways because of his Evangelical roots (and his speechwriters'), and no doubt he's at least sometimes careful to restrain himself from making specifically Christian references. But there have to be limits to the value of judging an author's or speaker's secret agenda based on his coded allusions (one thinks of the ink spilled to prove that Shakespeare was a closet Catholic trying to give courage to fellow Catholics). My hunch is that very few people -- even committed and well-educated ideologues -- pick up on subtle allusions in speeches, and only journalists (and politicians' ideological opponents) actually read political speeches after they've been transcribed. (Arguably, journalists don't read the speeches either, but only search the texts for striking metaphors and shocking claims that can be excerpted for article leads.)

(Josh reports back on what he found here.)]

[Jack Collins points out that God dictated or inspired little or no punctuation in the original biblical texts:

From the perspective of a religion student, I always found the "where God put a comma" slogan curious, since, of course, in the original Hebrew and Greek, the Bible texts had no punctuation to speak of, and that the current versification and punctuation, if in the critical editions of the original language texts, are the result of later traditions. Indeed, the idea of commas and periods is kind of alien to Biblical Hebrew, where verbless clauses often stand on their own and the boundaries between sentences are blurred. Even the Masoretic pointing (which is medieval in origin) doesn't have a clear equivalent to commas and periods, instead using a system of accents, pauses and vowel changes to indicate conjunction and disjunction between words and clauses.

But Daphna Shezaf observes that things are different in modern Hebrew usage:

I thought you would like to know that for modern Hebrew speaker, there's nothing odd about Bush comma metaphor. We use comma quite a lot to designate small things, or things of no or little significance. My dictionary dates this use back to the 1950's literature, which is almost ancient history for modern Hebrew. Specifically, some form of "all this will just be a comma in history books", is a rather common idiom (Googling it gave about 100 results, which is a lot considering the number of Hebrew web pages and the difficulties that Hebrew morphology and spelling pose on searching).

In fact, this meaning of "comma" is so natural for me, that I was not sure there wasn't some of "double pun" I was missing in all your posts on the subject. But now I have consulted a friend who is native English speaker. He suggested this is either a very spooky coincidence, or Bush has linguistically blew his cover: he must indeed be an undercover Mossad operative, after all.

The plot thickens.]

Posted by Mark Liberman at 06:53 AM

September 25, 2006

The comma has legs, but the dagger is silent

The best riff on Iraq-as-a-comma was by Ken Layne at Wonkette, who displayed this map, along with the suggestion "How about hitting the bong after the interview?"

Ken added that "Meanwhile, some retired top guns from the Pentagon are on the Hill today demanding Rumsfeld be considered an asterisk … er, 'sent packing.'"

In second place, I guess, was Greg Mitchell at Editor & Publisher, who suggested that

[O]ne can think of other punctuation that might be apt, including "?" for the 140,000 Americans still deployed there, "!" for the cries of the gravely injured, and "$" for Haliburton and other contractors.
Or perhaps, as in the comics pages, when an angry character really wants to curse: "!@#%^&*()#*"
But I'd like to offer one more, the simple period, to replace the hopeful comma. Below you will find some 2,700 periods, each standing for an American life lost in Iraq.

By chance, I read that just before this:

[Update: the comma comes from Gracie Allen! Never mind how the future histories of our time will be worded, it's becoming increasing clear to me that the script for current events is being written by a team that includes Dorothy Parker, Thomas Pynchon and Christopher Buckley.]

Posted by Mark Liberman at 07:06 PM

Avoiding passive for dummies


Diane Steele, publisher of the Dummies series (over a thousand titles beginning with DOS for Dummies in 1991), explains to Rachel Donadio ("Dumbing Up" in the NYT Book Review, 9/24/06, p. 31) how the books are put together:

The editorial team, based in Indianapolis, gives authors a kind of "Dummies for Dummies" manual and a computer template.  "Copy editors do the line editing and Dummifying," Steele said.  "It's a word we use to talk about how to make text comply with our style guide."  The approach is strict.  "We address the reader as you -- you can, next you do this -- we don't talk about we," she said.  "We try to be funny, or at least lighthearted."  Furthermore, Steele said: "We don't use future tense, we don't use passive voice, we don't have long chapters.  A 26-page chapter is getting pretty long."

Yes, Avoid Passive.  (Also Avoid We and Avoid Future, which we haven't discussed here.)  But sometimes you really want a passive.

According to Steele, Dr. Alan Rubin, author of Diabetes for Dummies

said he had some friendly discussions with his editors about the passive-voice rule.  "Sometimes I'll write something like 'the patient was comatose and was given thyroid hormone,' and they'll change that to 'the patient was comatose and took thyroid hormone,' " Rubin said.  "I have to tell them these are extremely sick patients, they can't take care of themselves, they have to be passive whether Wiley likes it or not."

Ok, the patient was passive (comatose, in fact), but does the sentence have to be?  (Yet another demonstration of why the technical term passive is not such a great choice.)  Of course not.  It could be recast as something like "the patient was comatose, so the doctor gave her/him thyroid hormone", though that's longer and also introduces the doctor as an important participant in the story.  There are ways we -- oh, sorry, you -- can avoid passive and keep the sentence short: "the patient was comatose and got thyroid hormone".  The VP "got thyroid hormone" in this version is not passive in form, true, but it also takes subjects denoting a recipient, rather than an agent, so if you dislike the passive because you want agentive subjects, this version won't really make you happy.  But then "was comatose" doesn't take agentive subjects either, and it's hard to see how you could convey the coma information with a VP that takes agentive subjects; you can devise non-copular VPs -- "lapsed/fell into a coma", for instance -- but their subjects denote affected persons rather than agents.  (Deliciously, a fairly standard technical term for an affected participant in an event is patient.  Yes, "lapsed into unconsciousness" and "fell sick" are VPs taking patient subjects.)

This would be a good time to remind readers that the advice literature is inclined to confuse syntactic functions (like subject and direct object) and participant roles (like agent and patient).  Granted, the world would be simpler if you could get right from syntax to meaning -- if, say, subjects always denoted agents in events -- but this is very much not the world we live in, and we just have to get used to working with two different sets of concepts and separate sets of terminology.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at 11:01 AM

Comma

On CNN's The Situation Room, aired 9/20/2006, Wolf Blitzer interviewed President George W. Bush:

BLITZER: We see these horrible...
BUSH: Of course you do.
BLITZER: ... bodies showing up, torture, mutilation.
BUSH: Yes.
BLITZER: The Shia and the Sunni, the Iranians apparently having a negative role. Of course, Al Qaeda in Iraq still operating.
BUSH: Yes, you see it on TV. And that's the -- that's the power of an enemy that is willing to kill innocent people. But there is also an unbelievable will and resiliency by the Iraqi people.
Twelve million people voted last December. Admittedly, it seems like a decade ago. I like to tell people when the final history is written on Iraq, it will look like just a comma, because there is -- my point is, there is a strong will for democracy.
[emphasis added]

In the comments at the Huffington Post, "zave" observed:

For more on commas, see: "Eats, Shoots & Leaves Iraq"

I was going to say that most of the people who get that joke are probably in favor of leaving, but on reflection, that's probably wrong. Neo-conservatives are just as likely to be interested in militant fussiness about punctuation as realists and isolationists are, I suppose. And the joke is equally good (or bad) from just about any political perspective.

Back in 2003, I heard an informal talk by a political scientist on the prospects for the American occupation of Iraq. His central point was that the rebuilding and transformation of Iraqi society was likely to take a long time, cost a lot of money, and go through some difficult stretches. Given that prospect, he expressed skepticism about whether the American public and the American government would be willing to stick it out, and suggested that the aftermath of American intervention could be pretty ugly if they weren't. I was more optimistic at the time, but (outside of Iraqi Kurdistan) it's certainly starting to look as though "eats, shoots and leaves" might be all too appropriate a context for that "comma".

[Hat tip: David Donnell]

Posted by Mark Liberman at 07:23 AM

September 24, 2006

Stereotypes and facts

"Many women find biological comfort in one another's company, and language is the glue that connects one female to another." That's what Louann Brizendine wrote in The Female Brain, and this idea resonates with many people, including me. At least, I think I recognize the corollary illustrated in this Dilbert strip (9/21/2006):

Perfect, right? Dilbert was focused entirely on the job -- on the numbers! Tina immediately thinks -- and asks! -- about the striking events of Yvonne's personal life. Maybe Dilbert didn't even know about the sextuplets, the fire, and the surgery, or maybe he knew and didn't ask for an update. Either way, he's a poster child for what Simon Baron-Cohen identifies as the "systematizing" tendencies of the male brain, while Tina represents the "empathizing" tendencies of the female brain.

I've recently devoted some effort to debunking exaggerated or outright false characterizations of sex differences. For example, the "biological comfort" quote from The Female Brain was followed by a series of striking overgeneralizations, unsupported by the references given in Brizendine's end-notes, as discussed in mind-numbing detail in "The main job of the girl brain" (9/2/2006). A list of other Language Log posts debunking brain-sex claims can be found here. And in today's Boston Globe Ideas section, I have a short piece fact-checking a couple of specific speech-related claims from The Female Brain ("Sex on the brain", 9/24/2006).

But that Dilbert strip has a particular resonance for me, because I worked for 15 years in an industrial research lab where most of the engineers were male, and all of the secretaries were female. News about people's lives generally reached me through the secretaries. I left that job 16 years ago, but I still feel that keeping up with colleagues' personal lives is something that comes more naturally to women than to men.

Is that feeling a valid one? In the lab where I worked, sex was largely conflated with job description. And in Dilbert's department, Tina is the female technical writer in a culture of mostly-male engineers, so sex is also conflated with humanities-vs.-engineering educational background. When I reflect on it, I can certainly think of many individual women who are all business in the workplace, and many individual men who are endlessly curious about everyone else's lives, but the gender stereotype implied by the exchange between Dilbert and Tina still rings true.

I don't know any studies specifically bearing on how likely men and women are to know or to ask about the personal lives of co-workers. But you could make some inferences about this from Baron-Cohen's idea of an "empathizing quotient" and a "systemizing quotient", and the results of research measuring these values (operationalized in terms of scores on a self-rating questionnaire with scalar answers).

You can go to the EQSQ site and take the tests to place yourself on these scales. You probably won't be surprised to learn that the results of giving the tests to various groups of subjects show a difference between the average scores of women and the average scores of men -- along with a great deal of overlap between the groups. One graphical representation of the data is this scatter plot from Simon Baron-Cohen, Rebecca C. Knickmeyer, and Matthew K. Belmonte, "Sex Differences in the Brain: Implications for Explaining Autism", Science 310 (5749) 819-823, 2005:

Fig. 3. SQ scores versus EQ scores for all participants, with the boundaries for the different brain types

(The red diamonds are individual females, the blue triangles are individual males, and the green squares are individuals in a group diagnosed with Asperger's Syndrome or High Functioning Autism.)

There's more to say about S-E theory, but for now, let's just observe that as usual, the actual distributions for men and women overlap quite a bit. Here's a plot of the overlap in SQ-R distributions, from S. Wheelwright, S. Baron-Cohen, N. Goldenfeld, J. Delaney, D. Fine, R. Smith, L. Weia and A. Wakabayashi, "Predicting Autism Spectrum Quotient (AQ) from the Systemizing Quotient-Revised (SQ-R) and Empathy Quotient (EQ)", Brain Research 1079(1) 47-56, 2006:

(The "R" in SQ-R stands for "revised" -- the authors modified the original SQ test items "to provide a greater coverage of social systems and domestic systems, not just mechanical or abstract systems", "because items in the original SQ were drawn primarily from traditionally male domains", rendering the prediction of greater male systemizing a circular one.)

The same article gives EQ and SQ means and standard deviations for various populations, from which we can calculate effect sizes:

On the EQ, females in the general population score 47.2 (SD = 10.2), which is significantly higher than the male mean of 41.8 (SD = 11.2), while people with ASC score significantly lower than typical males, with a mean score of 20.4 (SD = 11.6). On the SQ, typical males score a mean of 30.3 (SD = 11.5), which is significantly higher than the mean for typical females of 24.1 (SD = 9.5). People with an ASC score significantly higher than typical males with a mean of 35.7 (SD = 15.3). Finally, on the AQ, not surprisingly, people with ASC have the highest AQ scores (mean 35.8, SD = 6.5), but consistent with predictions, typical males score higher (mean = 17.8, SD = 6.8) than typical females (mean = 15.4, SD = 5.7).

In the case of the EQ (empathizing quotient) that's an effect size of d=0.50 for the difference between women and men; for the SQ (systematizing quotient) we get an effect size of d= 0.59. So these are moderately large effects -- but they are by no means categorical properties of women and men.

Another way to look at the meaning of these numbers is to ask how a randomly selected female and a randomly selected male will compare in terms of scores on these questionnaires. With respect to the empathizing test, the woman will score higher about 64% of the time, while the male will score higher about 36% of the time. For the systematizing test, the random man will score higher about 67% of the time, while the random woman will score higher about 33% of the time. (I believe that this gap is narrowed if the the "revised" SQ test is used.)

This paper also provides some experimental support for my guess that the difference between Dilbert and Tina might have something to do with their educational and job choices as well as with their X and Y chromosomes. In Table 1, we learn that people with different college degrees differ on average almost as much in EQ (and SQ) as women and men do:

DegreeSexnSQ-RAQEQ
Physical scienceMale294Mean65.419.435.9
SD17.56.411.0
Female159Mean59.918.044.7
SD19.45.711.3
Biological scienceMale125Mean62.016.741.6
SD17.85.811.5
Female290Mean52.015.648.5
SD19.25.811.4
Social scienceMale115Mean61.916.241.4
SD18.85.011.0
Female181Mean51.215.048.7
SD19.75.110.8
HumanitiesMale189Mean53.715.740.5
SD20.66.011.7
Female408Mean48.414.648.7
SD17.95.311.2

So Dilbert and Tina have a double dose of group difference: male vs. female, engineering vs. humanities. And the result, it seems to me, is a common situation: a stereotype with a basis in fact.

Remember, though, that those female-vs.-male distributions in EQ and SQ still overlap quite a bit. And I'll bet that effects of similar size, on the same questionnaire results, can result from differences in cultural background and life experience -- or even from the short-term influence of interventions to shift perceived group norms and values. So we need to be careful in drawing conclusions from such results, whether about individuals or about groups.

But the most important lesson, in my opinion, is that the facts matter. Where the facts turn out to support consequential cognitive differences between human females and males, let's try to look clearly at what those differences are, where they come from, and what individual, social and political conclusions we should draw. But let's not let popularizers of brain-sex differences bring overgeneralizations and outright fallacies into the discussion as if they were scientific results. The lamentably low factual and logical standards of self-help books should be just as out of place in a public-policy debate are they are in a scientific journal article.

Posted by Mark Liberman at 09:05 AM

September 23, 2006

The Path to Poincaré

The article was presented under the heading of "The New Yorker: Fact". The authors were Sylvia Nasar and David Gruber, the category was "Annals of Mathematics" and the title was "Manifold Destiny: A legendary problem and the battle over who solved it" (New Yorker, 8/28/2006):

But according to a 12-page letter from Todd & Weld LLP, dated 9/18/2006, this gripping morality play, starring the saintly, reclusive genius Grigory Perelman and the pushy, over-the-hill careerist Shing-Tung Yau, was mostly fiction. Or at least, the damning "facts" and "quotes" dealing with Shing-Tung Yau were mostly fabricated, selected from unreliable sources, or presented as uncontested truth despite substantial contrary evidence.

In other words, despite the New Yorker's reputation for diligent fact-checking, it's claimed that this piece of reportage is roughly as accurate as ABC's much-contested mockumentary The Path to 9/11.

I won't review the whole Yau story here -- it's complicated, you can read about it for yourself, and like it or not, I expect that we're going to hear a lot more about this over the next few months and years. But there was one particular passage in the Todd & Weld letter that definitely rang my bell:

... [Y]ou offered purported "facts" in supposed support for your stated conclusion. The centerpiece of this effort was your inclusion of quotes set forth at page 56, which you attributed to Dr. Yau and the "acting director" of his Beijing mathematics institute from a June 3, 2006 press conference held by Dr. Yau in China which you claimed had taken place in order "to explain the relative contributions of the different mathematicians who had worked on the Poincare[.]" Based on the quotes, you represented to reader that Dr. Yau had claimed that he, Professor Zhu and Professor Cao were entitled to "thirty percent" of the credit. To add insult to injury, you pointed out that the numbers supposedly used by the "acting director" did not add up to 100% (e.g., "Evidently, even simple addition can sometimes trip up a mathematician").

The problem is that Dr. Yau did not utter the words you attributed to him and you were so informed prior to the publication of your article. Likewise, there was no "acting director" of Dr. Yau's mathematics institute in Beijing in June of 2006 (or at any other time) who spoke the words you placed in his mouth. (There was a deputy director, Yang Le, but he apparenty did not even attend the June 3rd press conference).

Here's what Nasar and Gruber's Manifold Destiny said about the news conference:

By early June, Yau had begun to promote the proof publicly. On June 3rd, at his mathematics institute in Beijing, he held a press conference. The acting director of the mathematics institute, attempting to explain the relative contributions of the different mathematicians who had worked on the Poincaré, said, “Hamilton contributed over fifty per cent; the Russian, Perelman, about twenty-five per cent; and the Chinese, Yau, Zhu, and Cao et al., about thirty per cent.” (Evidently, simple addition can sometimes trip up even a mathematician.) Yau added, “Given the significance of the Poincaré, that Chinese mathematicians played a thirty-per-cent role is by no means easy. It is a very important contribution.”

A note dated 9/7/2006 in the New Yorker Forums, from the husband of a Science Times reporter who covered the Beijing news conference, supports the Todd & Weld picture of what happened:

About the contrversy around the credit for solving the Ponicare Conjecture.
The Xinhua News Agency first reported on June 4 that Professor Yang Le told the reporter a division of 50%+25%+30% credit between Hmilton, Perelman and Chinese Scienctits. The news is here:
http://news3.xinhuanet.com/newscenter/2006-06/04/content_4644722.htm (in Chinese)
However, on June 9, the same reporter of the Xinhua News Agency
wrote another news in which Yang Le specificly emphasized that he was not an expert in the field to make such judgment and that he was against any attempt to make such judgment. The news is here:
http://beijing.icm2002.org.cn/mcmweb/Three_Said.htm (in Chinese)
Why there were such two completely opposite reports by the same reporter from the Xinhua?
The truth is that before the first news was wrote, Professor Yang Le was not interviewed by the reporter. And after Professor Yang Le’s protest about report to the XinHua reporter, the Xinhua reporter offered in order not to retract the first report he was willing to make a real interview with him in exchange. Believe or not, such unprofessional practice sounds strange, but it does really happen in China. I do not know how such strange number was reached at the beginning, but the truth was that Professor Yang Le was not intviewed by the Xinhua reporter before the interview for the second report.

And the same note indicates that there is a recording:

From the recording of the press conference, where 8 reporters from five Chinese media, including the reporter from the XinHua News agency, were pesented, some reporter asked Professor Yau whether Cao-Zhu’s paper can claim all the credit, and Professor Yau specificly said that Hamilton and Perelman’s contributions were the most important, Cao-Zhu’s paper just presented the complete proof and closed the case, and the proof of the Poincare Conjecture was a group effort. There was no mentioning of the division of credit in the press conference. Professor Yang Le was not present at the press conference.

After the press coference, my wife and one of her colleague at the Sciencetimes had an exclusive interview face to face with Professor Yang Le in the same day. There was no such mentioning of percentage in that interview.

The author of that note says that

I then spent some time a few days ago on the inernet to do my own research on the 30% credit story. Such research should have been done by Ms. Nasar and her associates. I have to say after going through all this materials, I learned how wrong and the New Yorker article was.

I have aways been telling my wife how unprofessional many of the reporters in China are, and how unfortunate that I have to live with this fact. But I have never expected that people like Professor Nasar can be so unprofessinal in writng the article in the New Yorker magazine.

But as regular readers of this and other weblogs know, the effective standards for accuracy in quoting sources in print these days are so low that they're nearly non-existent -- see this post for a list of some of the examples we've documented over the years. It's a small step from these leading questions and misleadingly selective or approximate answers to the alleged Chinese pattern of just making up something that the source might plausibly have said, if they'd ever been asked. (And of course, the Chinese might claim to have learned the technique from high-profile U.S. journalists such as Jason Blair and Stephen Glass -- except that no one needs to be taught laziness, ambition and fakery.) Similarly, doctored and staged photographs are ubiquitous (Tom Glocer, Reuters CEO, speaking on CNN: "I would think it's extremely uh likely that there are incidents all around us of manipulated images and um staged images").

So here's a modest proposal, which might help abate the flow of falsehood by at least a small percentage. When news agencies interview sources on the record, or at least when they cover news conferences, how about recording everything and putting it up on the web? You could fake the recordings, I guess, but that's a lot more work than just using accurate quotes would be.

[This case also reminds me of an earlier episode of allegedy faked quotations at the New Yorker -- the case of Janet Malcolm and Jeffrey Masson. Malcolm interviewed Masson extensively for the New Yorker articles that became her book In the Freud Archives; Masson claimed that Malcolm fabricated many quotations that put him in a bad light, and the legal case dragged on for many years. One of the most curious aspects of this curious affair was the resonance with Malcolm's next project, The journalist and the murderer, a case study of a journalist's relation to his sources in a murder trial, whose theme was that "every journalist knows that what he does is morally indefensible".]

Posted by Mark Liberman at 04:15 PM

Gabby guys: the effect size

Are women really more talkative than men? A few minutes ago, I did a quick experiment that bears on the question, and the answer turned out to be "no".

The experiment was quick and easy, but it wasn't small, because I didn't need to collect any data: I used a published speech corpus. Specifically, I ran a couple of perl scripts over the transcripts and speaker demographics from the Fisher English Corpus Part 1 (FECP1), a collection of 5,850 telephone conversations lasting up to 10 minutes each, recorded in 2003. Speakers were from all over the U.S., ranged in age from teenagers to people in their 80s, and had educational levels from high-school to post-graduate degrees. Participants were assigned conversational partners at random, and asked to talk for up to ten minutes on one of forty topics like"What do each of you think is the most important thing to look for in a life partner?", or "Do either of you think that you would commit perjury for a close friend or family member?". Calls were routed through a computer in Philadelphia, which recorded them with the knowledge and consent of both parties.

If you don't have much patience for numbers and graphs, here's the summary: in conversations between the sexes, the men used about 6% more words on average than the women did; and in about 55% of such conversations, the male participant talked more than the female participant did. In single-sex conversations, two guys exchanged about 3.2% more words, on average, than two gals did. For more details, read on -- and as a bonus, you'll learn, in exact quantitative terms, whether size really matters. Effect size, that is...

FECP1 includes 1,910 mixed-sex conversations. In 1048 of them (54.9%) the male participant produced more words than the female participant did, while in 862 of them (45.1%) the female participant produced more words than the male participant did. The average number of words produced by a male participant in a mixed-sex conversation was 925.9, while the average number of words produced by a female participant was 866.6, or about 6.4% fewer.

In 2,368 FECP1 conversations where two women were talking, each participant on average produced 901.5 words. This is about 4% more than the average number of words produced by women talking with men. In 1,572 conversations where two males were talking, each participant on average produced 930.4 words. This is about half a percent more than the average number of words produced by men talking with women. So there's a small indication that women might be more talkative when talking with women -- and a smaller indication that men talk more with other men -- but the amount of change is not very large.

This graph of the distribution of word counts for all participants, female and male, shows how similar the distributions for the two sexes were:

And for completeness, the very similar graph of the similar distributions of word counts for male and female participants in mixed-sex conversations:

One way to measure the size of such group differences is to scale the difference between the group averages according to the amount of variation within each group's distribution. More technically, this is the difference between the means divided by the pooled standard deviation. Or in the form of an equation,

This measure of "effect size" is known as Cohen's d. According to the Wikipedia, Cohen (1992) suggested that d of "0.2 is indicative of a small effect, 0.5 a medium and 0.8 a large effect size". For the mixed-sex FECP1 conversations, the effect size of the diference between the number of words used by men and the number of words used by women (expressed in terms of Cohen's d) is 0.203. For all the conversations, the effect size is 0.128.

In other words, these are small to extra-small effects. But they're in the opposite direction from the predictions of Louann Brizendine's (unsubstantiated) claim that women normally produce almost three times more words per day than men, due to crucial biological differences allegedly laid down in the eighth week of fetal life:

A huge testosterone surge beginning in the eighth week will turn this unisex [fetal] brain male by killing off some cells in the communication centers and growing more cells in the sex and aggression centers. If the testosterone surge doesn't happen, the female brain continues to grow unperturbed. The fetal girl's brain cells sprout more connections in the communications centers and areas that process emotion. How does this fetal fork in the road affect us? For one thing, because of her larger communication center, this girl will grow up to be more talkative than her brother. Men use about seven thousand words per day. Women use about twenty thousand. For another, it defines our innate biological destiny, coloring the lens through which each of us views and engages the world. [From The Female Brain, p. 14 -- emphasis added]

However, these small-or-extra-small word-count effects are actually a bit larger than the effects that are generally found for differences in measure of verbal performance between males and females (though most measures show a small performance advantage for females). According to Janet Shibley Hyde and Marcia C. Linn, "Gender Differences in Verbal Ability: A Meta-Analysis", Psychological Bulletin, 104:1 53-69 (1988):  

Many regard gender differences in verbal ability to be one of the well-established findings in psychology. To reassess this belief, we located 165 studies that reported data on gender differences in verbal ability. The weighted mean effect size (d) was +0.11, indicating a slight female superiority in performance. The difference is so small that we argue that gender differences in verbal ability no longer exist. Analyses of effect sizes for different measures of verbal ability showed almost all to be small in magnitude: for vocabulary, d = 0.02; for analogies, d = −0.16 (slight male superiority in performance); for reading comprehension, d = 0.03; for speech production, d = 0.33 (the largest effect size); for essay writing, d = 0.09; for anagrams, d = 0.22; and for tests of general verbal ability, d = 0.20. For the 1985 administration of the Scholastic Aptitude Test-Verbal, d = −0.11, indicating superior male performance. Analysis of tests requiring different cognitive processes involved in verbal ability yielded no evidence of substantial gender differences in any aspect of processing. Similarly, an analysis by age indicated no striking changes in the magnitude of gender differences at different ages, countering Maccoby and Jacklin's (1974) conclusion that gender differences in verbal ability emerge around age 11. For studies published in 1973 or earlier, d = 0.23 and for studies published after 1973, d = 0.10, indicating a slight decline in the magnitude of the gender difference in recent years.

Whatever the size of these effects, are they the direct result of genetic and hormonal effects on brain wiring during embryological (or later) development, as opposed to being a more indirect result of the different life experiences of females and males? (Of course such environmental effects would also be mediated by brain differences, unless we believe that life experiences affect us by modifying our immaterial souls.) The first answer to this question is that no one has a clue. The second answer to this question is that the effects are so small, and so variable according to circumstance, that the question becomes an academic one, in the exact sense of that term -- the answer is of interest to scientists, but it should have no public policy implications at all, except to make us suspicious of people like David Brooks and Leonard Sax.

Here are some comparisons that may help to put these effects and their sizes in perspective. One comparison involves group differences that are mostly genetic, and another involves differences that are mostly environmental. .

1. Some secondary sex differences do involve medium-to-large effect sizes. For example, according to Table 5 of the National Center for Health Statistics' Anthropometric Reference Data for Children and Adults: U.S. Population 1999-2002, the average height of 19-year-old American males is 176.7 cm, with s.d. = 10.6, while the average height of 19-year-old females is 162.9 cm, with s.d. = 11.0. This is an effect size of d=1.32. For the same comparision of 19-year-old males and females, the effect size for the average difference in weight was only d=0.51 (because the standard deviations are much larger relative to the means).

2. Some environmental effects on cognitive performance involve medium-to-large effect sizes. Martha J. Farah, et al., ("Childhood poverty: Specific associations with neurocognitive development", Brain Research 1110(1) 166-174, September 2006) "administered a battery of tasks designed to tax specific neurocognitive systems to healthy low and middle SES [socio-economic status] children screened for medical history and matched for age, gender and ethnicity".

Fig. 1. Effect sizes, measured in standard deviations of separation between low and middle SES group performance, on the composite measures of the seven different neurocognitive systems assessed in this study. Black bars represent effect sizes for statistically significant effects; gray bars represent effect sizes for nonsignificant effects.

All the participants in this study were African-American girls between the ages of 10 and 13. As the graph above indicates, the difference in performance on the "Language" part of the test battery between middle SES and low SES girls represented an effect size of about 0.95.

There were two language-related tasks:

Peabody Picture Vocabulary Test (PPVT)
This is a standardized vocabulary test for children between the ages of 2.5 and 18. On each trial, the child hears a word and must select the corresponding picture from among four choices.
Test of Reception of Grammar (TROG)
In this sentence–picture matching task designed by Bishop (1982), the child hears a sentence and must choose the picture, from a set of four, which depicts the sentence. Its lexical–semantic demands are negligible as the vocabulary is simple and a pre-test ensures that subjects know the meanings of the small set of words that occur in the test.

Note that in terms of effect size, this finding is several times the largest difference found in the Hyde and Linn meta-analysis of sex differences in verbal ability.

_______________________________

[A note about the overall numbers of participants of each sex in the Fisher English Part 1 corpus is in order. In this phase of the data collection, there were 5850 conversations, and therefore 11,700 conversational sides. Of those, 6,646 (or 56.8%) were female, and 5,054 (or 43.2%) were male. This imbalance was caused by the fact that participants needed to be callable at a particular phone number during a particular time period. Thus people who don't work outside the home, or who are retired, are likely to be over-represented in the collection; and women in turn are over-represented in these two groups. In fact, we had to work hard to keep the imbalance of sexes in the collection from being larger.]

[A list of links to other relevant Language Log posts can be found here.]

Posted by Mark Liberman at 07:26 AM

September 22, 2006

Macaca-watch continues

The uproar over Sen. George Allen's use of the peculiar word macaca last month to refer to a young Democratic campaign worker of South Asian descent just won't go away. (See here and here for previous Language Log coverage.) Some have claimed that macaca, as a variant of macaque, has been used as a French racial epithet in North Africa. And since Allen's mother is French Tunisian, the argument goes, Allen himself must have picked it up from her. But the senator's mother, Henrietta "Etty" Allen, put the kibosh on that claim in a Washington Post interview:

Etty Allen said Wednesday that she had never used the word "macaca" before and had to go to a dictionary to look it up when she heard of the controversy. She said the word did not exist in her dictionary.
"I swear to you, I have never used that word," she said. "I must have used a lot of bad words, but not that word."

That seems like a reasonable disavowal, particularly since it has never been proven, to my knowledge, that macaca really has ever been a racial slur in common use in North Africa, despite many bloggers and reporters treating this claim as fact. (Josh Marshall, for instance, wrote that "in Colonial-era North Africa, particularly the Francophone areas, 'Macaca' is a rough equivalent of 'N-ger'." Sez who? So far the only evidence I've seen is for macaque as a slur for dark-skinned people, not macaca.) Etty Allen's profession of ignorance would seem to support her son's recently proffered excuse for using macaca, which is that it was a completely meaningless word he made up on the spot — this despite earlier explanations that it was somehow a blend of Mohawk and caca.

But now comes yet another rationale. As reported on Wonkette, George Allen told World Magazine about a different provenance for macaca:

Allen actually had a pretty credible defense for what he said. No one — including The Washington Post, which featured the story repeatedly for several weeks — ever demonstrated that "macaca" really has such murky racial connotations in any language. But in northern Italy, where Allen's mother had close family connections, "macaca" does seem to mean "clown" or "buffoon." Allen says now that's what he was trying to communicate.

I don't know if this latest explanation will hold any more water than previous ones, especially given the fact that Etty Allen is on record as saying she had never heard of macaca before. But does macaca actually mean 'clown' or 'buffoon' in some northern Italian dialect? I checked out the reasonably comprehensive Dizionario della Lingua Italiana and found this "figurative" definition for macàca (listed after the zoological sense of the monkey genus):

fig., donna goffamente brutta e sciocca

If I'm translating the Italian correctly, this means that monkeyish macàca gets extended in some varieties of Italian to mean "a clumsily ugly and foolish woman." That's the feminine form of macàco, glossed as "uomo goffamente brutto e sciocco" ('a clumsily ugly and foolish man'). So if Allen is sticking by the Italian motivation, he'll next have to explain why he was referring to S.R. Sidarth, the (male) campaign worker, as a female buffoon. Doesn't look like Allen is getting out of this morass of macaca any time soon.

[Update: As for French derogatory uses of macaque, Chris Waigl provides a link to the word's entry in Trésor de la Langue Française informatisé, where one definition is "personne trés laide" ('very ugly person'). Still no firm evidence for the Francophone use of macaca, though.]

Posted by Benjamin Zimmer at 07:57 PM

Counting Out Al Gore

A search engine is a tool, no better or no worse than any other tool, an axe, a shovel or anything. A search engine is as good or as bad as the man using it.

Or was it "a gun is a tool"? My memory's shaky on that one, but whatever it was I know for sure that Alan Ladd said it to Jean Arthur in Shane. And with no waiting period and everybody packing, it's a good idea to keep your head beneath the parapet when the hit counts start to fly.

Take the exchange that recently surfaced on the letters pages of The American Prospect. It began with a sentence in Todd Gitlin's July 5 review of Eric Boehlert's book Lapdogs: How the Press Rolled Over for Bush. In the course of describing how media reporting is skewed, Gitlin reported Boehlert as saying:

Outside The Boston Globe. . . the total number of media accounts that mentioned both [Bush's National Guard] absenteeism and Texas pol Ben Barnes' acknowledgment that he tried to sneak young Bush into the Guard: two. The number of accounts of the phony charge that Al Gore claimed to have invented the Internet: more than 4,800.

No way, said Alan Abramowitz, the Alben W. Barkley Professor of political science at Emory, who was impelled to write a letter to the editor to say that his own researches showed that the Gitlin/Boehlert claim was clearly wrong:

A Lexis-Nexis search reveals only 19 mentions of the "Gore-invented-the-Internet" charge in major American newspapers between January 1, 2000, and Election Day. Moreover, the point of several of these articles was that Gore had never made such a claim but that he had been a strong supporter of the development of the Internet. . . Gitlin's (and Boehlert's) claim that the media frequently and uncritically reported this accusation, like the accusation itself, appears to be greatly exaggerated.

Needless to say, that's an wildly and patently incorrect result. In a response to Abramowitz's letter, Gitlin replied:

My source for the "more-than-4,800" claim was Boehlert's Lapdogs (p. 160). Maybe I should have checked earlier. Strangely, when I did so just now, Lexis-Nexis turned up neither 4,800-plus entries, nor the 19 that Professor Abramowitz found, but 445.

Actually, Gitlin's result isn't inconsistent with Boehlert's claim. Lexis-Nexis major papers includes only a small -- if influential -- portion of the American press (and around half the papers in the database are foreign ones). When you search severally in each of Nexis's four regional US News files, you come up with 973 hits for the period in question -- add the previous nine months (since the story hardly began at the beginning of 2000) and you come up with over 2000. In a more careful search at the Prospect's Tapped blog, Paul Waldman notes that the story gets a total of 4349 hits on Nexis's Allnews database over the 18 months prior to the 2000 election. And even that database doesn't include most local TV or AM radio talk shows, not to mention major blogs and Internet sites (the string "gore 'invented the internet'" gets 90 hits on townhall.com alone). In short, Boehler's "4800 media stories" is unquestionably a considerable understatement . And even if a good number of those stories involve refutations of the charge that Gore claimed to have invented the internet, the very need for them suggests that the damage was done. As Gitlin observed:

. . . lest we succumb to the fog of dueling Nexises, I submit that we recall Karl Rove's principle: When you're explaining, you're losing. Insofar as newspapers were saying that Gore was defending himself against a deceitful charge, he sounded, to some undecided population of voters, like an evasive braggart. That was bad enough.

Of course counts of media stories are only a rough indication of how widely diffused a story is, but even if we restrict ourselves to print, the contrast between Abramowitz's 19 stories and the actual figure of several thousand is pretty striking. But then anybody who lived through this period knows without having to check that the story was all over the place. Which leads me to ask, How could Abramowitz possibly have believed the number his search returned?

there's no way to tell exactly how Abramowitz managed to come up with the figure of 19. Waldman suggests that he must have searched only on the specific string "Gore invented the internet" in the Nexis Major Papers files, which is what I assumed too, but it turns out that even that very sentence gets 24 hits in that database for the first 9 months of 2000. Did he maybe use some other search string but leave the pulldown menu at the default "headline, Lead Paragraph(s), Terms" value rather than doing a full-text search? Did he enter the dates wrong? Did he screw up the search syntax, or enter a string that presumed that Nexis was using Google's search syntax, as a lot of people do these days?

Who knows -- there are an awful lot of ways to get this stuff wrong, and man and boy I've personally explored every one of them -- after 20 years of using Nexis, Dialog, and other news databases, I'm still doing searches and getting results that are implausible on their face, so that I have to give my processor a whack and try again with some other search terms.

From Abramowitz's failure to do just this, it's clear both that he's a newbie to Nexis and that he has an inordinate, or at least unwarranted confidence in his ability to find his way around with the technology. If that's especially odd in his case, it's only because he does quantitative research in voter behavior, so that you'd figure he'd have to have had the same experience using stat packages: getting some completely implausible result and having to go back and correct some setting -- or more likely, at this stage of his career, making a caustic remark and sending his graduate assistant to run the data again. I mean, who hasn't had this happen to him?

Abramowitz isn't alone in this -- people are always trotting out search results in the service of this or that point that are absurd on their face, even if not a whole lot of them are the holders of named chairs in quantitative disciplines. Blame the googlization of the Web, which gives us all the illusion of search-engine expertise. Or blame people's tendency to believe quantitative results and claims even when they fly in the face of plausibility.

Whatever the cause, you can put it down as one more example that makes the case for universal instruction in information literacy -- even if it probably comes too late for the tenured classes.

Posted by Geoff Nunberg at 04:31 PM

Ineffability again

Several readers sent in complaints about my treatment of Nick Piombino in a recent post on "Ineffability". The longest and most eloquent complaint, from a philosopher, is given in full past the jump.

I'm a long-time reader of language log, but I've never felt compelled to write in. I love your posts in general, but I thought you were too harsh to Nick Piombino in "Ineffability"; in fact, his sentiment seems closer to the truth than you give him credit for.

I take it that Nick-- and others who have similarly expressed this feeling of ineffibility-- meant that sentences in English, regardless of their complexity, literally cannot (completely) convey much of the contents of our attitudes and the subtleties of our emotions. This is the case in spite of the fact (as linguists are so fond of pointing out) that natural language is recursive and we therefore have an infinite number of propositions we could assert. 'Infinite' does not mean all. Some conceptual distinctions cannot be got at by logical operations on concepts we already have-- e.g. philosophers argue that the concepts of folk psychology or counterfactuals, though supervenient on the physical, cannot be translated into lower-level vocabulary, not because there aren't an infinite number of things to be said in the vocabulary of neuroscience or particle physics, but because that infinitude of thought doesn't cover every cut in possibility space.

Surely there are things we think and feel that cannot be expressed in words. Part of the problem with assigning truth-conditions to attitude ascriptions is that the meaning of the content-clause only roughly maps on to our actual attitude state. This is not because we speak sloppily, but rather because there are distinctions we cannot draw with words (likely because a word for the right concept would be for the most part useless). Granny's folk psychology only draws so many distinctions to begin with, many of them muddled or hopelessly confused. Would this be remedied if we had a thousand times the words in the dictionary, all concerning human psychological states that previously could not be referred to with our existing vocabulary, however combined? Perhaps. But we would need at least that number.

I think some lingusists (and I genuinely don't mean 'some linguists' to be a referential indefinite, picking out you) are afraid to admit that our lexicon + a compositional semantics doesn't get us everything, because this seems like inviting the Whorfians to have a field day. But this need not be so: we can think things we cannot (given our lexicon) say, and we *could* think things we cannot (given our current conceptual repertoire) think now. All Nick seems to be saying, if I read him right, is that it would be very difficult to say most of the things we think, even given a language with a greater vocabulary in all its compositional glory.

I hope this email is clear and not too argumentative. They stop feeding you in philosophy grad school so you'll be more vicious in the ring. I just wanted to defend Nick's side-- what I take to be a common side, perhaps touching, but not stupid.

I agree, mostly.

Certainly my post was hurried and careless. In particular, Nick Piombino said that more words wouldn't solve the problem, and I agreed with him, but I complained in a misleading way about the way he put it. People too often equate a language with its vocabulary, and talk as if you can't express a concept if your language lacks a single word that denotes it. But Piombino didn't say that, and I shouldn't have attributed to him an argument he didn't make.

I also agreed with Jonathan Mayhew's point that it was distracting for Piombino even to bring up the question of vocabulary size, since the expressivity of poetry doesn't increase monotonically with the size of the vocabulary that the poet draws on. I didn't like Jonathan's child:toys::writer:words analogy, but I didn't explain my problems with it in a coherent way.

The philosopher's letter objects that my analogy sentences:concepts::digit-sequences:numbers is a false one, because "there are distinctions we cannot draw with words". On the other hand, there are numbers (e.g. pi) that we cannot express with (finite) digit sequences in a positional number system. My point was not that digit sequences can express all numbers, but that increasing the base of the number system to a larger integer doesn't change its expressivity.

Something similar, though less clear, happens with words. Sometimes adding new words is just a way to save space and time, because the new word is just a convenient short reference to a longer explanation. But other new words are not reducible in this way to definitions in terms of existing words, and how people learn their meaning is more mysterious.

You could try to reconstruct the analogy between numbers and concepts by reference to the fact that given a finite number of axioms, there are an infinite number of mathematical truths that you can't prove. If words are like mathematical axioms rather than like digits in a positional number system, and expressing a concept is like proving a theorem, then vocabulary size is relevant to the problem of ineffability (because adding axioms increases the set of provable truths) but doesn't solve it (because infinitely many unprovable truths will always remain). Perhaps that's what Piombino meant, at some level. Certainly it's a kinder construal, and therefore to be preferred.

One comment about "Granny's folk psychology", and its limited, "muddled or hopelessly confused" distinctions: after a few weeks of reading the oeuvre of Leonard Sax, M.D., Ph.D., and Louann Brizendine, M.D., I find that Granny is looking smarter all the time.

Posted by Mark Liberman at 07:05 AM

September 21, 2006

Dan McGrew and topic marking


Phil Jensen, who recently discovered Language Log and is working his way through it systematically (in the past week, two people have reported to me that they're doing this!), writes that he followed a link from 2005 back to "the earlier long post on prosody" -- Mark Liberman's "An internet pilgrim's guide to accentual-syllabic verse" of 7/6/04 -- and then moved on to my 10/24/05 posting on topic-marking in terms of, so that he experienced the juxtaposition of fair chunks of "Dangerous Dan McGrew" with my example of topic-marking Left Dislocation:

Office hours, they're from two to four.

Ah, he thought, here we want something like

said the T.A. we all called Sue

to follow.

And now I have a wretched earworm.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at