October 10, 2007

Sentiment classification at the Sunday Times

One of the latest fashions in computational linguistics is "sentiment classification", which tries to determine automatically what writers' attitudes towards their topics are. For an interesting account of some recent work in this area by a recent Penn grad, see John Blitzer et al., "Biographies, Bollywood, Boom-boxes, and Blenders: Domain Adaptation for Sentiment Classification".

Most of the information useful to such algorithms comes from particular words or word sequences: thus in the Blitzer et al. paper, positive evaluations of books are associated with snippets like "engaging", "must read", "fascinating" and so on; in contrast, strings like "<num> pages", "predictable", and "plot" tend to be associated with  negative evaluations. Strings like "are perfect", "years now", "a breeze" are positive signs for kitchen appliances, whereas "the plastic", "awkward to" and "leaking" are negative indicators.

A few days ago, I quoted from Ed Caesar's article in the Sunday Times about Deborah Cameron's new book, The Myth of Mars and Venus ("Are pop gender studies from Uranus?", 10/8/2007). I noted that "His tone is just a little bit patronizing (though perhaps he's only being British, it's sometimes hard for me to tell the difference)". To remind you of what I was talking about, here's how he starts -- be on the look-out for negative indicators like "rather prosaically", "pet peeves", "irked", "bludgeoning its brains out":

So it turns out that after all the rows about the washing up, the shopping and the school run, men are not from Mars nor women from Venus. Both sexes are, rather prosaically, from Earth. And, despite anecdotal evidence to the contrary, men and women do speak the same language.

At least we do according to Deborah Cameron, Britain's pre-eminent feminist philologist (not often that you meet one of them) and the current Rupert Murdoch professor of language and communication at Oxford University.

Cameron, 48, is a firebrand with an impressive list of pet peeves, including Tories, Darwinists, GNER's passenger service announcements, Big Brother's language "so-called" experts, man-hating "pseudo-feminists" and societies for the protection of the semicolon. Don't get her started on Lynne Truss.

But the subject that has irked her most recently -- enough for Cameron to dedicate an entire book to bludgeoning its brains out -- is what she calls The Myth of Mars and Venus, published last week by Oxford University Press.

The review by Susannah Herbert, in the Sunday Times of the same day, gives almost exactly the same description of Cameron's book, but with a very different tone. Among the positively-associated words are things like "nuance" and "first-class" -- Caesar's "bludgeoning its brains out" is Herbert's "delightfully spiky":

In the village of Gapun in Papua New Guinea, when a woman is annoyed with her husband, she swears at him for 45 minutes, at the top of her voice so the neighbours catch every nuance. During this "kros" -- the word means "angry" -- the target is not allowed to answer back, nor may anyone interrupt until she's given her feelings full expression.

And what expression it is. The anthropologist Don Kulick recorded a typical kros: "You're a ****ing rubbish man. You hear? Your ****ing ***** is full of maggots. You're a big ****ing semen *****. Stone balls! ...****ing black *****! You *****ing mother's ****!"

When the flowers of English womanhood carry on like this -- at closing time on Friday night in Ipswich, say -- they're thought to be behaving laddishly. When the housewives of Gapun turn the air blue, however, they are only doing what comes naturally to a woman. The village men, apparently, pride themselves on their ability to conceal their opinions and express themselves indirectly: if they need to get a grievance off their chests, they get their wives to do it for them. In Gapun, women are from Mars, men are from Venus.

I sensed early on in this delightfully spiky book that Deborah Cameron -- an Oxford professor of language and communication -- would give a first-class kros, and enjoy it, too. The only problem would be limiting the number of victims to one. Cameron's targets are many: there's John Gray, the author of the psychobabble classic, Men Are from Mars, Women Are from Venus, Deborah Tannen, the author of You Just Don't Understand, Simon Baron-Cohen, the author of The Essential Difference, and the husband-and-wife team behind a slim volume called Why Men Don't Iron.

Some of Caesar's expression of sentiment is more subtle. For example, I noticed that he paired "feminist" with the archaic, rather specialized, and technically inaccurate term philologist rather than linguist, and then emphasized the suggested disciplinary eccentricity with the catty little aside "not often that you meet one of them":

At least we do according to Deborah Cameron, Britain's pre-eminent feminist philologist (not often that you meet one of them) and the current Rupert Murdoch professor of language and communication at Oxford University.

But due to lack of cultural context, I missed what may be the most important features in that sentence. An anonymous academic of British origin, now working in an American university, filled me in.

The Caesar article doesn't strike me as patronising, though the tone is certainly strange. I think this is to do with a generalized wish to debunk self-importance, possibly with a whiff of personal animus bordering on misogyny.

The decalogue for British academics begins:

- thou shalt not self-promote
- thou shalt not be shrill and angry
- thou preferably shalt not come from a bastion of privilege like Oxford University
- thou surely shalt not hold a chair named after the Dirty Digger (= Murdoch)
- if thou art at a bastion of privilege and thou holdest a chair named after the Dirty Digger, thou canst not win.

I gather that "self-promotion" means something like "openly aspiring above your station", and "shrill and angry" means "passionately committed to an unlicensed cause".

Meanwhile, the copy of The Myth of Mars and Venus that I've ordered from amazon.com hasn't come yet, but OUP sent me one from the U.K. After skimming it quickly, I'll go with "delightfully spiky" -- more later.

Posted by Mark Liberman at October 10, 2007 10:02 AM