August 07, 2006

C*m sancto spiritu

Yes, another triumph of the iTunes automatic asterisking program: the innocent Latin preposition "cum" 'with' loses its "u" because of its dirty homograph, as in Blowfly's song "Cum of a Lifetime" and Super 8 Cum Shot's self-titled album.  This wonderful fact from Barbara Partee, who downloaded "Carmina Burana" from iTunes and was confronted with "Si puer c*m puellula", which she would never have understood if it hadn't been for the work of the Taboo Avoidance Crew (also known as the Too Asterisked Crew) here at Language Log Plaza.

And there's fresh news from the TAC!  But first, a little puzzle for you to solve:

Here are six items that iTunes asterisks out that weren't in previous Language Log postings.  Identify the offending words.

1 c*********s [NOT "cocksuckers" or "cuntlickers"], 2 f******o, 3 f*****g [NOT "fucking"], 4 p******t, 5 s***m, 6 t**t

If you said "teat" for the last one, you're wrong; "teat" escapes asterisk-free, as do "nipple(s)", "boob(s)", and "breast(s)" (only "tit(s)" gets caught).  The right answer is "twat".  ("Muff" is ok though, even in "muff dive", "muff diver", and "muff diving".)  Number 5 is "sperm"; this is odd since, as you will recall, "semen" is ok on iTunes.  Number 4 is "pederast", 3 is "fisting", 2 is "fellatio" (an easy one), and 1 is "cunnilingus".  What on earth are the medico-legal "cunnilingus", "fellatio", and "pederast" doing on this list?

You might remember that "whore" is out; I can now report that "pimp", "hustler", "prostitute", "hooker", and "brothel" are ok.  The American iTunes seems not to appreciate the force of "arse" in British English, since the word gets no asterisks, while "ass" is a dirty word.  And as a result of the iTunes whole-word approach to asterisking -- it looks like someone has to enter a word into a list of asteriskables, and then the program searches for that whole word, and not for that letter sequence within a longer word -- "cockring" escapes punishment (while "cock" is, of course, verboten).

As Wendell Kimper pointed out to me in e-mail, there's a lot that iTunes misses because of the whole-word rule: "whore" is out, but "whorecast", "sacredwhore", "whoresulta", and even plain ol' "whoring" get by.  Kimper noted a podcast titled "A Cop and Two Wh***es", which startled me, because it's a violation of the first-and-last-letters-only rule; how did that "e" get preserved?  I then discovered that "cocks" becomes "c**ks", with the k preserved (though "fucks" becomes "f***s", not "f**ks").  Another Apple mystery.

Then Q. Pheevr wrote to say he'd blogged about iTunes's automatic asterisker a couple of times. And added still another complication: the Canadian iTunes store works differently from the U.S. store, at least to the extent of bleeping (of all things) "pole", which the U.S. store sensibly leaves untouched (since most of its uses in song titles are non-sexual).  Finally, another bit of marginal irregularity:

Also, the Canadian iTunes Store, which is the one I use, censors "bordel" in French, except in one song title ("Le p'tit bordel," by Marie France), but not "bordello" or "brothel" in English. The cross-linguistic inconsistency is amusing, if not entirely surprising, but the fact that one song seems to have escaped the automatic asterisker is a bit mysterious.

Pheevr's blogs note that while innocent eyes are protected from SEEING the nasty words, most of the time when the songs are sampled innocent ears can HEAR them.  There is, as yet, no automatic bleeping of the samples.

And to see the awful consequences of abandoning the whole-word rule, consider what happened when a British labor union opted for string searching.  As reported by Jon Henley in his "Diary" column in the Guardian of 1 August (passed on to me by Mark McConville), in the wake of Mel Gibson's drunken reviling of the "fucking Jews":

Our spirits are much lifted, through, by news that manufacturing, technical and skilled persons' union Amicus, which has already distinguished itself by monitoring employees' emails and hiring private detectives to keep an eye on its more unruly members, has started automatically filtering its internet discussion forums for unsuitable words, phrases and letter combinations. But while Scunthorpe understandably becomes "S****horpe", and Blackpool (a tad unnecessarily, to our mind) as "Black***l", it seems "entitled", "parse" and even the unspeakable "Saturday" survive intact. Are we alone in wondering how this can be?

You can see the system in action here.  But: POO??  I rushed to my iTunes to check on "poo", and while I was at it, "pee", and was relieved to see that these two nursery words are fit for American eyes.

McConville also points out another case where the visual offense was judged worse than the auditory one (see above on iTunes titles and samples):

... on REM's 1992 album "Automatic For The People", a song which is quite clearly meant to be called "Fuck Me Kitten" is listed on the sleeve as "Star Me Kitten". I remember at the time one of the band saying that Warner refused to allow the  (written) word "fuck" to appear on the title list (this was the era of Tipper Gore and Prince's "You Sexy Motherfucker"), although they had no problem with the line appearing in the song itself.

Along the same line, the Nirvana single "Rape Me" was altered to "Waif Me" for Wal-Mart and Kmart releases of the album "In Utero" -- "an intentionally comical name chosen by Cobain himself", according to Wikipedia.  (Thanks to Martin Marks for the pointer.)  Comical, maybe, but also phonologically VERY close to "rape".

Various artists have used their very own avoidance characters, drawing from the set used in comic-strip generic curses (like "*$&!"), in song titles like these:

Can't F#%k Wit Me
F#@k the Creationists
I Wanna F##k U
F@#k Da Police
Don't F$%k With Us

As OSTENTATIOUS (and instantly interpretable) avoidance, these are hard to beat.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at August 7, 2006 07:10 PM