Parsing's a bitch, ain't it? At least, that's the hook for a recent Newsweek article about a theory of the brain's mechanism for telling time, which involves tracking signal propogation in neurons following perceptual events, or something; it was a bit hard to tell from the article's description.
The work in question is by UCLA neurobiologist Dean Buonomano,1 and among his webpages there's some nice java animations and sound files illustrating the brain's temporal abillities, suitable for intriguing undergraduates with. One of them shows that timing is important in speech perception. If an [s] sound is followed by a few milliseconds of silence and then by the vowel [i], the listener perceives the [s] as the lone onset consonant in a syllable [si]. If the [s] is separated from the [i] by just a few more milliseconds of silence, on the other hand, the listener perceives the silence as a voiceless stop between the [s] and the [i]. You get the impression of having heard the syllable [sti], instead of [s]...[i].2 The point seems to be that in order to perform this feat, the brain has to have some quite sensitive mechanism for distinguishing small temporal intervals.
The Newsweek article starts off with a famous mondegreen to lure the reader, but then doesn't do anything to explain how that particular misparse is related to the question of timing. Never fear, though, Language Log is here!
So, here's the mondegreen in question:
Jimi sounds like he's saying, "Scuse me, while I kiss this guy", when in fact the lyric is "Scuse me, while I kiss the sky". That is, the listener mishears the sequence
/kɪsðəˈskaɪ/
as
/kɪsðəsˈgaɪ/
Now, it is a puzzle why this would happen. One doesn't normally mishear /g/ for /k/ or vice versa; you wouldn't mix up 'coat' and 'goat', for example, in the usual case. But in Purple Haze, some particularities of English pronunciation are at work to mislead you.
/k/ and /g/ are usually described as being pronounced exactly alike except for the vibration of the vocal cords: /k/ is voiceless and /g/ is voiced. Otherwise everything about the configuration of the tongue and oral cavity are identical -- they're both velar stops. In fact, however, there's more than one way to skin a /k/ in English, and some /k/s are more like /g/ than others.
English voiceless stops are pronounced in several different ways depending on their position in the syllable. A voiceless stop alone at the beginning of a syllable gets an extra oomph, an extra puff of air, making for a longer, more perceptible period of voicelessness before the vowel sound starts up (a longer "Voice Onset Time"). That extra puff of air is called aspiration, for those of you keeping track at home, and is transcribed with a superscripted 'h' after the consonant: [kh]. In coat, e.g., that initial /k/ is aspirated, so it's not pronounced just [koʊt], but rather [khoʊt], when you really get down to it. That aspiration really makes the voicelessness of the /k/ stand out and sound quite different from the voiced /g/ at the beginning of 'goat', [goʊt]. You'd never get them mixed up.
The trick is, the aspiration doesn't show up everywhere. Voiceless stops are not aspirated when they occur after an /s/ at the beginning of a syllable. (You can feel the difference if you put your hand in front of your face and alternate saying 'pot', [phɑt], and 'spot' [spɑt] -- in the first you should feel the puff on /p/ but not in the second). In such cases, the absence of the little puff of air means that the voiceless period associated with the stop is shorter and less perceptible. The upshot is that the /k/ sound following /s/ in complex syllable onsets sounds a lot more like a /g/ than other /k/s do.
This isn't normally a problem, because there aren't any syllables in English that begin with /sg/ -- that's just not a legal English onset consonant cluster, so it doesn't matter if the /k/ sounds like a /g/, you know it has to be a /k/ in that context. But in connected speech, you might get a word that ends in a vowel (like the) in front of a word that starts /sk/ (like sky)...and in the right circumstances, the listener might think that the /s/ belonged with the preceding word that ended in the vowel, for instance if there's a very frequent alternate word (like this), suitable to the syntactic and semantic context, which sounds just like the vowel-ending word (the) but ending in an /s/.
Imagine the listener got as far as making that mistake, of parsing the /s/ as part of the previous word, hearing this ... rather than the s... Then the next problem they'd have to solve would be to try to identify the next consonant in line in the speech stream. It's definitely a velar stop -- but is it /k/ or /g/? "Well," their perceptual system reasons to itself, "if it were a /k/ at beginning of a word like this, it would be aspirated -- I'd expect a longer period of voicelessness right here. Given that there's not too much voicelessness, I guess it must be a /g/." And presto! 'kiss the sky' turns into 'kiss this guy'.3
So it does all have to do with timing after all -- the brain has to detect the difference between 30ms of voicelessness and 60ms of voicelessness, and the research described in the Newsweek article is about figuring out how it pulls that off.
Oddly enough, just as Prof. Shuy was posting the Newsweek clipping on the bulletin board next to the water cooler, I was reading a blog post all about this exact same thing over at In A Word: It's all about persbective. How would you spell that word of Mary Poppins's?
Update: Dan Everett writes:
However, the parsing shows also how important context is, because no self-respecting acid head from the 60s would have misparsed this. We all knew that Purple Haze was a brand of acid and that it caused you to want to kiss the sky.Another Everett case for the interdependence of grammar and culture. The man's obsessed!
1Anarthrous, anarthrous, anarthrous. It's just a nifty word.
2The text below the sound file refers to the syllable as a 'phoneme'. Just carelessness, we assume. We're nothing here at LL if not charitable.
3Of course it's not really thinking this. It's just a network of neurons firing away, trying to settle into a pattern that constitutes a sensible linguistic representation for that stream of sound, given all the contextual factors involved. 'Kiss the sky' and 'kiss this guy' are both strongly activated given Jimi's sound waves. In fact, probably '...kiss this guy' gets an activation edge from the semantic context. Kissing usually involves an animate direct object, after all. Those who hear 'kiss this guy' are seriously underestimating the degree to which the narrator in the song is acting funny.
Posted by Heidi Harley at February 9, 2007 01:30 AM