Language Log: Getting kids (and politicians) wrong

October 13, 2004

Getting kids (and politicians) wrong

Yesterday I heard a fascinating talk by Paul Foulkes, of the University of York, on the development of local dialect features in the speech of children in Newcastle, England. He reported a large number of interesting results that I won't discuss in this post. Instead, I want to speculate about one tantalizing fact that that he mentioned in passing: when he asked a large number of adults, both local and non-local, to guess the sex of two-to-four-year-old children by listening to short utterances, about two thirds of the judgments were wrong.

Here's the background. Some characteristic features of the local Newcastle dialect are gender-associated in young adults: glottalized variants of medial /p/, /t/, /k/ are found more in males than females, while pre-aspirated variants of final /p/, /t/, /k/ are commoner in females than in males. This differentiation begins to develop in children from about the age of three. Part of the reason is that mothers' speech to their children is differentiated in the appropriate way by sex.

Foulkes and his colleagues wanted to look at the perception of their features, by local and non-local listeners. Since it's hard -- maybe impossible -- to tell the sex of children at the age of 2-4 from their voice alone, they figured that these gender-associated dialect features might be used by local listeners. And they were.

The experiment was designed using a selection of naturally-occurring isolated-word stimuli, recorded during the original studies of the development of this structured variation. So the stimuli were not well controlled, as Paul was at pains to explain. That is, half of them overall came from girls and half from boys; the stimuli with medial glottalized stops were likewise half from girls and half from boys; the stimuli with pre-aspirated final stops were ditto; and so on; but the interaction of these features with speech rate, loudness, pitch range, and other stereotypically gender-associated features was not controlled. It was not exactly random, either -- it was just whatever it happened to be in the real-life recordings they had made.

It's probably because of this complex and uncontrolled variation that the results of the experiment were somewhat messy. The effect they were looking for -- evidence that the local listeners were sensitive to the gender-linked features of the local dialect -- could be seen in the data, but it was a small effect and a somewhat erratic one.

The thing that most interested me, though, was something that Paul didn't focus on his talk, because it wasn't part of the design or the planned interpretation of the experiment in question. He hadn't analyzed the effect in the experimental data, didn't have the exact figure on his slides, and only mentioned it in response to a question. So I don't want to put him on the spot with respect to any claims about this result -- for now let's just say that it's a speculation on my part about a result that might be true. If it turns out (after Paul checks) that I've gotten it wrong, I'll post a correction. However, effects of this general type do happen, and it's interesting to see why.

The way this experiment was designed, if the subjects had guessed "boy" and "girl" without any information from the stimuli, they would have been right half the time. A subject could have guessed "boy" all the time, or "girl" all the time, or "boy" and "girl" alternately, or "boy" and "girl" randomly, or "boy" through the first half of the experiment and "girl" through the second half. It doesn't matter -- they'd still be right half the time. (I think -- I didn't check for the relevant experimental design issues).

So if the subjects were wrong about 2/3 of the time overall, that means that they were getting some useful information from the stimuli. It was just wrong information, information that led them to a false conclusion. What could this have been? Paul did some multiple regression analyses that showed that subjects' judgments were being influenced by features like amplitude, pitch, speech rate and voice quality. Basically (as I recall) his subjects showed the effects of the conventional stereotypes that girls' speech is softer, slower, higher in pitch, and more breathy-voiced, while boys' speech is louder, faster, lower in pitch, and more creaky-voiced. These stereotypes are not random: they correspond to an image of girls as more polite and controlled, and boys as more, well, boisterous. (Except for the pitch business, which is complicated but probably reflects an influence of adult norms along with a confounding influence of the pitch-raising effects of greater loudness and vocal effort).

Now at least for some of these features, these stereotypes are known to be wrong. For example, girls talk faster than boys do, on average, according to the experimental results I've seen cited. In the particular stimuli used in Paul's experiment, it might well happen to be true that several salient features were distributed in the anti-stereotypical way.

Notice that it's not enough for the stereotyped features to be distributed randomly. Suppose that girls and boys are equally likely to be loud or soft, but listeners think that "loud" means "boy" and "soft" means "girl", and so whenever in the experiment they hear a recording of a kid yelling, they say "boy", and whenever they hear a recording of a kid speaking softly, they say "girl". The result will still be that they're right half the time.

For them to be wrong more often than right, the basis of their response has to be worse than wrong in the sense of unconnected with the facts. It actually has to be the opposite of the facts, to some extent. There's an interesting story to be told about why humans -- who are usually such accurate statistical learners -- can (all too often) develop shared perceptual associations that are so significantly at variance with the facts. (Positive feedback due to greater salience of stereotypical features will get you random beliefs, but sometimes there's something more...)

The point that I want to make here is that stereotypes have a powerful influence on the perception of the "subtle style cues" in speech that so much of current political journalism is focusing on. These stereotypes can be associated with groups (like Southerners or New Englanders), or with individuals (like George W. Bush or John Kerry). And despite their powerful influence on perception, they can be quite wrong.

Posted by Mark Liberman at October 13, 2004 08:22 AM