April 22, 2005

Smoke signals and sounds

Geoff Pullum, being a syntactician, looked at the smoke over the Sistine chapel on 4/19 and saw a moral about the complex relations between form and meaning in language

The white smoke emerging from the chimney ... to announce the election of Pope Benedict XVI was unquestionably a communication, but not a linguistic one. ...

If all human communication were done in ways similar to the way the cardinals initially signal their votes (as opposed to the way the camerlengo ultimately makes the official announcement to the waiting crowd), then although there might be a discipline of semiotics (created by extra-terrestrial visitors, presumably, since such crude forms of communicative signalling would hardly put humans in a position to create academic disciplines), there would be no linguistics.

Being a phonetician, I saw a different moral, one about the difficult relations between messages and signals in speech.

According to an article by Alessandra Stanley, published on 4/20/2005 in the NYT:

Infallibility is expected of popes and television anchors, so there was something arresting about the confused scramble to interpret the first creamy wisps of smoke floating from the Vatican chimney yesterday.

"Darned if it doesn't look darker," said Charles Gibson of ABC, trying to square the appearance of white smoke with the absence of confirmation from the Vatican bell tower. All the networks went live at the first puff of smoke and as they waited, watched and deliberated (beige? charcoal?), none of the anchors could be certain of what they were seeing.

The first few newswire reports (found on Google News) were equally confused and confusing. The confirmatory bells also were rung, but it was almost time for them to sound the hour anyway, and so some sources discounted this signal and called the whole thing a false alarm, until that camerlengo came out and spoke.

The problem with the smoke signals is that everyone involved gets so little practice. The Vatican employees who burn the ballots don't get any rehearsals, at least not in the real setting, and the people watching outside don't get (what psychologists would call) practice trials with feedback. I'm sure that with a few dozen rounds of practice, everyone involved would get their signals straight.

There's a lesson here for language as well as for communication. These smoke-signaling problems help explain why in human spoken languages, the sound of a word is not defined directly (in terms of mouth gestures and noises). Instead, it's encoded in terms of a phonological system, whereby a word's pronunciation is defined as a structured combination of a small set of elements, meaningless in themselves. This was called "duality of patterning" by Charles Hockett in his celebrated list of characteristic properties of human language. More concretely, we could call it the "phonological principle".

Why is phonological encoding needed? Here's the math: a typical child learns about 40,000 words in the ten years between the ages of 3 and 13. 40,000/(10*365) = 10.96 words per day on average. Most of this learning is without explicit instruction, just from hearing the words used in meaningful contexts. Usually, a word is learned after hearing only a handful of examples. Experiments have shown that young children can learn a word (and retain it for at least a year) from hearing just one casual use.

Let's put aside the question of how to figure out the meaning of a new word, and focus on how to learn its sound.

You only get to hear the word a few times -- maybe only once. You have to cope with many sources of variation in pronunciation: individual, social and geographical, attitudinal and emotional. Any particular performance of a word simultaneously expresses the word, the identity of the speaker, the speaker's attitude and emotional state, the influence of the performance of adjacent words, and the structure of the message containing the word. Yet you have tease these factors apart so as to register the sound of the word in a way that will let you produce it yourself, and understand it as spoken by anyone else, in any style or state of mind or context of use.

In subsequent use, you (and those who listen to you speak) need to distinguish this one word accurately from tens of thousands of others. (The perceptual error rate for spoken word identification can be less than one percent, where words are chosen at random from a list of dictionary headwords and spoken by arbitrary and previously-unknown speakers, and transcribed by careful and motivated listeners under good acoustic conditions.)

Let's call this the pronunciation learning problem. If every word were an arbitrary pattern of sound, this problem would probably be impossible to solve.

The phonological principle solves this problem by splitting it into two problems, each one easier. One problem is to learn the general relationship between phonological "spellings" and sounds; the other problem is to learn the specific phonological "spellings" of individual words.

  • Phonological representations are digital, i.e. made up of discrete elements in discrete structural relations.
  • Copying can be exact: members of a speech community can share identical phonological representations.
  • Within the performance of a given word on a particular occasion, the (small) amount of information relevant to the phonological identity of the word is clearly defined.
  • The acoustic interpretation of phonological representations is general, i.e. mostly independent of word identity.
  • Thus every performance of every word by every member of the speech community teaches about the system as a whole, and therefore helps listeners to sharpen up their perception of all words, not just the particular one spoken.

Later in the NYT article there is a telling phrase about color of the Sistine smoke:

It was those few moments of uncertainty, however, that haunted those who had to hold forth, live, on the air, for minutes with no idea what color smoke was floating to the sky.

But that's not true. They knew what color it was: it was right there in front of their eyes. They just didn't know what its color meant, because they didn't know where to put the threshold in their perceptual space, because they hadn't had enough practice with Sistine smoke, and they didn't have any other relevant experience to bring to bear. At least not in a precise enough way.

Once the feedback came, the watchers tried hard to adjust their thresholds:

Mr. Blitzer on CNN kept going back to the tape.

"It's clearly white," he said. "In hindsight."

It's not only the news anchors who had trouble interpreting what they were seeing. Newsday quotes another watcher whose experience of the smoke's color was also semiotically uncertain and temporally unstable:

"It looks white," said the Rev. Carlos Encina, 40, who is from the small European country of Liechtenstein, "but at the beginning it was black."

Ah, but that was before he knew what it meant.

[Note: some bits of this post are recycled from my lecture notes for ling001]

Posted by Mark Liberman at April 22, 2005 07:20 AM