Ask Language Log: sounds and meanings

Barbara Duncan asks:

Do you know of any language where sounds have consistent meanings from word to word? If so, what language? If not, why not?

If by "sounds" you mean phonemes or phonetic features, then  there are no languages where sounds have consistent meanings from word to word. This generalization is what Charles Hockett ("The origin of speech", Scientific American, 203: 88-96, 1960) called "duality of patterning":

The meaningful elements in any language -- "words" in everyday parlance, "morphemes" to the linguist -- constitute an enormous stock. Yet they are represented by small arrangements of a relatively very small stock of distinguishable sounds which are in themselves wholly meaningless. This "duality of patterning" is illustrated by the English words "tack,", "cat" and "act." They are totally distinct as to meaning, and yet are composed of just three basic meaningless sounds in different permutations. Few animal communicative systems share this design-feature of language -- none among the the other hominoids, and perhaps none at all.

A hundred years ago, Ferdinand de Saussure referred to the meaninglessness of the atomic elements of linguistic sound with the famous phrase "the arbitrariness of the sign" ("l'arbitraire du signe" in the original French).

Why do all languages exhibit duality of patterning? Why are linguistic signs, in general, arbitrary combinations of elementary discrete sound elements that are themselves meaningless? Basically, this digital encoding is the only way to have a large and expandable vocabulary whose elements are transmitted reliably.

Here's how I explain it in the lecture notes for Linguistics 001:

Experiments on vocabulary sizes at different ages suggest that children must learn an average of more than 10 items per day, day in and day out, over long periods of time.

A sample calculation:

* 40,000 items learned in 10 years
* 10 x 365 = 3,650 days
* 40,000 words / 3,650 days = 10.96 words per day

Most of this learning is without explicit instruction, just from hearing the words used in meaningful contexts. Usually, a word is learned after hearing only a handful of examples. Experiments have shown that young children can learn a word (and retain it for at least a year) from hearing just one casual use.

Let's put aside the question of how to figure out the meaning of a new word, and focus on how to learn its sound.

You only get to hear the word a few times -- maybe only once. You have to cope with many sources of variation in pronunciation: individual, social and geographical, attitudinal and emotional. Any particular performance of a word simultaneously expresses the word, the identity of the speaker, the speaker's attitude and emotional state, the influence of the performance of adjacent words, and the structure of the message containing the word. Yet you have tease these factors apart so as to register the sound of the word in a way that will let you produce it yourself, and understand it as spoken by anyone else, in any style or state of mind or context of use.

In subsequent use, you (and those who listen to you speak) need to distinguish this one word accurately from tens of thousands of others.

(The perceptual error rate for spoken word identification by motivated listeners is less than one percent, where words are chosen at random and spoken by arbitrary and previously-unknown speakers.)

Let's call this the pronunciation learning problem. If every word were an arbitrary pattern of sound, this problem would probably be impossible to solve.

What makes it work? In human spoken languages, the sound of a word is not defined directly (in terms of mouth gestures and noises). Instead, it is mediated by encoding in terms of a phonological system:

  1. A word's pronunciation is defined as a structured combination of a small set of elements
    The available phonological elements and structures are the same for all words (though each word uses only some of them)
  2. The phonological system is defined in terms of patterns of mouth gestures and noises
    This "grounding" of the system is called phonetic interpretation, and it's the same for all words

How does the phonological principle help solve the pronunciation learning problem? Basically, by splitting it into two problems, each one easier to solve.

  1. Phonological representations are digital, i.e. made up of discrete elements in discrete structural relations.
    Copying can be exact: members of a speech community can share identical phonological representations.
    Within the performance of a given word on a particular occasion, the (small) amount of information relevant to the identity of the word is clearly defined.
  2. Phonetic interpretation is general, i.e. independent of word identity
    Every performance of every word by every member of the speech community helps teach phonetic interpretation, because it applies to the phonological system as a whole, rather than to any particular word.

For more on this topic, see Michael Studdert-Kennedy and Louis Goldstein, "The Gestural Origin of Discrete Infinity", in Christiansen & Kirby, eds., Language Evolution, OUP, 2003. And for an amusing and interesting -- but completely unsuccessful -- attempt to design a language with more systematic sound-meaning correspondences, see Joge Luis Borges' essay "El Idioma Analítico de John Wilkins", Otras Inquisiciones (1952).

However, the arbitrariness of the sign is not complete. Every language has a certain amount of phonetic symbolism. There are always onomatopoeic words like whoosh and tick-tock, and this natural sound/meaning connection may bleed through to some extent into fairly large areas of vocabulary, as in English flip/ flap,/flop, clink/clank/clunk, etc. And in some languages, there are specific classes of words, sometimes called "ideophones", where these quasi-natural sound-meaning correspondences are systematized, sometimes to a considerable extent.

I've been interested in this area since I was an undergraduate, when I worked for a while on sign languages, and thought about the imitative aspects of signing. And it came up again in my work in graduate school, in connection with the form and interpretation of pitch patterns in intonational languages, again because a crucial feature of ideophonic systems is that their semantics is mainly iconic rather than symbolic. Here's something that I wrote about this in my dissertation (The intonational system of English, 1975):

An additional complication in the analysis of intonational units arises because of the extremely strong role played by phonetic symbolism in constraining intonational meanings. In no other aspect of language is "l'arbitraire du signe" less manifest than in intonation., and we have every reason to believe that a substantial portion of the content of the intonational lexicon of English is determined by the universal symbolic (better: metaphorical) value of tones and tone-sequences. However, there are also many clear examples of language-specific tunes, and meanings for tunes, so that some degree of arbitrariness or conventionalization must be built into the system.

In the non-intonational lexicon, phonetic symbolism clearly cross-cuts morphology and even phonology, with non-distinctive oppositions (e.g. short/long, for English) and non-morphological sequences (e.g. -ink in wink and blink) often playing a role.

Some psychologically significant intonational oppositions (e.g. terminally falling / terminally rising) should be seen as being of this nature. [...] [In a description based on level tones,] the rising/falling distinction is not a direct characteristic of the phonology or morphology of the intonational system, but rather an overlaid distinction, a complex property of the systematic representation of the tune, like those distinctions which would be required in defining phonetic symbolism in general.

I think that there is excellent evidence that this is true. We have a sense that "rising" gestures in general share some property by opposition to "falling" gestures. Weak and strong beats in music are conceived of as rising and falling respectively (arsis/thesis, levatio/positio, upbeat/downbeat etc.). In dance, rising up on the toes is generally an arsic gesture, while coming down flatfooted in generally thetic. Raising the eyebrows is an other-directed gesture (greetings, expression of skepticism etc.), while lowering the eyebrows is a more self-directed gesture (signaling concentration, etc.). In sign languages, questions, nonterminal pauses etc. are usually signaled with an upward motion of the hands, while more "final" terminations are signaled with a downward motion (superimposed on whatever signs are being employed in the "utterance"). Examples could be multiplied indefinitely; the point is simply that "rising" and "falling" have some general metaphorical value independent of any role that they may play in intonation, and that they roles which can in general be attributed to these concepts in intonation (e.g., other-directed vs self-direction, nonfinal vs. final) are exactly what would be expected on the theory that we have proposed, that they are essentially para-linguistic metaphors.

The fact that the normal metaphorical value of "rising" and "falling" is sometimes violated in the case of particular intonational tunes shows that universal sound symbolism does not completely determine the meaning of intonational words, although is obviously has a strong influence.

If we are to understand this situation, we would do well to examine the properties of conventionalized systems of sound-symbolism in general. Following a usage originally established to cover such phenomena in Bantu languages, and since extended to other cases, we will call these aspects of language ideophonic systems.

Ideophonic systems have five properties that will be of interest to us; the first and last of these they share with more conventional linguistic systems, while the remaining three tend to differentiate them from other aspects of language. [...]

1) Ideophones are words; that it, they are made up of sequences of systematically distinctive elements (= phonemes), in patterns whose structure is determined by a morphology.

2) In general, the meaningful units in an ideophonic system are not directly driven by the morphological analysis of a particular ideophone, but rather by some set of (more or less complex) properties defined on it.

3) The meanings of these units are typically metaphorical rather than referential; that is, they refer to a class of analogous aspects of different cognitive structures, rather than to any particular aspect of any particular such structure.

4) Ideophonic signs are not arbitrary -- the meanings of particular elements of an ideophonic system are strongly influenced by universal considerations. However, in any particular case, these form-meaning correspondences may become a specific, characteristic system, which is usually consistent with the universal basis, but is not entirely predicted by it.

5) Within a given system, "lexicalization" is possible -- that is, specific ideophonic words may take on particular meanings which are not predicted either by the universal basis, or by the particular system they belong to.

The most familiar linguistic examples of ideophones are echoic words, like English clang, clank, etc. Words that are not exclusively echoic may also have an ideophonic component -- for example, it is not completely accidental that "gong" refers to a large metallic disk that gives a loud, resonant tone when struck, while "flute" refers to a high-pitched wind instrument. However, there are cases in which ideophonic systems extend far behond the metaphorical relationship of the sound of a word to a non-linguistic sound.

For example, in Bahnar (Guillemet 1959, cited in Diffloth 1972), the words /blɔːl/ and /bloːl/ are glossed as follows:

/blɔːl/ 1. when a small fish quickly jumps out of the water.
2. when a man who has debts comes to your door or appears at your window.
/bloːl/ 1. when a big fish quickly jumps out of the water.
2. when an important person comes to your door or appears at your window.
3. when a great effort is made to reach an object which is out of reach.
4. suddenly speaking louder when one cannot be heard well.

Several aspects of the above example deserve comment. 1) We are dealing with words, made up of sequences of phonemes, not just free expressive noises; 2) the aspect of the word that is changed to produce a difference in meaning is (in this case) a single feature of a single phoneme, not a substitution of phonemes or sequences of phonemes; 3) the meanings of these words are extremely abstract properties, which pick out classes of situations related in some intuitively reasonable, but highly metaphorical way: the general "meaning" seems hopelessly vague and difficult to pin down, yet the application to a particular usage is vivid, effective and often very exact; 4) the particular phonological opposition which differentiates the two words, /ɔ/ vs. /o/, has a non-arbitrary connection to the meaning difference [...]

This last point deserves some amplification. Suppose we make a partial listing of certain pairs of adjective with intuitively corresponding properties:


Now, there is some phonological feature opposition (say tense/lax) which characterizes the difference between Bahnar /ɔ/ and /o/. In the system of ideophones of which the examples /blɔːl/ and /bloːl/ are members, this feature opposition (if it occurs in the proper position) has semantic content. But it is not at all clear that we want to say that tense means "big" while lax means "little", and that the other emanings are metaphorical extensions of these core meanings. Rather, what seems to be happen is that a systematic analogy is made between the phonological opposition "tense/lax" itself and the class of semantic oppositions big/little, important/unimportant etc., so that the actual "meaning" of the choice tense depends entirely on the nature of the situation to which one decides to apply the ideophone in question. Thus it may be true, in an ideophonic system, that the only meaning of a given element lies in the ability to analogize (in a systematic way) from its phonetic or phonological character to a large number of different concepts. In one sense, then, an ideophonic element means itself, given the human ability to make a kind of free-ranging metaphor out of its particular phonological or phonetic properties.

On this view, the non-arbitrary character of dieophonic sound-meaning correspondences, and their referential indeterminacy, the apparent abstractness of their meanings, are closely connected. This connection between the non-arbitrariness of the meaningful element and the essentially metaphorical character of the meaning is clearly exemplified in these examples from Korean (cited in Diffloth 1972):

tɔllɔŋ tɔllɔŋ 1. sound of small bells.
2. swaying movement of something suspended.
3. feeling of being left along when everyone has gone.
4. someone appears flippant.
ttɔllɔŋ ttɔllɔŋ 1. sound of narrow bells, bells hit hard.
2. swaying movement of short object, tightly suspended.
3. feeling of being left alone when everyone has gone; shock of solitude comes more suddenly.

N.B. Diffloth cites more than twenty Korean exmaples from a "paradigm" created by holding constant the formal property "repeated disyllabic word with medial -l-"; this property seems to represent some meaning like "back and forth movement, socillation, suspension, etc." Apparently there are thousands of possible words in the ideophonic system of Korean, including the possibility of nonce formations.

The meaning of the distinction t/tt, in these examples, is again best considered as the ability of construct a metaphorical connection between the phonological opposition itself and any one of a class of rather different concepts.

Following what I believe is a fairly standard usage, we will call this mode of meaning, in which the signifié is a general metaphorical extension of some intrinsic property of the signifiant, by the term iconic. [...]

Iconic meaning is also characteristic of non-linguistic expressive noises, gestures, etc. However, an interesting aspect of ideophonic systems is that they are linguistic, made up of phonemic sequences which are often arranged according to fairly restrictive morpheme structure constraints. As a result, they are more prone to conventionalization than paralinguistic systems generally are -- the range of possible metaphors is often restricted, the meanings of the ideophonic elements become less iconic and more arbitrary, and the degree of compositionality of the resulting ideophonic words may decrease, in just the same way that polymorphemic words in general decrease in compositionality across time.

For more on this, from people who know a lot more about than I do, you could take a look at two relatively recent books:

Leanne Hinton, ed., Sound Symbolism, CUP 1994
Erhard Friedrich Karl Voeltz & Christa Kilian-Hatz, eds., Ideophones, John Benjamins, 2001

[John Cowan writes:

Even Wilkins's language doesn't truly have phonemes with consistent meaning. All that can be said is that words for closely related concept differ in the last phoneme, and so on up the tree, but initial 'r' does not mean the same thing as 'r' in the third position: the former has a single (broad) meaning, but the latter's meaning depends on the identity of the first two phonemes.

Put that way, the language isn't so very different from natural languages with single-phoneme morphemes, like the one-letter clitic prepositions in the Slavic languages, or the "changed tone" in Cantonese, which is a set of homophonous bound morphemes (or a single highly polysemous one) that changes the tone of the preceding syllable to mid rising (35). In both cases, these morphemes formerly had more phonemes that have been lost; the 35 toneme was apparently once a whole syllable that happened to have 35 tone.


