March 22, 2005

Up with (more) good, down with (more) bad

Yesterday, I heard a fascinating talk on intonational meaning in Japanese. Conclusion: speakers use higher pitches for more intense or emphatic evaluative phrases -- when the evaluation is positive. When the evaluation is negative, speakers use lower pitches for more intense or emphatic evaluation. This was true in conversational speech, but when speakers read lists of phrases, no such effect was found.

The talk was given by Yoshinori Sagisaga, for many years a researcher and research manager at ATR, and now a professor at Waseda University. He described some experiments (done in collaboration with Takumi Yamashita and Yoko Kokenawa) on the sort of simple adverb+adjective phrases that someone might say in response to a "how is it?" question, e.g.

Q. "Aji doo?" (How does it taste?)
A. "Hijooni umai" (It's extremely tasty.)

These phrases involved a number of adverbs, in decreasing scalar intensity:

hijooni   extremely
sootoo very
wariai quite
sokosoko   relatively
futsuuni normally
anmari not so much

and also a number of adjectives, which come in "positive" and "negative" pairs like clean vs. dirty:

positive
negative
+
-
kirei
kitanai
clean
dirty
umai
mazui
delicious
unsavory
kawaii
busaiku
charming
ugly
yasasii
kibisii
mild
strict
omosiroi
tumaranai
interesting
boring

When subjects read such phrases in lists, the pitch contours were basically all the same (these examples are chosen to use accentless four-mora adverbs and adjectives, for ease of comparison, and the plot shows the fundamental frequency at the mid-point of each mora):

Things were different when people were giving their evaluation of a scene, situation, substance etc. in conversation with an experimenter. In this case, the scalar intensity of the adverb interacted with the emotional direction of the adjective. For positive adjectives (clean, charming, interesting, etc.), more adverbial intensification produced higher pitches (this plot shows pitch values at vowel mid-points for six adverbs):

and this plot shows the (proportional) difference between conversational speech and read speech for the highest pitch of each adverb:

In contrast, for negative adjectives (dirty, ugly, boring, etc.) , more adverbial intensification produced lower pitches (this plot shows the proportional difference between conversation speech and read speech for highest pitch of each available adverb):

This table gives the correlation between adverb F0 and adverb intensity:

Correlation between adverbial F0 and adverb "intensity":
Read speech
(no significant correlation)
Conversational speech,
positive adjectives
+0.85
Conversational speech,
negative adjectives
-0.83

Sagisaka also did some perceptual experiments, in which 10 subjects listened to combinations of adverb+adjective at 12 different overall pitch levels, and judged them on a five-point scale from 1 (very bad) to 5 (very good). The average judgements for the positive adjectives showed preference for higher pitch with greater intensification:

max F0
(Hz)
not at all
not so
much
normally
relatively
quite
very
extremely
185( F# )
1.56
1.42
1.70
1.96
2.26
3.48
3.78
174 ( F   )
1.76
1.62
2.14
2.48
2.60
3.74
4.10
164 ( E   )
2.10
2.00
2.62
2.94
3.18
4.00
4.16
155 ( D#)
2.36
2.56
3.20
3.48
3.82
3.98
4.06
146 ( D  )
2.84
2.88
3.52
3.72
4.04
3.84
3.98
138 ( C#)
3.20
3.16
3.96
4.14
4.14
3.56
3.50
130 ( C  )
3.48
3.50
4.12
4.18
4.00
3.28
3.12
123 ( B  )
3.80
3.90
4.10
3.98
3.64
2.94
2.70
116 ( A#)
3.98
4.08
3.66
3.60
3.30
2.50
2.38
110 ( A  )
4.34
3.92
3.12
3.00
2.66
2.20
1.94
103 ( G#)
4.34
3.72
2.56
2.56
2.36
1.84
1.64
98 ( G  )
4.18
3.54
2.30
2.32
2.12
1.70
1.54

The judgments for the negative adjectives showed preferences in the opposite direction:

max F0
(Hz)
not at all
not so much
normally
very
extremely
185 ( F# )
2.80
2.32
1.52
1.96
2.20
174 ( F   )
3.12
2.72
1.78
2.26
2.32
164 ( E   )
3.46
3.22
2.24
2.46
2.46
155 ( D#)
3.50
3.56
2.82
2.62
2.52
146 ( D  )
3.64
3.72
3.22
2.88
2.84
138 ( C#)
3.66
3.86
3.66
3.22
3.26
130 ( C  )
3.52
3.80
3.92
3.64
3.50
123 ( B  )
3.10
3.60
4.14
3.80
3.84
116 ( A#)
2.74
3.10
4.06
4.04
3.92
110 ( A  )
2.38
2.54
3.76
4.12
4.14
103 ( G#)
2.18
2.34
3.54
4.22
4.12
98 ( G  )
1.94
2.08
3.44
4.04
3.94

I suspect that English speakers would show similar effects, at least in some circumstances.

 

Posted by Mark Liberman at March 22, 2005 07:53 AM