Yesterday, I heard a fascinating talk on intonational meaning in Japanese. Conclusion: speakers use higher pitches for more intense or emphatic evaluative phrases -- when the evaluation is positive. When the evaluation is negative, speakers use lower pitches for more intense or emphatic evaluation. This was true in conversational speech, but when speakers read lists of phrases, no such effect was found.
The talk was given by Yoshinori Sagisaga, for many years a researcher and research manager at ATR, and now a professor at Waseda University. He described some experiments (done in collaboration with Takumi Yamashita and Yoko Kokenawa) on the sort of simple adverb+adjective phrases that someone might say in response to a "how is it?" question, e.g.
Q. "Aji doo?" (How does it taste?)
A. "Hijooni umai" (It's extremely tasty.)
These phrases involved a number of adverbs, in decreasing scalar intensity:
hijooni | extremely |
sootoo | very |
wariai | quite |
sokosoko | relatively |
futsuuni | normally |
anmari | not so much |
and also a number of adjectives, which come in "positive" and "negative" pairs like clean vs. dirty:
positive |
negative |
+ |
- |
kirei |
kitanai |
clean |
dirty |
umai |
mazui |
delicious |
unsavory |
kawaii |
busaiku |
charming |
ugly |
yasasii |
kibisii |
mild |
strict |
omosiroi |
tumaranai |
interesting |
boring |
When subjects read such phrases in lists, the pitch contours were basically all the same (these examples are chosen to use accentless four-mora adverbs and adjectives, for ease of comparison, and the plot shows the fundamental frequency at the mid-point of each mora):
Things were different when people were giving their evaluation of a scene, situation, substance etc. in conversation with an experimenter. In this case, the scalar intensity of the adverb interacted with the emotional direction of the adjective. For positive adjectives (clean, charming, interesting, etc.), more adverbial intensification produced higher pitches (this plot shows pitch values at vowel mid-points for six adverbs):
and this plot shows the (proportional) difference between conversational speech and read speech for the highest pitch of each adverb:
In contrast, for negative adjectives (dirty, ugly, boring, etc.) , more adverbial intensification produced lower pitches (this plot shows the proportional difference between conversation speech and read speech for highest pitch of each available adverb):
This table gives the correlation between adverb F0 and adverb intensity:
Correlation between adverbial F0 and adverb "intensity": | |
Read speech | (no significant correlation) |
Conversational speech, positive adjectives |
+0.85 |
Conversational speech, negative adjectives |
-0.83 |
Sagisaka also did some perceptual experiments, in which 10 subjects listened to combinations of adverb+adjective at 12 different overall pitch levels, and judged them on a five-point scale from 1 (very bad) to 5 (very good). The average judgements for the positive adjectives showed preference for higher pitch with greater intensification:
max F0 (Hz) |
not at all |
not so much |
normally |
relatively |
quite |
very |
extremely |
185( F# ) |
1.56 |
1.42 |
1.70 |
1.96 |
2.26 |
3.48 |
3.78 |
174 ( F ) |
1.76 |
1.62 |
2.14 |
2.48 |
2.60 |
3.74 |
4.10 |
164 ( E ) |
2.10 |
2.00 |
2.62 |
2.94 |
3.18 |
4.00 |
4.16 |
155 ( D#) |
2.36 |
2.56 |
3.20 |
3.48 |
3.82 |
3.98 |
4.06 |
146 ( D ) |
2.84 |
2.88 |
3.52 |
3.72 |
4.04 |
3.84 |
3.98 |
138 ( C#) |
3.20 |
3.16 |
3.96 |
4.14 |
4.14 |
3.56 |
3.50 |
130 ( C ) |
3.48 |
3.50 |
4.12 |
4.18 |
4.00 |
3.28 |
3.12 |
123 ( B ) |
3.80 |
3.90 |
4.10 |
3.98 |
3.64 |
2.94 |
2.70 |
116 ( A#) |
3.98 |
4.08 |
3.66 |
3.60 |
3.30 |
2.50 |
2.38 |
110 ( A ) |
4.34 |
3.92 |
3.12 |
3.00 |
2.66 |
2.20 |
1.94 |
103 ( G#) |
4.34 |
3.72 |
2.56 |
2.56 |
2.36 |
1.84 |
1.64 |
98 ( G ) |
4.18 |
3.54 |
2.30 |
2.32 |
2.12 |
1.70 |
1.54 |
The judgments for the negative adjectives showed preferences in the opposite direction:
max F0 (Hz) |
not at all |
not so much |
normally |
very |
extremely |
185 ( F# ) |
2.80 |
2.32 |
1.52 |
1.96 |
2.20 |
174 ( F ) |
3.12 |
2.72 |
1.78 |
2.26 |
2.32 |
164 ( E ) |
3.46 |
3.22 |
2.24 |
2.46 |
2.46 |
155 ( D#) |
3.50 |
3.56 |
2.82 |
2.62 |
2.52 |
146 ( D ) |
3.64 |
3.72 |
3.22 |
2.88 |
2.84 |
138 ( C#) |
3.66 |
3.86 |
3.66 |
3.22 |
3.26 |
130 ( C ) |
3.52 |
3.80 |
3.92 |
3.64 |
3.50 |
123 ( B ) |
3.10 |
3.60 |
4.14 |
3.80 |
3.84 |
116 ( A#) |
2.74 |
3.10 |
4.06 |
4.04 |
3.92 |
110 ( A ) |
2.38 |
2.54 |
3.76 |
4.12 |
4.14 |
103 ( G#) |
2.18 |
2.34 |
3.54 |
4.22 |
4.12 |
98 ( G ) |
1.94 |
2.08 |
3.44 |
4.04 |
3.94 |
I suspect that English speakers would show similar effects, at least in some circumstances.
Posted by Mark Liberman at March 22, 2005 07:53 AM