According to the New York Times, some scientists found that
Hungarians are far better than Americans at recalling long prices; on average, they can recall 19 to 24 syllables with decent accuracy, while Americans can recall only 13. The authors suggested that this was because Hungarians speak 41 percent faster, both out loud and when repeating sounds to themselves “subvocally.”
I expressed skepticism, and promised to say more when the paper came out. Google Scholar found me an online preprint: Marc Vanhuele, Gilles Laurent, and Xavier Drèze, Consumers' Immediate Memory for Prices, and Eszter Hargittai (who blogged about it at Crooked Timber) pointed me to the final version online here. (This discussion is mostly based on the preprint, but a scan of the final version suggests that the experiments haven't changed much, except that the researchers appear to have added a subvocalization-rate experiment on American subjects.)
What I learned from reading the manuscript, alas, is that my skepticism was amply warranted.
1. The American and Hungarian experiments on price recall involved different stimuli and different experimental methods, so that cross-language comparison of the number of syllables remembered becomes problematic.
2. Specifically, the subjects' "syllable span" in recall of prices was not measured in the same way in the two languages. Instead, it was inferred from averages combining quite different sets of factors in the different languages, reflecting the different experimental designs. Given the differing structures of the experiments and the low percent of variance accounted for by regression models of the data, these inferred estimates of digit-string memory shouldn't be compared.
3. The researchers never measured the actual speaking rates of either the American or the Hungarian participants, nor (as far as the paper tells us, anyhow) of any other Americans or Hungarians. Instead, they measured the subvocalizing (i.e. imaginary speaking) rates of Hungarian speakers and French speakers. (In the final version, they compare subvocalization rates of Hungarian speakers and American English speakers.) This is a problematic thing to do: how do you really tell when someone starts subvocalizing and when they finish? The authors don't tell us their method, or give any assurance that the same method was used in France and in Hungary. Just as important, the strings being subvocalized were not comparable in the two languages, being samples of the differing stimuli used in the two different experiments.
The American and Hungarian stimuli were different on purpose -- that was the whole point of using Hungarians:
"...given our previous demonstration of the effect of the number of syllables of the price on price recall, we are interested in examining how consumers in countries with currencies with high face value deal with the challenges of price recall. Our two previous recall tests used monetary units with comparable face values -- the euro and the dollar -- but in some other currencies, the prices of the products include many more digits. The Hungarian marketplace is ideal to examine both dimensions. The Hungarian forint is pegged to the euro at an exchange rate of approximately 250 forint per euro (202 forint per dollar at the time of the study). In addition, price endings in Hungary follow a restricted number of patterns, which enables us to manipulate the visual "usualness" of prices in the Arabic format. Hungarian prices do not use decimals, and though coins of 1, 2, and 5 forint exist, unit prices that are three digits long (e.g. for the candy product category used in our study) always end in a 5 or 9. Prices with five or six digits (e.g., digital cameras) always end in 90 or 00."
In other words, the Americans saw prices like $3.91 or $543, whereas the Hungarian prices for the corresponding items would have been 789 or 109,690 (presumably with whatever the forint sign is, and I suppose with periods instead of commas marking the thousands). We'd have to look at the full range of stimuli and their likely pronunciations in English and Hungarian to figure out what effects this might have -- but in any case, the stimuli were quite different in kind as well as in detail.
The experimental conditions were different, it seems, out of practical necessity. In America, "ninety-one U.S. undergraduate business students participated for course credit in this online experiment", presumably participating individually, at times and in places of their choosing. In Hungary, "Due to practical constraints participants could not be tested individually and were tested in groups, which precluded a full randomization of prices across word lengths and presentation and test orders. Participants viewed a PowerPoint slide show with the instructions, study, and test screens." I believe that the Hungarians were tested in class; at least, this is explicitly what was done with the French participants in a pilot experiment.
Could these differences make a difference? Well, from what I know about Penn undergraduates (and as faculty master of Ware College House, I live among them), I'd guess that some of the American participants might have taken part in "this online experiment" while simultaneously listening to music (their own or their roommate's), participating actively or passively in a conversation, and otherwise multi-tasking. As for the Hungarians, they might have been distracted in other ways, but they also might have (even without wanting to) been able to see the answers written down by their neighbors in class.
Note that these experiments were fairly complex, and in different ways:
The main experimental variables of interest are the price length of the target product in number of syllables and the price length of the other products on the study screen (3-8 syllables each). Recall performance might also be influenced by the number of products per study screen (2 or 3), the presentation order (from left to right, 1, 2, or 3), and the question order (1, 2, or 3), which affects the time the consumer has to rehearse an item before recalling it. Therefore, we introduce these factors as control variables. ...
For each product, we created a price we labelled "short" (S) and another we labeled "long" (L). For study screens that displayed two products, we used the ... four price/length combinations: SS, SL, LS and LS [sic -- probably should be LL]. For three-product screens, we provided eight combinations from SSS to LLL. ... To limit the length of the task, we used a fractional factorial design in which we showed each subject only two of the three product categories ... Each subject therefore viewed a total of 24 screens with 64 product-price pairs to recall, and the product sequence, price length conditions were randomized for each subject.
That's for the American subjects, who were tested on three product categories (candy, DVDs and digital cameras). The Hungarians were tested in groups, and in a different way, using "the staircase method in which the total number of syllables to be remembered on each subsequent screen increases". They were tested on only two categories, candy and digital cameras. The candy trials escalated from 8 to 21 syllables, and the camera trials escalated from 21 to 42 syllables. (The Americans would have seen screens with a total of between 6 and 24 syllables, a much lower range.) As a result, in the Hungarian experiment,
...there is a high correlation between the number of syllables of the target and the other price(s) in a given trial (candy prices are three digits long; camera prices are six digits). In the logistic regression, we therefore used only one variable for syllable length to code the total syllable length of all prices in a given trial.
And see the paper for the details of an additional complication: the variation in the Hungarian experiment between "usual" prices (3-digit prices ending in 0 or 9, 6-digit prices ending in 90 or 00) and "unusual" ones (all the other possibilities). One group of subjects got only "usual" prices; the other group got 50% "usual" prices and 50% unusual ones. This is important, because it means that 3/4 of the Hungarian prices -- all the prices for the "usual" group, and half the prices for the "unusual" group -- contained significantly less information than the number of digits involved would indicate. In the case of the three-digit "usual" prices, there were only 200 possibilities instead of 1,000 (about 7.6 bits instead of 10 bits), and in the case of the six digit "usual" prices, there were only 20,000 possibilities instead of 1,000,000 (about 14.3 bits instead of 19.9). Combined with the fact that the length-in-syllables conditions were randomized in the English-language experiment, but always yoked stepwise for the Hungarians, it starts to look like an information-theoretic measure of the difficulty of the Hungarian and English tasks might end up nearly the same, despite the apparent difference in the number of digits in the prices.
OK, enough. I'm not convinced by the proposed inferences about comparative memory spans, whether measured in digits or in syllables. The purpose of this experiment was to "[examine] how consumers in countries with currencies with high face value deal with the challenges of price recall", and I'm happy to grant that it succeeds in doing so. It convinces me that Hungarians have no more trouble than Americans or French people do in remembering the prices of candy and digital cameras, despite the fact that their currency exchange rates make typical prices for these items take more syllables to pronounce than corresponding American or French prices.
But the researchers didn't measure the digit spans or speaking rates of Americans and Hungarians (or French either). They measured the ability to remember the prices of candy, digital cameras (and for the Americans, DVDs), using different stimuli, different experimental tasks, and different experimental settings in the experiments on subjects in different countries. Specifically, they grouped objects by 2s and 3s, but grouped the stimuli in different ways, presented the stimuli by different methods, and recorded the responses in different ways in the different countries. Then to estimate the number of syllables in a price that subjects were likely to remember 50% of the time, they averaged across the (many) experimental conditions other than the number of syllables in the target price -- conditions which were very different for the Hungarian participants and for the English-speaking (and French-speaking) participants. The result is interesting but extremely unlikely to yield reliable or comparable numbers for the subjects' ability to recall a given number of syllables.
In a more ideal universe, the people who write for a major paper like the New York Times would able to figure such things out. I guess we have to blame people like me (by which I mean the faculty of American universities) for failing to educate them properly.
Posted by Mark Liberman at August 16, 2006 10:47 PM