October 22, 2005

English the most idiosyncratic and wordy?

Daphne Bramham's column [subscription required] in today's Vancouver Sun is entitled:

Keeping up with the English race:
The most idiosyncratic and wordiest of languages acquires and sheds words with stunning speed

For the most part, the column is an innocuous discussion of words that have recently entered the language. My favorite is ignoranus, for a person who is both ignorant and an asshole. I can think of many uses for that one. What struck me as peculiar are the claims that of all languages English is the "most idiosyncratic" and "wordiest". These are repeated in the body of the article:
English is the most idiosyncratic and wordiest of all languages. It has none of the rigidity of form that is the hallmark of German. And in all of its creative exuberance, unlike French, there's been no need for official word police.

The claim that English is the wordiest language has a fairly straightforward interpretation. "wordy" means "using or containing too many words", so the wordiest language would be the language that, on average, uses the most words to express the same content. We can't really evaluate this claim without making it more precise - we need to have consistent cross-linguistic notions of "word" and "same content" - but I strongly suspect that insofar as we can make the claim precise enough to test it will turn out not to be true. One reason is that, other things being equal, we should expect wordiness to be greater in isolating languages, in languages spoken by people with simple and unspecialized technology, and in languages whose relatively recent history has been such that they have no acquired multiple layers of vocabulary via language contact. Since English is not at the extreme isolating end of the morphological spectrum, is associated with complex and specialized technology, and has multiple lexical strata, we would not expect it to be particularly wordy.

Another reason is that both I and others who have some experience with translation have the distinct impression that documents generally lengthen noticeably when translated from English into French but not conversely. I don't know if this topic has been studied rigorously, but I did a quick check that seems to bear it out. I counted the words in the English and French versions of the decisions of the Supreme Court of Canada. The French versions contained 19,687,757 words, the English versions 18,682,563, for a ratio of 1.054. By this measure French is about 5% wordier than English.

I suspect, however, in light of her remarks on "creative exuberance", that Bramham means something different, namely that the English lexicon contains more words than that of any other language. That may be true, if one can get past the very sticky problems of defining and counting words, but in my experience wordy when applied to a language cannot mean "having a large lexicon".

More problematic still is the claim that English is the most idiosyncratic language. To begin with, what does this mean? It must mean that English deviates more from the linguistic norm than any other language. What norm, and how do we quantify deviation? Is idiosyncrasy some sort of statistical measure of deviation from central tendancy or is it to be based on notions like markedness of parameter settings and/or presence or absence of peripheral constructs? And insofar as we have a measure of idiosyncrasy, how does anyone know which language is most idiosyncratic, much less someone whose acquaintance with languages appears to be restricted to English, French, and German? How are we to know that one of the other 6,000+ languages isn't more idiosyncratic? Indeed, if French and German are taken to represent the norm, virtually all of the native languages of British Columbia are by any reasonable standard more different than English.

My purpose here is not to pick on Daphne Bramham. She's an experienced, award-winning journalist (profile) whose column I generally like. She's knowledgable and on the side of truth, justice, and the Canadian way. She may even read Language Log: she quotes from a post by Mark Liberman later in the same column. My point is that for some reason otherwise sensible people seem to feel free to toss off dubious statements about language without much thought, investigation, or careful phrasing. This column of Daphne Bramham's is a minor offender: the dubious statements don't affect the main point of the column and Bramham does not present herself as having any expertise on linguistic matters.

More disturbing are journalists who make serious mistakes on major points of their articles, especially those who write with some frequency about language and consider themselves to have expertise. Geoff Pullum has pointed out the deficiencies of a piece by BBC science reporter Alex Kirby and a report by CBS reporter Bob Simon. Mark Liberman has dealt with The Atlantic's Cullen Murphy and The Boston Globe's John Powers. An example that I've discussed is New York Times science reporter Nicholas Wade's lack of understanding of the rudiments of historical linguistics. As I've suggested here before, the poor performance of journalists writing about linguistics probably reflects the generally low level of knowledge about language. It may not be reasonable to expect journalists to learn much about linguistics, but they could do a lot better just by taking the subject more seriously and doing a little more research and thinking before they write about it.

Posted by Bill Poser at October 22, 2005 11:49 PM