Linguistics in General Science Journals

An issue that has come up here on Language Log on several occasions is the inadequate refereeing of papers in some areas of linguistics, especially historical linguistics, by general science journals such as Nature, the Proceedings of the National Academy of Science, and Science. The problem is that the editors of such journals falsely believe themselves and the referees that they choose competent to evaluate papers in areas of linguistics remote from those that overlap areas traditionally covered by these journals, such as neurolinguistics and psycholinguistics. As a result, highly controversial if not outright crank work is treated as if it were solid. When linguists criticize their editorial practices, the editors of such journals tend to respond huffily that they know what they are doing and have selected competent referees: the linguists who complain are just old fuddy-duddies, irritated at being left in the wake by the new wave.

In theory the names of the referees are confidential, so it isn't possible directly to debate their qualifications. In practice, we do sometimes know who the referees were, and when we do, it turns out as we suspect that they are either linguists competent in some other area of linguistics who know little about the relevant area or people not competent in linguistics at all. Even in these cases, however, a public debate is not feasible.

There is, however, another route, which I am going to take here. I will present an example of a paper that unquestionably should not have been published in the form in which it was published. To be precise, I will argue the following two propositions:

  1. Any competent editor would have recognized that the paper should be refereed by a person competent in historical linguistics;
  2. Important aspects of the paper depend on claims that no one competent in historical linguistics would let pass;

The paper that I will discuss is "On the origin of internal structure of word forms" by Peter F. MacNeilage and Barbara L. Davis, which appeared in Science on 21 April 2000, volume 288, pages 527-531. Let me immediately make clear that this isn't meant as an attack on either Peter MacNeilage, whom I know, or Barbara Davis, whom I don't know. The problem with this paper arises from the fact that it brings together material from two quite different areas of linguistics about which no single person can be expected to be knowledgable. One of the important functions of the editorial process is providing expert review of areas in which the authors are not experts. It is the editors of Science who failed more than MacNeilage and Davis.

Here is the abstract published with the paper:

This study shows that a corpus of proto-word forms shares four sequential sound patterns with words of modern languages and the first words of infants. Three of the patterns involve intrasyllabic consonant-vowel (CV) co-occurrence: labial (lip) consonants with central vowels, coronal (tongue front) consonants with front vowels, and dorsal (tongue back) consonants with back vowels. The fourth pattern is an intersyllabic preference for initiating words with a labial consonant-vowel-coronal consonant sequence (LC). The CV effects may be primarily biomechanically motivated. The LC effect may be self-organizational, with multivariate causality. The findings support the hypothesis that these four patterns were basic to the origin of words.

The paper deals with the relationship among three sets of forms:

  • the words of modern languages as spoken by adults
  • the first words of modern infants
  • the reconstructed forms of words of the original human language

As the claims about proto-language are important and lie outside the areas of specialization of both authors, the editors should have known that obtaining and paying heed to at least one referee competent in historical linguistics was essential.

Let us turn now to the second point, that there are severe problems with the "corpus of proto-word forms" used by MacNeilage and Davis. The corpus in question consists of the 27 "reconstructions" proposed by Bengtson and Ruhlen (1994) in a paper entitled "Global Etymologies". This paper is well known to be badly flawed.

The first problem with B&R's "reconstructions" is that it remains to be established that all human languages are descended from a common ancestor. B&R purport to demonstrate this using the notoriously flawed technique dubbed "multilateral comparison" or "mass comparison" by its proponents. This technique, more accurately called "superficial lexical comparison", consists in presenting sets of words from various languages that allegedly resemble each other sufficiently in sound and meaning that the resemblance cannot reasonably be attributed to chance and declaring "Behold!". In point of fact, the probability of finding similarities of the sort adduced as evidence is quite high. Moreover, the technique is unable to distinguish similarities due to common descent from those due to language contact. The technique is known to be unsound both on theoretical grounds and on the basis of experience: although those ignorant of the history of historical linguistics often claim that it is an innovation introduced to overcome the limitations of the comparative method, in fact it is the older technique, displaced by the comparative method as linguistics developed modern, scientific methods. (For some examples of false conclusions produced by "mass comparison", see Poser and Campbell (1992).)

The second problem with B&R's "reconstructions" is that they are not. The term "reconstruction" has a specific, generally accepted meaning in historical linguistics. It does not mean just any guess as to what the ancestral form of a word might have been. Rather, it refers to the result of applying a fairly well defined procedure. That procedure involves discovering regular sound correspondances among a set of languages, determining which sound correspondances are in complementary distribution and therefore reflect developments from a single proto-phoneme, and using both the interaction of the sound changes and our knowledge of the directionality of sound change to determine what the most likely phonetic properties of the reconstructed phonemes were and what sound changes led to the observed sound correspondances. Laying this out in detail would require a lengthy post in itself, but it is discussed in any good textbook of historical linguistics, such as Campbell (2004), Crowley (1998), Hock and Joseph (1996), Rankin (2003), or Trask (1995). For something online, the Wikipedia article on the Comparative Method is pretty good.

The point is that real reconstructions are supported by very specific kinds of evidence and that they represent a network of falsifiable hypotheses. Bengtson and Ruhlen's "reconstructions" are not the result of such a procedure but are mere guesswork. They have quite literally looked at words in a number of languages and said: "well, these might have come from something like X" and proclaimed X to be a reconstruction. Such "reconstructions" cannot even be wrong. If I were to claim that the proto-Indo-European ancestor of English bear, Latin fero, Sanskrit bharāmi, Greek φερώ, Armenian berem, etc. was *pero rather than the accepted *bhero, I would be proven wrong by the fact that the sound correspondances would not work: Proto-Indo-European */p/ does not yield Latin /f/, Sanskrit /bh/, Greek φ, etc. But if someone disagrees with Bengtson and Ruhlen as to the proto-World reconstruction of, say, *tik "finger" and claims that it was really *dik, or *tug, or *tink, there is no way to settle the issue, because each hypothesis is as good as any other. There is no network of interlocking, falsifiable hypotheses, just a bunch of independent guesses.

Indeed, it is worth noting that B&R do not present any systematic argument for their "reconstructions", nor have they, nor anyone else, ever offered a justification for this "technique". Proponents of superficial lexical comparison have offered justifications for their approach, unconvincing though they be, but neither I nor any of the other linguists with whom I have checked is aware of any attempt at justifying "reconstructions" produced by methods other than the comparative method.

The third problem with B&R's "reconstructions" is that the data on which they are based are badly flawed. Two major critiques have been published. One, Salmons (1992), looked at the evidence underlying a single putative proto-form *tik. It found grave errors in much of the data. The other, Picard (1998), is amenable to compact summary. Picard reviewed all of the data from Algonquian languages cited by B&R. He found five types of error:

Incorrect Language
The form cited is not from the language it is said to be from. For example, the form woxos "shin" is identified as Blackfoot but is in fact Arapaho.

Incorrect Gloss
The meaning given for the form is incorrect. For example, Natick mukketchouks is glossed as "boy", but is actually "son, man child".

Incorrect Transcription
The pronounciation given is incorrect. For example, the Shawnee word for "girl" is given as kwan-iswa but is actually kwaaniswa.

Incorrect Segmentation
The word is broken into morphemes incorrectly or without justification. For example, Blackfoot nóoma "my husband" is given by B&R as no-ma, incorrectly glossed as "husband". The hyphen indicates that B&R think that the word consists of two morphemes. This allows them to compare this word with other words containing ma, which they claim go back to a Proto-World form mano. This Blackfoot word actually consists of the prefix /n/ "my" and the noun stem /-óoma/ "husband". /óo/ is part of the noun stem. /óoma/ looks less like /mano/ than /ma/ does; the effect of B&R's incorrect segmentation was to make the Blackfoot form look more similar to the other forms cited than it really is. Such errors of segmentation do not have a random effect - they almost always are of such a nature as to make the forms compared look more similar than they really are.

Ancestral Disparity
In some cases, a word that in its modern, attested form resembles the other members of the equation is known to derive from an ancestral form that looks very little like the other members. For example, Arapaho /woxos/, which does resemble the putative Proto-World /bu(n)ka/ and words from other languages with which it is compared, is derived from Proto-Algonquian /meθkwaθkana/, which looks nothing like /bu(n)ka/.

Of B&R's 27 "global etymologies", nine involve forms from Algonquian languages. Picard found an average of two errors per form, with at least one error in every form. Picard's findings are summarized in the following table, based on the table in his paper.

Error Type/Form123456789
Incorrect Language (Group)X  X    X
Incorrect Gloss    XXXX 
Incorrect Transcription X X     
Incorrect Segmentation X   X XX
Ancestral DisparityX XX  XX 

Some errors do not necessarily render the form useless. Assignments to the wrong language may not matter if it belongs to the same family. Incorrect glosses may not matter if the true meaning of the form is close enough. Incorrect transcriptions do not always change relevant factors. On the other hand, incorrect segmentation usually means that the real form does not fit the equation. Similarly, when the ancestral form is unlike the form cited, this means that the form does not fit the equation. Errors in these two categories therefore generally invalidate the comparison.

Picard found either an incorrect segmentation or ancestral disparity in eight of the nine Algonquian forms cited. In sum, all of the Algonquian forms cited by B&R are erroneous; eight out of nine forms are flawed so seriously as to invalidate their inclusion in the equation. (Only seven of the nine equations are actually invalidated because in one case B&R compared both an actual Arapaho form (misidentified as Blackfoot) and a reconstructed Proto-Algonquian form. Although the Arapaho form is descended from a Proto-Algonquian form that looks nothing like the other members of the equation, the Proto-Algonquian form that they cite is sufficiently similar in form and meaning to the forms from other languages that it might validly be compared with them.)

The fourth problem has to do with the ambiguity of the term Proto-World. In one sense, this means the hypothetical first language of human beings. In the other sense, it means the reconstructed ancestor of the attested human languages. If all languages that had ever existed were attested, their reconstructed ancestor would be an approximation to the first language of human beings at the point at which it first diversified into two or more speech varieties. Unfortunately, we have no idea how long a period elapsed between the origin of language and the point at which the first diversification ocurred, much less of what changes may have occurred, nor do we know what branches of the family tree may have become extinct without attestation. It is perfectly possible that thousands of years elapsed between the origin of language and the first diversification. Moreover, if major branches are unattested, the common ancestor of all known languages may postdate the original human language by tens of thousands of years. The result is that even if we could reconstruct the ancestor of all attested languages, it would not be safe to equate it with the original language.

In sum, there are four problems with MacNeilage and Davis' use of B&R's "reconstructions":

  • It has yet to be demonstrated that all spoken languages are descended from a single ancestor;
  • These so-called "reconstructions" are not in fact reconstructions. They have no scientific basis whatsoever;
  • There are severe flaws in the data on which the "reconstructions" are based;
  • Even if we assume that all spoken languages are related and that B&R's "reconstructions" are valid reconstructions of forms from the ancestor of all attested spoken languages, the relationship of this proto-language to the first language spoken by human beings is unknown.

These problems are well known to historical linguists and were well known in 1999 when MacNeilage and Davis' paper was under review by Science. They are so severe as to completely invalidate MacNeilage and Davis' reliance on Bengtson and Ruhlen's "global etymologies". These "reconstructions" are not merely controversial; they are nonsense. Relying on them is like relying on Pons and Fleischmann's work on cold fusion or Erich von Däniken's work in archaeology. I submit, then, that we can conclude with confidence that in evaluating this paper the editors of Science did not rely on the advice of anyone competent in historical linguistics.

Bengtson, John D. and Merritt Ruhlen (1994)
"Global etymologies," in Merritt Ruhlen (ed.) On the Origin of Languages. Stanford: Stanford University Press. pp. 277-336.
Campbell, Lyle (2004)
Historical Linguistics: An Introduction. Cambridge: The MIT Press. 2nd edition.
Crowley, Terry (1998)
An Introduction to Historical Linguistics. Oxford: Oxford University Press.
Hock, Hans H. and Brian D. Joseph (1996)
Language History, Language Change, and Language Relationship: an introduction to historical and comparative linguistics. Berlin: deGruyter. ISBN 311014784X.
Picard, Marc (1998)
"The Case against Global Etymologies: Evidence from Algonquian," International Journal of American Linguistics 64.2.141-147.
Poser, William J. and Lyle Campbell (1992)
"Indo-European Practice and Historical Methodology" Proceedings of the 18th Annual Meeting of the Berkeley Linguistics Society pp. 214-236.
Rankin, Robert L. (2003)
The comparative method. In Brian D. Joseph & Richard D. Janda (eds.) The handbook of historical linguistics, Oxford: Blackwell, pp. 183-212. ISBN 1405127473.
Salmons, Joseph P. (1992)
"A look at the data for a global etymology: *tik 'finger'," In W. Davis Garry and Gregory K. Iversen (eds.) Explanation in Historical Linguistics. Amsterdam: John Benjamins. pp. 207-228.
Trask, R. L. (1996)
Historical Linguistics. New York: Oxford University Press.

Note: the dates of the papers as published are misleading. All of these papers were circulated long in advance of publication, so the fact that Salmons' critique of B&R's paper was published two years prior to the appearance of the paper of which it is a critique is not anomalous. Indeed, B&R complained quite publicly about the rejection of their paper by Language, in response to which a public discussion took place.

