December 04, 2007

Rational inquiry: past, present and future

Remember that National Geographic spread last year about the newly-discovered "Gospel of Judas Iscariot"? An opinion piece by April DeConick in the New York Times ("Gospel Truth", 12/1/2007) argues that its essential argument was untrue and perhaps even dishonest:

It was a great story. Unfortunately, after re-translating the society's transcription of the Coptic text, I have found that the actual meaning is vastly different. While National Geographic's translation supported the provocative interpretation of Judas as a hero, a more careful reading makes clear that Judas is not only no hero, he is a demon.

Several of the translation choices made by the society's scholars fall well outside the commonly accepted practices in the field. For example, in one instance the National Geographic transcription refers to Judas as a "daimon," which the society's experts have translated as "spirit." Actually, the universally accepted word for "spirit" is "pneuma" — in Gnostic literature "daimon" is always taken to mean "demon."

Likewise, Judas is not set apart "for" the holy generation, as the National Geographic translation says, he is separated "from" it. He does not receive the mysteries of the kingdom because "it is possible for him to go there." He receives them because Jesus tells him that he can't go there, and Jesus doesn't want Judas to betray him out of ignorance. Jesus wants him informed, so that the demonic Judas can suffer all that he deserves.

Beyond the pattern of translations "well outside the commonly accepted practices in the field", Prof. DeConick described a mistake that seems to rise to the level of outright fakery:

Perhaps the most egregious mistake I found was a single alteration made to the original Coptic. According to the National Geographic translation, Judas's ascent to the holy generation would be cursed. But it's clear from the transcription that the scholars altered the Coptic original, which eliminated a negative from the original sentence. In fact, the original states that Judas will "not ascend to the holy generation." To its credit, National Geographic has acknowledged this mistake, albeit far too late to change the public misconception.

She leaves the question of motivation unresolved; or, to put it another way, she leaves the National Geographic dangling in an ethical limbo:

How could these serious mistakes have been made? Were they genuine errors or was something more deliberate going on? This is the question of the hour, and I do not have a satisfactory answer.

You can read more about this textual controversy, and additional background on the Tchacos Codex, on The Forbidden Gospels Blog. Her arguments about the translation seem persuasive to me, though I haven't tried to evaluate them in detail. But this case underlines an important respect in which the norms of inquiry in traditional humanistic scholarship are superior to those of modern science.

When scholars like Prof. DeConick debate a point, they normally do so in a context where all participants have access to all of the underlying data. If someone mis-translates a crucial word, or leaves out a negative, other scholars will catch it -- because they have independent access to the same texts.

But in this case, as Prof. DeConick observes in her NYT OpEd piece, the National Geographic violated those norms, at least temporarily, in pursuit of a scoop:

National Geographic wanted an exclusive. So it required its scholars to sign nondisclosure statements, to not discuss the text with other experts before publication. The best scholarship is done when life-sized photos of each page of a new manuscript are published before a translation, allowing experts worldwide to share information as they independently work through the text.

Another difficulty is that when National Geographic published its transcription, the facsimiles of the original manuscript it made public were reduced by 56 percent, making them fairly useless for academic work. Without life-size copies, we are the blind leading the blind. The situation reminds me of the deadlock that held scholarship back on the Dead Sea Scrolls decades ago. When manuscripts are hoarded by a few, it results in errors and monopoly interpretations that are very hard to overturn even after they are proved wrong.

To avoid this, the Society of Biblical Literature passed a resolution in 1991 holding that, if the condition of the written manuscript requires that access be restricted, a facsimile reproduction should be the first order of business. It's a shame that National Geographic, and its group of scholars, did not follow this sensible injunction.

Now that full transcriptions and/or full-sized facsimiles are available, the normal situation in humanistic scholarship has been restored. And this is a situation that scientists, in general, can only dream about.

When scientists publish a new and controversial claim, they normally keep their basic data secret, publishing only a few illustrative examples, and some summaries in the form of tables, graphs and evaluative numbers from various statistical tests. This is supposed to present the material that is essential to the argument. But even if there has been no out-and-out faking of data, the path that leads to the illustrative examples and statistical summaries is usually full of choices that are not neutral ones.

There may be seriously confounding factors (in the selection of materials or subjects, or in the process of running experiments or gathering data), and the scientists may fail to notice these, or may notice them and choose not to document the problems.

The measurements, even if they're made from a scientific display like a spectrum or an MRI image, often involve some subjective judgment. Sometimes measurements made "blind", i.e. by people who are unaware of the hypothesis to be tested and also of the category of each measured display. But often this kind of blind measurement isn't possible, or isn't done where it might have been possible.

The data used in the final argument is often selected and corrected, with outliers or other "bad" data detected and removed. This is a fine thing to do, in principle -- but if "outliers" are in effect defined as data points that disagree with the hypothesis, this cleaning process may bias the results.

And once a data set is created and cleaned up, there are many different kinds of statistical models and tests that might in principle be applied to it. It's common for scientists to analyze their data in dozens of ways, and publish only one or two of these. This may be the result of an honest evaluation of the best way to show what's going on -- but it's usually at least an attempt to find the most "interesting" angle, one that makes a case in the strongest way. And sometimes it's a frankly partisan choice, with equivocal or contrary indications suppressed.

The area most open to abuse is the selection of "characteristic" or "typical" examples, where the goal is to illustrate the phenomena under discussion, but the result may be to leave readers with a very misleading idea of what things are like.

Disciplinary norms are supposed to prevent most of these problems, and referees and editors are supposed to catch the rest. But I can tell you that disciplinary norms are spotty in the fields that I'm familiar with, and even the most active referee can't easily penetrate the veil that separates scientific raw materials from the summary presentations in papers submitted for publication.

These are among the reasons that it's crucial for results to be replicated, especially by people who are not committed to any particular outcome. In effect, scientists who are replicating -- or failing to replicate -- someone else's results are going back to Nature's texts for a independent reading. (That's Mother Nature, not the Nature Publishing Group...). However, replications are slow, and in some cases they are dauntingly (or even prohibitively) expensive. And when the phenomena under investigation are diverse, as linguistic behavior certainly is, a failure to replicate can be just as misleading as an initial result.

So in general, research would progress a lot faster, and with fewer false starts and blind alleys, if scientists in most fields normally published their raw data, as well as a record of the crucial stages in cooking the final presentation. Once this was completely impractical, but cheap mass storage now makes it relatively easy. And in the fields where it has become normal for people to work with shared raw (and curated) data, the effect has been more cost-effective research and faster progress.

There are many reasons that people in less enlightened subdisciplines give for not wanting to do this. There's the extra work of documenting and organizing their raw and partly-cooked materials so as to make them coherently accessible to others. There are problems about data formats. There can be a problem of confidentiality of human subjects. There's yadda yadda yadda. These arguments have some force, but (in my opinion) most of them are suspiciously self-serving.

The continuing development of networked computing makes it inevitable, in my opinion, that scientific practice will change in the direction of fuller publication of experimental data. It'll be a slow process, especially in academic science, since modern academics are among the most conservative cultures in history. But eventually, the science of the future will be as empirically responsible as the humanism of the past.

Posted by Mark Liberman at December 4, 2007 07:09 AM