March 21, 2008

Wiki rage in Sussex

Tara Brabazon, professor of media studies at the University of Brighton, has written a column about her reactions to the Kindle wireless electronic reading device in the online edition of Times Higher Education. In general she seems to like it, but she does make this remark:

Kindle includes wireless access to Wikipedia. I do not need wireless access to Wikipedia. I would prefer to stir-fry my own small intestines than to have continual access to a site where the entry for Klingon is longer than the entry for Latin.

Now, I don't want to seem insensible to the genre of humorous hyperbole (I believe I might even have used it myself in the past); but let me just enter a mild demurral concerning the use of gross byte count of Wikipedia articles as a means of assessing the encyclopedia's values. [It has been pointed out to me (hat tip: Aaron Davies) that this practice is a recognized sport among whiners; it is called wikigroaning, and it is discussed very entertainingly in this blog post.]

In a literal sense, Professor Brabazon is not wrong on the claim about differential length. I did make some crude efforts to double-check the factual point by hand. There are many problems with measuring the length of Wikipedia entries (because of charts, images, special characters, cultural trivia lists, links, see-also lists, notes, references, etc.), but each time I tried counted non-blank character sequences in the two main articles down to various roughly comparable stopping places, the results did mostly come out with slightly higher word counts for the Klingon language article than for the Latin language article. [Note: this automatic checking site, however, seems to disagree as of March 23rd, 2008.]

The differences are small: gross word counts for the entire articles, including all the Klingon cultural trivia, yield only a 10% difference. Choosing other end points sometimes makes the difference smaller. And of course, the difference starts to go massively the other way when all the related articles on Latin literature and culture are considered. But as things stand, it can truly be said that slightly more has been written on the Klingon language than the Latin language in Wikipedia. (One reason is that Klingon has a separate script that needs some discussion, whereas Latin is written using essentially the same alphabet that English still uses after all these years.)

I just wouldn't want anyone to be tempted to take Professor Brabazon seriously and really look down on Wikipedia just because, at this point in its evolution, the Klingon hobbyists seem to have posted slightly more than the classicists. One point to be made is that excessive length in one article has no implications at all regarding quality of a different article. If Encyclopedia Britannica spends too many pages on a minor topic, you pay for it in actual paper and ink and leather bindings, and there is less space for everything else in the encyclopedia for which you paid your hundreds of dollars. But Encyclopedia Britannica is beginning to look a little bit like the typewriter to a lot of younger people today. Professor Brabazon seems not be quite on board with this new medium in which storage space is essentially free and article lengths are unlimited by media considerations. Long articles on junk topics you don't care about no longer cost you space that could have been devoted to topics you do care about.

More substantively, as the first commenter on the column notes, for scientists Wikipedia is in general remarkably useful, and of astonishingly high factual accuracy. If there has been trouble with destructive battling over biased text, it seems to have been not in science but instead mainly in politics, recent history, and controversial current events.

In fact let me point out right now a factually incorrect statement obviously entered with malice, in a place where you need technical skills in linguistics to spot it. In the article on renegade AIDS researcher Peter Duesberg, when accessed on March 21, a pronunciation was given after Professor Duesberg's surname, in IPA phonetic script. Only the pronunciation shown was not the one for Duesberg; it was the pronunciation of but the one for douche bag. (See why everyone should learn at least the IPA?) That's the sort of minor sabotage you do get on Wikipedia (one can imagine the prank might have been the work of a gay activist who's taken a phonetics class). But even there, the rest of the article seems accurate. [And — underlining the reliability and resilience of Wikipedia rather than its weakness — the insertion was deleted almost immediately. The day after this post went up, that malicious phonetic transcription was gone.]

Stephen Colbert did base a lot of funny material on the idea of a world in which everyone could rewrite the encyclopedia to make it say whatever they wanted it to say (recall the sketch in which he repeatedly altered his own entry). But his entry looks normal and accurate now. Whenever I look up articles on random topics in fields where I have some technical knowledge, in general I am amazed at the quality of Wikipedia. It is constantly revised and improved, and is astonishingly up to date (for example, try looking up the entry for a famous person who died yesterday, and you will probably find that the death has been covered). Adding yet more to the Klingon language entry won't alter its quality, any more than adding to the Latin language entry would reduce Professor Brabazon's rather snobbish hostility to it.

Update, March 22: A case could be made that I am vastly too generous to Professor Brabazon. In fact the case has been made. Jake Troughton writes:

I agree with pretty much everything you said regarding Wikipedia, but I do think it's worth noting that the articles related to the Latin language aren't just "on Latin literature and culture", but on aspects of the language itself. There are separate articles, for instance, on Latin grammar, Latin declension, and Latin conjugation, History of Latin, Latin spelling and pronunciation, and more, which (as near as I can tell) don't have counterparts for Klingon. Thus, I don't think you're quite right when you say, "But as things stand, it can truly be said that slightly more has been written on the Klingon language than the Latin language in Wikipedia." Klingon may have the longer main page, but that's very nearly the extent of Wikipedia's information on the topic, whereas in the case of Latin, there are subpages that go into some detail.

My point, of course, is purely factual. I do not think Wikipedia is made better by an increase of it's Latin-information-to-Klingon-information ratio. If anything, I'd say that it's contributions on the Klingon language are sorely lacking; if you think about, a constructed language that was designed to sound as alien as possible, and that was built to reflect the culture and values of a fictional race of space-faring warriors, yet that some actual Earth-bound human beings apparently still manage to (and choose to) hold mundane conversations in, is a pretty fascinating topic

Eli Bishop has made very similar points. He notes that "Comparing the lengths of Wikipedia articles is even trickier than you suggested. Whenever an article threatens to become unmanageably long, it is almost always broken up into smaller articles, and the main article becomes merely a summary. ... So any measure of just the main article's length is meaningless; certainly it's far from the case that "at this point in [Wikipedia's] evolution, the Klingon hobbyists seem to have posted slightly more than the classicists." It's quite clear that he is right, which makes Professor Brabazon's wiki rage much sillier. As always, I am too generous. It is an oft-noted fault of mine.

Posted by Geoffrey K. Pullum at March 21, 2008 10:09 AM