September 25, 2004

A random monkey begins Julius Caesar

Bill Poser describes it as "a well known claim" that "If you have enough monkeys banging randomly on typewriters, they will eventually type the works of William Shakespeare." There's one little thing he does not note which I think might be worth pointing out here, and that is that the claim is definitely true. I think a lot of people might not realize that: some incorrectly call it a conjecture. No, it is absolutely true, a special case of a theorem of Kolmogorov. I went to the monkey simulator site that Bill refers to, and ran it during lunch. After lunch I found that my monkey had actually started typing out Act 1, Scene 1, of Julius Caesar, from the very beginning.

The play opens with a speech by Flavius (spelled "Flauius" in the old Latin style where u and v are not distinguished since the latter is just the semivowel corresponding to the vowel represented by the former):

Flauius. Hence: home you idle Creatures, get you home:
Is this a Holiday? What, know you not
Being mechanical, you ought not walk
Upon a labouring day without the sign
Of your profession? Speak, what trade art thou?

My monkey (actually a random character generator) had begun typing out the play, as far as the 17th character:

Flaui us. He nce: h
12345 67891011 121314151617

Now, to say that my monkey would not be permitted to choose o as the next character after a sequence like this would be to say that his choices are not random. Full randomness means that he can pick any letter he chooses, and we have no way to predict which one he might pick. Therefore he could choose o as his next letter. And by the same reasoning he could pick m as the one after that, and choose e after that, and so on. As the monkeys randomly type through the billions and trillions of years, one day I'll get the whole of Julius Caesar, simply by an accident of such choices. You can take that to the bank.

This time, I missed: my monkey chose b instead of o, and I got:

Flauius. Hence: hb'in-p1:s]Ij"PpXeygefFPXD)gg8Ns...

So it didn't work out. Not today. But it doesn't matter. The claim is true. One day the sequence Flauius. Hence: h will come back, this time with an o and the whole of the rest of Julius Caesar following it, followed by all the other plays. That's the really staggering thing about the claim: it's not a speculation that one day we'll get the whole of the Immortal Bard's works out of an untiring team of monkeys working away on keyboards; it's definitely true — unless astronomy imposes its cosmic time limit on everything and the earth is destroyed or the universe shuts down before we get there. And because of the randomness we have no way to tell whether that will happen before we get the right character sequence.

[Note added later: Fernando Pereira warns me to be cautious about jumping from mathematical monkeys (independent random variables) to real monkeys. He notes (quite rightly) that a finite computational device such as a monkey cannot generate an infinite sequence of independent random draws, because any random number generator has a cycle, however large. It may turn out that the cycle of your number generator is smaller than the number of draws needed before you ever hit what you want. The monkeys would need a true source of randomness to be really truly random So let me just make it clear that I am assuming very special monkeys here, with access to a true and perfect source of total randomness. They never develop repetitive habits in their typing, it truly is wide open at every point what key they will hit next. They also don't crap on the keyboard or urinate into the system unit. You don't find monkeys of this sort in the typical zoo. In fact you won't even find them in the above-referenced simulation, since it will be using a finite computing device with a finite cycle for its random generation. But abstract away from all that. In the beautiful, abstract world of pure math, where random sequences are genuinely random, the monkey Shakespeare claim really is true. Trust me. Why would I lie to you?

But Mike Albaugh has convinced me that it may be true even with real monkeys, in principle. Mike points out that we don't need total randomness; we just need for the monkeys to be working their way through the logical space of all character sequences without getting stuck in repetitive loops that are too small (and for there to be enough of them working for long enough in a universe that lasts for long enough, of course): "if the state-space is sufficiently large to contain the complete works of Shakespeare," writes Mike, "then even the simplest, linear traversal of it will eventually 'find' the works. To a first approximation, all pseudo-random number generators are just 'fancy ways of counting' the states of their state-space, so as to 'look random'. There is a long history of folks 'looking sideways' at such series and seeing distinct patterns. This goes back at least to the cryptanalysis of the Lorentz Machine, in WWII. John Von Neuman is quoted to the effect that people who attempt to generate random number by computation are '.. in a state of sin.' But for a well-specified task of finite length, simply having a large enough state-space will do." Mike also points out that there will come a problem at the point where the first monkey violates someone's jealously guarded copyright. Presumably the monkey that starts typing out the source code for Windows XP will be a monkey in deep, deep trouble.]

Posted by Geoffrey K. Pullum at September 25, 2004 05:26 PM