March 16, 2006

printing a list of all english words

The GNOME project has just released GNOME 2.14. The release notes include a section on performance improvements which begins with:

Figure 1. GNOME Terminal performance improvements between GNOME 2.12 and 2.14. Time taken is the time to print a list of all English words to the screen. [emphasis mine]

which refers to this image:

That's so impressive you know it can't be right: there are infinitely many English words so they can't be printed in finite time. If this isn't obvious to you, the classic examples are kinship terms like "great-grandmother" and "great-grandson", which can be continued indefinitely:

great-grandmother
great-great-grandmother
great-great-great-grandmother
great-great-great-great-grandmother
great-great-great-great-great-grandmother
great-great-great-great-great-great-grandmother

Unfortunately, they don't make clear what list they actually used, though a good guess would be the list found on Unix systems in /usr/dict/words or /usr/share/dict/words, which usually is a list of only 45,425 words though there is a longer variant containing 235,882 words. Both, being finite, are inifinitely shorter than a list of infinite length. Fortunately, the quality of the GNOME project doesn't depend on its theoretical linguistic sophistication. (I say "theoretical" because GNOME now supports 45 languages.)

Posted by Bill Poser at March 16, 2006 05:32 PM