July 08, 2004

Filename generation idiocy.doc

No matter how low your esteem for modern word processors might already be, the manufacturers will find ways of adding new features that will drive your respect for them even lower. Take the filename-suggestion feature that evolved over the past few years in both Word and WordPerfect. Start a blank document and put something in it with a note to a colleague at the top — let's say you put "TRY AND MAKE THIS PIECE OF SHIT INTO SOMETHING, HARRY; I'VE TAKEN BRADSHAW'S STUPID IDEA AND DONE WHAT I CAN WITH IT, BUT IT STILL LOOKS LIKE A CROCK TO ME. --- BOB". Now choose Save As from the file menu to save it. The program will choose a default filename for you. Only what it chooses is simply what it thinks is the title, and that it assumes will be whatever you have on the first line, whatever its length, and regardless of whether it contains spaces or not. So if you just say yes to the default you will find you then own a file with a fairly eye-opening name. I tried this with two word processors (your mileage may differ). WordPerfect 11 for Windows gave me a file called


Word for Mac OS-X gave me a file called


(And then, incidentally, the program crashed without explanation in a freshly booted environment and exited unexpectedly. Microsoft should really try and make this piece of shit into something.)

The problem is that the designers have no idea how to do the process of mapping documents onto suitable words or phrases that might be good mnemonic filenames for them (it's nontrivial, of course), but they try to do it anyway. It's another case of tomorrow's technology today.

It's always the linguistic factors on which they fall down: silly filename-suggesting feature; text conversion from upper case to lower case with initial caps that doesn't quite know which words to capitalize; hopelessly unreliable conversion of characters from one word processor to another (ask a linguist like Arnold Zwicky who has switched from Windows to OS-X and thus has to convert his WordPerfect documents into Word whether he's ever had a conversion that was trouble-free); and of course a grammar checker that is a complete joke, and a spell checker that is little better (they try to do correction suggestions, another thing they are not any good at: Barbara carelessly accepted one of its suggestions the other day without intending to, and we found that a joint paper of ours referred to Kimchee instead of Chomsky).

It's not that nothing at all has been improving: some of the plain graphics programming improvements are extraordinarily clever. But linguistics and natural language processing have not been feeding into the word processor industry at all.

Posted by Geoffrey K. Pullum at July 8, 2004 06:43 PM