July 23, 2007

"Expressions of negative Clippy feelings"

Michael Kaplan, who works on "internationalization and localization issues" for Microsoft, especially "collation and keyboard issues", recently posted some useful information about Clippy:

I had a friend complain to me the other day (the way that all folks who have friends working at Microsoft tend to do) about Clippy and how to turn him off in Office 2003.

Now I have mentioned before that Clippy is off in the default install and has been for a few versions now.

But I figure if even Charles Simonyi can be confused by it then I suppose anyone can. :-)

So I remembered an old trick someone had mentioned to me and asked my friend "Have you tried being rude to him?"

"What do you mean?" she asked me. "How can you be rude to a talking paper clip?"

"Well," I suggested, "try venting your anger at him. Tell him in a few concise words how you feel about him."

Here's an example of Michael's suggested remedy:

He explains further:

After telling Clippy this, the first item on the list explains how to change the Office Assistant,and the second item explains how to hide or show it.

Now this is obviously not the only way to find the message, but I find three different language issues amusing here:

  • An amazing number of people use this exact phrase;
  • There are reportedly many other expressions of negative Clippy feelings that will have the same effect on search in help;
  • There are disadvantages to a formal education that make this method of finding a solution less obvious.

And this immediately brings up some questions:

I wonder how sophisticated the "unhappy user" detection is here in language. And whether it has been appropriately localized.

Apparently Michael, who works on collation issues in Redmond, is not on the mailing list for the Translingual Cussing Committee. (Perhaps it's run out of Microsoft's new Bishkek Lab...)

Michael's questions raise some problems in machine learning. Some of the engineering issues now are covered under topics like "sentiment detection". More scientifically, we could ask for a discovery procedure, applied to a corpus of texts in an (otherwise unknown) new language, that will find all and only the cuss words. (By which I mean something like "taboo expressions of negative affect" -- though it's not easy to define this across languages and cultures. More links on this are here.)

Imagine if Zellig Harris, a half a century ago, had assigned that problem to Noam Chomsky, rather than the (easier?) task of inducing syntactic structures!

Posted by Mark Liberman at July 23, 2007 05:56 AM