September 05, 2007

Parsing Miss Upton

While the world was laughing at Miss Teen Carolina's dysfluent answer to an ill-posed question, Lukas Biewald and Brendan O'Connor at Powerset got serious about it. They fed a transcript of Lauren Caitlin Upton's response into their version of the XLE parser, with this result:

Powerset's goal is to use such analytic techniques to improve web search, and one of the questions about this idea has always been how well the analysis copes with material that's fragmentary or carelessly written or non-canonical in some other way. So they were proud to announce that their system was able to use the parser's output to answer a question about Miss Upton's answer:

Some nitpickers may complain that the answer is incomplete -- what about South Africa and the Iraq and like such as? Still, an impressive achievement.

I found it striking that their analysis of Miss Upton's remarks involved such deeply right-branched embedding. That's because their grammar treats strings of fragments that it can't analyze further in terms of structures generated by rewrite rules of the form

FRAGMENTS → X FRAGMENTS

As a result, a string of apparently disconnected babble -- say, Vicki Pollard's classic string of discourse markers -- will look something like this

rather than like this:

This is a sensible-enough way to approach the problem. From a computational point of view, uniformly right- (or left-) branching structures are easily handled by finite-state methods; and from a psychological point of view, right-branching structures have the advantage over left-branching structures that you don't need to decide how deep you're going to go before you start talking.

But a set of interesting psychological issues lurk behind those formal choices. When people produce (or understand) such strings of fragments, is there a sense in which they're processing them in layers, as we assume they're doing in dealing with phrases like "Kim doesn't like stale fruitcake", or "Leslie's uncle's dog was barking"? Are strings of discourse markers (including maybe the communicatively-meaningful disfluencies that you can read about in Michael Erard's Um) different from false starts in this respect?

[OK, I know that the elements of Vicki Pollard's conversational opening are actually grouped as (yeah but) (no but) etc., as indicated both by the prosody of her performance and (I think) by the interpretation of the content. All the more reason to wonder about the constituent structure of disfluencies... And I've always thought that the most amusing part of the "yeah but no but yeah but" business was the suspicion that there's a way of construing it as a recursive stack of discourse markers, rather than as a series of false starts -- if only I had enough short-term memory to grasp it. ]

Posted by Mark Liberman at September 5, 2007 06:45 AM