February 10, 2008

Ontological Promiscuity v. Recursion

On Friday, the colloquium speaker at the Institute for Research in Cognitive Science was Jerry Hobbs, talking about "Deep Lexical Semantics": a project to express the meaning of words and phrases in a way that's integrated with formal theories of commonsense knowledge. You can learn more about his work in this area from the references here, but that's not what this post is really about. Instead, it's about something that seems to be completely unrelated: the controversy over recursion and its role in human language in general, and in certain languages such as Pirahã in particular.

In brief, the debate has gone as follows. Noam Chomsky and others "hypothesize that FLN [the faculty of language in the narrow sense] only includes recursion and is the only uniquely human component of the faculty of language"; but Ray Jackendoff, Steve Pinker and others deny this, arguing that "language is a complex adaptation for communication which evolved piecemeal". Meanwhile, Dan Everett has argued that a contemporary language of Brazil, Pirahã, lacks recursion entirely, a claim that echoes Ken Hale's analysis, several decades earlier, of Australian languages such as Warlpiri; but Andrew Nevins and others have challenged Dan's description.

For more (than you want to know) about these recursion controversies, you could read some old Language Log posts, such as "JP versus FHC+CHF versus PJ versus HCF" (8/25/2005), "Dan Everett and the Pirahã in the New Yorker" (4/9/2007), and "The enveloping Pirahã brouhaha" (6/11/2007).

Recursion, in this context, means "linguistic structures that are embedded inside other structures of the same type". Familiar examples of recursive embedding of sentences include subordinate clauses ("before <sentence>"), sentential complements ("noticed that <sentence>"), and relative clauses ("the book that <sentence>"). In English, these are freely combined -- picking a news story at random from today's NYT, the first sentence has three levels of sentential embedding, and so does the (shorter and simpler) fourth sentence:

[ It could also help the administration make its case that 
  [ some detainees at Guantánamo, where [ 275 men remain, ] would pose a threat if 
    [ they are not held at Guantánamo or elsewhere ] ] ]

For a funnier and more transparently recursive example of clausal recursion, check out the third panel of this recent Cathy strip:

OK, what about Jerry Hobbs' talk? Well, a remark that he made in passing reminded me of idea that he had more than 20 years ago, described in a paper entitled "Ontological Promiscuity'' (ACL 23, pp. 61-69, July 1985). The abstract begins like this:

To facilitate work in discourse interpretation, the logical form of English sentences should be both close to English and syntactically simple. In this paper I propose a logical notation which is first-order and nonintensional, and for which semantic translation can be naively compositional. The key move is to expand what kinds of entities one allows in one's ontology, rather than complicating the logical notation, the logical form of sentences, or the semantic translation process.

The basic idea is to replace recursive embedding (in the semantics) with reference to abstract entities created for the occasion -- events, time, places, propositions and so on. To illustrate how this works, he analyzes a complex journalistic sentence:

(4) The government has repeatedly refused to deny that Prime Minister Margaret Thatcher vetoed the Channel Tunnel at her summit meeting with President Mitterand on 18 May, as New Scientist revealed last week. [New Scientist, 6/3/1982, p. 632]

When this sentence is analyzed following his prescription, the result is a conjunction of 11 simple predications, with the embeddings all replaced by anaphoric references to abstract individuals (via the subscripted variables En):

The representation of just the verb, nominalizations, adverbials and tenses of sentence (4) is as follows:

The upside-down Vs -- "wedges" -- are semantic conjunctions, so translated into heavy English, this means something like "Thing1 is completed, and Thing1 is repeated, and Thing1 is the Government refusing Thing2, and Thing2 is the Government denying Thing3, and Thing3 is Margaret Thatcher vetoing the channel tunnel, and ..."

As Jerry explains,

Sentence (4) shows that virtually anything can be embedded in a higher predication. This is the reason, in the logical notation, for flattening everything into predication about individuals.

This is more or less what I had in mind when I wrote that Dan Everett's no-embedding claim "imposes a lot fewer constraints on what the Pirahã can say than you might think" ("Parataxis in Pirahã",  5/19/2006); or when I suggested a fake homework assignment ("Communicating", 7/29/2007), to

... rewrite paratactically (i.e. by stringing phrases together without embedding, using explicit or implicit anaphora to keep track of the connections) what Jeremy expressed syntactically (here, using complement clauses): "Tell her that Brittany said that Zuma said that Sara said that it's okay with her if that's what D'ijon said."

I forgot a key precedent: back in 1985, Jerry Hobbs suggested that the semantics of all natural-language sentences ought to be mapped into a paratactic form, i.e. a conjunction of simple propositions, tied together with anaphoric references to appropriate abstract entities. Jerry's purpose was to make the logic of discourse more tractable:

The real problem in natural language processing is the interpretation of discourse. Therefore, the other aspects of the total process should be in the service of discourse interpretation. This includes the semantic translation of sentences into a logical form, and indeed the logical notation itself. Discourse interpretation processes, as I see them, are inferential processes that manipulate or perform deductions on logical expressions encoding the information in the text and on other logical expressions encoding the speaker's and hearer's background knowledge.

Towards that end, he argued, the logical notation

... should be syntactically simple. Since discourse processes are to be defined primarily in terms of manipulations performed on expressions in the logical notation, the simpler that notation, the easier it will be to define the discourse operations.

And so his candidate for the logical notation is a conjunction of simple propositions, where everything that appears to motivate more complex structures is handled by anaphoric references to entities in a promiscuous ontology.

Yesterday, when I pointed out to him the connection between this work and the current recursion controversy, Jerry got worried. "But wait a minute -- whose side am I on?" he asked.

I'm not sure. Maybe he's against the language=recursion side, since his proposal suggests (as the ancient Greeks believed) that the choice between syntaxis and parataxis is just a stylistic or rhetorical one, so that a language can give up syntactic recursion without suffering any essential expressive loss. Or maybe he's on the language=recursion side, since his proposal essentially replaces structural recursion with a sort of ontological and anaphoric recursion. Or maybe his claim gives Dan Everett a more precise way to phrase the claim that a culture's ontological puritanism might translate into avoidance of syntactic recursion. Or again, maybe the point is that nominalizations and (implicit or explicit) anaphora are the moral equivalent of recursion, which undermines Dan's assumption of a one-to-one correspondence between ontological attitudes and syntactic resources.

Either way, his proposal clarifies the questions in the recursion wars, at least for me, even if it doesn't make the answers any plainer.

Meanwhile, I'm still working on the Hobbsian translation of "You're one of those people who says you're not one of those people who says you're not one of those people".

[Fernando Pereira writes:

I'm not sure that Jerry's proposal really "clarifies the questions in the recursion wars", for two reasons. Formally, as far as I know there is no proof that Jerry's unfolding method preserves the combinatorial distinctions of recursive formulas, in particular with respect to scope ambiguities. Computationally, what Jerry's proposal reminds me of are ideas like "last-call optimization" in programming languages and "recursive ascent" in parsing (an idea of Rene Leermakers that is less known than it deserves and generalizes left-corner parsing, the Marcus deterministic parser, and others). In both cases, recursive computations that do not really need to be recursive are automatically transformed into iterative recursions by recognizing when elements saved in the recursion stack can be overwritten because they will never be used again. However, such transformations still require a stack data structure for genuine center embedding. Which, in Jerry's proposal, would require an unbounded number of pending anaphoric relations. So, it's not clear that Jerry's proposal makes testable predictions with respect to sentence processing.

Well, I think it's clear that Jerry meant "promiscuous" to embrace an unbounded number of ontological one-night-stands. And even though I haven't seen even a sketch of a proof that there's a well-defined translation between recursion in the automata-theory sense and a formal language with "ontological promiscuity" in its semantics, I reckon that if you get (to keep a record of) as many typed and indexed entities as you care to create, you ought to be able to imitate a stack, and (for example) prune Σ* back to some arbitrary context-free language.

But I had in mind a much narrower point.

In Dan Everett's current account, the sentences that he once analyzed in his dissertation as involving sentential embedding, instead involve things like nominalizations and anaphoric references to evoked situations. That's the Paratactic Way, and it's also pretty much how Jerry argued that the semantics of English ought to be modeled.

This might or might not be a good way to describe NL meaning, whether in English or in Piraha. But it makes me wonder about some of the arguments on both sides: Everett's argument that the Piraha avoid clausal embedding because of a cultural commitment to certain ontological restrictions; and Nevins' argument that the Piraha can't possible lack clausal embedding, because that would prevent them from the expressing a significant part of the range of things that any human language (including, clearly, theirs) can express.

Fernando's response:

These arguments are more evidence, if any was needed, of the deleterious effect of a kind of formalistic thinking that poisoned linguistics with the advent of transformational grammar. Harris was in my reading quite careful not to make claims based on the form of particular representations; in fact, if we think of Jerry's "representation" as just a concise notation for NL sentences, then Jerry's account has much in common with Harris's account of nesting. On the other hand, Chomsky and his followers, maybe influenced by early formal language theory, have gone into assigning more computational/cognitive import to particular representations than they can actually bear, missing the existence of efficient reductions between certain types of representations. Both Everett and Nevins are amusingly similar in their biases even though their conclusions are opposed.

The empirical question for me is how embedding vs flat+anaphora use memory. In embedding, information about incomplete clauses must be kept in short-term memory as nested clauses are processed. In flat representations, referents must stay available for easy retrieval by later clauses in the discourse. Are these memory systems the same or different? My slightly informed guess is that they are different. The first is more likely to be shared with those involved in perceptual-motor nesting in complex tasks. The second is more likely to be shared with those involved in sequencing complex interactions with peers. A further guess is that both of these systems are more refined in us than in other primates, and both share in supporting language, although the relative shares of the load vary between languages.

Mark Eli Kalderon writes:

Nice post.

The trade-off of recursion for ontology may only be plausible modulo assumptions about what constraints there are on natural language semantics. If the semantics is merely a formal representation, then ontological profligacy is no problem. If, however, a competent speaker who understands a sentence is supposed to be implicitly committed to the entities posited by the semantics, then there may be problems. Consider, for example, Davidson's semantic analysis of adverbs in terms of quantification over events. Works reasonably enough. But consider "a rapidly converging function". If semantics is meant to be ontologically committing, then Davidson's analysis, if correct, would commit us to the existence of events when we are not so committed (and indeed where there are none).

A key insight of early analytic philosophy, as I understand things, was that the routine ontological commitments of natural-language semantics are sometimes disastrously misleading. On this view, we must sometimes either find new ways of talking, or else agree to interpret old ways of talking in artificial and perhaps unnatural ways. It's not a surprise to find ourselves in this situation with the semantics of mathematical language like "rapidly converging function".

But the philosophy of language is even further outside my job description than automata theory is, so I may be missing the point. ]

[Update #2 -- John Cowan writes:

It's interesting to note that Flanagan, Sabry, Duba, and Felleisen introduced the same notion to computer science in 1993, under the name of "administrative normal form" (ANF): the transformation maps "f(g(x), h(y))" into "let v0 = g(x) in (let v1 = h(y) in f(v0, v1))", so that all arguments to functions are constants or variables, never other function calls.

]

Posted by Mark Liberman at February 10, 2008 08:14 AM