Penn sponsors an annual lecture on issues in the cognitive sciences, endowed in memory of Benjamin and Anne Pinkel. Last Friday, Ray Jackendoff gave the 2004 Pinkel lecture, on the topic Towards a Cognitive Science of Culture and Society.
Ray's presentation explored an analogy between language and social interaction, which he laid out at the start of his handout in a sort of a table:
Unlimited number of understandable sentences | Unlimited number of understandable social situations |
Requires combinatorial rule system in mind of language user | Requires combinatorial rule system in mind of social agent |
Rule system not available to consciousness | Rule system only partly available to consciousness |
Rule system must be acquired by child with only imperfect
evidence in environment, virtually no teaching |
Rule system must be acquired by child with only imperfect evidence, only partially taught |
Learning thus requires inner unlearned resources, perhaps partly specific to language | Learning thus requires inner unlearned resources, perhaps partly specific to social cognition |
Inner resources must be determined by genome interacting with processes of biological development | Inner resources must be determined by genome interacting with processes of biological development |
Ray went on to discuss other aspects of his proposed research program, without dwelling further on the analogy between sentences and "understandable social situations". However, this analogy reminded me of an interesting undergraduate term project from a course I taught last fall with Lyle Ungar, "Introduction to Cognitive Science".
A senior engineering student named Chris Osborn wanted to explore a class of statistical sequence models called "Aggregate Markov Models" (AMMs), which Fernando Pereira used a couple of years ago to show that Noam Chomsky was wrong in 1957 about the statistical status of "Colorless green ideas sleep furiously". Chris decided to try fitting an AMM not to the sequence of words in texts, but to the sequence of speaker names in a discussion. As a source of data, he chose the transcripts of three oral arguments from the 2001 term of the U.S. Supreme Court.
Chris found that a two-class AMM accurately distinguishes the justices from other participants (court officials and lawyers), even when trained on a single transcript of about 250 turns. This is analogous (on a smaller scale) to Fernando's success in inducing word classes that distinguish the probabilities of the different word orders in Chomsky's example by a factor of 200,000. The evaluation is different -- Chris was interested in finding induced classes that make sense, while Fernando wanted lifelike probability estimates for very improbable sequences -- but both applications show unsupervised learning of implicit structure from sequence data.
Chris was not trying to suggest that humans normally analyze turn-taking independent of the content of the turns or other aspects of the context, and I'm not suggesting this either. But if you think about it, an orthographic transcript is a highly abstracted characterization of the actual communicative interaction -- it leaves out everything except the sequence of word identities, more or less -- and the sequence of speaker names is another such abstracted characterization, in which there is also often quite a bit of structure. We know that humans are exquisitely sensitive to the statistical properties of communicatively-relevant behavioral sequences, and there is no reason to suppose that this sensitivity ends at the edges of speaker turns.
Posted by Mark Liberman at February 23, 2004 10:11 AM