February 22, 2007

Labels Are Not Definitions

My recent positing on Extris constructions in English began, very carefully:

For at least 35 years, English speakers have been producing sentences with an occurrence of a form of BE that is not licensed in standard English (SE) and is not a disfluency...

This was designed -- perhaps too subtly, but my posting was an abstract for a conference paper, so it was necessarily concise -- to exclude two types of examples: those that are in fact licensed in SE, and those that are disfluencies.  Nevertheless, examples of the Isis subtype (most of which have is is or was was in them), and, especially, the widespread labeling of Isis as the "double is", "is is", "double be", or "be be", phenomenon often cause people to think that any sentence with is is or was was in it is an instance of the phenomenon.  But some are just SE and some are disfluencies.  People have been misled by the labels.  The larger lesson is:

Labels Are Not Definitions (Or Descriptions)

Early on in our investigations of the phenomenon, the Stanford/Colorado research group began to use the label Isis or ISIS (pronounced /ájsIs/), just to get away from the possibly misleading "is is" etc. stuff.  The label is suggestive, but doesn't look like a characterization or description of the phenomenon.  (This tactic doesn't always work, but we still think it's better than the alternatives.)

In any case, people come to us with examples of both of the types we try so carefully to exclude.  I'll look at disfluencies first.

The kind of disfluency that can get confused with genuine Isis is a repetition disfluency: speakers, in the heat of speech production, pause and repeat some material while they're groping for how to formulate what they want to say next.  Usually these repetitions are of "little words", like the, a, infinitival to, personal pronouns, and, yes, forms of the verb BE.  The article the is an especially frequent target of repetition disfluency; if you look at carefully transcribed natural speech, you'll see an awful lot of occurrences of "the, the".  These are inadvertent repetitions, not part of the speaker's system of English syntax.

The published material on Isis mixes examples that are punctuated with a comma (suggesting a pause, and a possible disfluency) and those punctuated without.  In collecting our own examples on the fly -- my personal file has 133 examples in it at the moment -- the Isis group has tried to distinguish the two cases.  We are particularly impressed by examples that are pronounced smoothly and without pause, like the ones below:

The difference is is that I don't want him to find you.
Part of the problem is is that they gave me a project and...
The problem is is that you get into...

The first is from an episode of the television show Charmed, and was surely not a scripted bit (and was probably not noticed by anyone involved with the episode).  The other two are from a male speaker, a Silicon Valley type, overheard in the Palo Alto Gordon Biersch restaurant on 11/20/06.

But there are also plenty of repetition disfluencies.  (Some people have suggested that these are, historically, the source of Isis.  But no other such disfluency seems to have been grammaticalized, and Brenier & Michaelis (2005) -- in the list of references in my Extris posting -- have argued that Isis has a much more interesting motivation, involving separate functions for each of the forms of BE in Isis.)  Eventually, the more phonetically minded members of the Isis group set about studying how to distinguish repetition disfluencies from instances of a (non-standard) syntactic construction Isis.  The result was the Coppock et al. paper cited in my earlier posting (which is, alas, not yet published, but is, hooray, available on-line here).

Ordinary people -- people who aren't into Isis professionally -- don't distinguish the two phenomena, and they probably notice the disfluencies much more than the Isis occurences; certainly, my non-linguist friends are often astonished that I've detected a smooth Isis production when they didn't.  What this means is that the cases that non-linguists bring to me are just the ones most likely to involve disfluencies.  So it is with two examples offered to me by Edith Maxwell yesterday:

The thing is is, is that..
What it is is, it's a...

These look like spectacular triple-is examples.  But note the punctuation.  I suspect that the first one has Isis ("The thing is is") followed by a repetition disfluency, and that the second starts out as a perfectly well-formed pseudocleft ("What it is is" -- see below) followed by a repetition, in this case a partial restart with "it is" repeated in the form "it's".  Neither exemplifies a triple-is construction.

On to examples that are just SE, for instance:

What it was, was football.
What this was, was a victory for North Carolina's team.

Colleagues who know that I'm a student of "double be" every so often write and post to suggest the first of these, an Andy Griffith punchline, as an example.  And a colleague wrote to offer me the second.  These are just ordinary pseudocleft sentences in which the subject clause happens to end in a form of BE, which is then, of course, followed by a form of BE that belongs to the main clause (it's part of the construction).  (The commas indicate an intonation contour rather than a pause.)  There's nothing even slightly non-standard about these examples; they're in pretty much everybody's syntactic system.  In fact, eliminating one of the occurrences of was yields rubbish:

*What it was(,) football.
*What this was(,) a victory for North Carolina's team.

Now we have a way to get Isis examples that have three occurrences of is in a row, but where the first two are just part of a pseudocleft sentence, and the third is the extra is of Isis:

What part of it is is is that the irony...

This one's from KFJC's Robert Emmett, host of the Norman Bates Memorial Soundtrack Show, on 3/5/05.  Emmett is a virtuoso Isis user and has been providing me with examples for six years now.

(Some people have suggested pseudoclefts that have a form of BE at the end of the subject and another at the beginning of the predicate as the historical source of Isis.  But such examples are not especially frequent, and some kind of "double function" account is much more satisfying.)

If you take the name "double be" to be not just a label, but actually a definition, you'll be tempted into seeing repetition disfluencies and entirely standard pseudocleft sentences to be instances of the phenomenon.  But, to hammer it home again:

Labels Are Not Definitions (Or Descriptions)

I make this point every few months.  Here's a version of the idea in a discussion of the English "subjunctive":

I've been providing arbitrary designations for both phrase properties (Constr:286) and word properties (Form:I), along with suggestive labels of my own devising or from CGEL (plain counterfactual, irrealis). It's important to realize that these suggestive labels play absolutely no role in the description of the language. If they're well chosen, they allude to some relevant aspect of syntax or semantics, but the labels are in no way descriptions, of either the syntax or the semantics.

So there's no substantive issue here. "Irrealis" is a much better name for Form:I than, say, "cislocative" or, for that matter, "elephant", but it's at best a hint at the semantics of the constructions in which it occurs.

And one from a treatment of plural, mass, and collective in English:

I'm going to reject the standard labels, because they encourage you to think that the grammatical categories are semantically defined -- with a singular word used to refer to one thing and a plural to more than one -- while the fact is that the connection between grammatical categories and meaning is much more indirect.  What I'll do instead is use the labels SG and PL, which are helpfully suggestive but also evidently novel.

[Later, I introduce C vs. M and COLL vs. ~COLL]

And from a piece on somewhat woozy VPE examples, where I make the principle explicit:

Though the construction is usually known as Verb Phrase Ellipsis (sometimes Verb Phrase Deletion), the omitted phrase is not always a VP.  In (4), it's an AdjP.  "VPE" isn't a bad name, but it doesn't tell you everything.  The slogan is: Labels Are Not Definitions.

Finally, in connection with pronoun case:

"Nominative" and "accusative" (or "subject case" and "object case") aren't bad names, but the labels aren't definitions, and they aren't descriptions.

Still, people are inclined to think that words are tightly bound to their referents; pigs are so-called because they're, well, pig-like.  Linguists know better, or so we think; we understand about the arbitrariness of the sign, after all.  But when we're confronted with derivative or complex expressions, pretty much everybody, linguists included, hopes that the labels are going to be definitions; just unpack the expression, and you know what it means.  But this is almost never going to be the case -- certainly not for ordinary-language expressions, and hardly ever even for technical terminology.

Our touching faith in labels as definitions is routinely exploited in creating new labels or choosing labels from a set of alternatives.  We argue, for example, over whether a particular set of people should be called Black or African American or something else, but (even if we agree on who belongs to this set) no choice of label could possibly bear the burden of picking out just this set of people.  We similarly argue over which of a number of labels to use for language varieties used by many of the people in this set, and once again the labels can't do the work of picking out just this collection of varieties.  As with grammatical terminology, the best we can do is nod in the right direction.

Occasionally, you see real silliness that comes from a naive faith in the power of labels.  Every so often, someone explains to me that you can't end a sentence with a preposition because the word preposition means '[a word] placed before [its object]', and stranded prepositions aren't followed by their objects.  Q.E.D.  (Note also the Etymological Fallacy in this reasoning.)

And you see people desperately hoping that technical terms will mean what they would mean in ordinary language.  Here's a complaint from Mark Morton in The Lover's Tongue (2003), p. 17:

... some people, such as my first-year English students, mistakenly call the language of Shakespeare Old English.  It's not.  In fact, Shakespeare wrote in Early Modern English, which is also the language of the King James Bible.

Well, it's a form of English, and it's certainly old, meaning from a considerable time ago, and these people have probably heard of Old English, so if anything should deserve this label it's the language of Shakespeare.  But, too bad, the label isn't a definition.

zwicky at-sign csli period stanford period edu

Posted by Arnold Zwicky at February 22, 2007 02:00 PM