June 26, 2004

Oops, I did "that" again

Sometimes people put in more thats than they ought to:

(link) It's strange that given the huge community of programmers at slashdot, that the number of books isn't really that long.
(link) No but seriously, I think it's obvious that because Michael Moore eats frequently that everything he says must not have one iota of truth.
(link) I'm surprised that after a bus falling 100 feet and landing on its front, that anybody survived at all.
(link) I'm sure that given the technological complexities and the demanding financial needs of both theater and national missile defense programs, that we will be able to spend this additional $1 billion.

I used the prescriptive formulation "put in more thats than they ought to" because as far as I can see, this is not a variant or non-standard grammatical pattern, it's just a mistake. I take it as obvious that there's a difference. But in fact, I'm not entirely sure which diagnosis applies here.

[Warning: what follows is some stream-of-consciousness musing about this general question, focused on the "extra that" case. Unless you're interested in both syntax and psychology, you'll probably want to turn your attention to some of our other fine posts...]

Why do I think these apparently extra complementizers are a mistake, rather than a non-standard grammatical pattern? Logically, there are two arguments for such a position: that a "different grammar" theory doesn't work, and that a "production error" theory does.

I won't say anything specific about the failure of "different grammar" theories, except to say that I've tried to make up grammars that would license "... that ADV that ..." patterns, and don't find any of them convincing. I might well be wrong about this, and welcome suggestions.

What about the production error theory? If these repetitions of that are mistakes, why do they happen? The obvious answer is that you put that in before the adverbial, and then forget you did it and put it in a second time after the adverbial. One piece of evidence for this is that the stretch between the two instances of that is often very long:

(link) I'm convinced that given they totally fucked up the planning of this whole thing, and they're totally chained to their neo-con preconceptions, and they totally lied from start to finish, that if the same people are left in charge certainly someday this will all end, but it won't be a good ending.

However, there are a couple of problems with the forgetfulness theory. One is that the mistake sometimes happens with short adverbials, both in writing:

(link) I believe that we have a Creator who made us to live in a certain way, and that therefore that there is ‘natural law’.
(link) This ensures that all students have the same outline information in anticipation of examinations, and that therefore that coverage across discussion groups is as uniform as possible.

and in speech:

(link) MR. BOUCHER: We have not changed our position, and in fact, we believe that the jurisdiction of the International Criminal Court needs to be -- can't be established over nationals of states that are not party to the Rome statute and that, therefore, that Americans and others who are not members of the Rome statute, who participate in UN peacekeeping, need to be protected from some kind of misguided prosecution because of actions they might undertake while participating in those operations.

If the forgetfulness theory is correct, some people have very short memories, at least in some circumstances, or else such examples are the result of different sorts of mistakes, namely careless post-editing in writing, or self-correction and restarting in speech.

We could certainly check whether the double-that construction is more frequent, other things equal, when the adverbial is longer. After a bit of searching for examples, I have the (scientifically unreliable) impression that it is. This is another case where one could perhaps do some valid, quantitative Google psycholinguistics.

A second problem with the "oops I did that again" theory is that the two string positions for that seem to be semantically incompatible. This doesn't invalidate the theory, but it adds some additional implications.

We're talking about phrases with a matrix (like "It's clear", "I'm surprised", "It's obvious", "I'm convinced") and a complement clause ("that S"). Possible locations for an adverbial in such sentences include these three:

(a) ADVERB MATRIX that S ("Therefore I'm convinced that blah")
(b) MATRIX ADVERB that S ("I'm convinced therefore that blah")
(c) MATRIX that ADVERB S ("I'm convinced that therefore blah")

When the adverbial modifies the whole thing, you get (a) or (b):

Moreover, I'm convinced that our alumni and friends recognize and accept the basic premise of what I'm proposing.
I’m convinced, moreover, that this sentiment is shared by the US Administration.

but not (c):

?I'm convinced that, moreover, our alumni and friends recognize and accept the basic premise of what I'm proposing.
?I’m convinced that, moreover, this sentiment is shared by the US Administration.

In contrast, when the adverbial modifies only the complement clause, you get (a) or (c):

I'm convinced that before long she'll be sleeping in it.
Before the rebuild, I'm convinced that oil in the rocker area couldn't drain away fast enough and was getting sucked up by the PCV and blown into the intake manifold.

but not (b):

?I'm convinced before long that she'll be sleeping in it.
? I'm convinced before the rebuild that oil in the rocker area couldn't drain away fast enough and was getting sucked up by the PCV and blown into the intake manifold.

Anyhow, some kinds of adverbial phrases like "given blah" can take either scope. I can intend to say

(1) "given X, I'm convinced that Y", where I mean that X is what convinces me of Y

or

(2) "I'm convinced that given X, Y", where I mean that I'm convinced of the inference from X to Y.

If you object that the difference in meaning is a subtle one in most cases, you're right. After all, it's not easy to think of a circumstance in which (1) would be objectively true and (2) false, or vice versa.

It seems plausible to me that a speaker or writer may on a given occasion mean both (1) and (2) at the same time, and/or may be in a sort of mixed state of mind, part way in between intending to say (1) and intending to say (2). You can think of this as a mixture of underlying communication intentions, or a mixture of implementational choices, or both, it doesn't matter to this argument.

The point is that meaning (1) licenses the two forms (a) or (b), while meaning (2) licenses the two forms (a) or (c). Since what we actually get (in the "double-that" examples) is a blend of form (b) and form (c), it seems that the folks who produced the "double-that" examples were in a kind of psychological superposition of state (1) and state (2).

That's fine with me. I generally feel like I'm trying to say several different things at once, and I'm fond of the idea that linguistic knowledge and linguistic intentions should be modeled as distributions over linguistic structures, not as specific, individual forms. But others will find this notion less attractive.

I have a feeling that someone is going to write in to complain that all such examples are fully and completely grammatical; and that someone else will inform me that this usage was the norm in Chaucer's English, or 18th-century legal prose, or something. Well, we'll see.

Posted by Mark Liberman at June 26, 2004 02:02 PM