Suppose you wanted to track changes in the relative usages of syntactic
variants by writers in, oh say, the past three or four decades.
You would, of course, take advantage of tagged corpora (with
part-of-speech information) that have become available in the past
half-century and compare rates of occurrence in earlier and later
corpora.
This is just what Geoffrey Leech and Nicholas Smith (of Lancaster
University) have been doing recently, looking at a collection of
variables in both American and British English and using corpora made
available in 1961 and 1991-92. Overall, they find significant
increases in several colloquial vs. more formal variants, with more
extreme changes in American English vs. British. On one variable,
relative that vs. which, the change in American
English writing has been enormous; the change in British English is in
the same direction, in favor of that,
but is very much smaller.
The obvious interpretation is that the forces of prescription have been
winning big in the U.S., at least with respect to this one
variable. (Meanwhile, stranded prepositions, contracted
auxiliaries, and contracted negatives, among other variables, have
risen in frequency as against their more formal alternatives -- this
despite the strictures of the advice literature on standard written
English.) American writers are, apparently, conforming more and
more to what the advice books say: use that rather than which in restrictive relatives (the
That Rule, a recurrent theme in the halls of Language Log Plaza, most
recently discussed here).
But then a passing remark by Anne Fadiman in her valedictory editor's
column in the American Scholar
reminded me that there's another factor at work here, and it's hard to
assess its role in these corpus statistics: what appears in the
corpora is not, exactly, what people wrote; instead, it's what got
published, and in the U.S. there's an almost religious attachment to
the That Rule in the editorial establishment, which intervenes between
the writer's original text and the version that appears in print.
Already published is an article by Leech, "Recent grammatical change in
English: data, description, theory", in K. Aijmer & B. Altenberg
(eds.), Advances in Corpus
Linguistics (Papers from the 23rd International Conference on
English Language Research on Computerized Corpora, Göteborg 22-26
May 2002), Amsterdam: Rodopi (2004), 61-81. Still in press is
Leech & Smith, "Recent grammatical change in written English,
1961-1992: some preliminary findings of a comparison of American and
British English", in Antoinette Renouf & Andrew Kehoe (eds.), The Changing Face of Corpus Linguistics,
Amsterdam: Rodopi. My summary here relies on further discussion
by Leech in e-mail to Rodney Huddleston and to me.
Though restrictive vs. non-restrictive relatives are not entirely
factored out, Leech did some recent quick calculations that factored
out prepositional relatives (obviously a potentially important
consideration, given the decline in fronted prepositions vs. stranded
prepositions), and came up with, in the U.S. data, a decrease of 41.5%
in the frequency of non-prepositional which
relatives and a corresponding increase of 48.5% in the frequency of
non-prepositional that
relatives. This is pretty stunning, and exceeds the changes in
British English by roughly a FACTOR of 5.
Now, this effect is in the direction of "colloquialization", given that
restrictive which is less
frequent in spoken vs. written English. But the effect is, in
Leech & Smith's words, "dramatic", way beyond simple American
colloquialization. Leech & Smith conclude: "This preference
[for restrictive that],
amounting to an increasing taboo against which as a restrictive relativizer,
is now built into grammar checking software, and we can expect it to be
making even greater headway at present than in the early 1990s."
But. But. What we're looking at here is what comes out of
the publishing enterprise. We don't know what went into it.
Here Anne Fadiman's passing reference to copyediting suddenly becomes
relevant: