February 06, 2007

Systematic irregularization

According to a story by Laura M. Holson in yesterday's's NYT ("The Director Lines Up a Shot", 2/5/2007), Steven Spielberg "is concerned that the studio he built with the music impresario David Geffen and the movie executive Jeffrey Katzenberg is losing its identity" to its new owner, Paramount Pictures. But Ben Stiller is in a more upbeat mood about the state of DreamWorks:

“In terms of movies getting green-lit,” he said, “it kicked us into a gear that we hadn’t been in before.”

Here at Language Log, we don't have any professional interest in who owns whom in Hollywood. But when it comes to a failure of systematic regularization, we're all over it.

"Systematic regularization" is one of several different terms used to name a familiar phenomenon, long noted in English and some other languages as well. Sometimes, a morphologically-irregular word form becomes regularized when the word is used in a new way:

Factories churn out Barbies, Mickey Mouses (*Mickey Mice) and Ninja Turtles.
Brian Pierce flied out (*flew out) to center to end the inning.

There's been some controversy over what the conditions are for this to happen. It's clearly not enough for the word to be used in an extended or idiomatic sense -- thus we say "straw men" not "straw mans", "caught a cold" not "catched a cold", and so forth. What's going on?

About 25 years ago, some clever linguists suggested an explanation. The linguists were Edwin Williams ("On the notions 'lexically related' and 'head of a word'", Linguistic Inquiry 12, 245-274, 1981 ) and Paul Kiparsky ("Lexical phonology and morphology", in I. S. Yang (Ed.) Linguistics in the morning calm. Seoul: Hansin, pp. 3-91). And their idea, shorn of complexities, is that the crucial difference is in the words' derivational history.

For example, "flied" in the baseball sense involves first making the verb fly into a noun (as in "pop fly" or "fly ball"), and then making that noun back into a verb. In graphical notation, that's something like

Or in the notation of labelled brackets: [V [N fly N] V] .

The phrase "I caught a cold" really involves an inflected form of the verb catch, even though the verb is used in an idiomatic sense. And therefore, according to this argument, the usual irregular past tense form "caught" is used. But the phrase "He flied out to left field" involves a new verb, made from a noun that (at least historically) was in turn derived from the verb -- and so the default, regular past tense form "flied" is used -- as it would be for a borrowed or completely invented word -- instead of the normally-associated irregular form "flew".

In the case of nouns and the regularization of plural forms, the idea is similar. "Mickey Mouse" used as a noun is not a compound form of the noun mouse -- rather, it's a noun formed from a name formed from that noun. A lineman is a kind of man -- which could be explained by saying that it's a compound whose head is the noun man -- and so the plural is "linemen". But walkman is not a kind of man, it's just a brand name that happens to include the noun man; and so the plural is more likely to be "walkmans" than "walkmen". (The Sony Walkman was hot stuff during the period when these points were first being hashed out -- the example dates the discusssion.)

Anyhow, that's why some people have called this kind of regularization "systematic" -- on this account, it's a rule-governed consequence of morphosyntactic structure, not like a sporadic lexical change like the ones that happen now and then with pairs like dreamt/dreamed.

Now, the background of "movies getting green-lit" is pretty much like that of "flied out". A "green light" has been a standard metaphorical (as well as literal) signal for "go ahead" at least since traffic lights were invented, as the OED's citations indicate:

1937 T. RATTIGAN French without Tears III. i. 126 We had a bottle of wine and got pretty gay, and all the time she was giving me the old green light.
1954 WODEHOUSE Jeeves & Feudal Spirit xxii. 216 Carry on, old sport. You have the green light.

And any noun (or nominal phrase) is liable to be verbed, as this one has been in Hollywood, long since. For example, a story in the Los Angeles Times, May 3, 1948, "Warners Lists Large Spring Movie Slate":

With eleven films greenlighted by Jack L. Warner to go before the cameras during May and June, record production activity looks for Warner Bros. for the first half of 1948.

Given the fact that greenlighted is written solid and without quotation marks, this was clearly a well-established usage in 1948.

These days, both "green-lit" and "green-lighted" seem to be in fairly common entertainment-industry use:

It was green lit by Apr. 2 and we started shooting Aug. 8. (Hollywood Reporter, 5/9/2003)
Within six weeks, Lions Gate had green-lit "Diary" as a feature. (New York Post, 3/5/2005)

MTV also green-lighted ''Beat Sweep,'' a hard-core hip-hop series; (Atlanta Constitution, 4/30/1999)
The network has greenlighted a second season of "Idol Tonight," the exclusive preshow for Fox's "Idol," ... (Hollywood Reporter, 1/23/2007)

I don't have the time to investigate in detail whether the distinction between the preterite ("They greenlighted/greenlit it") and the past participle ("It got greenlighted/greenlit") makes any statistical difference; nor whether there's really a significant difference in usage between the entertainment industry and the rest of us. However, from a quick scan of the first few pages of Google News Archive's 2,050 returns for {greenlit} and 1,030 returns for {green-lit}, I reckon that this is mostly an entertainment-industry usage. The 2,180 returns for {greenlighted} and the 3,430 returns for {green-lighted} seem to deal with a wider variety of topics.

OK, so what's going on out there in Hollywood, morphologically speaking? Didn't they get the memo about systematic regularization? And why does it matter?

Well, believe it or not, these issues were a big deal among psycholinguists in the 1980s and 1990s, a period known to students of intellectual history as The Great Past Tense War. (Not really, but it could be.)

And during those Homeric struggles, some other explanations for the flew/flied phenomenon emerged. George Lakoff, in an unpublished (and apparently now unavailable) 1987 paper "Connectionist explanations in linguistics", suggested that the essential generalization was semantic rather than syntactic. His proposal was quoted (and rejected) in one of the central presentations of the syntax-based accounts, J.J. Kim, S. Pinker, A. Prince and S. Prasada, "Why no mere mortal has ever flown out to center field", Cognitive Science, 15(2) 173-218, 1991. Their quote from Lakoff:

[Pinker and Prince (1988)] cite the well-known fact that certain polysemous lexical items have different past tense forms for different senses of the verb. For example, fly in its central sense, takes the past tense flew, but takes flied in its extended baseball sense. There is a general constraint on such cases: It is always the central senses that have irregular past tenses.

Other versions of the semantics-based theory are discussed in Yasuhiro Shirai, "Is regularization determined by semantics, or grammar, or both?", J. Child Lang. 24 495-501, 1997.

Kim et al. (1994) claim, based on four experiments, that both school-age and pre-school children are sensitive to the grammatical status of verbs and nouns. More specifically, they claim that children avoid using the irregular past form for certain verbs simply because they know these verbs are denominal, and they prefer instead to use a regular past tense form. With respect to nouns, Kim et al. (1994) explain children's preference for `walkmans' over `walkmen' as resulting from this word's exocentric (headless) status. Their explanation depends exclusively on grammatical information -- the derivational status or headedness of the verbs or nouns in question.

However, their findings are perfectly consistent with a completely different explanation, what I will call the Semantics Hypothesis: speakers avoid irregular forms simply because they do not want to convey the meanings associated with those forms. In the 'fly' case above, if one says `The batter flew out to centre field' it may erroneously activate the image of the batter flying through the air. In the case of `walkmen', the irregular plural form men activates the image of human beings, not portable audio-cassette players. Each irregular form is strongly associated with its conventionalized meaning, which may not be the meaning intended by the speaker in a particular situation. On this account, speakers tend not to use the irregular form when its conventionalized meaning conflicts with the meaning the speaker wants to convey, and opt instead for the regular form. This account, proposed in Harris (1992, 1993) and Daugherty, MacDonald, Petersen & Seidenberg (1993), is at least intuitively appealing.

Note that the "semantic" account of these phenomena has an essentially Gricean component -- as Shirai puts it, "speakers avoid irregular forms simply because they do not want to convey the meanings associated with those forms". This predicts that in a context where the "extended" sense of a verb becomes commoner, and thus a priori more likely, the irregular past tense should also become commoner, since its use is less likely to cause misunderstanding. That's exactly the pattern that we see in the case of greenlit/greenlighted -- the irregular form greenlit is widely used in the entertainment industry, where getting or not getting green-lighted is a ubiquitous concern.

And perhaps something similar has been going on with "flew out" as well. To start with, the facts as sometimes asserted need a bit of a reality check. K Daugherty, M MacDonald, A Petersen and M Seidenberg pointed this out in 1993, countering Kim et al.'s clever title "Why no mere mortal has ever flown out to center field" with their own "Why no mere mortal has ever flown out to center field, but people often say they do", 15th Annu. Conf. Cogn. Sci. Soc., 1993.

I don't have a copy of that presentation, which doesn't seem to be available on line [Update: here it is...], but I do have access to the Google News Archive, where a search for {"flew out to * field"} yields 1,300 hits like these (the first four hits at the moment):

But Mike Marshall flew out to right field to end the inning.
Piazza simply flew out to right field to end the second inning.
He flew out to right field in the first inning, grounded out on a bunt try in the third, flew out to left in the fifth, and flew out to center in the seventh.
A single by Mark Grace drove in Sandy Martinez, but Rodriguez flew out to left field to end the threat.

And we can even find things like

With Monty Tech ahead, 5-1, and the first batter of the inning having already flown out to center field, it was hard to imagine the drama that was about to [unfold].

It's true that {"flied out to * field"} yields 2,600 hits. So you might think that "flied" beats "flew", if only stochastically. But there's something funny about the "flied" examples -- at least on the first few pages, a large fraction of them are simple recaps, often apparently generated mechanically from the scorecard, e.g.

3rd: Nix flied out to right field. Barajas doubled to right. M.Young flied out to center field. Blalock doubled to right, Barajas scored. ...


Rockies third - Mohr flied out to left field. Helton flied out to left field. ... Taguchi flied out to right field. Cardinals 4, Rockies 4. ...

So it seems that when an actual sportswriter-type human being is writing prose about baseball, at least in the texts indexed by the Google News Archive, more human beings "flew out to center field" than "flied out to center field". Nor is this a linguistic innovation -- thus in the Los Angeles Times for Feb. 14, 1898, on p. 5, we can read that

Tyler then steadied down and struck out the next two men up, and the third one flew out to center field.

So why did a whole series of linguists and psycholinguists put a star (indicating ungrammaticality) on "*flew out to center field" -- and even make this alleged impossibility the theme of a clever title in a major journal article ("Why no mere mortal has ever flown out to center field")? There's an obvious hypothesis, awaiting empirical test. Perhaps (talk about) baseball is less frequent in the life of linguists than in the life of sportswriters.

Anyhow, whether the true account of the facts is syntactic or semantic, it needs to deal with the fact that "systematic regularization" is in fact rather patchy. And Hollywood's embrace of "greenlit" over "greenlighted" is just another patch in the quilt.

(For more on the background of this discussion, see the tag-team match in Trends in Cognitive Sciences 6(11), 2002, between Steven Pinker and Michael Ullman on one side, and James McClelland and Karalyn Patterson on the other. P&U "The past and future of the past tense"; M&P "'Words or Rules' cannot exploit the regularity in exceptions"; M&P "Rules or connections in past-tense inflections"; P&U "Combination and structure, not gradedness, is the issue".)

[Update -- Lane Greene writes:

Several funny things seem to be going on with your counts of "flied out" and "flew out". First, I can't recall ever hearing a baseball announcer say "flown out", and I watch a lot of baseball, perhaps unlike Pinker et al. I'm sure I've heard it, but I've never noticed it; if I did, I'd giggle at it as an ungrammatical hypercorrection.

In the middle of writing this e-mail I think I figured it out. Sportswriters and announcers are probably a lot more likely to say "flied out to right" than "flied out to right FIELD". "Flied out to right" yields 190,000 hits in plain Google (not news). "Flied out to right field" yields 371. "Flew out to right" yields 684.

So to get a better picture of sportswriter usage (if you have the time, which you may well not) you'd want to add the counts for "flied out to right", "flied out to center" and "flied out to left" and compare them to the counts for "flew out to right..." etc. You'll see on the first page that most "flew out to right" results are baseball-related, and don't involve getting in a plane or using super powers to fly out to right anything, so it's a good comparison without too much non-baseball noise.

If you look, I think you'll see that Pinker is right: "flew out" is a widely attested but definitely idiosyncratic usage. "Flied out" is several orders of magnitude more common.

Well, I accept Lane's intuition, which for that matter I share. But the facts of usage these days seem to be distinctly otherwise. Here are detailed counts from the Google New Archive for left, right and center, with a following field and with the following context unspecified.

  left right center
"flied out to __ field" 732 895 840
"flew out to __ field" 338 453 419
"flied out to __" 18,200 19,300 21,700
"flew out to __" 1,450 1,500 1,750

The results are comparable to what I found earlier with the interpolated asterisk in place of {left, right, center}.

Why the difference? Well, we can get a clue if we look at the first five hits for "flied out to center" (from the Google News Archive search snippets):

TORONTO 5TH: Mondesi flied out to center. C Delgado popped out to third. Fullmer homered to right.
Renteria flied out to center fielder Gathright. Ortiz singled to center. M.Ramirez singled to center, Ortiz to third. Nixon singled to center, Ortiz scored ...
Womack flied out to center fielder Damon. Walker walked. Pujols popped out to second baseman Bellhorn. 0 runs, 0 hits, 0 errors, 1 left on.
TEXAS 2ND: Segui flied out to center. R Mateo lined out to third. Lamb doubled to left. R Clayton singled to center, Lamb scored.
P Reese flied out to center. J Damon walked. M Bellhorn singled to left, J Damon to second. D Ortiz doubled to deep center, J Damon scored, M Bellhorn to ...

In other words, these are not general prose, they're recaps that are mechanically-generated (maybe even computer-generated) from the scorecard.

(And even with that assist, the difference is just about one factor of ten, not "several orders of magnitude".)

As for the 190,000 hits in the regular Google search for {"flied out to right"}, again, every single hit on the first five pages is in a scorecard recap.

So we're left with a psycholinguistic mystery -- why do some literate baseball fans like Lane Greene believe, contrary to the truth, that "flew out" is an "idiosyncratic usage" that is "several orders of magnitude" less common than "flied out" in baseball contexts, including sportswriting? I'll confess that I drank the "systematic regularization" KoolAid on this point for decades myself -- right up until the point last night when I checked the facts.

As for "flown out", I agree that it's rare, but announcers don't have occasion to use the pluperfect very often. ]

Posted by Mark Liberman at February 6, 2007 06:42 AM