Language Log: The world in a grain of sand

January 29, 2008

The world in a grain of sand

Andrew Gelman (at the Statistical Modeling, Causal Inference and Social Science blog) recently posted "A message for the graduate students out there"

Research is fun. Just about any problem has subtleties when you study it in depth (God is in every leaf of every tree), and it's so satisfying to abstract a generalizable method out of a solution to a particular problem.

which references a post of his from 2005:

In a recent article in the New York Review of Books, Freeman Dyson quotes Richard Feyman:

No problem is too small or too trivial if we really do something about it.

This reminds me of the saying, "God is in every leaf of every tree," which I think applies to statistics in that, whenever I work on any serious problem in a serious way, I find myself quickly thrust to the boundaries of what existing statistical methods can do.

That's certainly what happens in every area of linguistic research that I've ever worked on. Except that the terra incognita around us is not only methodological, but also descriptive and conceptual.

Andrew draws a cheerful -- and true -- conclusion, which also applies in the various areas of linguistics:

[This] is good news for statistical researchers, in that we can just try to work on interesting problems and the new theory/methods will be motivated as needed.

Last week at dinner after my talk at the University of Chicago, several of us were reminiscing about our grad-school experiences. John Goldsmith and I contributed a number of stories about Morris Halle, who taught this lesson so intensely and effectively.

As Andrew says, the initial effect of looking carefully at an arbitrary problem is generally additional complexity:

I could give a zillion examples of times when I've thought, hey, a simple logistic regression (or whatever) will do the trick, and before I know it, I realize that nothing off-the-shelf will work. Not that I can always come up with a clean solution (see here for something pretty messy). But that's the point--doing even a simple problem right is just about never simple.

Occasionally -- more often if you're smart or lucky -- you run across a simple problem for which the standard treatment is complicated and not very good, where you can find a better, simpler and generalizable solution.

It's commoner to find a better solution that's just as complicated, if not more so. But Morris taught us to have faith that if you keep at it, glimpses of the truth will be revealed.

One of my favorite examples comes from a rather different area of speech and language research. In the 1920s and 30s, Harvey Fletcher made basic discoveries about auditory physiology and the nature of speech perception, as a consequence of looking carefully at a mundane-seeming problem: how to predict the intelligibility of nonsense syllables as a function of the frequency response of a telephone circuit.

For an accessible and entertaining discussion, see Jont Allen, "Harvey Fletcher's role in the creation of communication acoustics", J. Acoust. Soc. Am. 99(4), 1996. And for an entirely different perspective on the "articulation index" research, see Aleksandr Solzhenitsyn's fictionalized account (in The First Circle) of Gleb Nerzhin's doomed attempts to apply and extend the same ideas.

Yakonov wants Nerzhin to give up his work on the acoustics of perception, and to focus instead on hacking the circuitry of the vocoder that Stalin demanded to compete with SIGSALY, which Fletcher's group at Bell Labs had built during WW II (with the participation of Alan Turing on the British side).

Posted by Mark Liberman at January 29, 2008 05:22 PM