August 25, 2003

In that case, Dottore, you leave me no choice.

I learned yesterday that Antonio Zampolli died Friday morning, as a result of a freak fire in his office.

As a small and eccentric addition to the obituaries and memorials that are already starting to appear, I'd like to tell a story about the well-known linguist G, Antonio Zampolli and the Archbishop of Pisa.

I heard this story late one night sometime in 1990, while walking near the Pisa cathedral after a long, somewhat alcoholic dinner. My informant was one of the participants. I'm not at all sure that the tale is factually true, but I think that what it says about Antonio's relationship with society and with life is true enough.

On the grounds of the Pisa cathedral, there is a series of low stone pillars with heavy iron chains hung between them, marking the boundary between paved walkways and grassy areas. My informant pointed to a row of these chain fences, the hundred meters or so of grass beyond them, and on the far side, a low building forming part of the wall that surrounds the cathedral grounds. "When G. was visiting," he said, "he and Antonio were passing just here on a night just like this one, after just such a dinner as we have had, and Antonio challenged him to a race, hurdling the fences, crossing the grass and ending at the wall over there."

Neither Antonio nor G. looks much like a sprinter, but Antonio was deceptively athletic, having (he once told me) played in goal for the Italian Olympic hockey team in 1956. Apparently G. was game, and off they went. After the race, both men were winded as well as a little drunk, and so they sat down at the base of the wall to recover.

Then "it was a beautiful night -- the air was warm, the moon was out, the stars were bright -- and so of course Antonio began to sing."

After a few minutes, there was the sound of a siren, a police car drove up, and two policemen got out and approached them. "Ah, Dottore," said the senior policeman, "I'm sorry to trouble you, but that window up there? it's the bedroom of the Archbishop. The poor old man is not in good health, and your singing has awakened him. He has telephoned to my superior, and thus I must ask you to stop singing and let him sleep in peace."

But as soon as the police left, Antonio began to sing again. A few minutes later they were back, and repeated the warning, somewhat less politely. However, no sooner had the police car driven off for the second time, than Antonio again resumed singing. And the police returned yet again, clearly in a state of considerable exasperation. "Dottore," said the senior policeman, "please think, the poor old man, at 2:00 in the morning! His health! My superior! We cannot keep warning you again and again, without taking some action!"

Antonio said nothing. So the policeman continued: "One last time, Dottore, I must ask you: will you sing or will you not sing?"

"I will sing!"

"In that case, Dottore, you leave me no choice... I will sing with you!

Antonio could often be exasperating, and not only to policemen, but in the end, his colleagues usually saw no choice but to sing along with him. His voice will be missed.

Posted by Mark Liberman at 08:36 PM

August 09, 2003

Those who are not authorized are not authorized

My gym's plural its sign reminds me of a story about another sign, in another place and time.

In the spring of 1970, I was patching up helicopters at an Army camp near the place where Vietnam, Laos and Cambodia come together. We had a problem with tools, parts and supplies disappearing, and our first sergeant blamed it on outsiders wandering through our work areas. I had access to sheetmetal, paint and stencils, so he told me to make up some big signs to warn off passing lurps, crews from other units, and such-like suspicious types.

"Here's what I want," he said. "Big red letters on a white background: 'PERSONNEL WHO ARE NOT AUTHORIZED TO BE IN THE HANGAR ARE NOT AUTHORIZED TO BE IN THE HANGAR.'"

I wrote it down and looked at it

"Don't you think it's kind of redundant?" I asked.

"What do you mean 'redundant'?"

"Well, it kind of says the same thing twice."

He looked at the message for a while. "OK, I see what you mean. So instead, let's make it: 'ONLY PERSONNEL WHO ARE AUTHORIZED TO BE IN THE HANGAR ARE AUTHORIZED TO BE IN THE HANGAR.'"

I wrote that down too.

"Sarge, I hate to say it, but I think that one's got the same problem."

Silence. Grunt. Silence."If you're so smart, what do you suggest?"

"Well, the usual thing is 'AUTHORIZED PERSONNEL ONLY.'"

Silence. Suddenly, a big grin. "OK, college boy, now you tell me this -- 'authorized personnel only WHAT?'"

Too quickly, I answered "well, I guess it's something like 'authorized personnel are the only ones who are allowed, uh..." Lamely: "... allowed to be in the hangar.'"

Triumphantly: "Ha! And what does 'allowed' mean, smartass?"

In the end, I made up six signs, with big red letters on a white background, reading 'PERSONNEL WHO ARE NOT AUTHORIZED TO BE IN THE HANGAR ARE NOT AUTHORIZED TO BE IN THE HANGAR'".

Tools, parts and supplies continued to evaporate at the same rate as before. And a third of a century later, I still don't have a really good answer to the question "authorized personnel only WHAT?"

Posted by Mark Liberman at 09:12 AM

"All lockers must be emptied of its contents."

Plural pronouns with nominally singular antecedents like "everyone" have been a major battlefield in the grammar wars. "Everyone loves their mother": right or wrong?

My gym just fired the opening gun in a new skirmish, by posting dozens of signs reading "All lockers must be emptied of its contents by August 22 at 5:00 p.m."

Someone has learned, from the "singular their" fuss, a lesson that no one wanted to teach: Universally quantified ancedents should get singular pronouns. Or something like that.

This is clearly a case of hypercorrection. However, it's not clear which side has gained: the prescriptivists can claim (correctly?) that even their opponents surely agree that THIS is a mistake; the anti-prescriptivists can counter that pedantry is the root cause of the error.

It's interesting that everyone, prescriptivists and anti-prescriptivists alike, seems to think that hypercorrection is wrong, morally as well as logically. For the prescriptivists, any form that deviates from a postulated universal standard is wrong.. For (at least some of) their opponents, use makes right, as long as you conform to your own group's norms -- but it's a sin to imitate the norms of a more prestigious group, if you get it wrong. On this point, the prescriptivists seem to me to be fairer and more democratic in their attitudes, even if their particular prescriptions are often foolish.

Posted by Mark Liberman at 07:58 AM

August 05, 2003

English on the anvil

David Gardner writes about Hinglish. I particularly like the idea of P.G. Wodehouse entering the Hindu pantheon -- he's certainly in mine.

Here is a magazine article on a more code-switching version of Hinglish. And for lexicographic completeness, here is a discussion of Hinglish as a film genre.

Posted by Mark Liberman at 10:14 AM

August 03, 2003


On 20 June 2003, ICANN announced the deployment of a new system of Internationalized Domain Names (IDNs), which permit domain names to use non-Roman scripts. Thanks to this we now have spam about "multilingual web addresses" and hype which confuses languages and writing systems: "register any domain name in any language."

The new system permits web addresses in any Unicode-supported script, covering all the "prominent" languages but leaving many others unsupported.

To avoid modifying the infrastructure of the internet, IDNs are uniquely and reversibly translated into ASCII strings having a special prefix "xn--", and comprised only of letters, digits and hyphens. These ASCII names are then resolved to IP addresses like by nameservers in the usual way. End-users don't actually see these ASCII names, since the business of mapping between Unicode and ASCII is handled by IDN-aware web applications. For example, consider the following Devanagari IDN:

यहल‹—हिन �द€• �य‹‚नह€‚ब‹लस•त‡हˆ‚.com
[Click here if above example does not display correctly]

This would be mapped to:

RFC 3490, the first of three IDN Standards, recognizes a linguistic problem: ``the introduction of the larger repertoire of characters creates more opportunities of similar looking and similar sounding names.'' ICANN's Guidelines, address this by requiring top-level domain registries to associate each IDN with one language or set of languages, and employ language-specific rules, such as the reservation of all domain names with equivalent character variants in the languages associated with the registered domain name.

Language identification: Associating an IDN to a language or set of languages is problematic when the standard for language identification covers less than a tenth of the world's languages, with a host of attendant problems as explored by Peter Constable and Gary Simons (2000) in their paper Language identification and IT: Addressing problems of linguistic diversity on a global scale.

Language-specific rules: Unicode already introduces indeterminacy, since a single visual form, such as a URL printed on a business card, has many Unicode representations. (e.g. U+00C7 (LATIN CAPITAL LETTER C WITH CEDILLA) can also be expressed as the sequence U+0043 (LATIN CAPITAL LETTER C) U+0327 (COMBINING CEDILLA)). However, each language has its own possibilities for orthographic confusability and variability, especially those languages lacking a standard orthography. In ICANN's system, the top-level domain registries will handle these by establishing language-specific rules of character equivalences.

This model has three indeterminacies: language identification, Unicode character identification, and language-specific character equivalences. Perhaps it had to be this complicated. If nothing else, the indeterminacies will be a great marketing opportunity. For example, Verisign's IDN Marketing Guide presents several ploys for getting people to buy variants of the same IDN.

Posted by Steven Bird at 08:52 AM