November 10, 2006

Recursively nested quotation marks

The theory of quotation marks in printed Standard English says that whether you use single quotation marks (‘ ’) or double quotation marks (“ ”) as the default, if you have to enclose one quotation within another you switch to the other kind, and so on recursively, alternating quotation mark types: either ‘...“...‘...’...”...’, and so on, or “...‘...“...”...’...”, and so on, to any depth of embedding of quotes. But of course, even just a quoted string that is within a quoted string that is itself within a quoted string is very rare. If you would like to see one in the wild, I can tell you one place to look.

In Elizabeth Kostova's best-selling vampire novel The Historian (Time Warner Books, 2005), starting four lines from the bottom on page 365 of the paperback edition (8th reprint) that I purchased in England this summer, you can see a sequence like this (I omit a long portion just before “The epitaph”):

   “A Traveller” had visited the monastery in Snagov in 1605. He had talked a good deal with the monks there [. . .] The epitaph, which I copied down with care — out of what instinct I didn't know — was in Latin.’ Hugh dropped his voice, glanced behind him, and stubbed out his cigarette in the ashtray on our table.

    ‘After I'd written it down and struggled with it a while, I read my translation aloud: “Reader, unbury him with a —” You know how it goes [. . .]

   [. . .substantial amounts of text omitted here —GKP . . .] My father looked very upset.’ Here Hugh lit another cigarette, and the match shook in the gathering darkness. [. . . much more text omitted here —GKP . . .] 

What is going on here is that the character named Hugh James is telling a long story (shown here in green) which is embedded within a longer story being told by the father of the narrator of the whole novel (shown in red). The father's words are signalled by opening double quotation marks (in red), which, in a style familiar to those who know 18th and 19th-century epistolary novels, are repeated at the beginning of each paragraph but only closed once, at the end of the whole section in that person's voice (so the closing red quotation marks are not shown above; they occur on page 376, at the end of the chapter). Hugh James's words are shown in single quotation marks (green), also not closed at the end of each paragraph but only at the end of a complete section of direct quotation in his voice (as, for example, just before “Hugh dropped his voice”; the second green left single quotation mark above is actually not closed until a couple of paragraphs later, on the lower half of page 366, after the words “My father looked very upset”). The first double quotation marks in blue are scare quotes; there is a character in a manuscript identified only as “A Traveller”, and the novelist uses scare quotes in the written form of Hugh James's spoken utterance to make it clear that, in the part shown here in blue, Hugh James is not using an indefinite noun phrase in his own voice but rather using a repetition of the manuscript's way of identifying a certain definite individual. Later the green type is interrupted by another blue section, in double quotation marks, where Hugh quotes the epitaph.

Also embedded in the narrator's father's double-quoted sections of the novel are letters from another character, the father's mentor Bartolomeo Rossi, and these are in italics. Had they been shown in quotes instead, those would have been single quotation marks.

This is really a novel of very complex narrative structure. One has to keep one's eye on the ball, and one's recursion-depth counter on the level of quotations in which one is currently embedded. This kind of complexity will not be found very often in any kind of literature. But at least I have been able to show you a paragraph that opens with the sequence <Left Double Quotation Mark> <Left Single Quotation Mark> <Left Double Quotation Mark>. And I could have put things differently by saying that the paragraph begins with ‘’, giving you a four-quotation-mark sequence to read (the outermost single quotation marks would be mine, to indicate that I am quoting a string composed of the other multicolored ones). And if someone else quoted me, they would need to add yet another set of quotation marks (those should be double quotation marks, since I used single). In principle, there is no limit.

Nerd note: I leave it as an exercise for those readers who are acquainted with the methods of formal language theory to turn what I have just explained into a rigorous argument that the set of all possible properly punctuated English texts cannot be accepted by any finite-state automaton.

For more nerdy typographical stuff about quotation marks in various languages, see this post by John Cowan.

Posted by Geoffrey K. Pullum at November 10, 2006 02:06 PM