July 04, 2006

More on the "one-time rings"

I've gotten dozens of responses to the post on "Matrimonial cryptography" (6/28/2006), and I'll quote or summarize them all later. But so far, none of them really solve the problem that Dan and Sarah set, and some of them don't really address it at all. So I thought I'd say a little more about the problem, and explain in greater detail what a solution might look like.

Dan and Sarah are planning to get married, and they hatched the charming plan to find three messages written in the ordinary alphabet of letters from A to Z -- call the messages D, S, and M -- such that (for some simple, general and metaphorically satisfying function f) it will be true that f(D,S) = M. They'll have D engraved on Dan's wedding band, and S engraved on Sarah's, and then their individual messages will combine to create something new, implicit in the two of them but present in neither one alone, just as their marriage does.

They had a specific idea for the function f: use the simple trick of mod-26 addition found in some versions of "one-time pad" cryptosystems. (There could be other choices for f, and Dan and Sarah wouldn't turn down a good one, but choices that involve clever variants of letter-shapes, or remembering secret keys, aren't really in the spirit of the puzzle.) For those of you who aren't familiar with the mod-26 version of the one-time pad idea, here's an explanation. Suppose we align the letters from A to Z (ignoring case) with the numbers from 0 to 25, and with all the other integers modularly related to them:

... x y z a b c d e f g h i j k l m n o p q r s t u v w x y z a b c ...
... -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 ...
    ... 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 ...      
      ... -26 -25 -24 -23 -22 -21 -20 -19 -18 -17 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 ...    

Then we can combine strings of letters by adding them. We find that ||aced + scam = seep||, because

a+s=s (0+18=18)
c+c=e (2+2=4)
e+a=e (4+0=4)
d+m=p (3+12=15)

The combination inverts by simple subtraction: ||seep - scan = aced|| (because 18-18=0, 4-2=2, etc.) and likewise ||seep-aced = scam||.

Similarly, ||zulu + soul = riff|| because

z+s is 25+18=43, and 43 mod 26 (i.e. 43 - 26) is 17 → r
u+o is 20+14=34, and 34 mod 26 is 8 → i
etc.

In a net-accessible list of 109,582 English wordforms, there are 853 three-letter words, and 25,446 triples in which two three-letter words from the list add up by these rules to form a third word that is also on the list. This list of triples starts with ||aah+ado=adv||, ||aah+ads=adz||, ||aah+ale=all||, and runs through ||zoo+wax=vol||, zoo+zag=you||, and ||zoo+zap=yod||. There are 3,130 three-letter words, and 51,296 four-letter triples, from ||aahs+alfa=alms|| to ||zuni+tuck=sops||. There are 6,919 five-letter words, and 15,370 five-letter triples, from ||aahed+lance=laugh|| to ||zunis+tulsa=soyas||. There are 11,492 six-letter words, and 2,665 six-letter triples, from ||aahing+setons=seaway|| to ||zooids+casaba=bogies||.

And of course we don't need to limit ourselves to message mappings where the word edges all line up. We have ||an_ear+troth=testy||, ||salute+in_case=annuli||, and so on (where we'll elide the spaces in the old-fashioned cryptographical style...).

To use this scheme as a cryptosystem, the sender and receiver of the message need a shared keystream of sufficiently random letters. The sender adds the secret keystream to the plaintext to create the cyphertext, and the receiver then subtracts the keystream from the cyphertext to re-create the plaintext.

In Dan and Sarah's plan, there is no secret random keystream. Instead, each ring's message forms the keystream for the other ring's plaintext, and the two rings together -- taken in either order, because addition is commutative -- create a new message, which is implicit in the two of them, but present in neither one alone. The combination-function is simple and hallowed by tradition. The puzzle is to find three messages that are about the right length to engrave on the rings -- at most 20 characters or so -- and also are appropriate to the occasion.

I'm sure that Dan and Sarah would accept lines from Petrarch or Goethe, but let's limit things to English messages for the moment. The number of up-to-20-letter-long English-word-sequence triples will be astronomical, and nearly all of them will be incoherent and matrimonially inappropriate -- and crashingly boring to boot. Given a word list, it's easy to write a program to enumerate the possibilities (though it might take a long time to run to completion), but searching that program's output for attractive phrases is not a plausible plan.

We can set one of the three strings to something appropriate, and look for ways to fill the other two strings with genuine word sequences. It's easy to write a program to do this, but again, nearly all of the results will be incoherent and silly at best, even if we scan pages of output for something reasonable:

  e a r s a g b e n t i f e l l b l a m ...
+ d o f l o c k k i l l c a t a d i m s ...
  h o w d o i l o v e t h e e l e t m e ...

An approach that might be more would start with an appropriate output message, automatically derive a large number of word-sequence-pairs that sum to it, and then sort the results by estimated sequence probability based on some simple but reasonable model (like a trigram model or an aggregate bigram model). You could train or adapt the model on a collection of poetry. Then you might hope that scanning a few dozen screenfuls of output would turn up a plausible candidate or two. That's what I was hoping some kind and clever reader would take the time to to program up for Dan and Sarah -- if I had a spare couple of hours, I'd do it myself.

But today's the Fourth of July, and those of us in the U.S. of A. will be spending today celebrating with family and friends. Still, if you're from a fourthless culture, or otherwise find yourself at loose ends at some point in the near future, you couldn't find a more sentimentally satisfactory puzzle than this one.

Posted by Mark Liberman at July 4, 2006 07:03 AM