June 15, 2004

base26

A visualization of (pieces of) the space of 4-letter-words in English, by toxi, using processing.

It would be nice to do some appropriate dimensionality reduction first, which would enable (a version of) the whole 4-letter-word space to be shown at once, or to be seen from perspectives different from those provided by adjacent letter pairs. The same approach would also allow visualization of the whole vocabulary, or various pieces of it. It would be interesting to try similar things in the space of pronunciations, or the joint spelling/pronunciation space.

Posted by Mark Liberman at June 15, 2004 11:47 AM