Mark Nandor, a math teacher at Wellington School in Columbus, Ohio, has posted a list of all of the English words that can be spelled using the symbols for the first 111 elements, as well as lists of magic squares made up of chemical symbols. His definition of English word is "listed in the ENABLE word list", which is used by Scrabble players. You can get your own copy here if you like: enable.zip.
Nandor says that he computed the list using Mathematica in about 25 hours including programming time. Mathematica is a wonderful tool for doing mathematics, but it isn't ideal for this sort of problem. I solved the same problem by matching this regular expression case-insensitively against the ENABLE word list:
^((ac)|(ag)|(al)|(am)|(ar)|(as)|(at)|(au)|(b)|(ba)|(be)|(bh)|(bi) |(bk)|(br)|(c)|(ca)|(cd)|(ce)|(cf)|(cl)|(cm)|(co)|(cr)|(cs)|(cu) |(db)|(ds)|(dy)|(er)|(es)|(eu)|(f)|(fe)|(fm)|(fr)|(ga)|(gd)|(ge) |(h)|(he)|(hf)|(hg)|(ho)|(hs)|(i)|(in)|(ir)|(k)|(kr)|(la)|(li) |(lr)|(lu)|(md)|(mg)|(mn)|(mo)|(mt)|(n)|(na)|(nb)|(nd)|(ne)|(ni) |(no)|(np)|(o)|(os)|(p)|(pa)|(pb)|(pd)|(pm)|(po)|(pr)|(pt)|(pu) |(ra)|(rb)|(re)|(rf)|(rg)|(rh)|(rn)|(ru)|(s)|(sb)|(sc)|(se)|(sg) |(si)|(sm)|(sn)|(sr)|(ta)|(tb)|(tc)|(te)|(th)|(ti)|(tl)|(tm)|(u) |(v)|(w)|(xe)|(y)|(yb)|(zn)|(zr))+$
using the GNU version of the standard Unix utility grep (specifically, its egrep avatar). It took ten minutes or so to locate and download the ENABLE list and construct the regular expression. The computation time? Less than one second on my 1.6GHz P4 with 512MB of RAM, not exactly a supercomputer. Moreover, I think that I got the correct result. Nandor's program somehow missed the valid words berg and urges, but included the non-words cryosurg ical, urg es, and v irgins.
Personally, I don't find this sort of exercise all that fascinating though I know some people do. It does, however, provide a nice illustration of the utility of regular expression matching for linguistic searching.
Posted by Bill Poser at January 3, 2006 10:13 PM