January 11, 2006

Stupid machine-generated spiritual blather

The site established by the Devi Press (find them if you want at http://www.devipress.com/, but I am damned if I am going to give them a link) for the purpose of advertising its books on Christian, gnostic, and mystical topics has a set of pages containing 1,185 (one thousand, one hundred eighty-five) articles on religious topics, each with an accompanying link to a page advertising a book called The Mystic Christ. The article titles, indexed in ASCII order, run from "1 John God Is Love" and "A Love Sent From God Above" down to "Youth Group Devotions" and "Zohar Kabbalah". And a typical piece of prose from one of them looks like this:

Abounding opposites present a few pages but he was carried out of members who went to our conversation ends where it really fulminating on ecumenically united states submitted as well i daniel had died of morality and a final analysis be fighting with which are by subject browse for an unbearable. The fourth year of heaven on a thousand strong and he had the christians about. We will of your policy who are his entrance into sticking it was called the only your sects so that will take the whole world or unpopular they do you?

That's right. Every single article was generated by an extremely crude random text-generation algorithm. (Fantastically crude. Computational linguists can do a lot better than this. Heck, a trained trunk monkey could do better.) The articles even bear a notice saying (lest the program actually write something intelligible) "DISCLAIMER: The text for this article was generated automatically by a computer. As such, nothing in this article should be construed as a statement of fact or as the opinion of the maintainers of this site." And each article has ten links to others on the list. The entire fraudulent assemblage is just an exercise in Google-bombing: Devi Press is trying to raise its Google ranking by having more than a thousand pages that link to ads for its crappy books, each of those pages being the target of links by multiple other articles.

Would you like me to begin my rant now?

<RANT>What I'm objecting to is not that this crap is religious drivel. It's that it's dishonest drivel. It's an illicit attempt to get advertising space (in the form of appearances on Google search results lists) that other people ultimately pay for. It's like having thousands of huge styrofoam cubes with your company's name on them delivered to a public landfill so that others will see them (only that doesn't happen because neither styrofoam nor landfill space is free).

The poor Google corporation is buying new CPUs and disk drives every day as it tries to keep up with the growth of the web, and every byte of this asyntactic Christian-gnostic-mystical garbage, this useless verbal waste, has to be stored in some huge refrigerated data barn somewhere and indexed and searched every single day. Every legitimate site and every genuine shopper (and — declaration of interest — every honest syntactician trying to explore language using Google as a corpus) is being slowed down (at least a tiny bit), and sometimes baffled and misled, by the totally fake pseudo-text these venal morons are stashing on their server for the sole purpose of masking the fact that nobody is very interested in their boring useless crappy books, and it makes me mad, OK? As Stephen Colbert would say, Devi Press, you're dead to me. You're on my Dead To Me board. All right, I'm done.</RANT>

Posted by Geoffrey K. Pullum at January 11, 2006 10:00 AM