October 27, 2004

The Axis of Spam

America is under attack. Unwanted linguistic material, possibly hazardous, is being launched at us in vast quantities. Spam received at my email address, which I now never advertise anywhere in machine-readable form (though it is of course too late) is now vastly outnumbering genuine messages. And I have noticed that a large and increasing proportion of it is now coming from accounts in the national domains of foreign countries. To fight this we must know our enemy. Which countries are the most guilty? Who are the gravest offenders in the Axis of Spam? Here are some very rough details for the spam I have received (all of it trapped by the indispensable spamassassin) in the last few weeks.

The details have to be rough because of course many (perhaps most) headers are forged, often "From" lines and "From:" lines and "Received from" lines and "Sender" lines and "Reply-To" lines don't match, and so on. The Unix tools I have used make it easy to automate the count without any human effort, but they go wrong on some unexpected kinds of message, they will not detect duplicates, and so on. For what it is worth I have ranked the countries named below in reverse order by one very crude measure: the number of times their suffix appears in material that has been correctly filtered as spam by my filtering. Here, then is a very rough guide to evilness of countries as spam sources, in decreasing order of evildoerhood. The number at the left is the count for that country's suffix in my sample of spam.

33 Italy (.it)
16 Japan (.jp), Germany (.de)
14 France (.fr)
13 The Czech Republic
12 The United Kingdom (.uk)
11 China (.cn)
8 Guatemala (.gt), South Africa (.za), Spain (.es), Taiwan (.tw),
7 Portugal (.pt)
5 Austria (.at), Latvia (.lv), Poland (.pl), Sweden (.se)
4 Argentina (.ar), Australia (.au), Canada (.ca)
3 Brazil (.br), Greece (.gr), Iceland (.is), Israel (.il), Lithuania (.lt), Russia (.ru), Slovenia (.si)
2 Cocos (Keeling) Islands (.cc), Hungary (.hu), Netherlands (.nl), Nieue (.nu), Panama (.pa), Peru (.pe), Romania (.ro)

There are the data. Make of them what you will. I cannot see much rhyme or reason in any of it — except for one extraordinary fact that John Bell pointed out to me (oddly I had missed it): the three original Axis Powers of World War II are right at the top of the list! Could spam be a continuation of World War II by other means?

Apart from that, not much that is significant. Countries using Indo-European languages are in the majority, but that's not much of a surprise considering the character of the Internet. Putative friends and allies like the U.K. appear to rank as more evil than known enemies like France. (Just kidding! A hearty welcome to all our French readers!) Tiny Latvia and Lithuania are in on the act along with huge countries like Brazil and Australia. Poor countries like Guatemala and rich countries like Japan are cheek by jowl.

I have had at least some spam from other countries that do not show up in the sample surveyed above: Chile (.cl), Denmark (.dk), Finland (.fi), India (.in), Ireland (.ie), Malaysia (.my), Micronesia (.fm), and Switzerland (.ch).

Switzerland! "Five hundred years of peace and democracy," says Harry Lime (played by Orson Welles), contemptuously, in The Third Man; "and what did that produce? The cuckoo clock." Well, that, and good chocolate, and the Vatican Guard, and also spam advertising cheap software, it seems.

Absent from the above list are countries the USA has recently made war on (Afghanistan, .af, and Iraq, .iq), but also the two countries belonging to the original Axis of Evil who have not yet been attacked (Iran, .ir, and the Democratic People's Republic of Korea, .kp). The two countries that are usually at the top of Transparency International's list for freedom from corruption (Denmark, .dk, and Finland, .fi) are sending spam, and the two countries most usually found right at the other end, the most corrupt in the world (Cameroon, .cm, and Bangla Desh, .bd) are not.

A rather weird thing is the absence of Nigeria (.ng), considering that every day I am contacted several times by Nigerian widows, businessmen, and ex-aides to fallen government ministers who want my help in shifting hundreds of millions of illicit dollars into the country. But they always seem to get hold of .com or .net addresses.

Of course, right at the top of the list would be the most evil country in cyberspace: the USA, which sends most of the spam in the world, as Noam Chomsky would certainly want me to point out. But I cannot work out how many American spams I get, because .us is hardly used so far, and names in the .com and .net domains and the like can be purchased or used by non-Americans too. What I can tell you is that if yahoo.com and hotmail.com were countries, they would be the world's most evil — Yahoo! (which has several foreign subsidiaries) being the most evil, generating about 10% of all the spam I get, and Hotmail second, with about 7%.

Posted by Geoffrey K. Pullum at October 27, 2004 04:43 PM