Combating Image Spam

Spammers are ingenious little devils. One piece of spam that always has been near impossible to filter out is image spam. Quite simply, image spam is where they make an image with the spam message inside. Since most spam analyzers only look at text, not images, you get the spam in your mail box. To combat this type of spam, you need an e-mail service that will analyze images for spam. That service should also let you dictate where that e-mail goes based on the results that it finds. Tuffmail fits the bill on both accounts. Let’s take a look at an e-mail that I get daily to one of my e-mail addresses:

Do you notice the text on the bottom of the e-mail? If I take parts of this text and throw them into Google, I get some interesting results. “emma davenport, a quiet, bright-eyed girl” comes from Chapter 11 of the book “An Old Fashioned Girl”. “bess vanished from the room, seeming to take all the light with” comes from Chapter 2 from Parnassus. The spammer is adding legitimate text at the bottom of his e-mail to make his e-mail less “spammy” for Bayesian filters. Bayesian filters look at the whole e-mail and give it a score based on how many spam words are contained in the e-mail. By adding more legitimate, non-spam like text, the message gets a lower spam score. Clever, very clever.

Looking at e-mail header at webmail.tuffmail.net, I found this under the line X-Spam-Report:
X-Spam-Report: Content analysis details: 0.0 BAYESSCORE 0.502318 0.0 BAYES_50 BODY: Bayesian spam probability is 40 to 60% 1.1 EXTRA_MPART_TYPE Header has extraneous Content-type:…type= entry 0.1 FORGED_RCVD_HELO Received: contains a forged HELO 4.0 RCVD_HELO_IP_MISMATCH Received: HELO and IP do not match, but should 1.5 RCVD_NUMERIC_HELO Received: contains an IP address used for HELO 4.1 FUZZY_OCR BODY: Message contains an image with common spam text [price with fuzz 0: score 0.50] [browser with fuzz 0: score 0.50] [ambien with fuzz 16: score 0.10] [click with fuzz 0: score 0.50] [type with fuzz 0: score 0.50] [click with fuzz 0: score 0.50] [browser with fuzz 0: score 0.50] 0.0 HTML_MESSAGE BODY: HTML included in message [score: 0.5000] 0.0 TM_IMG_ATTACH FULL: Email has a inline image 0.8 SARE_GIF_ATTACH FULL: Email has a inline gif 2.0 RCVD_IN_SORBS_DUL RBL: SORBS: sent directly from dynamic IP address [85.16.21.168 listed in dnsbl.sorbs.net] 1.9 RCVD_IN_NJABL_DUL RBL: NJABL: dialup sender did non-local SMTP [85.16.21.168 listed in combined.njabl.org] 1.1 MY_CID_ARIAL_STYLE SARE cid arial2 style 0.7 MY_CID_AND_STYLE SARE cid and style 0.7 MY_CID_AND_ARIAL2 SARE CID and Arial2

Most of that is probably Greek to you, so I highlighted the important part we can filter on. The analyzer engine found an image with common spam text, including ambien and price. Therefore, we can make a rule like this in Tuffmail’s IMP4 webmail interface:

Viola! Good bye image spam!

– Soli Deo Gloria

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.