Internet.com ISP-Planet
Search ISP-Planet


Search internet.com
internet.com

IT
Developer
Internet News
Small Business
Personal Technology
International

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers

internet.commerce
Partner With Us














ISP Technology

 

General

Image Spam

An anti-spam company's founder explains this increasingly troublesome scourge of e-mail.

by David Skoll
Roaring Penguin Software, Inc. President and Founder
[May 4, 2007]
Email a colleague

What is it?
An "Image Spam" is a spam e-mail that contains its sales pitch in the form of an image, such as a JPEG or GIF image. There may be no other content in the e-mail, or it may include nonsensical text, unrelated text such as jokes or news reports, or simply gibberish.

Why image spam?
As content-filtering spam software became more sophisticated and accurate, spammers found it more difficult to pitch their wares using normal text or HTML messages. As a result, they turned to encoding their sales pitch as an image. This completely bypasses most anti-spam content-filters, because they cannot analyze the words in the images.

How can we combat image spam?
It turns out that image spam can be detected quite accurately using the same techniques that fight other spam:

The gibberish or nonsense text included with image spam very quickly becomes "red-flag" text for a Bayesian filter. A distributed Bayesian database such as Roaring Penguin's Training Network adapts extremely quickly to most image spam.

An image with little or no accompanying text is also a red flag, because almost all legitimate mail that contains images also includes a reasonable amount of body text.

Normal connection-level techniques such as greylisting and DNS-based RBLs continue to be effective against image spam.

What about OCR?
Some anti-spam vendors have resorted to using Optical Character Recognition tools to extract the text from an image spam for analysis. Unfortunately, OCR has met with limited success. The state-of-the-art in OCR is not very advanced. Furthermore, OCR tools are not designed to extract text from an image that is actively being manipulated by an adversary. Spammers have reacted to OCR tools by obfuscating the text in the images they send. The obfuscated text is still relatively easy for humans to recognize, but very difficult for OCR tools to extract.

In addition to the accuracy problem, OCR is very compute-intensive and can greatly slow down a content filter.

—End

 

Related articles:
  [March 6, 2007] Barracuda Networks Updates Image Scanning in Anti-Spam Engine
  [July 10, 2006] IronPort Reports Surge in Image Spam
  [Sept. 1, 2005] Skoll: The Heart of the Penguin

 

ISP Glossary
Find an ISP Term

Newsletters!
ISP-Planet Weekly

Best of ISP-Planet

 

Feedback


Advertising inquiry? Click here!

ISP-Planet's RSS feed

internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

Jupitermedia Corporate Info

Legal Notices, Licensing, Reprints, Permissions, Privacy Policy.
Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers

Whitepapers and eBooks

Intel Whitepaper: Comparing Two- and Four-Socket Platforms for Server Virtualization
IBM Solutions Brief: Go Green With IBM System xTM And Intel
HP eBook: Simplifying SQL Server Management
IBM Contest: Are You the Next Superstar? Join the "Search for the XML Superstar" Contest to Find Out
Microsoft PDF: Top 10 Reasons to Move to Server Virtualization with Hyper-V
Microsoft PDF: Six Reasons Why Microsoft's Hyper-V Will Overtake Vmware
Microsoft Step-by-Step Guide: Hyper-V and Failover Clustering
Intel PDF: Quad-Core Impacts More Than the Data Center
Intel PDF: Virtualization Delivers Data Center Efficiency
Go Parallel Article: PDC 2008 in Review
Microsoft PDF: Top 11 Reasons to Upgrade to Windows Server 2008
Avaya Article: Communication-Enabled Mashups: Empowering Both Business Owners and IT
Intel Whitepaper: Building a Real-World Model to Assess Virtualization Platforms
  PDF: Intel Centrino Duo Processor Technology with Intel Core2 Duo Processor
Microsoft Article: Build and Run Virtual Machines with Hyper-V Server 2008
Go Parallel Article: Q&A with a TBB Junkie
IBM Whitepaper: Innovative Collaboration to Advance Your Business
Internet.com eBook: Real Life Rails
IBM eBook: The Pros and Cons of Outsourcing
Internet.com eBook: Best Practices for Developing a Web Site
IBM CXO Whitepaper: The 2008 Global CEO Study "The Enterprise of the Future"
Avaya Article: Call Control XML in Action - A CCXML Auto Attendant
IBM CXO Whitepaper: Unlocking the DNA of the Adaptable Workforce--The Global Human Capital Study 2008
Adobe Acrobat Connect Pro: Web Conferencing and eLearning Whitepapers
HP eBook: Guide to Storage Networking
MORE WHITEPAPERS, EBOOKS, AND ARTICLES