The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

How To Use - SpamCannibal

PREFACE

Today's email systems are called upon to examine and classify incoming mail in ways it was never designed to do. DNSBL servers and sophisticated filter help immensely in this task by quickly identifying viruses, spam and spam sources, but there is no good way to stop this traffic from consuming bandwidth. The tactic from yesteryear of bouncing messages back to the envelope sender only makes the matter worse as ALL spam and virus mail comes with bogus headers. This practice triples or quadruples the bandwidth consumed by the spam. First the inbound transit, second the bounce to the innocent envelope domain owner, third the return bounce from the mailer daemon for the unknown envelope user and a potential fourth refusal from those site equipped with a double bounce filter. Every time a piece of spam is received, even from a known source, this process is repeated and there is no burden placed on the sender or incentive for them to stop.

Before discussing how SpamCannibal addresses this problem, let us consider the path that a message takes as it enters a well designed mail system.

1. connection

A connection is made to the host TCP/IP port 25 and is handed off to the Mail Transfer Agent or front-end-filter by the operating system.

2. access control

The MTA examines the source of the message and checks against remote DNSBL's and its access list to see if the source is in its reject list. If rejected, the message is usually returned to the envelope sender with and error code.

3. content filtering
 The message is filtered for spam content and either marked for 
special delivery disposition or bounced to the envelope sender as in 
step 2.

While these steps do a reasonable job of reducing the unwanted mail delivered to the end user, it does nothing to reduce or eliminate the bandwidth consumed by the ever increasing load of spam and virus mail, nor does it impose any penalty on the feckles sender.

SpamCannibal, the missing piece

SpamCannibal provides the missing element in email system design. It provides the piece needed to reduce and eliminate unwanted spam traffic. SpamCannibal does this in a surprisingly simple way in a multi step process -- since we will reference the three steps that the MTA takes to receive mail, the SpamCannibal steps will be labeled a), b), c), etc.....

With a SpamCannibal enhanced mail system, an incoming connection to TCP/IP port 25 goes through these steps.

a. access control

The incoming host IP address is checked against a local database of banned hosts. If the IP address is acceptable OR UNKNOWN, it is logged into the archive database and the MTA is connected for step 1) of its process.

Let's assume for the sake of discussion that the UNKNOWN host delivered a spam load for which the MTA will complete steps 2) and 3) and provide some subsequent disposition.

b. c. d. skipped for normal messages

This connection is passed through to the MTA

e. automated spam source identification

Some few minutes later, a cron script checks all of the collected archive IP addresses against the same DNSBL list used by the MTA. Addresses for which "A" records are returned from the DNSBL's are added to the database of banned hosts to be tarpitted. If you wish to be polite and impose a minimum cost on the spam sender, SpamCannibal can be configured to simply ignore the incoming connection request as if port 25 had no service.

The spam source has now been identified. Let us repeat the steps for SpamCannibal.

item a. (again) access control / tarpit action

The incoming host IP address is checked against a local database of banned host. The IP address is found to match an entry in the database.

b. tarpit response

SpamCannibal ACKnowledges the connecting hosts SYN packet with a small window size then drops the packet.

c. tarpit acknowledgement

The connecting host responds with its own ACK and may attempt to send data using the reduced window size or simply ask for a larger window. Either way it will take some time before the connection is terminated.

d. persistent tarpit complete

The connecting host sends data. SpamCannibal ACKnowledges the data receipt and further reduces the transmission window size. The remote host now will hang on indefinitely trying to send the balance of its payload.

e. never reached

The local host never sees the banned connection. What little traffic remains is handled entirely by the tarpit daemon.

All of the steps that the SpamCannibal tarpit takes are stateless. There is no forked child, suspended job, or memory storage. Each incoming connection it treated anew based only on the information in the inbound packet. What SpamCannibal accomplished is a threefold reduction in the traffic cause by spam and virus payloads because they NEVER LEAVE THE TRANSMITTING HOST. This multiplies itself in reduction in resources consumed on the local mail host since it does not have to process the payload through the MTA, interrogate DNSBL's, run filters or waste human time emptying overfilled email boxes.

The flip side of this is not so pleasant. The sending mail host has a loaded task waiting for a response from its TCP/IP stack. The TCP/IP stack has full buffers that have not been transmitted and the timeout mechanism is reset each time it attempts to send data. Every additional thread caught by SpamCannibal requires another task and additional resources on the TCP/IP stack. This could easily stall the sending process on a host that distributes UBE, UCE or virus mail to a large number of sites where SpamCannibal has been deployed.

USING SPAMCANNIBAL WITH YOUR MAIL SYSTEM

SpamCannibal has four runtime elements.

1. Front end "dbtarpit" daemon.

This daemon interfaces directly with Linux's "iptables" and receives every packet destined for port 25 before it is passed to the MTA. As far as human operators are concerned, this it the most passive looking of the operations since there is no external interface.

2. The sc_BLcheck script

This script runs periodically to check the accumulated (logged) IP addresses that connected to port 25 against your preferred list of DNSBL's. This should be the same set of DNSBL's that are used by your MTA. IP addresses with returned "A" records are added to the "tarpit" database for subsequent denial of access.

3. Inbox robot

Spam that escapes DNSBL detection can be emailed to SpamCannibal's secure mail robot, sc_mailfilter, to process the headers, extract the originating MTA IP address, and add that address to the tarpit database.

4. Web administration tools

SpamCannibal's secure web administration tools allows the system administrator to manually add spam hosts through a simple cut and paste operation or to manually add or delete hosts from the database.

In addition to these tools, there's also a nifty statistics display that is borrowed from the LaBrea::Tarpit perl module. It provides a realtime snapshot of the current and recent spam host activity on the mail host.

5. Optional multi_dnsbl daemon

multi_dnsbl is a DNS emulator daemon that increases the efficacy of DNSBL look-ups in a mail system. multi_dnsbl may be used as a stand-alone DNSBL or as a plug-in for a standard BIND 9 installation. multi_dnsbl shares a common configuration file format with the Mail::SpamCannibal sc_BLcheck.pl script so that DNSBL's can be maintained in a common configuration file for an entire mail installation.

It is recommended that SpamCannibal installations utilize multi_dnsbl for there MTA's DNSBL lookups as this minimizes network traffic to the DNSBL's and optimizes the order in which they are interrogated.

TYPICAL INSTALLATIONS

The SpamCannibal site.

System 1

A standalone system incorporating an MTA, DNS daemon, web server, and SpamCannibal installation. This system runs three of SpamCannibal daemons.

1. dbtarpit

Denies access to banned hosts and collects incoming connection IP addresses.

2. dnsbls

Provides blacklist DNS service on a internally accessible port from the SpamCannibal databases. The primary DNS server (bind-9.xx) is slaved to dnsbls to provide external DNSBL service.

3. bdbaccess

Provides privileged access to the SpamCannibal web pages running on the local web server.

In addition, the sc_lbdaemon (LaBrea) data collection daemon runs on the localhost and provides statistics for the local web server pages.

System 2

A standalone system incorporating an MTA and SpamCannibal installation. This system runs two SpamCannibal daemons.

1. dbtarpit

Denies access to banned hosts and collects incoming connection IP addresses in the same manner as System 1.

2. bdbaccess

Provides privileged remote access to the SpamCannibal web pages running on a remote host (actually in another city with a different network provider).

System 2 uses System 1's DNSBL for to check its IP archive database. System 2 is the secondary mail host for System 1, but bears roughly twice as much spam traffic based on traffic analysis.

In addition, System 2 runs the sc_lbdaemon and provides remote statistics access for a web process running on another host.