9.03.2008

Green Eggs and Spam

Though I am just now starting a blog, I think that a part of me has wanted to for a long time. Blogs, and the Internet in general, can be a fickle thing. Who wants to hear you talk about your opinions on things that may or may not matter? Its hard to say if your opinion on a topic is interesting or just more of the same white noise that already permeates much of the Internet. I suppose the only way to find out is to try.

My over-arcing idea for this blog is to focus on the new and strange things in technology, as well as random musings about anything in between, though I promise to keep the tangents to a minimum.

As a opening I figure I should share a bit about what I do to show I am not just some kook that thinks he knows about computers. In my job I serve as the IT department for a wide range of companies. These companies themselves either cannot afford their own internal IT department, or would rather not deal with it themselves so they come to us. In this role I deal with a wide range of issues, dealing from "I just deleted the Internet, help!" to "the mail server just crashed." Out of all these issues, there is one that seems to eat up more of my time than anything; SPAM. Blocking spam, removing clients from spam lists, explaining to users why they get spam ("Yes ma'am, I realize you don't need a larger penis, but...").

For the last few days I have been working on this for a new client. We took over their network and picked up everything that their previous IT manager had put in place and are expected to keep it running or make it better. They had a few things set up to deal with Spam on their network, the first of which is an application called JEP(S) by Proxmea (don't ask me what JEP(S) stands for). This application does "graylisting" which is a concept that seems ok in theory, but bad in practice.

Their administration document describes the function of the device as such:
The theory of Greylisting
The basics of greylisting works by collecting what is called a triplet made out of the sending mail servers IP address, the senders email address and the recipient email address.

An example triplet could look like
this: 62.122.56.27,alice@companyxyz.com,bob@yourdomain.com

This information is saved in a database together with a time stamp of when this combination was first and last seen. Before an email session is accepted the triplet is compared to what is saved in the database and depending on if it’s a new entry or if this triplet has been seen before, it will be blocked or passed.

For example; the first time the above triplet is seen the session will be blocked as it has not been seen
before. If the mail is resent immediately (seconds after the first one), the triplet will be compared to what is in the database and then the server will see that it’s only seconds old. This session will then also be blocked.
When the mail is transmitted next time (let’s say 10 minutes after the initial session) then it will once more be compared to the database and now it will be passed.
So in my case, I have just started doing work with this client. I send them an email and JEP(S) grabs my mail server's address, my email address, and the recipient's address. It then blocks all mail that matches these 3 things for 10 minutes because I might be a spammer. The probable assumption of the company making this application is that someone sending spam would randomly generate a new email address every time it attempts to make a connection to the mail server. Since it there would be a new sending address every time, it would block their message every time. This causes two problems that I see.

  1. For legitimate messages, everything has to wait for this 10 minute limit before it can go through. Granted all mail servers should retry up to at least 3 days, depending on how this JEP(S) application responds, the sending server might think it was denied and not retry (unlikely, but still a possibility). Best case you have to wait 10 minutes for your first message to go through, lets hope this isn't a time sensitive message!
  2. On the other side you might not even block most spam. Who is to say the sender is going to use a random email address on each connection or that they won't retry their message again in 10 minutes?
The concept seems good on the surface, but when you probe deeper it just looks like a great hassle for your end users, and that appears to be exactly what is going on. Their users are calling in claimed mail is being blocked, but after doing testing they state that it just came in. Other messages are blocked and their sender gets a message back stating that their message was blocked due to an unknown reason, while at the same time they are complaining about getting spam. Though the end users are getting most of their email, it takes longer to receive and does little to stop the spam, so then what is the point?

Spam is a complicated foe. If you are too strict with it you run the risk of blocking legit email, but if you are too lenient you let too much mail through. Of the various methods of blocking I have used in the past, two have stood out. Microsoft Exchange Hosted Services (Frontbridge) and Barracuda Networks Spam Firewall. Both are very good at what they do, and are suited to different groups. Frontbridge lets you be almost completely hands off. They handle all the work and about once a day (depending on the amount of spam you get) you will receive a message listing any blocked messages, allowing you to deliver any legit messages and whitelist their senders. My personal pick is the Barracuda as it gives the network administrator more control and customizability over how email is sorted. You can choose to just mark messages as spam but deliver and let the user decide how to handle it (set up filtering rules in Outlook or another mail client) or quarantine just like Frontbridge, or you can do both. This is in addition to blocking messages on blacklists, that send messages in bulk, or based on key words and content in the message. The system will even learn as it goes and can be trained to be more effective.

I could probably go on more about these devices and what makes them good, but that's really a different topic as this post is quite long enough. Given my familiarity with these and the frequency that I deal with these issues, don't be surprised if you see this topic come up again.

No comments: