Office of Information Technology
Network Manager's Technote Spam / Virus Filter Information
This TechNote discusses the College's new Anti Spam / Anti Virus mail filter appliance.
What it is and What it Does:
The College recently purchased a mail filtering appliance by Barracuda Networks which has been deployed at other LVAIC institutions and various other Colleges and Universities. The term "appliance" refers to the fact that this system is provided as a package of special hardware and software largely managed by Barracuda Networks. There is no software to update or maintain as with other solutions and the system largely does its job without requiring a lot of maintenance.
Mail entering GroupWise and Hal are first filtered and scored by the appliance. The appliance looks for "unsafe attachments" which are used to harbor viruses, it finds viruses infected messages, and it scores the messages for spam and phishing attempts.
The appliance looks for viruses. If the appliance determines the message contains a virus, the message is logged and discarded before it reaches your mailbox. This prevents viruses from entering your mailbox, making campus mail safer. The appliance uses at least two different commercial anti-virus products, like Sophos and McAfee, to scan your mail for viruses. This increases the likelihood that relatively new virus threats are filtered.
The appliance looks for "unsafe attachments." An "unsafe attachment" is an attachments which is commonly used to carry viruses but is very unlikely to be used for legitimate documents. Unsafe attachments include .exe, .pif, .lnk, and so on. Messages with unsafe attachments are logged and discarded before they reach your mailbox. We use the list of "unsafe attachments" maintained by Microsoft and feel that following the lead of Microsoft in this matter is prudent.
"Phishing " ( e.g. "fishing" ) is internetspeek for attempts to steal your identify or credentials or other personal information by using " social engineering " and other means. Phishing attempts include fake notices from banks, e-Bay, and other services trying to fool you into logging in to a fake server with your password and user-ID. One you do this, you have provided the Phisher with your real password and login ID and perhaps other information like social security number. This information is then used for fraud or sold to others. So at its heart, phishing is about identity theft .
The appliance detect phishing by examining the mail for its intent. Is it trying to get you to login to known bogus web sites? For obvious phishing attempts, the appliance logs and discards the message before they reach your mailbox.
Spam In General:
The appliance uses a spam scoring system to example the contents of a message. In addition the appliance checks the origins of the mail message to see if it comes from known spam sources or ISP's which harbor them. Obviously there are cases where one person's spam is another persons legitimate mail. Since the appliance does not contain a human brain, its not able to really read each message and "know" for a fact that it is or is not spam. It uses a very sophisticated scoring system which takes into account hundreds of different litmus tests. Its also uses bayesian analysis to do a better job of "understanding" what the content of the message is rrelativelyto our own norms.
But no system will make everyone 100% happy. The nature of the scoring system is such the appliance categorizes mail into three categories: "probably not spam," "probably spam," and "yikes! that's absolutely spam all right!" For cases where the appliance is "conflicted" it will optionally tag the message - just like the old JSpamFilter system. For instances where the message is very obviously spam, the appliance will log and discard the message before it reaches your mailbox.
Obvious Spam In Discarded Unilaterally:
The threshold for unilaterally discarding the spam message is extremely high. The appliance scores spam from 0 - 10, and the current discard threshold is set to 9.5. This value is higher than the value of 7.0 recommended by the vendor. However it is a "safer" value which prevents filtering of false positives. Some of the most horribly offensive spam scores higher than a 10, we have seen messages with scores higher than a whopping 26!
So you may ask, exactly what sort of messages are being discarded? Well, after examining log entries for nearly 2700 of these discarded messages, really an unpleasant job believe me, not a single one was legitimate mail, not even close. In most cases these were some of the most horrible spam messages imaginable. So we are totally confident that legitimate mail from any source will make it to your mailbox. However this leaves the matter of tagging.
The appliance currently tags suspected spam with "[JSpamFilter] [*]" for backwards compatibility with your existing spam rules. The appliance tags any messages with a score of 3.5 or greater. This is the recommended value provided by the vendor, and also is the value used by several other Colleges and Universities we have been in touch with. It strikes a balance between false positives and false negatives. Mail scoring below a 3.5 will not be tagged.
Spam Tagging Accuracy:
I am sure that you have already receive messages which were spam and not tagged, as well as legitimate mail which was tagged. This is the reality of spam identification. There will be false positives and false negatives. However none of this mail is being dropped, you receive it all. So it is not, as it were, the end of the world if a message is inappropriately tagged.
However we have been in touch with several people who have historically received huge mountains of spam, so much so that it is crippling at times. We are talking > 100 spam messages a day. In each of these cases theses individuals have reported the tagging accuracy near 100%. For example one person went from an average of 70 untagged spam messages a day to less than 5!
I am sure that seeing that "[JSpamFilter]" tag next to legitimate mail is irksome. For people who receive a lot of spam, they may use rules to automatically file these message in a folder for later inspection - this is the best way to handle larger volumes of spam. However there is the risk that a small number of those message might be important. However nobody is forcing you to automatically delete tagged mail. There is no excuse to later complain that "I never received that mail because you spam thing tagged it." It is up to the individual to decide how serious / debilitating their spam mail is and take appropriate action.
In cases where you receive a small number of spam messages a day, the least risky way to deal with the problem is to use the tags as a hint rather than a litmus test. You can easily use the tags to quickly remove these messages from your mailbox in perhaps 30 seconds each morning. This way you will never miss a message even if tagged inappropriately. Its the safest and recommended way to deal with the problem.
Some people receive huge quantities of spam and use rules to automatically file or delete tagged messages. If you want to do this, we cannot stop you. Since no anti-spam system is perfect, if you simply automatically delete tagged mail you run the risk of deleting legitimate mail, even mail from the campus community. The risk of this is very small, however you take on that risk by automatically deleting the messages. A safer route would be to have rules file the messages in a spam folder which you periodically peruse and empty out.
Tuning Rule Based Handling of Tagged Mail:
As mentioned above, no system will be perfect. And you are likely to have those pesky listserves, daily new blasts, and other content you want to receive. You can easily tune you spam filtering / filing rule to make exceptions for these. You can add conditions to the rule to prevent mail from certain sources or with specific subjects from being deleted or files. Its easy. The support desk can assist you in fine tuning your GroupWise rules .
Tagging - Win, Lose, Draw:
So the usefulness of the tagging scheme is mostly in the eyes of the beholder. For many people the tags are extremely useful, accurate. For others you may need to ignore the tags, or create loopholes in your GroupWise rules to ignore tags on messages you don't consider spam.
Tagging of Mail Sent By Muhlenberg Community Members:
Because we are still in the middle of migrating people off our old mail servers and a number of other technical reasons, we do not have a means to tell the appliance "hey don't tag stuff sent by Muhlenberg people." Unfortunately about 20-30% of spam is sent with forged addresses indicating the sender is @muhlenberg.edu. So we cannot simply allow all mail sent from an @muhlenberg.edu address. Today it is technically possible that something from "firstname.lastname@example.org" could be tagged as spam! The following section addresses this legitimate issue.
The above message clearly has too many recipients., and was sent TO: Everyone Group rather than BC: Everyone Group. Every copy of the message includes a complete list of the hundreds of recipients. This makes each message very large. In addition, this message was also sent off campus to an AOL recipient. This fully discloses out entire address book to these 3rd parties. That's bad.
How to Avoid the Wrath of the Appliance:
Eventually we will be in a position to know precisely what is mail originating from Muhlenberg constituencies. Until then please use the following rules when sending mass / group mailings. These are basic rules to follow anyway.
- Address group / everyone mail using the BC: / BCC: field. This means "Blind Copy" which prevents the recipient list from being disclosed. Of you address these messages using the TO: or CC: fields, the spam filter will see you are sending to many, many recipients and will increment the score to the point that it might be tagged. Always use BC: / BCC: to address Everyone / Faculty / Staff or other large groups. This is a good thing to do anyway
- When sending mail use lucid subject lines, don't be cute or terse. Having a subject like "Important" is exactly the sort of thing spammers do. It is better to be specific: "Important information for faculty advisees." "Please see me" / "Look at this" is another favorite of spammers, be specific, e.g. "Please see me regarding spam filter technote." Many people see vague subjects and don't bother to open your mail - assuming it is spam even if untagged.
- Avoid sending spam. In some instances the spam filter will identify your mail as spam if it looks like spam.