Setting up Evolution to filter spam away
What is SpamAssassin?
SpamAssassin is a piece of software intended for mail servers. It reads the e-mail and scans for special words like the always mentioned Viagra. It also tries to identify other signs that spam usually has such as addresses containing numbers, use of many uppercase words etc. Each time an indication is found the letter is assigned an amount of points. When a specified amount of points has been reached, the letter is marked as spam and hereby allowing easy filtering at the client.
SpamAssassin is intended for the mail server but unfortunately not that many ISP's find the urge to install it though it is free (Both as in "free beer" and as in "free speech"). However, it is also possible to install SpamAssassin on the client. You can either set it up with some evil procmail solution or you can just set up your favorite e-mail client to pipe the contents of each letter through SpamAssassin and have it filter upon SpamAssassin's reply.
SpamAssassin can be downloaded in all flavors at the SpamAssassin Download Page and installed whatever way you prefer.
I installed SpamAssassin with
apt-get install spamassassin
How to run SpamAssassin
SpamAssassin has a lot of arguments that can be set. For use a spam filter for Ximian Evolution these two are used: -P and -e.
-P tells SpamAssassin to pipe the output instead of trying to
-e tells SpamAssassin to send out an error code just before it closes. This error code will then indicate if the piped letter is spam or not.
Setting up Ximian Evolution
A very strong feature in Ximian Evolution is the handling of filters. Here is how I set up Ximian Evolution to use SpamAssassin as a spam filter.
Select "Filters..." from the "Tools" menu in the tool bar. Then choose "Add."
Click for larger image
Write a decent name for the filter. Then beneath the "If" choose Add and select "Pipe Message to Shell Command." In the field to the right type in the command /usr/bin/spamassassin -P -e. Then choose "does not return" and select 0. Beneath "Then" set up what is going to happen to the letter classified as spam. In my setup I first move the letter to a folder named Spam and then set the status of the message to Read.
I have chosen to have the letter moved instead of deleted right away because the spam filter does make mistakes. Therefore it is in general a very bad idea to delete letters without looking at them first. However, moving the spam away from the inbox allows me to pay less attention to spam.
Whitelisting: Allowing all mail from a user or a domain
SpamAssassin often marks newsletters as spam although you have specifically signed up for them. To allow newsletters and mail from your friends to pass the spam filter without being testing, all you have to do is whitelist the addresses.
Whitelisting is done in the SpamAssassin configuration file, which probably resides in /etc/mail/spamassassin
To whitelist an address like email@example.com append this to the file:
You can also add a whole domain add the same time:
Additionally, if you want to block a certain domain or user, you can use the blacklist_from firstname.lastname@example.org.
The above is however quite easy for the spammer to exploit by using false From-addresses. It does happen often enough that blacklisting is useless. However, it is my impression that whitelisting as it is done above works OK. If you want to be really sure that the whitelisting will not get exploited you can use the below code which demands that the hostname, the e-mail appears to originate from, also shows in the headers of the e-mail:
Pros and Cons
The SpamAssassin filter does catch spam. However, it tends to misclassify friendly newsletters containing many links and lots of html tags. Spam with almost no words in the text body also passes by the filter. The SpamAssassin solution also demands some ressources to run and it will generally slow down the fetching of new mail from the server.
If you don't want spam in your inbox there is also other e-mail clients such as Mozilla Mail or Mozilla Thunderbird which has a built-in spam filter that you train to recognize spam. I think it works well and recognizes more spam than SpamAssassin. I have grown to like Ximian Evolution and that is basically what is holding me back from switching. Also the fact that I have lots of mail archived in Ximian Evolution keeps me from switching.
Ximian Evolution could use a built-in spamfilter that would run faster than this solution. However, until that the SpamAssassin solution is worth living with.
Tuning SpamAssassin to Recognise More Spam
Former weblog entry now moved here.
Are you stuck with Ximian Evolution 1.4 not being able to wait for a stable release of Evolution 2 with built-in spam protection? This is for you then...
Evolution 2 is to be released soon. However, in the last couple of days I have received around 25 spam e-mails a day of which many were not recognised as spam by my SpamAssassin + Evolution combination. With the latest w32/mydoom.a virus flying around too it turned out to be too much for me. Action had to be taken.
Four Steps for Training SpamAssassin
What Mozilla Mail, Thunderbird, MS Outlook and others have been doing for some time, can also be accomplished by SpamAssassin. Bayesian Filtering is the fine name for this function. To make use of it requires a little fiddling in the console. Here's what to do:
- Put all your unrecognised spam in a folder within Evolution. In this example I'll use the folder UnrecognisedSpam. Remember that some evolution setups don't delete email right away but just marks it for deletion. In those cases be sure to use expunge! (Thanks Bengt J. Olsson (url) for pointing that out)
- Open a console, navigate to your newly created mail folder in your home directory. On my installation it is /home/krath/evolution/local/Inbox/subfolders/UnregonisedSpam
- Run this command from within the directory. It will categorise every
e-mail as spam.
sa-learn --showdots --mbox --spam mbox
- Be sure to have some e-mail that is not spam. SpamAssassin needs to
categorise some so-called "ham" in order to function correctly. This is
done by - for instance - navigating to your Inbox (see above for
location) folder and running:
sa-learn --showdots --mbox --ham mbox
Thereby you tell SpamAssassin that your Inbox does not contain any spam.
Hat tip goes to the SpamAssassin Wiki: Bayes In SpamAssassin
If anything goes wrong, turn to the Bayesian FAQ also at the SA Wiki.
Tests performed by SpamAssassin: http://eu.spamassassin.org/tests.html
Apt-get for Red Hat Linux 6.2, 7.2, 7.3, 8.0 and 9.0: apt-get fra FreshRPMS
Ximian Evolution: http://www.ximian.com/products/evolution/