Piggie Site Admin

Joined: 10 Dec 2005 Posts: 59 Location: Florida
|
Posted: Tue May 02, 2006 12:51 am Post subject: Spam Assassin - Setup and Use |
|
|
Updated June 9, 2006
Spam Assassin is now installed and working on the server. It takes a little getting used to and a little work to get it to learn what is spam and what is ham. Ham is good mail or non-Spam mail.
Spam Assassin is set up for every mail name separately. You train spam assassin for every mail name and set up personal white and black lists for each mail name as well. So what you do to one email account doesn't affect other email accounts even in the same domain name.
There is a Global whitelist and Global blacklist, but there is very little in them except my email address so it can't get blocked when I email you as a client.
The filter works on 2 principles. With a Hit number and a Bayesian Filter. When you first use Spam Assassin there are standard built in filters that give an email hits or points.. If the hit totals exceed the number you set, the email is considered spam. Positive numbers are spam filters and negative numbers are ham filters. Though there are only few ham filters in the package.
After you classify 200 emails as spam and 200 emails as ham, the Bayesian filter starts working, learning from which mails you mark spam & which you mark ham in the Training page. While the system does have an auto-learn feature for spam and ham, unless you get dozens of emails per day, this can take months and months to reach the 200 of each type needed for the Bayes Filter to start working.
Note: In this text, the term mail name is an email address.
VERY IMPORTANT (if you skim this part and you might as well not use the filter!): When Training your Bayesian Filter, be careful what you classify spam and ham. Don't just use your personal preferences on mail you don't want to get, but be sure it's really spam. And example of what not to mark spam is an email from a mail list you signed up with but no longer like. Go unsubscribe. And don't mark a link request from someone you don't like or is sending you one a week, as you should blacklist their email address instead. Since the format of these types of messages is a pattern that looks like ham, and you mark it spam, you will pollute your Bayesian filter. You want to mark real unsolicited emails as spam, the ones that come out of no where and fill you mail box with medication, sex, watch and other stupid offers. If at any time you can't decided if an email is spam or ham, use the forget box to mark the message if you see it already has a little grey gear next to it. The little gear means that email has been learned, either automatically or by you marking it spam or ham.
You must mark the messages and click ok before you download them or they won't be in the training section for the Bayesian filter to learn: Many of you probably either use web mail or Outlook express. Both of those are fine, but you must learn how to use them to train the filter.
Outlook. If you run Outlook Express, Outlook, or one of the varieties and you are looking at an email in it, then it's too late to train on that email. Once you download an email it's gone off the server and one can't train the Bayes at that point. As far as I know Outlook can't just look at what mail is waiting on the server without downloading it, which again at that point it's too late to train. It will let you look if you run in IMAP instead of POP3, but that is beyond the scope of this topic.
So what you need is a little reader program. One that can connect to your email account but not download the message. There are several programs that can do this. Mail Washer Pro is probably the best of the best, and sells for $37 USD
There is a free version that doesn't have all the features and a large banner ad, if you like let me know so I can email you a copy.
You set up your email accounts in a program like Mail Washer. Then you can see what mail is waiting on the server without actually downloading it. You can even set it to read the first few hundred lines of the email if you can't tell enough from the subject. If you see spam then log in to Plesk and mark those messages. More details below.
Those that read the mail via web mail, you can read your mail and still mark it in Spam Assassin, as long as you don't move it to another folder. This also applies to anyone using any email client in IMAP mode to read their mail.
ALSO! If you set the filter to delete spam, it's gone forever. There is no place to go get it back or see what it killed. It's gone, good bye, can't see it anymore, actually you never see it, as it's killed when it comes in your mail account. Be very careful setting your Spam Assassin to delete incoming emails.
1) Turn on Spam Assassin: From one of your domain name screens, go to one of your mail name screens by clicking on Mail Icon. You will see a list of your mail names toward the bottom of the screen. Note to the left of the mail names are tiny icons. These are shortcuts into specific areas of that mail name. If the one in the S column is in color, then Spam Assassin is already turned on for that mail name! Click on it and it will take you to Step 2 below. If S is not in color, then look to see if the icon for that mail name under the B column is in color. If it is, click on it, but it it's not, then that is a forwarded mail name and Spam Assassin is not available for forwarded mail names. Clicking on the tiny icon in the B column of that specific mail name will take you to the Mailbox screen for that address. Look at the bottom of that screen and you will see a checkbox to activate Spam Assassin for that mail name. It is labeled Enable spam filtering.
Click ok and you are taken back to that specific mail name screen.
2) Set up Spam Assassin: Still in the same mail name screen from above, click on the Spam Filter icon.
A) Jump down to Personal settings section.
a. Please leave Use server wide settings checked. All I do with that is put my web hosting emails in the white list so you always get email from me about the server or your problems.
b. Hits required for spam: The default is 7. Just about all emails with hit total greater than 7 are spam. I find that a setting of 5 is the best. Some good mail will get marked as spam at first at 5, but as you train the Bayesian filter, I found no good mail got marked as spam at 5.
c. What to do with spam mail: Stick with the default of Mark as spam and store in mailbox checked. Until you are very very very very very sure you have your Bayesian filter set correctly and have your hits at the right number, don't tell it to delete spam. Why? Because it doesn't save deleted mail, it's gone forever!
d. Modify spam mail subject: I suggest leaving this check box marked. But I also suggest in changing the default message added to the subject line of spam to [SPAM] which is the industry wide default setting (again Plesk does something not standard).
In addition there is a secret setting. If you check Mark as spam and store in mailbox but uncheck 'Modify spam mail subject' check box. Then the subject is not changed, but spam lines are added to the header of the email if you are into some fancy filtering in your email client and don't want to pollute the subject line of spam (like they don't deserve it!). Two headers are available for filtering just on header content alone. Filters can be set to look for either of these lines in the header: X-Spam-Flag: YES or X-Spam-Status: Yes
2) Black List & White List: Here you can add email addresses that you wish to always include (white list) or always exclude (black list). These steps are just a matter of entering an email address and hitting Add or highlighting one and hitting Remove. Wild cards are accepted. For example one of my favorite wild cards for my black lists is paypal@* which blacklists any email from paypal@anythingatall.anytld
You blacklist or white list an entire domain with *@somedomain.com, which will include anynameatall@somedomain.com. A broader use of wild cards is another favorite of mine *confirm@*, which includes any email address at all that contains the phrase "confirm" as the user part of an email address. Phishing scammers love to use things like aw-confirm@ebay.com or confirm@paypal.com and this one catches them all! Be creative, but be careful, as wild cards are powerful.
3) Training the Bayesian Filter:
NOTE: Tell me whenever you add a new mailname to one of your sites, and I can train the filter to at least a basic level of usage. Still read the following as the filter from time to time as the spammers change techniques, will need further training.
Now comes the fun part. It's not hard, just gets a little time consuming, and again, it must be done for every mail name. This training is not universal. This is a good reason to use aliases to point to one real mail name. Not only do you only have to check only one mail account, but also the filters work then across all the aliases for that mail name. (As a more advanced aside, you can forward emails from other domains to the trained email name on site to take advantage of filtering it as well with one filter and only checking one mail box.)
1. Before you download your mail, log into Plesk and go to one of your domains (some of you only have one domain). Click on the Mail Icon for that domain.
2. Then on the new screen you see the list of mailboxes for that domain at the bottom of the screen. Before each mail name you see some tiny icons labeled
L B R G A S AV. These are shortcuts to the functions in that mail name (saves a click). If Spam Assassin is turned on for a mail name the little icon under the S (spam) is in color and not grayed out. Click on this little icon jumps you straight to the Spam Assassin screen for that mail name. (This trick works to jump straight to the other functions of the mailbox also).
3. On the Spam Assassin screen for that mail name, Click on the Training Icon.
4. Then mark your email in your account as spam, it's not spam or forget.
5. Click OK at the top of the screen. Wait till it finishes before you download your mail with your email program.
6. Done! And WARNING!!! Never click the CLEAR SPAM FILTER DATABASE or that will be the end of all your Training up to that point!
Locked Topic - Please start a new topic in this section to discuss this Topic. |
|