An effective way to filter spam.

I’m sure everyone has tried a number of different ways to filter spam e-mail. There are many filter programs out there. There are black-list based filters, and bayesian filters, and key-word filters, etc.

I have found a method that is very effective at filtering the vast majority of spam I recieve: filter html e-mails. I noticed a long time ago that the vast, vast majority of spam is in html format, so simply looking for the <html> tag in the body of the e-mail is quite effective.

Of course, there are some real people that use html formatted e-mail as well. Thus, instead of blocking html e-mail completely, I simply re-direct it into a seperate folder that I skim through every once in a while.

So far in my experience of filtering html e-mail, only about 1 in 80 html e-mails are from real people, and about 2 in 3 plain text e-mails are from real people. So for me, it cuts down the spam considerably.

Anyway, just thought I’d share this with the Blender community. Feel free to use it or leave it as you see fit. :slight_smile:

Thats pure genious…

Actually, a very very good way to filter spam is by parsing the e-mail header. A good majority of spam e-mails don’t respect header standards, have broken headers, missing information and so forth.

Martin

I never get spam for some reason. I’ve had the same accounts for ages, and I have recieved maybe 2-3 emails that fit within the catagory of spam. One was from Yahoo, and the other was from some gaming site.

That is a very bad idea, espically for ISP’s to do someting like this site wide. A lot of email’s from finiancial instutions send out statements, bills, receipts… in an html format to make it all “pretty” and “customer friendly”. That’s a law suite waiting to happen.

A Bayesian filter like Popfile can do wonders, if your ISP doesn’t alread stop the crap at the gate. At my place of business, where I’m the IT Manager, we crank through about 2000 incoming emails a week. Around 91% of that is spam. Popfile, when set up to sort only for spam/not-spam has been running at 99.5% accuracy for months. It works. It’s easy to set up. Give it a go.

I use qmail + spamassassin. It works great.

Did I ever say that ISP’s should do that?

I was simply pointing out something that was effective for me. And bare in mind that I didn’t say to block html e-mails. I said to filter them into a seperate folder that is looked through every once in a while. I have a few friends and family members who send html e-mails, so I’m certainly not advocating the blatent blocking of them.

Bluedemon, please read my post more carefully.

Guys…spam is already filtered at the factory. Otherwise how would it be legal to sell it for consumption?

Maybe I should read the rest of the thread. :wink:

I find that if every time I get an email I don’t want, I block the sender, that immediately cuts down on the number of messages I get. I also realized that most spam is sent to many addresses at once, and at a point noticed that spam coming from a particular address was copied to the same people every time so I simply picked one of the random email addresses that wasn’t mine and said, “If an email contains this name, delete it.”

I apologize, im a system administrator and think that spam should stop there before it gets to the end user. :slight_smile: