I'm stuck in having to write a simple spam filter I'm not really sure about how I'm going to do it.
So far I've come up with wordlist and domain filtering, which will give or remove points up to a certain threshold.
For example, if you're writing about "v1agr4" from a blacklisted domain, you'll get like 2 points for spam, but if you're writing about "v1agr4" from a hotmail.com account, you'll get only 1 "spam point".
Do you guys have any other suggestions / ressources?
This is more about learning spam filters than developing something enterprise grade
A spam filter is a program used to detect unsolicited, unwanted and virus-infected emails and prevent those messages from getting to a user's inbox. Like other types of filtering programs, a spam filter looks for specific criteria on which to base its judgments.
Tap the sender's profile image next to the message you want to mark as spam. Report spam.
Some really good algorithm info here:
http://www.paulgraham.com/spam.html
http://www.paulgraham.com/better.html
But, seriously, why reinvent the wheel?
Just download K9: http://keir.net/k9.html
Some open source Java projects related to Bayesian Spam Filtering (that was mentioned by LFSR Consulting):
And one extra for C++:
Look into Bayesian Spam Filtering.
I know perl has a library for it, so I'd assume java would have one too.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With