Suggestion for Spam Filters
One of the issues with spam is false positives. “Did you check your spam folder” is often a question to ask if your email is not received on the other end.
I’m not a machine learning expert and I’ve never made a spam filter, and I only know the naive Bayes approach. So this suggestion is not a machine-learning “breakthrough”. But from what I know about classification algorithms is that they usually provide a likelihood of one item being in one group or another. Some items are not identified as spam with absolute certainty – they are 51% likely to be spam, for example.
My suggestions is: for borderline items (lower certainty that they should be classified as spam), the spam filter should send emails to the sender indicating that his message was considered spam. A genuine sender will probably take additional steps, like sending another short email or calling/messaging the recipient (‘click here to confirm you are not spam’ won’t work, because it will easily be automated).
It’s rather a usability suggestion than a technical one, and I’m sure there are some issues that I’m missing. But I thought it’s at least worth sharing.
One of the issues with spam is false positives. “Did you check your spam folder” is often a question to ask if your email is not received on the other end.
I’m not a machine learning expert and I’ve never made a spam filter, and I only know the naive Bayes approach. So this suggestion is not a machine-learning “breakthrough”. But from what I know about classification algorithms is that they usually provide a likelihood of one item being in one group or another. Some items are not identified as spam with absolute certainty – they are 51% likely to be spam, for example.
My suggestions is: for borderline items (lower certainty that they should be classified as spam), the spam filter should send emails to the sender indicating that his message was considered spam. A genuine sender will probably take additional steps, like sending another short email or calling/messaging the recipient (‘click here to confirm you are not spam’ won’t work, because it will easily be automated).
It’s rather a usability suggestion than a technical one, and I’m sure there are some issues that I’m missing. But I thought it’s at least worth sharing.
It will be a great advantage for the spammers if they know their emails are marked as “spam”.
I think that the proper solution might be if the spam filter gives the certainty value to the email client (I don’t know how these spam filters work, they might be giving that value anyway). A good email client can tell the user about the presence of emails in the spam folder that might not be spam.
Good point, although this feature can also be used by spammers – they register a mail account, and send spam to themselves, analyzing what gets through and what gets labelled as “uncertain”
Since everyone sends emails, imagine what happens when spammers will disguise as spam filters.
Hah, also a good point, but they can’t know the full information about the email – recipient, (abridged) content. Yet, probably phishing sites disguised as friendly spam filters will appear. The question is – would they do damage?
Real damage, maybe, maybe not. But now spam filters have to work extra harder. Infinite loops?
Theres a spam filter that changes the header to [Possible Spam] and sends it anyway when it THINKS its a spammy message..which is cool. You can actually set the threshold and everything. Its really good actually and its called Xeams (exams.com) best of all its free and I know a few Universities use it!