Board logo

subject: The Most Used Anti Spam Filtering Technique - The Bayesian Filter [print this page]


The Most Used Anti Spam Filtering Technique - The Bayesian Filter

Many of us get flooded with spam in our personal mailboxes. Flooding junk boxes seem to act as a common form of advertising these days. The top most reason would be its high cost effectiveness as well as ease of maintenance. Due to this, ad hoc computer programs from the 1990s started using anti spam filters as a means to fight spam.

Filtering programs these days have become wiser in identifying spam messages. Messages that are purposely forged to look similar to messages sent through Outlook Express will be considered as spam. This also goes for messages that only comprise of an image are also considered as spam. Why? This is done by spammers to prevent anti spam software from checking the text.

If there are altered or missing headers such as the "sender" field, it is also indicative of spam. This is done by spammers to complicate sender identification. The same case if messages contain unusual or misspelled words. This method is used by spammers to avoid Bayesian filtering. It is actually the most effective technique of spammers which is why many anti spam software developers are seeking to find the best technique to detect it.

Now that spams are making people turn into Science and other anti spam filtering devices, it is only with hope that software like the Bayesian filter can detect it using probability algorithms. These filters are based on the so-called "Bayes rule". It is a theory of "conditional probability which estimates the likelihood of an event given the certainty of another event". Explaining further, the rule implies that "the likelihood of an event to occur in the future can be inferred to the number of time it has occurred in the past".

So if you will think of spam, this can only mean that once you break a message into discrete elements and you end up finding particular elements that seem to recur frequently in spam but not in ordinary mail then that would be the message contained is SPAM.

Bayesian filters are a type of plug-in which has a built-in database suitable for collecting messages as well as an inference engine used for assigning probability ratings. When a message arrives, it will rate them, vet individual elements and assign a composite rating for each message as a whole. It then copies them into the database. If the rating indicates a high probability that the message is a spam, filters will block it from the inbox. Users can also flag any spam that attempts to get through.




welcome to loan (http://www.yloan.com/) Powered by Discuz! 5.5.0