Talk:Bayesian message classification: Difference between revisions
imported>Catherine Woodgold (It's more complicated than that.) |
imported>Greg Woodhouse (Why is this article marked external?) |
||
Line 14: | Line 14: | ||
This article gives the impression that Bayesian spam filtering is done in a particular way, i.e. by treating probabilities of each word independently. That is not the only possible way to do Bayesian spam filtering, and I don't think it's the way it's usually (or always) done. Another way is to look at probabilities of phrases. Yet another way is to look at probabilities of certain combinations of words (regardless of where in the article the word appears). For example, the word "interest" might not by itself increase the spam score (or not much), but if it appears in the same message as "mortgage" and "house" it might add significantly to the probability of the message being classified as being about mortgages, and then get a Bayesian spam score based on the user's previous reactions to other messages about mortgages. In other words, it can be done in two steps, using Bayes' theorem at each step. --[[User:Catherine Woodgold|Catherine Woodgold]] 21:22, 2 May 2007 (CDT) | This article gives the impression that Bayesian spam filtering is done in a particular way, i.e. by treating probabilities of each word independently. That is not the only possible way to do Bayesian spam filtering, and I don't think it's the way it's usually (or always) done. Another way is to look at probabilities of phrases. Yet another way is to look at probabilities of certain combinations of words (regardless of where in the article the word appears). For example, the word "interest" might not by itself increase the spam score (or not much), but if it appears in the same message as "mortgage" and "house" it might add significantly to the probability of the message being classified as being about mortgages, and then get a Bayesian spam score based on the user's previous reactions to other messages about mortgages. In other words, it can be done in two steps, using Bayes' theorem at each step. --[[User:Catherine Woodgold|Catherine Woodgold]] 21:22, 2 May 2007 (CDT) | ||
== Why is this article marked external? == | |||
If it's from another source, then what is it? [[User:Greg Woodhouse|Greg Woodhouse]] 12:10, 30 June 2007 (CDT) |
Revision as of 11:10, 30 June 2007
Workgroup category or categories | Computers Workgroup [Editors asked to check categories] |
Article status | External article: from another source, with little change |
Underlinked article? | Yes |
Basic cleanup done? | Yes |
Checklist last edited by | Derek Harkness 04:15, 2 May 2007 (CDT) |
To learn how to fill out this checklist, please see CZ:The Article Checklist.
It's more complicated than that.
This article gives the impression that Bayesian spam filtering is done in a particular way, i.e. by treating probabilities of each word independently. That is not the only possible way to do Bayesian spam filtering, and I don't think it's the way it's usually (or always) done. Another way is to look at probabilities of phrases. Yet another way is to look at probabilities of certain combinations of words (regardless of where in the article the word appears). For example, the word "interest" might not by itself increase the spam score (or not much), but if it appears in the same message as "mortgage" and "house" it might add significantly to the probability of the message being classified as being about mortgages, and then get a Bayesian spam score based on the user's previous reactions to other messages about mortgages. In other words, it can be done in two steps, using Bayes' theorem at each step. --Catherine Woodgold 21:22, 2 May 2007 (CDT)
Why is this article marked external?
If it's from another source, then what is it? Greg Woodhouse 12:10, 30 June 2007 (CDT)
- Computers Category Check
- General Category Check
- Category Check
- Advanced Articles
- Nonstub Articles
- Internal Articles
- Computers Advanced Articles
- Computers Nonstub Articles
- Computers Internal Articles
- Developed Articles
- Computers Developed Articles
- Developing Articles
- Computers Developing Articles
- Stub Articles
- Computers Stub Articles
- External Articles
- Computers External Articles
- Computers Underlinked Articles
- Underlinked Articles
- Computers Cleanup
- General Cleanup
- Cleanup