Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is "naive" in a naive Bayes classifier?

What is naive about Naive Bayes?

like image 926
Peddler Avatar asked May 16 '12 08:05

Peddler


People also ask

Why is Bayes classification called naive?

Naïve Bayes classification is called Naïve because it assumes class conditional independence. The effect of an attribute value on a given class is independent of the values of the other attributes. This assumption is made to reduce computational costs and hence is considered Naïve.

Why naive Bayes theorem is naive?

Why is it called Naïve Bayes? The Naïve Bayes algorithm is comprised of two words Naïve and Bayes, Which can be described as: Naïve: It is called Naïve because it assumes that the occurrence of a certain feature is independent of the occurrence of other features.

What does naive mean in machine learning?

Last Updated on September 25, 2019. A Naive Classifier is a simple classification model that assumes little to nothing about the problem and the performance of which provides a baseline by which all other models evaluated on a dataset can be compared.


2 Answers

There's actually a very good example on Wikipedia:

In simple terms, a naive Bayes classifier assumes that the presence (or absence) of a particular feature of a class is unrelated to the presence (or absence) of any other feature, given the class variable. For example, a fruit may be considered to be an apple if it is red, round, and about 4" in diameter. Even if these features depend on each other or upon the existence of the other features, a naive Bayes classifier considers all of these properties to independently contribute to the probability that this fruit is an apple.

Basically, it's "naive" because it makes assumptions that may or may not turn out to be correct.

like image 51
laurent Avatar answered Oct 10 '22 16:10

laurent


If your data is composed of a feature vector X = {x1, x2, ... x10} and your class labels y = {y1, y2, .. y5}, a Bayes classifier identifies the correct class label as the one that maximizes the following formula:

P(y|X) = P(X|y) * P(y) = P(x1,x2,...,x10|y) * P(y)

For now, it is still not naive. However, it is hard to calculate P(x1,x2,...,x10|y), so we assume the features to be independent, this is what we call the Naive assumption, hence, we end up with the following formula instead:

P(y|X) = P(x1|y) * P(x2|y) * ... * P(x10|y) * P(y)

like image 34
gr33ndata Avatar answered Oct 10 '22 16:10

gr33ndata