Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Naive Bayes and Logistic Regression Error Rate

I have been trying to figure out the correlation between the error rate and the number of features in both of these models. I watched some videos, and the creator of the video said that a simple model can be better than a complicated model. So I figured that the more features I had the greater the error rate would be. This did not prove to be true in my work, and when I had less features the error rate went up. I'm not sure if I'm doing this incorrectly, or if the guy in the video made a mistake. Can someone care to explain? I also am curious how features relate to Logistic Regression's error rate as well.

like image 912
Taztingo Avatar asked Oct 02 '13 02:10

Taztingo


People also ask

Why is logistic regression more accurate than Naive Bayes?

If any of the features are correlated, the classification will not happen in an expected way. The features are split in a linear fashion so that even if the features are correlated, due to linear classification, logistic regression works in favor of data analysis and gives better results than Naive Bayes.

How is Naive Bayes different from logistic regression?

Naive Bayes Classifier:Naive Bayes Classifier is an example of a generative classifier while Logistic Regression is an example of a discriminative classifier.

Why does Naive Bayes give less accuracy?

The assumption that all features are independent is not usually the case in real life so it makes naive bayes algorithm less accurate than complicated algorithms.

Why is Naive Bayes better than logistic regression for text classification?

As the Naive Bayes algorithm has the assumption of the “Naive” features it performs much better than other algorithms like Logistic Regression, Tree based algorithms etc. The Naive Bayes classifier is much faster with its probability calculations.


1 Answers

Naive Bayes and Logistic Regression are a "generative-discriminative pair," meaning they have the same model form (a linear classifier), but they estimate parameters in different ways.

For feature x and label y, naive Bayes estimates a joint probability p(x,y) = p(y)*p(x|y) from the training data (that is, builds a model that could "generate" the data), and uses Bayes Rule to predict p(y|x) for new test instances. On the other hand, logistic regression estimates p(y|x) directly from the training data by minimizing an error function (which is more "discrimative").

These differences have implications for error rate:

  1. When there are very few training instances, logistic regression might "overfit," because there isn't enough data to estimate p(y|x) reliably. Naive Bayes might do better because it models the entire joint distribution.
  2. When the feature set is large (and sparse, like word features in text classification) naive Bayes might "double count" features that are correlated with each other, because it assumes that each p(x|y) event is independent, when they are not. Logistic regression can do a better job by naturally "splitting the difference" among these correlated features.

If the features really are (mostly) conditionally independent, both models might actually improve with more and more features, provided there are enough data instances. The problem comes when the training set size is small relative to the number of features. Priors on naive Bayes feature parameters, or regularization methods (like L1/Lasso or L2/Ridge) on logistic regression can help in these cases.

like image 70
burr Avatar answered Sep 27 '22 18:09

burr