Linear Discriminant Analysis vs Naive Bayes

Question

What are the advantages and disadvantages of LDA vs Naive Bayes in terms of machine learning classification?

I know some of the differences like Naive Bayes assumes variables to be independent, while LDA assumes Gaussian class-conditional density models, but I don't understand when to use LDA and when to use NB depending on the situation?

Maxim · Accepted Answer

Both methods are pretty simple, so it's hard to say which one is going to work much better. It's often faster just to try both and calculate the test accuracy. But here's the list of characteristics that usually indicate if certain method is less likely to give good results. It all boils down to the data.

Naive Bayes

The first disadvantage of the Naive Bayes classifier is the feature independence assumption. In practice, the data is multi-dimensional and different features do correlate. Due to this, the result can be potentially pretty bad, though not always significantly. If you know for sure, that features are dependent (e.g. pixels of an image), don't expect Naive Bayes to show off.

Another problem is data scarcity. For any possible value of a feature, a likelihood is estimated by a frequentist approach. This can result in probabilities being close to 0 or 1, which in turn leads to numerical instabilities and worse results.

A third problem arises for continuous features. The Naive Bayes classifier works only with categorical variables, so one has to transform continuous features to discrete, by which throwing away a lot of information. If there's a continuous variable in the data, it's a strong sign against Naive Bayes.