Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In which cases is the cross-entropy preferred over the mean squared error? [closed]

Although both of the above methods provide a better score for the better closeness of prediction, still cross-entropy is preferred. Is it in every case or there are some peculiar scenarios where we prefer cross-entropy over MSE?

like image 749
Amogh Mishra Avatar asked Apr 09 '16 09:04

Amogh Mishra


People also ask

What is cross-entropy loss function most suitable for?

Cross-Entropy as a Loss Function. Cross-entropy is widely used as a loss function when optimizing classification models. Two examples that you may encounter include the logistic regression algorithm (a linear classification algorithm), and artificial neural networks that can be used for classification tasks.

Does cross-entropy work well for linear regression?

So yes, cross-entropy can be used for regression.

Why MSE is not good for classification?

There are two reasons why Mean Squared Error(MSE) is a bad choice for binary classification problems: First, using MSE means that we assume that the underlying data has been generated from a normal distribution (a bell-shaped curve). In Bayesian terms this means we assume a Gaussian prior.

Why is mean squared error used as loss function?

Mean squared error (MSE) is the most commonly used loss function for regression. The loss is the mean overseen data of the squared differences between true and predicted values, or writing it as a formula. where ŷ is the predicted value.


1 Answers

Cross-entropy is prefered for classification, while mean squared error is one of the best choices for regression. This comes directly from the statement of the problems itself - in classification you work with very particular set of possible output values thus MSE is badly defined (as it does not have this kind of knowledge thus penalizes errors in incompatible way). To better understand the phenomena it is good to follow and understand the relations between

  1. cross entropy
  2. logistic regression (binary cross entropy)
  3. linear regression (MSE)

You will notice that both can be seen as a maximum likelihood estimators, simply with different assumptions about the dependent variable.

like image 66
lejlot Avatar answered Sep 17 '22 17:09

lejlot