sklearn metrics.log_loss is positive vs. scoring 'neg_log_loss' is negative

Question

Making sure I am getting this right:

If we use sklearn.metrics.log_loss standalone, i.e. log_loss(y_true,y_pred), it generates a positive score -- the smaller the score, the better the performance.

However, if we use 'neg_log_loss' as a scoring scheme as in 'cross_val_score", the score is negative -- the bigger the score, the better the performance.

And this is due to the scoring scheme is built to be consistent with other scoring schemes. Since generally, the higher the better, we negate usual log_loss to be consistent with the trend. And it is done so solely for that purpose. Is this understanding correct?

[Background: got positive scores for metric.log_loss, and negative scores for 'neg_los_loss', and both refer to the same documentation page.]

Marcus V. · Accepted Answer

The sklearn.metrics.log_loss is an implementation of the error metric as typically defined, and which is as most error metrics a positive number. In this case, it is a metric which is generally minimized (e.g. as mean squared error for regression), in contrast to metrics such as accuracy which is maximized.

The neg_log_loss is hence a technicality to create a utility value, which allows optimizing functions and classes of sklearn to maximize this utility without having to change the function's behavior for each metric (such include for instance named cross_val_score, GridSearchCV, RandomizedSearchCV, and others).

sklearn metrics.log_loss is positive vs. scoring 'neg_log_loss' is negative

Tags:

metrics

scikit-learn

Max

1 Answers

Marcus V.

Recent Activity

Donate For Us

sklearn metrics.log_loss is positive vs. scoring 'neg_log_loss' is negative

Tags:

metrics

scikit-learn

Max

1 Answers

Marcus V.

Related questions

Recent Activity

Donate For Us