Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

why does scikitlearn says F1 score is ill-defined with FN bigger than 0?

I run a python program that calls sklearn.metrics's methods to calculate precision and F1 score. Here is the output when there is no predicted sample:

/xxx/py2-scikit-learn/0.15.2-comp6/lib/python2.6/site-packages/sklearn/metr\ ics/metrics.py:1771: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples.   'precision', 'predicted', average, warn_for)  /xxx/py2-scikit-learn/0.15.2-comp6/lib/python2.6/site-packages/sklearn/metr\ ics/metrics.py:1771: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 due to no predicted samples.   'precision', 'predicted', average, warn_for) 

When there is no predicted sample, it means that TP+FP is 0, so

  • precision (defined as TP/(TP+FP)) is 0/0, not defined,
  • F1 score (defined as 2TP/(2TP+FP+FN)) is 0 if FN is not zero.

In my case, sklearn.metrics also returns the accuracy as 0.8, and recall as 0. So FN is not zero.

But why does scikilearn says F1 is ill-defined?

What is the definition of F1 used by Scikilearn?

like image 451
Tim Avatar asked Jan 13 '16 02:01

Tim


People also ask

What does it mean if F1 score is 0?

A binary classification task. Clearly, the higher the F1 score the better, with 0 being the worst possible and 1 being the best.

How do you interpret F1 scores in classification reports?

The F1 score is a weighted harmonic mean of precision and recall such that the best score is 1.0 and the worst is 0.0. F1 scores are lower than accuracy measures as they embed precision and recall into their computation.

What does a higher F-score mean?

Notice that F1-score takes both precision and recall into account, which also means it accounts for both FPs and FNs. The higher the precision and recall, the higher the F1-score. F1-score ranges between 0 and 1. The closer it is to 1, the better the model.

Can F-score be higher than accuracy?

F1-score vs Accuracy when the positive class is the majority class. Image by Author. For example, row 5 has only 1 correct prediction out of 10 negative cases. But the F1-score is still at around 95%, so very good and even higher than accuracy.


2 Answers

https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/metrics/classification.py

F1 = 2 * (precision * recall) / (precision + recall)

precision = TP/(TP+FP) as you've just said if predictor doesn't predicts positive class at all - precision is 0.

recall = TP/(TP+FN), in case if predictor doesn't predict positive class - TP is 0 - recall is 0.

So now you are dividing 0/0.

like image 69
Ibraim Ganiev Avatar answered Sep 24 '22 18:09

Ibraim Ganiev


Precision, Recall, F1-score and Accuracy calculation

- In a given image of Dogs and Cats    * Total Dogs - 12  D = 12   * Total Cats - 8   C = 8  - Computer program predicts    * Dogs - 8       5 are actually Dogs   T.P = 5     3 are not             F.P = 3       * Cats - 12     6 are actually Cats   T.N = 6      6 are not             F.N = 6  - Calculation    * Precision = T.P / (T.P + F.P) => 5 / (5 + 3)   * Recall    = T.P / D           => 5 / 12    * F1 = 2 * (Precision * Recall) / (Precision + Recall)   * F1 = 0.5    * Accuracy = T.P + T.N / P + N   * Accuracy = 0.55 

Wikipedia reference

like image 20
Wazy Avatar answered Sep 22 '22 18:09

Wazy