UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples

Tags:

scikit-learn

I'm getting this weird error:

classification.py:1113: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples. 'precision', 'predicted', average, warn_for)`

but then it also prints the f-score the first time I run:

metrics.f1_score(y_test, y_pred, average='weighted')

The second time I run, it provides the score without error. Why is that?

>>> y_pred = test.predict(X_test) >>> y_test array([ 1, 10, 35,  9,  7, 29, 26,  3,  8, 23, 39, 11, 20,  2,  5, 23, 28,        30, 32, 18,  5, 34,  4, 25, 12, 24, 13, 21, 38, 19, 33, 33, 16, 20,        18, 27, 39, 20, 37, 17, 31, 29, 36,  7,  6, 24, 37, 22, 30,  0, 22,        11, 35, 30, 31, 14, 32, 21, 34, 38,  5, 11, 10,  6,  1, 14, 12, 36,        25,  8, 30,  3, 12,  7,  4, 10, 15, 12, 34, 25, 26, 29, 14, 37, 23,        12, 19, 19,  3,  2, 31, 30, 11,  2, 24, 19, 27, 22, 13,  6, 18, 20,         6, 34, 33,  2, 37, 17, 30, 24,  2, 36,  9, 36, 19, 33, 35,  0,  4,         1]) >>> y_pred array([ 1, 10, 35,  7,  7, 29, 26,  3,  8, 23, 39, 11, 20,  4,  5, 23, 28,        30, 32, 18,  5, 39,  4, 25,  0, 24, 13, 21, 38, 19, 33, 33, 16, 20,        18, 27, 39, 20, 37, 17, 31, 29, 36,  7,  6, 24, 37, 22, 30,  0, 22,        11, 35, 30, 31, 14, 32, 21, 34, 38,  5, 11, 10,  6,  1, 14, 30, 36,        25,  8, 30,  3, 12,  7,  4, 10, 15, 12,  4, 22, 26, 29, 14, 37, 23,        12, 19, 19,  3, 25, 31, 30, 11, 25, 24, 19, 27, 22, 13,  6, 18, 20,         6, 39, 33,  9, 37, 17, 30, 24,  9, 36, 39, 36, 19, 33, 35,  0,  4,         1]) >>> metrics.f1_score(y_test, y_pred, average='weighted') C:\Users\Michael\Miniconda3\envs\snowflakes\lib\site-packages\sklearn\metrics\classification.py:1113: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.   'precision', 'predicted', average, warn_for) 0.87282051282051276 >>> metrics.f1_score(y_test, y_pred, average='weighted') 0.87282051282051276 >>> metrics.f1_score(y_test, y_pred, average='weighted') 0.87282051282051276

Also, why is there a trailing 'precision', 'predicted', average, warn_for) error message? There is no open parenthesis so why does it end with a closing parenthesis? I am running sklearn 0.18.1 using Python 3.6.0 in a conda environment on Windows 10.

I also looked at here and I don't know if it's the same bug. This SO post doesn't have solution either.

611

asked Apr 01 '17 22:04

Sticky

2 Answers

As mentioned in the comments, some labels in y_test don't appear in y_pred. Specifically in this case, label '2' is never predicted:

>>> set(y_test) - set(y_pred) {2}

This means that there is no F-score to calculate for this label, and thus the F-score for this case is considered to be 0.0. Since you requested an average of the score, you must take into account that a score of 0 was included in the calculation, and this is why scikit-learn is showing you that warning.

This brings me to you not seeing the error a second time. As I mentioned, this is a warning, which is treated differently from an error in python. The default behavior in most environments is to show a specific warning only once. This behavior can be changed:

import warnings warnings.filterwarnings('always')  # "error", "ignore", "always", "default", "module" or "once"

If you set this before importing the other modules, you will see the warning every time you run the code.

There is no way to avoid seeing this warning the first time, aside for setting warnings.filterwarnings('ignore'). What you can do, is decide that you are not interested in the scores of labels that were not predicted, and then explicitly specify the labels you are interested in (which are labels that were predicted at least once):

>>> metrics.f1_score(y_test, y_pred, average='weighted', labels=np.unique(y_pred)) 0.91076923076923078

The warning will be gone.

130

answered Oct 02 '22 01:10

Shovalt

the same problem also happened to me when i training my classification model. the reason caused this problem is as what the warning message said "in labels with no predicated samples", it will caused the zero-division when compute f1-score. I found another solution when i read sklearn.metrics.f1_score doc, there is a note as follows:

When true positive + false positive == 0, precision is undefined; When true positive + false negative == 0, recall is undefined. In such cases, by default the metric will be set to 0, as will f-score, and UndefinedMetricWarning will be raised. This behavior can be modified with zero_division

the zero_division default value is "warn", you could set it to 0 or 1 to avoid UndefinedMetricWarning. it works for me ;) oh wait, there is another problem when i using zero_division, my sklearn report that no such keyword argument by using scikit-learn 0.21.3. Just update your sklearn to the latest version by running pip install scikit-learn -U

answered Oct 02 '22 00:10

petty.cf

Related questions
                            
                                Youtube_dl : ERROR : YouTube said: Unable to extract video data
                            
                                Base 62 conversion
                            
                                Unable to install using pip install requirements.txt [closed]
                            
                                Printing list elements on separate lines in Python
                            
                                How should I organize Python source code? [closed]
                            
                                Reading rather large JSON files [duplicate]
                            
                                Good geometry library in python? [closed]
                            
                                When to use pip requirements file versus install_requires in setup.py?
                            
                                Why does Python use 'magic methods'?
                            
                                What are variable annotations?
                            
                                Why does substring slicing with index out of range work?
                            
                                Warning about mutable default argument in PyCharm
                            
                                How to check whether optional function parameter is set
                            
                                How to print a single backslash?
                            
                                How To Get IPython Notebook To Run Python 3?
                            
                                Python: fastest way to create a list of n lists
                            
                                Making a python user-defined class sortable, hashable
                            
                                Running Selenium Webdriver with a proxy in Python
                            
                                Apply Function on DataFrame Index
                            
                                Distributed task queues (Ex. Celery) vs crontab scripts

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With