Classification Report - Precision and F-score are ill-defined

Tags:

I imported classification_report from sklearn.metrics and when I enter my np.arrays as parameters I get the following error :

/usr/local/lib/python3.6/dist-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. 'precision', 'predicted', average, warn_for) /usr/local/lib/python3.6/dist-packages/sklearn/metrics/classification.py:1137: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. 'recall', 'true', average, warn_for)

Here is the code :

svclassifier_polynomial = SVC(kernel = 'poly', degree = 7, C = 5)

svclassifier_polynomial.fit(X_train, y_train)
y_pred = svclassifier_polynomial.predict(X_test)


poly = classification_report(y_test, y_pred)

When I was not using np.array in the past it worked just fine, any ideas on how i can correct this ?

966

asked Jan 11 '19 16:01

Andrew Zacharakis

2 Answers

This is not an error, just a warning that not all your labels are included in your y_pred, i.e. there are some labels in your y_test that your classifier never predicts.

Here is a simple reproducible example:

from sklearn.metrics import precision_score, f1_score, classification_report

y_true = [0, 1, 2, 0, 1, 2] # 3-class problem
y_pred = [0, 0, 1, 0, 0, 1] # we never predict '2'

precision_score(y_true, y_pred, average='macro') 
[...] UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. 
  'precision', 'predicted', average, warn_for)
0.16666666666666666

precision_score(y_true, y_pred, average='micro') # no warning
0.3333333333333333

precision_score(y_true, y_pred, average=None) 
[...] UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. 
  'precision', 'predicted', average, warn_for)
array([0.5, 0. , 0. ])

Exact same warnings are produced for f1_score (not shown).

Practically this only warns you that in the classification_report, the respective values for labels with no predicted samples (here 2) will be set to 0:

print(classification_report(y_true, y_pred))


              precision    recall  f1-score   support

           0       0.50      1.00      0.67         2
           1       0.00      0.00      0.00         2
           2       0.00      0.00      0.00         2

   micro avg       0.33      0.33      0.33         6
   macro avg       0.17      0.33      0.22         6
weighted avg       0.17      0.33      0.22         6

[...] UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. 
  'precision', 'predicted', average, warn_for)

When I was not using np.array in the past it worked just fine

Highly doubtful, since in the example above I have used simple Python lists, and not Numpy arrays...

answered Nov 09 '22 19:11

desertnaut

It means that some labels are only present in train data and some labels are only present in test dataset. Run the following codes, to understand the distribution of train and test labels.

from collections import Counter
Counter(y_train)
Counter(y_test)

Use stratified train_test_split to get rid of the situation where some labels are present only in test dataset.

It might have worked in past simply because of the random splitting of dataset. Hence, stratified splitting is always recommended.

The first situation is more about model fine tuning or choice of model.

answered Nov 09 '22 18:11

Venkatachalam

Related questions
                            
                                'pip==9.0.1' distribution was not found and is required by the application
                            
                                subplots in matplotlib give ValueError: not enough values to unpack
                            
                                Python - greyscale image to 3 channels
                            
                                Django AttributeError 'datetime.date' object has no attribute 'utcoffset'
                            
                                Conditionally offseting values by group with Pandas
                            
                                Convert NetCDF (.nc) to GEOTIFF
                            
                                taking the first non null in python
                            
                                Drop all columns before a particular column (by name) in pandas?
                            
                                Remove all edges from a graph in networkx
                            
                                conda-forge: Why does Conda inconsistently want to downgrade NumPy?
                            
                                How to remove image noise using opencv - python?
                            
                                Python - No module named 'fabric.api - Windows 10
                            
                                How can i dynamically (in a loop) show images in google colab?
                            
                                how to use arrays in gekko optimizer for python
                            
                                RuntimeError: Attempting to deserialize object on CUDA device 2 but torch.cuda.device_count() is 1
                            
                                Python websockets send to client and keep connection alive
                            
                                How do I include files with pyinstaller?
                            
                                Pip install error in Mac OS(error: command '/usr/bin/clang' failed with exit status 1)
                            
                                How do I increase the max length of captured Python parameters in Sentry?
                            
                                Python: while not Exception

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Classification Report - Precision and F-score are ill-defined

Tags:

python

machine-learning

classification

scikit-learn

Andrew Zacharakis

People also ask

2 Answers

desertnaut

Venkatachalam

Recent Activity

Donate For Us