Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Classification Report - Precision and F-score are ill-defined

I imported classification_report from sklearn.metrics and when I enter my np.arrays as parameters I get the following error :

/usr/local/lib/python3.6/dist-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. 'precision', 'predicted', average, warn_for) /usr/local/lib/python3.6/dist-packages/sklearn/metrics/classification.py:1137: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. 'recall', 'true', average, warn_for)

Here is the code :

svclassifier_polynomial = SVC(kernel = 'poly', degree = 7, C = 5)

svclassifier_polynomial.fit(X_train, y_train)
y_pred = svclassifier_polynomial.predict(X_test)


poly = classification_report(y_test, y_pred)

When I was not using np.array in the past it worked just fine, any ideas on how i can correct this ?

like image 966
Andrew Zacharakis Avatar asked Jan 11 '19 16:01

Andrew Zacharakis


People also ask

What is precision and F-score?

Precision quantifies the number of positive class predictions that actually belong to the positive class. Recall quantifies the number of positive class predictions made out of all positive examples in the dataset. F-Measure provides a single score that balances both the concerns of precision and recall in one number.

What is F-score in classification report?

The F-score, also called the F1-score, is a measure of a model's accuracy on a dataset. It is used to evaluate binary classification systems, which classify examples into 'positive' or 'negative'.

How do you define a classification report?

A Classification report is used to measure the quality of predictions from a classification algorithm. How many predictions are True and how many are False. More specifically, True Positives, False Positives, True negatives and False Negatives are used to predict the metrics of a classification report as shown below.

What is the difference between F-score and accuracy?

F1 score vs Accuracy Both of those metrics take class predictions as input so you will have to adjust the threshold regardless of which one you choose. Remember that the F1 score is balancing precision and recall on the positive class while accuracy looks at correctly classified observations both positive and negative.


2 Answers

This is not an error, just a warning that not all your labels are included in your y_pred, i.e. there are some labels in your y_test that your classifier never predicts.

Here is a simple reproducible example:

from sklearn.metrics import precision_score, f1_score, classification_report

y_true = [0, 1, 2, 0, 1, 2] # 3-class problem
y_pred = [0, 0, 1, 0, 0, 1] # we never predict '2'

precision_score(y_true, y_pred, average='macro') 
[...] UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. 
  'precision', 'predicted', average, warn_for)
0.16666666666666666

precision_score(y_true, y_pred, average='micro') # no warning
0.3333333333333333

precision_score(y_true, y_pred, average=None) 
[...] UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. 
  'precision', 'predicted', average, warn_for)
array([0.5, 0. , 0. ])

Exact same warnings are produced for f1_score (not shown).

Practically this only warns you that in the classification_report, the respective values for labels with no predicted samples (here 2) will be set to 0:

print(classification_report(y_true, y_pred))


              precision    recall  f1-score   support

           0       0.50      1.00      0.67         2
           1       0.00      0.00      0.00         2
           2       0.00      0.00      0.00         2

   micro avg       0.33      0.33      0.33         6
   macro avg       0.17      0.33      0.22         6
weighted avg       0.17      0.33      0.22         6

[...] UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. 
  'precision', 'predicted', average, warn_for)

When I was not using np.array in the past it worked just fine

Highly doubtful, since in the example above I have used simple Python lists, and not Numpy arrays...

like image 88
desertnaut Avatar answered Nov 09 '22 19:11

desertnaut


It means that some labels are only present in train data and some labels are only present in test dataset. Run the following codes, to understand the distribution of train and test labels.

from collections import Counter
Counter(y_train)
Counter(y_test)

Use stratified train_test_split to get rid of the situation where some labels are present only in test dataset.

It might have worked in past simply because of the random splitting of dataset. Hence, stratified splitting is always recommended.

The first situation is more about model fine tuning or choice of model.

like image 2
Venkatachalam Avatar answered Nov 09 '22 18:11

Venkatachalam