I imported classification_report from sklearn.metrics and when I enter my np.arrays
as parameters I get the following error :
/usr/local/lib/python3.6/dist-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. 'precision', 'predicted', average, warn_for) /usr/local/lib/python3.6/dist-packages/sklearn/metrics/classification.py:1137: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. 'recall', 'true', average, warn_for)
Here is the code :
svclassifier_polynomial = SVC(kernel = 'poly', degree = 7, C = 5)
svclassifier_polynomial.fit(X_train, y_train)
y_pred = svclassifier_polynomial.predict(X_test)
poly = classification_report(y_test, y_pred)
When I was not using np.array in the past it worked just fine, any ideas on how i can correct this ?
Precision quantifies the number of positive class predictions that actually belong to the positive class. Recall quantifies the number of positive class predictions made out of all positive examples in the dataset. F-Measure provides a single score that balances both the concerns of precision and recall in one number.
The F-score, also called the F1-score, is a measure of a model's accuracy on a dataset. It is used to evaluate binary classification systems, which classify examples into 'positive' or 'negative'.
A Classification report is used to measure the quality of predictions from a classification algorithm. How many predictions are True and how many are False. More specifically, True Positives, False Positives, True negatives and False Negatives are used to predict the metrics of a classification report as shown below.
F1 score vs Accuracy Both of those metrics take class predictions as input so you will have to adjust the threshold regardless of which one you choose. Remember that the F1 score is balancing precision and recall on the positive class while accuracy looks at correctly classified observations both positive and negative.
This is not an error, just a warning that not all your labels are included in your y_pred
, i.e. there are some labels in your y_test
that your classifier never predicts.
Here is a simple reproducible example:
from sklearn.metrics import precision_score, f1_score, classification_report
y_true = [0, 1, 2, 0, 1, 2] # 3-class problem
y_pred = [0, 0, 1, 0, 0, 1] # we never predict '2'
precision_score(y_true, y_pred, average='macro')
[...] UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples.
'precision', 'predicted', average, warn_for)
0.16666666666666666
precision_score(y_true, y_pred, average='micro') # no warning
0.3333333333333333
precision_score(y_true, y_pred, average=None)
[...] UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples.
'precision', 'predicted', average, warn_for)
array([0.5, 0. , 0. ])
Exact same warnings are produced for f1_score
(not shown).
Practically this only warns you that in the classification_report
, the respective values for labels with no predicted samples (here 2
) will be set to 0:
print(classification_report(y_true, y_pred))
precision recall f1-score support
0 0.50 1.00 0.67 2
1 0.00 0.00 0.00 2
2 0.00 0.00 0.00 2
micro avg 0.33 0.33 0.33 6
macro avg 0.17 0.33 0.22 6
weighted avg 0.17 0.33 0.22 6
[...] UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples.
'precision', 'predicted', average, warn_for)
When I was not using np.array in the past it worked just fine
Highly doubtful, since in the example above I have used simple Python lists, and not Numpy arrays...
It means that some labels are only present in train data and some labels are only present in test dataset. Run the following codes, to understand the distribution of train and test labels.
from collections import Counter
Counter(y_train)
Counter(y_test)
Use stratified train_test_split to get rid of the situation where some labels are present only in test dataset.
It might have worked in past simply because of the random splitting of dataset. Hence, stratified splitting is always recommended.
The first situation is more about model fine tuning or choice of model.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With