Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sklearn metrics for multiclass classification

I have performed GaussianNB classification using sklearn. I tried to calculate the metrics using the following code:

print accuracy_score(y_test, y_pred) print precision_score(y_test, y_pred) 

Accuracy score is working correctly but precision score calculation is showing error as:

ValueError: Target is multiclass but average='binary'. Please choose another average setting.

As target is multiclass, can i have the metric scores of precision, recall etc.?

like image 863
dino Avatar asked Aug 25 '17 22:08

dino


People also ask

What is a good metric for multiclass classification?

Most commonly used metrics for multi-classes are F1 score, Average Accuracy, Log-loss.

How do you evaluate a multiclass classification model?

We have to be careful here because accuracy with a binary classifier is measured as (TP+TN)/(TP+TN+FP+FN) , but accuracy for a multiclass classifier is calculated as the average accuracy per class. For calculating the accuracy within a class, we use the total 880 test images as the denominator.

Is accuracy good metric for multiclass classification?

Accuracy is one of the most popular metrics in multi-class classification and it is directly computed from the confusion matrix. The formula of the Accuracy considers the sum of True Positive and True Negative elements at the numerator and the sum of all the entries of the confusion matrix at the denominator.

How do you calculate confusion matrix for multiclass classification?

The confusion matrix is a N x N matrix, where N is the number of classes or outputs. For 2 class ,we get 2 x 2 confusion matrix. For 3 class ,we get 3 X 3 confusion matrix.


1 Answers

The function call precision_score(y_test, y_pred) is equivalent to precision_score(y_test, y_pred, pos_label=1, average='binary'). The documentation (http://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_score.html) tells us:

'binary':

Only report results for the class specified by pos_label. This is applicable only if targets (y_{true,pred}) are binary.

So the problem is that your labels are not binary, but probably one-hot encoded. Fortunately, there are other options which should work with your data:

precision_score(y_test, y_pred, average=None) will return the precision scores for each class, while

precision_score(y_test, y_pred, average='micro') will return the total ratio of tp/(tp + fp)

The pos_label argument will be ignored if you choose another average option than binary.

like image 146
ml4294 Avatar answered Sep 28 '22 05:09

ml4294