Why the value of precision and recall is almost the same as precision and recall of the underrepresented class

Question

I have binary classification in which one of the classes is almost 0.1 size of the other class.

I am using sklearn to create a model and evaluate it. I am using these two functions:

print(precision_recall_fscore_support(y_real,y_pred))

out: 
(array([0.99549296, 0.90222222]), # precision of the first class and the second class
 array([0.98770263, 0.96208531]), # recall of the first class and the second class
 array([0.99158249, 0.93119266]), # F1 score of the first class and the second class
 array([1789,  211]))             # instances of the first class and the second class

Which returns the precison,recal,fscore and support for each class

print(precision_score(y_real,y_pred),recall_score(y_real,y_pred))

out:
0.90222222 , 0.96208531 # precsion and recall of the model

Which returns the precision and recall of the prediction.

Why the precsion and recall function returns exactly the same value of the class with the less instances (here the class with 211 instances)?

desertnaut · Accepted Answer

Looking closely at the documentation of both precision_score and recall_score you will see two arguments - pos_label, with a default value of 1, and average, with a default value of 'binary':

pos_label : str or int, 1 by default

The class to report if average='binary' and the data is binary.

average : *string, [None, ‘binary’ (default), ‘micro’, ‘macro’, ‘samples’, ‘weighted’]*

'binary':

Only report results for the class specified by pos_label. This is applicable only if targets (y_{true,pred}) are binary.

In other words, as explained clearly in the docs, these two functions return respectively the precision and recall of one class only - the one designated with the label 1.

From what you show, it would seem that this class is what you call 'second class' here, and the results indeed are consistent with what you report.

In contrast, the precision_recall_fscore_support function, according to the docs (emphasis mine):

Compute precision, recall, F-measure and support for each class

In other words, there is nothing strange or unexpected here; there is no "overall" precision and recall, and they are always by definition computer per class. Practically speaking, and in imbalanced binary settings like here, they are usually computed for the minority class only.

Why the value of precision and recall is almost the same as precision and recall of the underrepresented class

Tags:

precision

scikit-learn

precision-recall

imbalanced-data

Nima Dolatshad

1 Answers

desertnaut

Recent Activity

Donate For Us

Why the value of precision and recall is almost the same as precision and recall of the underrepresented class

Tags:

precision

scikit-learn

precision-recall

imbalanced-data

Nima Dolatshad

1 Answers

desertnaut

Related questions

Recent Activity

Donate For Us