Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why the value of precision and recall is almost the same as precision and recall of the underrepresented class

I have binary classification in which one of the classes is almost 0.1 size of the other class.

I am using sklearn to create a model and evaluate it. I am using these two functions:

print(precision_recall_fscore_support(y_real,y_pred))

out: 
(array([0.99549296, 0.90222222]), # precision of the first class and the second class
 array([0.98770263, 0.96208531]), # recall of the first class and the second class
 array([0.99158249, 0.93119266]), # F1 score of the first class and the second class
 array([1789,  211]))             # instances of the first class and the second class

Which returns the precison,recal,fscore and support for each class

print(precision_score(y_real,y_pred),recall_score(y_real,y_pred))

out:
0.90222222 , 0.96208531 # precsion and recall of the model

Which returns the precision and recall of the prediction.

Why the precsion and recall function returns exactly the same value of the class with the less instances (here the class with 211 instances)?

like image 899
Nima Dolatshad Avatar asked Dec 05 '25 18:12

Nima Dolatshad


1 Answers

Looking closely at the documentation of both precision_score and recall_score you will see two arguments - pos_label, with a default value of 1, and average, with a default value of 'binary':

pos_label : str or int, 1 by default

The class to report if average='binary' and the data is binary.

average : *string, [None, ‘binary’ (default), ‘micro’, ‘macro’, ‘samples’, ‘weighted’]*

'binary':

Only report results for the class specified by pos_label. This is applicable only if targets (y_{true,pred}) are binary.

In other words, as explained clearly in the docs, these two functions return respectively the precision and recall of one class only - the one designated with the label 1.

From what you show, it would seem that this class is what you call 'second class' here, and the results indeed are consistent with what you report.

In contrast, the precision_recall_fscore_support function, according to the docs (emphasis mine):

Compute precision, recall, F-measure and support for each class

In other words, there is nothing strange or unexpected here; there is no "overall" precision and recall, and they are always by definition computer per class. Practically speaking, and in imbalanced binary settings like here, they are usually computed for the minority class only.

like image 85
desertnaut Avatar answered Dec 09 '25 13:12

desertnaut



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!