Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Good ROC curve but poor precision-recall curve

I have some machine learning results that I don't quite understand. I am using python sciki-learn, with 2+ million data of about 14 features. The classification of 'ab' looks pretty bad on the precision-recall curve, but the ROC for Ab looks just as good as most other groups' classification. What can explain that?

enter image description here

enter image description here

like image 631
KubiK888 Avatar asked Oct 23 '15 03:10

KubiK888


People also ask

How do precision and recall relate to the ROC curve?

Generally, the use of ROC curves and precision-recall curves are as follows: ROC curves should be used when there are roughly equal numbers of observations for each class. Precision-Recall curves should be used when there is a moderate to large class imbalance.

What is the difference between ROC and precision-recall curve?

The main difference between the ROC and PR curves is that the former considers the false positive rate whereas the latter is based on the precision. That is why we first have a closer look at these two concepts for imbalanced data.

What does a precision-recall curve tell you?

The precision-recall curve shows the tradeoff between precision and recall for different threshold. A high area under the curve represents both high recall and high precision, where high precision relates to a low false positive rate, and high recall relates to a low false negative rate.

What can you say about the precision-recall PR curve?

Precision-Recall (PR) Curve –A PR curve is simply a graph with Precision values on the y-axis and Recall values on the x-axis. In other words, the PR curve contains TP/(TP+FN) on the y-axis and TP/(TP+FP) on the x-axis. It is important to note that Precision is also called the Positive Predictive Value (PPV).


1 Answers

Class imbalance.

Unlike the ROC curve, PR curves are very sensitive to imbalance. If you optimize your classifier for good AUC on an unbalanced data you are likely to obtain poor precision-recall results.

like image 87
Calimo Avatar answered Oct 12 '22 21:10

Calimo