Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating the number of true positives from a precision-recall curve

Using the below precision recall graph where recall is on x-axis and precision is on y-axis can I use this formula to calculate the number of predictions for a given precision, recall threshold ?

enter image description here

These calculations are based on orange trend line.

Assuming this model has been trained on 100 instances and is a binary classifier.

At recall value 0.2 there (0.2 * 100) = 20 relevant instances. At recall value 0.2 the precision = .95 so the number of true positives (20 * .95) = 19. Is this a correct method to calculate the number of true positives from precision-recall graph ?

like image 246
blue-sky Avatar asked Mar 13 '19 16:03

blue-sky


People also ask

How do you find the precision of a true positive?

Consider a model that predicts 150 examples for the positive class, 95 are correct (true positives), meaning five were missed (false negatives) and 55 are incorrect (false positives). We can calculate the precision as follows: Precision = TruePositives / (TruePositives + FalsePositives) Precision = 95 / (95 + 55)

How do you interpret a precision-recall curve?

The precision-recall curve shows the tradeoff between precision and recall for different threshold. A high area under the curve represents both high recall and high precision, where high precision relates to a low false positive rate, and high recall relates to a low false negative rate.

Is calculated as the number of correct positive predictions divided by the total number of positives?

Sensitivity (Recall or True positive rate) Sensitivity (SN) is calculated as the number of correct positive predictions divided by the total number of positives. It is also called recall (REC) or true positive rate (TPR). The best sensitivity is 1.0, whereas the worst is 0.0.

What is true positive rate recall?

In machine learning, the true positive rate, also referred to sensitivity or recall, is used to measure the percentage of actual positives which are correctly identified.


1 Answers

I will argue it is not possible to do it like this. For the ease of calculations I will take a recall of 20%, precision of 90% and 100 observations.

I can make two result matrices which will produce those numbers. Here TP/TN denote Test Positive and Negative, CP/CN denote Condition Positive/Negative:

   CP CN
TP 9 1
TN 36 54

and

   CP CN
TP 18 2
TN 72 8

Matrix 1 has TP of 9, FP of 1 and FN of 36, resulting in a recall of 9 / (36 + 9) = 20% and a precision of 9 / (1 + 9) = 90%

Matrix 2 has TP of 18, FP of 2 and FN of 72, resulting in a recall of 18 / (72 + 18) = 20% and a precision of 18 / (2+18) = 90%

Since I can produce two matrices with different TP and the same recall + precision, the graph does not give enough information to trace back the TP.

like image 110
CIAndrews Avatar answered Sep 29 '22 03:09

CIAndrews