Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to combine False positives and false negatives into one single measure

I'am trying to measure the performance of a computer vision program that tries to detect objects in video. I have 3 different versions of the program which have different parameters. I've benchmarked each of this versions and got 3 pairs of (False positives percent, False negative percent).

Now i want to compare the versions with each other and then I wonder if it makes sense to combine false positives and false negatives into a single value and use that to do the comparation. for example, take the equation falsePositives/falseNegatives and see which is smaller.

like image 630
dnul Avatar asked Jul 23 '10 05:07

dnul


People also ask

How do you calculate false positives and false negatives?

The false positive rate is calculated as FP/FP+TN, where FP is the number of false positives and TN is the number of true negatives (FP+TN being the total number of negatives). It's the probability that a false alarm will be raised: that a positive result will be given when the true value is negative.

Which performance metric considers both false positive and false negative?

If the cost of false positives and false negatives are different then F1 is your savior. F1 is best if you have an uneven class distribution. Precision is how sure you are of your true positives whilst recall is how sure you are that you are not missing any positives.

How can you reduce false positives in binary classification?

To minimize the number of False Negatives (FN) or False Positives (FP) we can also retrain a model on the same data with slightly different output values more specific to its previous results. This method involves taking a model and training it on a dataset until it optimally reaches a global minimum.

Which is more important false positives or false negatives?

Depending on the desired test result, both positive and negative can be considered bad. For example, in a test for COVID, you want a negative test result. Although a positive result is deemed to be bad, a False Negative is the worst.


4 Answers

In addition to the popular Area Under the ROC Curve (AUC) measure mentioned by @alchemist-al, there's a score that combines both precision and recall (which are defined in terms of TP/FP/TN/FN) called the F-measure that goes from 0 to 1 (0 being the worst, 1 the best):

F-measure = 2*precision*recall / (precision+recall)

where

precision = TP/(TP+FP)  ,  recall = TP/(TP+FN)
like image 107
Amro Avatar answered Oct 14 '22 07:10

Amro


A couple of other possible solutions:

-Your false-positive rate (fp) and false-negative rate (fn) may depend on a threshold. If you plot the curve where the y-value is (1-fn), and the x-value is (fp), you'll be plotting the Receiver-Operator-Characteristic (ROC) curve. The Area Under the ROC Curve (AUC) is one popular measure of quality.

-AUC can be weighted if there are certain regions of interest

-Report the Equal-Error Rate. For some threshold, fp=fn. Report this value.

like image 43
user402675 Avatar answered Oct 14 '22 06:10

user402675


It depends on how much detail you want in the comparision.

Combining the two figures will give you an overall sense of error margin but no insight into what sort of error so if you just want to know what is "more correct" in an overall sense then it's fine.

If, on the other hand, you're actually wanting to use the results for some sort of more in depth determination of whether the process is suited to a particular problem then I would imagine keeping them seperate is a good idea. e.g. Sometimes false negatives are a very different problem to false positives in a real world setting. Did the robot just avoid an object that wasn't there... or fail to notice it was heading off the side of a cliff?

In short, there's no hard and fast global rule for determining how effective vision based on one super calculation. It comes down to what you're planning to do with the information that's the important bit.

like image 31
lzcd Avatar answered Oct 14 '22 08:10

lzcd


You need to factor in how "important" false positive are relative to false negatives.

For example, if your program is designed to recognise people's faces, the both false positives and false negatives are equally harmless and you can probably just combine them linearly.

But if your program was designed to detect bombs, then false positives aren't a huge deal (i.e. saying "this is a bomb" when it's actually not) but false negatives (that is, saying "this isn't a bomb" when it actually is) would be catastrophic.

like image 37
Dean Harding Avatar answered Oct 14 '22 07:10

Dean Harding