Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why we use mAp score for evaluate object detectors in deep learning?

In this Tensorflow detection model zoo they have mentioned COCO mAp score to different detection architectures. They also has said higher the mAp score higher the accuracy . What is don't understand is how this is calculated ? What is the maximum score it can has ? Why this mAp score is different from data set to data set ?

like image 470
Shamane Siriwardhana Avatar asked Sep 07 '17 10:09

Shamane Siriwardhana


1 Answers

To understand MAP (Mean Average Precision), I would start with AP (Average Precision) first.

Suppose we are searching for images of a flower and we provide our image retrieval system a sample picture of a rose (query), we do get back a bunch of ranked images (from most likely to least likely). Usually not all of them are correct. So we compute the precision at every correctly returned image, and then take an average.


Example:

If our returned result is 1, 0, 0, 1, 1, 1 where 1 is an image of a flower, while 0 not, then the precision at every correct point is:

Precision at each correct image = 1/1, 0, 0, 2/4, 3/5, 4/6
Summation of these precisions = 83/30
Average Precision = (Precision summation)/(total correct images) = 83/120

Side note:

This section provides a detailed explanation behind the calculation of precision at each correct image in case you're still confused by the above fractions.

For illustration purposes, let 1, 0, 0, 1, 1, 1 be stored in an array so results[0] = 1, results[1] = 0 etc.

Let totalCorrectImages = 0, totalImagesSeen = 0, pointPrecision = 0

The formula for pointPrecision is totalCorrectImages / totalImagesSeen

At results[0], totalCorrectImages = 1, totalImagesSeen = 1 hence pointPrecision = 1

Since results[1] != 1, we ignore it but totalImagesSeen = 2 && totalCorrectImages = 1

Since results[2] != 1, totalImagesSeen = 3 && totalCorrectImages = 1

At results[3], totalCorrectImages = 2, totalImagesSeen = 4 hence pointPrecision = 2/4

At results[4], totalCorrectImages = 3, totalImagesSeen = 5 hence pointPrecision = 3/5

At results[5], totalCorrectImages = 4, totalImagesSeen = 6 hence pointPrecision = 4/6


A simple way to interpret is to produce a combination of zeros and ones which will give the required AP. For example, an AP of 0.5 could have results like 0, 1, 0, 1, 0, 1, ... where every second image is correct, while an AP of 0.333 has 0, 0, 1, 0, 0, 1, 0, 0, 1, ... where every third image is correct.

For an AP of 0.1, every 10th image will be correct, and that is definitely a bad retrieval system. On the other hand, for an AP above 0.5, we will encounter more correct images than wrong in the top results which is definitely a good sign.

MAP is just an extension of AP. You simply take the averages of all the AP scores for a certain number of queries. The above interpretation of AP scores also holds true for MAP. MAP ranges from 0 to 100, higher is better.

AP formula on Wikipedia

MAP formula on Wikipedia

Credits to this blog

EDIT I:

The same concept is applied when it comes to object detection. In this scenario you would calculate the AP for each class. This is given by the area under the precision-recall curve for a given class. From this point, you find their averages to attain the mAP.

For more details, refer to section 3.4.1 and 4.4 of the 2012 Pascal VOC Dev Kit. The related paper can be found here.

like image 109
eshirima Avatar answered Oct 12 '22 15:10

eshirima