Understanding COCO evaluation "maximum detections"

Tags:

I started using the cocoapi to evaluate a model trained using the Object Detection API. After reading various sources that explain mean average precision (mAP) and recall, I am confused with the "maximum detections" paramter used in the cocoapi.

From what I understood (e.g. here, here or here), one calculates mAP by calculating precision and recall for various model score thresholds. This gives the precision-recall curve and mAP is calculated as an approximation to the area under this curve. Or, expressed differently, as the average of the maximum precision in defined recall ranges (0:0.1:1).

However, the cocoapi seems to calculate precision and recall for a given number of maximum detections (maxDet) with the highest scores. And from there get the precision-recall curve for maxDets = 1, 10, 100. Why is this a good metric since it is clearly not the same as the above method (it potentially excludes datapoints)?

In my example, I have ~ 3000 objects per image. Evaluating the result using the cocoapi gives terrible recall because it limits the number of detected objects to 100.

For testing purposes, I feed the evaluation dataset as the ground truth and the detected objects (with some artificial scores). I would expect precision and recall pretty good, which is actually happening. But as soon as I feed in more than 100 objects, precision and recall go down with increasing number of "detected objects". Even though they are all "correct"! How does that make sense?

696

asked Oct 16 '18 15:10

mincos

2 Answers

You can change the maxDets parameter and define a new summarize() instance method.

Let's create a COCOeval object:

cocoEval = COCOeval(cocoGt,cocoDt,annType)
cocoEval.params.maxDets = [200]
cocoEval.params.imgIds  = imgIdsDt
cocoEval.evaluate()
cocoEval.accumulate()
cocoEval.summarize_2() # instead of calling cocoEval.summarize()

Now, define summarize_2() method in cocoeval.py module in the following way:

def summarize_2(self):
    # Copy everything from `summarize` method here except
    # the function `_summarizeDets()`.
    def _summarizeDets():
        stats = np.zeros((12,))
        stats[0] = _summarize(1, maxDets=self.params.maxDets[0])
        stats[1] = _summarize(1, iouThr=.5, maxDets=self.params.maxDets[0])
        stats[2] = _summarize(1, iouThr=.75, maxDets=self.params.maxDets[0])
        stats[3] = _summarize(1, areaRng='small', maxDets=self.params.maxDets[0])
        stats[4] = _summarize(1, areaRng='medium', maxDets=self.params.maxDets[0])
        stats[5] = _summarize(1, areaRng='large', maxDets=self.params.maxDets[0])
        stats[6] = _summarize(0, maxDets=self.params.maxDets[0])
        stats[9] = _summarize(0, areaRng='small', maxDets=self.params.maxDets[0])
        stats[10] = _summarize(0, areaRng='medium', maxDets=self.params.maxDets[0])
        stats[11] = _summarize(0, areaRng='large', maxDets=self.params.maxDets[0])
        return stats
    # Copy other things which are left from `summarize()` here.

If you run the above method over your dataset, you will get an output similar to this:

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=200 ] = 0.507
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=200 ] = 0.699
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=200 ] = 0.575
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=200 ] = 0.586
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=200 ] = 0.519
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=200 ] = 0.501
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=200 ] = 0.598
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=200 ] = 0.640
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=200 ] = 0.566
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=200 ] = 0.564

189

answered Oct 07 '22 08:10

ashkan

I came to the conclusion, that this is just the way that the cocoapi defines its metric. It probably makes sense in their context, but I can as well define my own (which is what I did), based on the articles I read and linked above.

answered Oct 07 '22 07:10

mincos

Related questions
                            
                                Tensorflow: 'module' object has no attribute 'scalar_summary'
                            
                                Storing tensorflow models in memory
                            
                                How to implement Tensorflow batch normalization in LSTM
                            
                                Float16 slower than float32 in keras
                            
                                Keras model.fit() with tf.dataset API + validation_data
                            
                                Quantize a Keras neural network model
                            
                                Deploy Semantic Segmentation Network (U-Net) with TensorRT (no upsampling support)
                            
                                Why doesn't my Deep Q Network master a simple Gridworld (Tensorflow)? (How to evaluate a Deep-Q-Net)
                            
                                TensorFlow Estimator ServingInputReceiver features vs receiver_tensors: when and why?
                            
                                How to use Batch Normalization correctly in tensorflow?
                            
                                How to deal with UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape
                            
                                Python TensorFlow: How to restart training with optimizer and import_meta_graph?
                            
                                When to use tensorflow datasets api versus pandas or numpy
                            
                                Keras inconsistent prediction time
                            
                                Restoring TensorFlow model
                            
                                How can I use Tensorflow with react-native? [closed]
                            
                                How to select batch size automatically to fit GPU?
                            
                                what is Device interconnect StreamExecutor with strength 1 edge matrix
                            
                                TensorFlow Object Detection API print objects found on image to console
                            
                                Does bias in the convolutional layer really make a difference to the test accuracy?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Understanding COCO evaluation "maximum detections"

Tags:

tensorflow

object-detection-api

mscoco

mincos

People also ask

2 Answers

ashkan

mincos

Recent Activity

Donate For Us