Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the outputs of the Object Detection API of Tensorflow?

I used Tensorflow's Object Detection API found in https://github.com/tensorflow/models/tree/master/research/object_detection. I used summarize_graph and verified that the outputs are detection_boxes, detection_scores, detection_classes, and num_detections.

What are these? Which of these contains the coordinates of the detection box of the detected objects?

I displayed the shape of each of the output and found their sizes:

  • detection_boxes.shape = (1,300,4)
  • detection_scores.shape = (1, 300)
  • detection_classes.shape = (1, 300)
  • num_detections.shape = (1,)

when tested on one image that contains 8 playing cards. The classes considered were the numbers A, 2, 3, 4, 5, & 6.

like image 409
Chaine Avatar asked Dec 31 '25 21:12

Chaine


1 Answers

They represent exactly what the names suggest:

detection_boxes: coordinates of the predicted objects. Usually they represent: xmin,xmax,ymin,ymax.

detection_scores: exactly the score of each prediction, i.e., the model is 69% sure that certain image represent a A card.

detection_classes: a label that represent the prediction.

num_detections: the number of detections that the model was able to predict given a certain threshold.

like image 124
Vitor França Avatar answered Jan 03 '26 11:01

Vitor França



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!