I used Tensorflow's Object Detection API found in https://github.com/tensorflow/models/tree/master/research/object_detection. I used summarize_graph and verified that the outputs are detection_boxes, detection_scores, detection_classes, and num_detections.
What are these? Which of these contains the coordinates of the detection box of the detected objects?
I displayed the shape of each of the output and found their sizes:
detection_boxes.shape = (1,300,4)detection_scores.shape = (1, 300)detection_classes.shape = (1, 300)num_detections.shape = (1,)when tested on one image that contains 8 playing cards. The classes considered were the numbers A, 2, 3, 4, 5, & 6.
They represent exactly what the names suggest:
detection_boxes: coordinates of the predicted objects. Usually they represent: xmin,xmax,ymin,ymax.
detection_scores: exactly the score of each prediction, i.e., the model is 69% sure that certain image represent a A card.
detection_classes: a label that represent the prediction.
num_detections: the number of detections that the model was able to predict given a certain threshold.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With