Currently I am testing the yolo 9000 model for object detection and in the Paper I understand that the image is splited in 13X13 boxes and in each boxes we calculate P(Object), but How can we calculate that ? how can the model know if there is an object in this boxe or not, please I need help to understand that
I am using tensorflow
Thanks,
They train for the confidence score = P(object) * IOU. For the ground truth box they take P(object)=1 and for rest of the grid pixels the ground truth P(object) is zero. You are training your network to tell you if some object in that grid location i.e. output 0 if not object, output IOU if partial object and output 1 if object is present. So at test time, your model has become capable of telling if there is an object at that location.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With