I am trying to understand working of YOLO and how it detects object in an image. My question is, what role does k-means clustering play in detecting bounding box around the object? Thanks.
K -means clustering algorithm is very famous algorithm in data science. This algorithm aims to partition n observation to k clusters. Mainly it includes :
K means (i.e centroid) are generated at random.Assignment : Clustering formation by associating the each observation with nearest centroid.
Updating Cluster : Centroid of a newly created cluster becomes mean.
Assignment and Update are repitatively occurs untill convergence. The final result is that the sum of squared errors is minimized between points and their respective centroids.
EDIT :
Why use K means
what it really does in determining anchor box
Thanks!
In general, bounding boxes for objects are given by tuples of the form (x0,y0,x1,y1) where x0,y0 are the coordinates of the lower left corner and x1,y1 are the coordinates of the upper right corner.
Need to extract width and height from these coordinates, and normalize data with respect to image width and height.
Metric for K-mean
IoU turns out to better than former
Jaccard index = (Intersection between selected box and cluster head box)/(Union between selected box and cluster head box)
At initialization we can choose k random boxes as our cluster heads. Assign anchor boxes to respective clusters based on IoU value > threshold and calculate mean IoU of cluster.
This process can be repeated until convergence.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With