Hi i have been working on this for a while and yet to have no good solution.
I am reading a video frame by frame and am using background subtraction to ' identify the region where is there movement and use cvFindContours() to get the rectangle boundary of the moving objects.
Assuming the program is kept simple there can be only 2 human.
These objects and move in a manner they can overlapped, make turn and move away at certain interval.
How can i label this humans x 2 correctly.
cvFindContour can return the boundary in a random manner. for Frame1,Frame2,Frame3....FrameN
I can initially compare rect boundary centroid to label the human correctly. Once the human overlapped and move away this approach will fail.
I tried to keep track of pixel color of the original obj (however the human are fairly similar and certain areas have similar colors like hand,leg,hair ) hence not good enough.
I was considering using Image Statistic like :
CountNonZero(), SumPixels() Mean() Mean_StdDev () MinMaxLoc () Norm ()
to uniquely distinguish the two objects. I believe that would be a better approach.
This is a difficult problem and any solution will not be perfect. Computer vision is jokingly known as an "AI-complete" discipline: if you solve computer vision and you have solved all of artificial intelligence.
Background subtraction can be a good way of detecting objects. If you need to improve the background subtraction results, you might consider using an MRF. Presumably, you can tell when there is a single object and when the two blobs have merged, based on the size of the blob. If the trajectories don't change quickly during the times the blobs are merged, you can do Kalman tracking and use some heuristics to disambiguate the blobs afterwards.
Even though the colors are similar between the two objects, you might consider trying to use a mean shift tracker. It's possible that you may need to do some particle filtering to keep track of multiple hypotheses about who is who.
There are also some even more complicated techniques called layered tracking. There is some more recent work by Jojic and Frey, by Winn, by Zhou and Tao, and by others. Most of these techniques come with very strong assumptions and/or take a lot of work to implement correctly.
If you're interested in this topic in general, I highly recommend taking a computer vision course and/or reading a textbook such as Ponce and Forsyth's.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With