Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to implement a better sliding window algorithm?

So I have been writing my own codes for HoG and its variant to work with depth images. However, I am stuck with testing my trained SVM in the detection window part.

All that I've done right now is to first create image pyramids out of the original image, and run a sliding window of 64x128 size from top left corner to bottom right.

Here's a video capture of it: http://youtu.be/3cNFOd7Aigc

Now the issue is that I'm getting more false positives than I expected.

Is there a way that I can remove all these false positives (besides training with more images) ? So far I can get the 'score' from SVM, which is the distance to the margin itself. How can I use that to leverage my results ?

Does anyone have any insight in implementing a good sliding window algorithm ?

like image 331
sub_o Avatar asked Apr 08 '13 08:04

sub_o


1 Answers

What you could do is add a processing step to find the locally strongest response from SVM. Let me explain.

What you appear to be doing right now:

for each sliding window W, record category[W] = SVM.hardDecision(W)

Hard decision means it return a boolean or integer, and for 2-category classification could be written like this:

hardDecision(W) = bool( softDecision(W) > 0 )

Since you mentioned OpenCV, in CvSVM::predict you should set returnDFVal to true :

returnDFVal – Specifies a type of the return value. If true and the problem is 2-class classification then the method returns the decision function value that is signed distance to the margin, else the function returns a class label (classification) or estimated function value (regression).

from the documentation.

What you could do is:

  1. for each sliding window W, record score[W] = SVM.softDecision(W)
  2. for each W, compute and record:
    • neighbors = max(score[W_left], score[W_right], score[W_up], score[W_bottom])
    • local[W] = score[W] > neighbors
    • powerful[W] = score[W] > threshold.
  3. for each W, you have a positive if local[W] && powerful[W]

Since your classifier will have a positive response for windows cloth (in space and/or appearance) to your true positive, the idea is to record the scores for each window, and then only keep positives which

  • are a locally maximum score (greater that its neighbors) --> local
  • are strong enough --> powerful

You could set threshold to 0 and adjust it until you get satisfying results. Or you could calibrate it automatically using your training set.

like image 168
Antoine Avatar answered Oct 10 '22 08:10

Antoine