Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Improving accuracy OpenCV HOG people detector

I'm working in a project. A part of project consist to integrate the HOG people detector of OpenCV with a camera streaming .

Currently It's working the camera and the basic HOG detector (CPP detectMultiScale -> http://docs.opencv.org/modules/gpu/doc/object_detection.html). But don't work very well... The detections are very noising and the algorithm isn't very accuracy...

Why?

My camera image is 640 x 480 pixels.

The snippet code I'm using is:

std::vector<cv::Rect> found, found_filtered;
cv::HOGDescriptor hog;
hog.setSVMDetector(cv::HOGDescriptor::getDefaultPeopleDetector());
hog.detectMultiScale(image, found, 0, cv::Size(8,8), cv::Size(32,32), 1.05, 2);

Why don't work properly? What need for improve the accuracy? Is necessary some image size particular?

PS: Do you know some precise people detection algorithm, faster and developed in cpp ??

like image 835
Ricardo Avatar asked Oct 28 '14 11:10

Ricardo


People also ask

How to detect people with OpenCV?

However, OpenCV has a built-in method to detect pedestrians. It has a pre-trained HOG(Histogram of Oriented Gradients) + Linear SVM model to detect pedestrians in images and video streams. This algorithm checks directly surrounding pixels of every single pixel.

What is hog detectMultiScale?

The value returned by hog.detectMultiScale is just a list. This list represents the number of people in the image. Therefore, to get the total number of people in the image, just take the len(rects) .

How does a histogram of oriented gradients work?

The histogram of oriented gradients (HOG) is a feature descriptor used in computer vision and image processing for the purpose of object detection. The technique counts occurrences of gradient orientation in localized portions of an image.


1 Answers

The size of the default people detector is 64x128, that mean that the people you would want to detect have to be atleast 64x128. For your camera resolution that would mean that a person would have to take up quite some space before getting properly detected.

Depending on your specific situation, you could try your hand at training your own HOG Descriptor, with a smaller size. You could take a look at this answer and the referenced library if you want to train your own HOG Descriptor.

For the Parameters:

win_stride: Given your input image has a size of 640 x 480, and the defaultpeopleDetector has a window size of 64x128, you can fit the HOG Detection window ( the 64x128 window) multiple times in the input image. The winstride tells HOG to move the detection window a certain amount each time. How does this work: Hog places the detection window on the top left of your input image. and moves the detection window each time by the win_stride.

Like this (small win_stride): enter image description here

or like this (large win_stride) enter image description here

A smaller winstride should improve accuracy, but decreases preformance, and the other way around

padding Padding adds a certain amount of extra pixels on each side of the input image. That way the detection window is placed a bit outside the input image. It's because of that padding that HOG can detect people who are very close to the edge of the input image.

group_threshold The group_treshold determines a value by when detected parts should be placed in a group. Low value provides no result grouping, a higher value provides result grouping if the amount of treshold has been found inside the detection windows. (in my own experience, I have never needed to change the default value)

I hope this makes a bit of sense for you. I've been working with HOG for the past few weeks, and read alot of papers, but I lost some of the references, so I can't link you the pages where this info comes from, I'm sorry.

like image 117
Timmynator0 Avatar answered Oct 17 '22 06:10

Timmynator0