Histogram of Oriented Gradients

Tags:

I have been reading theory about HOG descriptors for object(human) detection. But I have some questions about the implementation, which might sound like an insignificant detail.

Regarding the window that contains the blocks; should the window be moved over the image pixel by pixel where the windows overlap at each step, as illustrated here: enter image description here

or should the window be moved without causing any overlapping, as here: enter image description here

The illustrations that I have seen so far used the second approach. But, considering the detection window being size of 64x128, it is highly probable that by sliding the window over the image one cannot cover the whole image. In case of image being size of 64x255, then the last 127 pixel will not be check for object. So, first approach seems more reasonable, however, more time and cpu consuming.

Any ideas? Thank you in advance.

EDIT: I try to stick to the original paper of Dalal and Triggs. One paper that implemented the algorithm and uses the second approach can be found here: http://www.cs.bilkent.edu.tr/~cansin/projects/cs554-vision/pedestrian-detection/pedestrian-detection-paper.pdf

315

asked Apr 08 '11 14:04

Ahmet Keskin

1 Answers

EDIT: Sorry -- I misunderstood your question. (Also, the answer I provided to the wrong question was in error -- I've since adjusted that below for context.)

You're asking about using the HOG descriptor for detection, not generating the HOG descriptor.

In the implementation paper you reference above, it looks like they are overlapping the detection window. The window size is 64x128, while they use a horizontal stride of 32 pixels and a vertical stride of 64. They also mention that they tried smaller stride values, but this led to a higher false positive rate (in the context of their implementation.)

On top of that, they're using 3 scales of the input image: 1, 1/2, and 1/4. They don't mention any corresponding scaling of the detection window -- I'm not sure what effect that would have from a detection standpoint. It seems that this would implicitly create overlap as well.

Original answer (corrected):

Looking at the Dalal and Triggs paper (in section 6.4) it looks like they mention both i) no block overlap, as well as ii) half- and quarter- block overlap when generating the HOG descriptor. Based on their results, it sounds like greater overlap produced better detection performance (albeit at a greater resource/processing cost).

answered Oct 19 '22 19:10

Rick Haffey

Related questions
                            
                                Let User Crop Photo in iOS App [closed]
                            
                                Detecting thresholds in HSV color space (from RGB) using Python / PIL
                            
                                How is exif info encoded?
                            
                                How to implement fast image filters on iOS platform
                            
                                Instagram Lux effect
                            
                                OpenCV image conversion from RGB to HSV
                            
                                A php file as img src
                            
                                Out Of Memory Error with images
                            
                                How to detect if the jpg jpeg image file is corrupted(incomplete)?
                            
                                PHP - create thumbnail image from url
                            
                                Skewing an image using Perspective Transforms
                            
                                Laplacian Matrix Calculation in C++ [closed]
                            
                                Align two images in OpenCV
                            
                                Properly rectifying stereo images for GPU (opencv)
                            
                                How to obtain Hierarchical component tree of MSER in Matlab?
                            
                                Android auto crop camera captured images
                            
                                Is there a camera calibration matrix database?
                            
                                Image comparison with php + gd
                            
                                Downsampling and Upsampling for Gaussian Image Pyramids in Swift

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Histogram of Oriented Gradients

Tags:

image-processing

computer-vision

object-detection

Ahmet Keskin

People also ask

1 Answers

Rick Haffey

Recent Activity

Donate For Us