Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does opencv's HOG descriptor return so many values

I'm trying to use OpenCV's HOG descriptor, but the feature vector computed from it seems far too long. Here is a snippet that demonstrates the problem:

#include <stdio.h>
#include <opencv2/opencv.hpp>
#include <stdlib.h>
#include <vector>

int main()
{
    cv::Mat image = cv::imread("1.jpg");
    std::vector<float> features;
    cv::HOGDescriptor hogdis;
    hogdis.compute(image, features);
    printf("HOG feature's length is %zu %zu\n", hogdis.getDescriptorSize(), features.size());
    return 0;
}

The output is

HOG feature's length is 3780 1606500

The latter value seems absurd. The image 1.jpg has dimension 256x256x3, which has much less pixels than the feature vector. Why does OpenCV fills the feature vector with so many values? How do I obtain the 3780 long vector to feed to my SVM trainer?

like image 338
Siyuan Ren Avatar asked Mar 13 '14 09:03

Siyuan Ren


People also ask

How does a HOG descriptor work?

The HOG descriptor focuses on the structure or the shape of an object. It is better than any edge descriptor as it uses magnitude as well as angle of the gradient to compute the features. For the regions of the image it generates histograms using the magnitude and orientations of the gradient.

How does HOG feature extraction work?

This is done by extracting the gradient and orientation (or you can say magnitude and direction) of the edges. Additionally, these orientations are calculated in 'localized' portions. This means that the complete image is broken down into smaller regions and for each region, the gradients and orientation are calculated ...

How Histogram of Oriented Gradients HOG works?

The histogram of oriented gradients (HOG) is a feature descriptor used in computer vision and image processing for the purpose of object detection. The technique counts occurrences of gradient orientation in localized portions of an image.

What is the output of HOG?

Typically, a feature descriptor converts an image of size width x height x 3 (channels ) to a feature vector / array of length n. In the case of the HOG feature descriptor, the input image is of size 64 x 128 x 3 and the output feature vector is of length 3780.


1 Answers

Why does OpenCV fills the feature vector with so many values?

The size of hog features is determined by the following equation (not solely determined on the image dimensions):

size_hog_features = (size_t)nbins * ( blockSize.width/cellSize.width) 
                         * (blockSize.height/cellSize.height) * ((winSize.width 
                         - blockSize.width)/blockStride.width + 1) 
                         * ((winSize.height - blockSize.height)
                         / blockStride.height + 1);

So it's quite normal you got such a long HOG feature vector.

How do I obtain the 3780 long vector to feed to my SVM trainer?

You can setup the parameters (i.e. nbins, blockSize, cellSize, winSize) of HOG feature before computing it, in order to get a HOG feature with the size you want.

But why are hogdis.getDescriptorSize() and features.size() inconsistent?

They are different. getDescriptorSize() returns the number of coefficients required for the classification. And it can be computed as follows (refer to here):

HOG descriptor length = #Blocks * #CellsPerBlock * #BinsPerCell

On the other hand, features.size() returns all the HOG feature size of the whole image.

To train, you need to pass in features.

like image 194
herohuyongtao Avatar answered Oct 19 '22 05:10

herohuyongtao