I'm trying to use OpenCV's HOG descriptor, but the feature vector computed from it seems far too long. Here is a snippet that demonstrates the problem:
#include <stdio.h>
#include <opencv2/opencv.hpp>
#include <stdlib.h>
#include <vector>
int main()
{
cv::Mat image = cv::imread("1.jpg");
std::vector<float> features;
cv::HOGDescriptor hogdis;
hogdis.compute(image, features);
printf("HOG feature's length is %zu %zu\n", hogdis.getDescriptorSize(), features.size());
return 0;
}
The output is
HOG feature's length is 3780 1606500
The latter value seems absurd. The image 1.jpg
has dimension 256x256x3, which has much less pixels than the feature vector. Why does OpenCV fills the feature vector with so many values? How do I obtain the 3780 long vector to feed to my SVM trainer?
The HOG descriptor focuses on the structure or the shape of an object. It is better than any edge descriptor as it uses magnitude as well as angle of the gradient to compute the features. For the regions of the image it generates histograms using the magnitude and orientations of the gradient.
This is done by extracting the gradient and orientation (or you can say magnitude and direction) of the edges. Additionally, these orientations are calculated in 'localized' portions. This means that the complete image is broken down into smaller regions and for each region, the gradients and orientation are calculated ...
The histogram of oriented gradients (HOG) is a feature descriptor used in computer vision and image processing for the purpose of object detection. The technique counts occurrences of gradient orientation in localized portions of an image.
Typically, a feature descriptor converts an image of size width x height x 3 (channels ) to a feature vector / array of length n. In the case of the HOG feature descriptor, the input image is of size 64 x 128 x 3 and the output feature vector is of length 3780.
Why does OpenCV fills the feature vector with so many values?
The size of hog features is determined by the following equation (not solely determined on the image dimensions):
size_hog_features = (size_t)nbins * ( blockSize.width/cellSize.width)
* (blockSize.height/cellSize.height) * ((winSize.width
- blockSize.width)/blockStride.width + 1)
* ((winSize.height - blockSize.height)
/ blockStride.height + 1);
So it's quite normal you got such a long HOG feature vector.
How do I obtain the 3780 long vector to feed to my SVM trainer?
You can setup the parameters (i.e. nbins, blockSize, cellSize, winSize
) of HOG feature before computing it, in order to get a HOG feature with the size you want.
But why are hogdis.getDescriptorSize() and features.size() inconsistent?
They are different. getDescriptorSize()
returns the number of coefficients required for the classification. And it can be computed as follows (refer to here):
HOG descriptor length = #Blocks * #CellsPerBlock * #BinsPerCell
On the other hand, features.size()
returns all the HOG feature size of the whole image.
To train, you need to pass in features
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With