Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OpenCV HOG feature data layout?

I'm working with OpenCV's CPU version of Histogram of Oriented Gradients (HOG). I'm using a 32x32 image with 4x4 cells, 4x4 blocks, no overlap among blocks, and 15 orientation bins. OpenCV's HOGDescriptor gives me a 1D feature vector of length 960. This makes sense, because (32*32 pixels) * (15 orientations) / (4*4 cells) = 960.

However, I'm not sure about how these 960 numbers are laid out in memory. My guess would be that it's like this:

vector<float> descriptorsValues =
[15 bins for cell 0, 0] 
[15 bins for cell 0, 1]
...
[15 bins for cell 0, 7]
....
[15 bins for cell 7, 0] 
[15 bins for cell 7, 1]
...
[15 bins for cell 7, 7]

Of course, this is a 2D problem flattened into 1D, so it would actually look like this:

[cell 0, 0] [cell 0, 1] ... [cell 7, 0] ... [cell 7, 7]

So, do I have the right idea for the data layout? Or is it something else?


Here's my example code for this:

using namespace cv;

//32x32 image, 4x4 blocks, 4x4 cells, 4x4 blockStride
vector<float> hogExample(cv::Mat img)
{
    img = img.rowRange(0, 32).colRange(0,32); //trim image to 32x32
    bool gamma_corr = true;
    cv::Size win_size(img.rows, img.cols); //using just one window
    int c = 4;
    cv::Size block_size(c,c);
    cv::Size block_stride(c,c); //no overlapping blocks
    cv::Size cell_size(c,c);
    int nOri = 15; //number of orientation bins

    cv::HOGDescriptor d(win_size, block_size, block_stride, cell_size, nOri, 1, -1,
                              cv::HOGDescriptor::L2Hys, 0.2, gamma_corr, cv::HOGDescriptor::DEFAULT_NLEVELS);

    vector<float> descriptorsValues;
    vector<cv::Point> locations;
    d.compute(img, descriptorsValues, cv::Size(0,0), cv::Size(0,0), locations);

    printf("descriptorsValues.size() = %d \n", descriptorsValues.size()); //prints 960
    return descriptorsValues;
}

Related resources: This StackOverflow post and this tutorial helped me to get started with the OpenCV HOGDescriptor.

like image 496
solvingPuzzles Avatar asked Nov 12 '12 21:11

solvingPuzzles


1 Answers

I believe you got the right idea.

In its original paper Histograms of Oriented Gradients for Human Detection (Page 2), it says

[...] The detector window is tiled with a grid of overlapping blocks in which Histogram of Oriented Gradient feature vectors are extracted. [...]

[...] Tiling the detection window with a dense (in fact, overlapping) grid of HOG descriptors and using the combined feature vector [...]

All it talked about is tiling them together. Although no detail info is introduced on how to exactly tile them together. I guess there should be no fancy things happens here (otherwise they will talk about it), i.e. just regularly concatenating them (from left to right, top to down).

After all, It's reasonable and the easiest way to layout the data.


Edit: You will convince yourself more if you look at how people access and visualize the data.

for (int blockx=0; blockx<blocks_in_x_dir; blockx++)
{
    for (int blocky=0; blocky<blocks_in_y_dir; blocky++)            
    {
        for (int cellNr=0; cellNr<4; cellNr++)
        {
            for (int bin=0; bin<gradientBinSize; bin++)
            {
                float gradientStrength = descriptorValues[ descriptorDataIdx ];
                descriptorDataIdx++;

                // ... ...

            } // for (all bins)
        } // for (all cells)
    } // for (all block x pos)
} // for (all block y pos)
like image 135
herohuyongtao Avatar answered Oct 19 '22 23:10

herohuyongtao