Logo Questions Linux Laravel Mysql Ubuntu Git Menu

OpenCV ORB GPU implementation slower than CPU






I am trying to run the ORB OpenCV algorithm to the frames of a video and I noticed the CPU version performs a lot faster than the GPU version. Here is the code:

#include <iostream>
#include "opencv2/core/core.hpp"
#include "opencv2/features2d/features2d.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/gpu/gpu.hpp"
#include <fstream>
#include <sstream> 
#include <math.h>
#include <omp.h>

#include <algorithm>
#include <vector>
#include <string>

using namespace cv;
using namespace std;
using namespace cv::gpu;

void process_cpu(string vid, int start_frame, int end_frame)
VideoCapture myCapture(vid);
Mat frame, gray_frame;
ORB myOrb(400);
Mat descriptors;
vector<KeyPoint> keypoints;

myCapture.set(CV_CAP_PROP_POS_FRAMES, start_frame);

for (int i=0; i<end_frame-start_frame; i++) {
    cvtColor(frame, gray_frame, CV_RGB2GRAY);
    myOrb(gray_frame, Mat(), keypoints, descriptors);

void process_gpu(string vid, int start_frame, int end_frame)
VideoCapture myCapture(vid);
Mat frame, gray_frame;
GpuMat gpu_frame;
ORB_GPU myOrb(400);
GpuMat keypoints, descriptors;

myCapture.set(CV_CAP_PROP_POS_FRAMES, start_frame);

for (int i=0; i<end_frame-start_frame; i++) {
    cvtColor(frame, gray_frame, CV_RGB2GRAY);
    myOrb.blurForDescriptor = true;
    myOrb(gpu_frame, GpuMat(), keypoints, descriptors);

int main (int argc, char* argv[])
int n = 4;
VideoCapture myCapture(argv[1]);
double frameNumber = myCapture.get(CV_CAP_PROP_FRAME_COUNT);

double TimeStart = 0;
double TotalTime = 0;
TimeStart = (double)getTickCount();

process_gpu(argv[1], 0, frameNumber);

TotalTime = (double)getTickCount() - TimeStart;
TotalTime = TotalTime / getTickFrequency();
cout << "Gpu Time : " << TotalTime << endl;

TimeStart = (double)getTickCount();

process_cpu(argv[1], 0, frameNumber);

TotalTime = (double)getTickCount() - TimeStart;
TotalTime = TotalTime / getTickFrequency();
cout << "Cpu Time : " << TotalTime << endl;

return -1;

After running this on a video with 3000 frames and 720x480 resolution, the GPU time is 54 sec and the CPU time 24 sec. I get similar results with other videos (not HD). PC specs:

  • i7-4770K CPU 3.50 GHz

  • NVIDIA GeForce GTX 650

  • 16 GB RAM

Other feature detection/description algorithms like SURF perform faster with the GPU implementation on my machine.

Has anyone compared the two implementation of ORB on his machine?

like image 891
Mr Alexander Avatar asked Jul 17 '14 12:07

Mr Alexander

1 Answers

Taken from this post:

cv::ORB applies a GaussianBlur (about 20 lines from the end of orb.cpp) before computing descriptors. There is no way to control this through the public interface.

cv::gpu::ORB_GPU has a public member bool blurForDescriptor, which by default constructs as false. When I set it instead to true, I find that min/avg/max hamming distance drops to 0/7.2/30 bits, which seems much more reasonable.

like image 85
Rick Smith Avatar answered Oct 17 '22 06:10

Rick Smith