Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OpenCV: C++ and C performance comparison

Right now I'm developing some application using OpenCV API (C++). This application does processing with video.

On the PC everything works really fast. And today I decided to port this application on Android (to use camera as videoinput). Fortunately, there's OpenCV for Android so I just added my native code to sample Android application. Everything works fine except perfomance. I benchmarked my application and found that application works with 4-5 fps, what is actually not acceptable (my device has singlecore 1ghz processor) - I want it to work with about 10 fps.

Does it make a sence to fully rewrite my application on C? I know that using such things as std::vector is much comfortable for developer, but I don't care about it.

It seems that OpenCV's C interface has same functions/methods as C++ interface.

I googled this question but didn't find anything.

Thanks for any advice.

like image 950
ArtemStorozhuk Avatar asked Jul 07 '12 15:07

ArtemStorozhuk


People also ask

Is OpenCV in C faster than Python?

All google results for openCV state the same: that python will only be slightly slower. But not once have I seen any profiling on that. So I decided to do some and discovered: Python is significantly slower than C++ with opencv, even for trivial programs.

Are open CVS fast?

The OpenCV version ran at an impressive 50 ms per frame and was 6x faster than the reference implementation. Remember, these are both CPU implementations.

Can I use C for OpenCV?

OpenCV is a popular Computer Vision library to develop applications built using C++ and C. It has several uses like Object Detection and Video Processing. Computer Vision overlaps with fields like Image Processing, Photogrammetry, and Pattern Recognition. A popular wrapper Emgu CV is used to run OpenCV using C#.

Should I learn OpenCV C++ or Python?

C++ as it's mostly the openCV main language and the computer vision algorithms mostly applied in C/C++ , you can use the python API but C++ will allow you to understand more what is happening.


5 Answers

I've worked quite a lot with Android and optimizations (I wrote a video processing app that processes a frame in 4ms) so I hope I will give you some pertinent answers.

There is not much difference between the C and C++ interface in OpenCV. Some of the code is written in C, and has a C++ wrapper, and some viceversa. Any significant differences between the two (as measured by Shervin Emami) are either regressions, bug fixes or quality improvements. You should stick with the latest OpenCV version.

Why not rewrite?

You will spend a good deal of time, which you could use much better. The C interface is cumbersome, and the chance to introduce bugs or memory leaks is high. You should avoid it, in my opinion.

Advice for optimization

A. Turn on optimizations.

Both compiler optimizations and the lack of debug assertions can make a big difference in your running time.

B. Profile your app.

Do it first on your computer, since it is much easier. Use visual studio profiler, to identify the slow parts. Optimize them. Never optimize because you think is slow, but because you measure it. Start with the slowest function, optimize it as much as possible, then take the second slower. Measure your changes to make sure it's indeed faster.

C. Focus on algorithms.

A faster algorithm can improve performance with orders of magnitude (100x). A C++ trick will give you maybe 2x performance boost.

Classical techniques:

  • Resize you video frames to be smaller. Often you can extract the information from a 200x300px image, instead of a 1024x768. The area of the first one is 10 times smaller.

  • Use simpler operations instead of complicated ones. Use integers instead of floats. And never use double in a matrix or a for loop that executes thousands of times.

  • Do as little calculation as possible. Can you track an object only in a specific area of the image, instead of processing it all for all the frames? Can you make a rough/approximate detection on a very small image and then refine it on a ROI in the full frame?

D. Use C where it matters

In loops, it may make sense to use C style instead of C++. A pointer to a data matrix or a float array is much faster than mat.at or std::vector<>. Often the bottleneck is a nested loop. Focus on it. It doesn't make sense to replace vector<> all over the place and spaghettify your code.

E. Avoid hidden costs

Some OpenCV functions convert data to double, process it, then convert back to the input format. Beware of them, they kill performance on mobile devices. Examples: warping, scaling, type conversions. Also, color space conversions are known to be lazy. Prefer grayscale obtained directly from native YUV.

F. Use vectorization

ARM processors implement vectorization with a technology called NEON. Learn to use it. It is powerful!

A small example:

float* a, *b, *c;
// init a and b to 1000001 elements
for(int i=0;i<1000001;i++)
    c[i] = a[i]*b[i];

can be rewritten as follows. It's more verbose, but much faster.

float* a, *b, *c;
// init a and b to 1000001 elements
float32x4_t _a, _b, _c;
int i;
for(i=0;i<1000001;i+=4)
{  
    a_ = vld1q_f32( &a[i] ); // load 4 floats from a in a NEON register
    b_ = vld1q_f32( &b[i] );
    c_ = vmulq_f32(a_, b_); // perform 4 float multiplies in parrallel
    vst1q_f32( &c[i], c_); // store the four results in c
}
// the vector size is not always multiple of 4 or 8 or 16. 
// Process the remaining elements
for(;i<1000001;i++)
    c[i] = a[i]*b[i];

Purists say you must write in assembler, but for a regular programmer that's a bit daunting. I had good results using gcc intrinsics, like in the above example.

Another way to jump-start is to convrt handcoded SSE-optimized code in OpenCV into NEON. SSE is the NEON equivalent in Intel processors, and many OpenCV functions use it, like here. This is the image filtering code for uchar matrices (the regular image format). You should't blindly convert instructions one by one, but take it as an example to start with.

You can read more about NEON in this blog and the following posts.

G. Pay attention to image capture

It can be surprisingly slow on a mobile device. Optimizing it is device and OS specific.

like image 131
Sam Avatar answered Oct 17 '22 10:10

Sam


Before making any decision like this, you should profile your code to locate the hotspots in your code. Without this information, any changes you make to speed things up will be guesswork. Have you tried this Android NDK profiler?

like image 44
Alex Wilson Avatar answered Oct 17 '22 10:10

Alex Wilson


There is some performance tests done by shervin imami on his website. You can check it to get some ideas.

http://www.shervinemami.info/timingTests.html

Hope it helps.

(And also, it would be nice if you share your own findings somewhere if you get any way for performance boost.)

like image 39
Abid Rahman K Avatar answered Oct 17 '22 10:10

Abid Rahman K


I guess the question needs to be formulated to: is C faster than C++? and the answer is NO. Both are compiled to the native machine language and C++ is designed to be as fast as C As for the STL (espeically ISO standard) are also designed and taken care that they are as fast as pointers + they offer flexibility. The only reason to use C is that your platform doesn't support C++ In my humble openion, don't convert everything to C, as you'll probably get almost the same performance. and try instead to improve your code or use other functionalities of opencv to do what you want.

Not convinced? well then write a simple function, once in C and once in C++, and run it in a loop of 100 million times and measure the time yourself. Maybe this helps you taking the right decision

like image 34
Moataz Elmasry Avatar answered Oct 17 '22 10:10

Moataz Elmasry


I've never used C or C++ in Android. But in a PC you can get C++ to run as fast as C code (sometimes even faster). Most of C++ was designed specifically to allow more features, but not at the cost of speed (Templates are solved at compile time). Most compilers are pretty good at optimizing your code, and your std::vector calls will be inlined and the code will be almost the same as using a native C array.

I'd suggest you look for another way of improving your performance. Maybe there are some multimedia hardware extensions in the Android you can get access to and use to optimize the code.

like image 3
user1494736 Avatar answered Oct 17 '22 12:10

user1494736