OpenCV: C++ and C performance comparison

Tags:

Right now I'm developing some application using OpenCV API (C++). This application does processing with video.

On the PC everything works really fast. And today I decided to port this application on Android (to use camera as videoinput). Fortunately, there's OpenCV for Android so I just added my native code to sample Android application. Everything works fine except perfomance. I benchmarked my application and found that application works with 4-5 fps, what is actually not acceptable (my device has singlecore 1ghz processor) - I want it to work with about 10 fps.

Does it make a sence to fully rewrite my application on C? I know that using such things as std::vector is much comfortable for developer, but I don't care about it.

It seems that OpenCV's C interface has same functions/methods as C++ interface.

I googled this question but didn't find anything.

Thanks for any advice.

950

asked Jul 07 '12 15:07

ArtemStorozhuk

5 Answers

I've worked quite a lot with Android and optimizations (I wrote a video processing app that processes a frame in 4ms) so I hope I will give you some pertinent answers.

There is not much difference between the C and C++ interface in OpenCV. Some of the code is written in C, and has a C++ wrapper, and some viceversa. Any significant differences between the two (as measured by Shervin Emami) are either regressions, bug fixes or quality improvements. You should stick with the latest OpenCV version.

Why not rewrite?

You will spend a good deal of time, which you could use much better. The C interface is cumbersome, and the chance to introduce bugs or memory leaks is high. You should avoid it, in my opinion.

Advice for optimization

A. Turn on optimizations.

Both compiler optimizations and the lack of debug assertions can make a big difference in your running time.

B. Profile your app.

Do it first on your computer, since it is much easier. Use visual studio profiler, to identify the slow parts. Optimize them. Never optimize because you think is slow, but because you measure it. Start with the slowest function, optimize it as much as possible, then take the second slower. Measure your changes to make sure it's indeed faster.

C. Focus on algorithms.

A faster algorithm can improve performance with orders of magnitude (100x). A C++ trick will give you maybe 2x performance boost.

Classical techniques:

Resize you video frames to be smaller. Often you can extract the information from a 200x300px image, instead of a 1024x768. The area of the first one is 10 times smaller.
Use simpler operations instead of complicated ones. Use integers instead of floats. And never use double in a matrix or a for loop that executes thousands of times.
Do as little calculation as possible. Can you track an object only in a specific area of the image, instead of processing it all for all the frames? Can you make a rough/approximate detection on a very small image and then refine it on a ROI in the full frame?

D. Use C where it matters

In loops, it may make sense to use C style instead of C++. A pointer to a data matrix or a float array is much faster than mat.at or std::vector<>. Often the bottleneck is a nested loop. Focus on it. It doesn't make sense to replace vector<> all over the place and spaghettify your code.

E. Avoid hidden costs

Some OpenCV functions convert data to double, process it, then convert back to the input format. Beware of them, they kill performance on mobile devices. Examples: warping, scaling, type conversions. Also, color space conversions are known to be lazy. Prefer grayscale obtained directly from native YUV.

F. Use vectorization

ARM processors implement vectorization with a technology called NEON. Learn to use it. It is powerful!

A small example:

Click to copy

float* a, *b, *c;
// init a and b to 1000001 elements
for(int i=0;i<1000001;i++)
    c[i] = a[i]*b[i];

can be rewritten as follows. It's more verbose, but much faster.

Click to copy

float* a, *b, *c;
// init a and b to 1000001 elements
float32x4_t _a, _b, _c;
int i;
for(i=0;i<1000001;i+=4)
{  
    a_ = vld1q_f32( &a[i] ); // load 4 floats from a in a NEON register
    b_ = vld1q_f32( &b[i] );
    c_ = vmulq_f32(a_, b_); // perform 4 float multiplies in parrallel
    vst1q_f32( &c[i], c_); // store the four results in c
}
// the vector size is not always multiple of 4 or 8 or 16. 
// Process the remaining elements
for(;i<1000001;i++)
    c[i] = a[i]*b[i];

Purists say you must write in assembler, but for a regular programmer that's a bit daunting. I had good results using gcc intrinsics, like in the above example.

Another way to jump-start is to convrt handcoded SSE-optimized code in OpenCV into NEON. SSE is the NEON equivalent in Intel processors, and many OpenCV functions use it, like here. This is the image filtering code for uchar matrices (the regular image format). You should't blindly convert instructions one by one, but take it as an example to start with.

You can read more about NEON in this blog and the following posts.

G. Pay attention to image capture

It can be surprisingly slow on a mobile device. Optimizing it is device and OS specific.

131

answered Oct 17 '22 10:10

Sam

Before making any decision like this, you should profile your code to locate the hotspots in your code. Without this information, any changes you make to speed things up will be guesswork. Have you tried this Android NDK profiler?

answered Oct 17 '22 10:10

Alex Wilson

There is some performance tests done by shervin imami on his website. You can check it to get some ideas.

http://www.shervinemami.info/timingTests.html

Hope it helps.

(And also, it would be nice if you share your own findings somewhere if you get any way for performance boost.)

answered Oct 17 '22 10:10

Abid Rahman K

I guess the question needs to be formulated to: is C faster than C++? and the answer is NO. Both are compiled to the native machine language and C++ is designed to be as fast as C As for the STL (espeically ISO standard) are also designed and taken care that they are as fast as pointers + they offer flexibility. The only reason to use C is that your platform doesn't support C++ In my humble openion, don't convert everything to C, as you'll probably get almost the same performance. and try instead to improve your code or use other functionalities of opencv to do what you want.

Not convinced? well then write a simple function, once in C and once in C++, and run it in a loop of 100 million times and measure the time yourself. Maybe this helps you taking the right decision

answered Oct 17 '22 10:10

Moataz Elmasry

I've never used C or C++ in Android. But in a PC you can get C++ to run as fast as C code (sometimes even faster). Most of C++ was designed specifically to allow more features, but not at the cost of speed (Templates are solved at compile time). Most compilers are pretty good at optimizing your code, and your std::vector calls will be inlined and the code will be almost the same as using a native C array.

I'd suggest you look for another way of improving your performance. Maybe there are some multimedia hardware extensions in the Android you can get access to and use to optimize the code.

answered Oct 17 '22 12:10

user1494736

Related questions
                            
                                Efficient way to compute p^q (exponentiation), where q is an integer
                            
                                OpenCV, C++: Distance between two points
                            
                                When/Why ( if ever ) should i think about doing Generic Programming/Meta Programming
                            
                                Can a C program handle C++ exceptions?
                            
                                How to get Linux distribution name and version?
                            
                                C++ function to count all the words in a string
                            
                                Operator |= for a boolean in C++
                            
                                C++: Why is const_cast evil?
                            
                                How to Find All Callers of a Function in C++?
                            
                                How to exit Win32 application via API?
                            
                                Are there any performance implications to including every header?
                            
                                How is this ternary conditional expression executed?
                            
                                Do you prefer explicit namespaces or 'using' in C++?
                            
                                Whats better to use in C++11 , Zero or NULL?
                            
                                what's the easiest way to generate xml in c++?
                            
                                C++ shared_ptr equality operator
                            
                                How do you handle strings in C++?
                            
                                How to make a C++ EXE larger (artificially)
                            
                                Clean way to eliminate "unused parameter 'widget'" warning generated by QGraphicsItem::paint
                            
                                c++ [&] operator [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

OpenCV: C++ and C performance comparison

Tags:

c++

performance

c

opencv

ArtemStorozhuk

People also ask

5 Answers

Sam

Alex Wilson

Abid Rahman K

Moataz Elmasry

user1494736

Recent Activity

Donate For Us