Right now I'm developing some application using OpenCV API (C++
). This application does processing with video.
On the PC everything works really fast. And today I decided to port this application on Android (to use camera as videoinput). Fortunately, there's OpenCV for Android so I just added my native code to sample Android application. Everything works fine except perfomance. I benchmarked my application and found that application works with 4-5 fps, what is actually not acceptable (my device has singlecore 1ghz processor) - I want it to work with about 10 fps.
Does it make a sence to fully rewrite my application on C
? I know that using such things as std::vector
is much comfortable for developer, but I don't care about it.
It seems that OpenCV's C
interface has same functions/methods as C++
interface.
I googled this question but didn't find anything.
Thanks for any advice.
All google results for openCV state the same: that python will only be slightly slower. But not once have I seen any profiling on that. So I decided to do some and discovered: Python is significantly slower than C++ with opencv, even for trivial programs.
The OpenCV version ran at an impressive 50 ms per frame and was 6x faster than the reference implementation. Remember, these are both CPU implementations.
OpenCV is a popular Computer Vision library to develop applications built using C++ and C. It has several uses like Object Detection and Video Processing. Computer Vision overlaps with fields like Image Processing, Photogrammetry, and Pattern Recognition. A popular wrapper Emgu CV is used to run OpenCV using C#.
C++ as it's mostly the openCV main language and the computer vision algorithms mostly applied in C/C++ , you can use the python API but C++ will allow you to understand more what is happening.
I've worked quite a lot with Android and optimizations (I wrote a video processing app that processes a frame in 4ms) so I hope I will give you some pertinent answers.
There is not much difference between the C and C++ interface in OpenCV. Some of the code is written in C, and has a C++ wrapper, and some viceversa. Any significant differences between the two (as measured by Shervin Emami) are either regressions, bug fixes or quality improvements. You should stick with the latest OpenCV version.
Why not rewrite?
You will spend a good deal of time, which you could use much better. The C interface is cumbersome, and the chance to introduce bugs or memory leaks is high. You should avoid it, in my opinion.
Advice for optimization
A. Turn on optimizations.
Both compiler optimizations and the lack of debug assertions can make a big difference in your running time.
B. Profile your app.
Do it first on your computer, since it is much easier. Use visual studio profiler, to identify the slow parts. Optimize them. Never optimize because you think is slow, but because you measure it. Start with the slowest function, optimize it as much as possible, then take the second slower. Measure your changes to make sure it's indeed faster.
C. Focus on algorithms.
A faster algorithm can improve performance with orders of magnitude (100x). A C++ trick will give you maybe 2x performance boost.
Classical techniques:
Resize you video frames to be smaller. Often you can extract the information from a 200x300px image, instead of a 1024x768. The area of the first one is 10 times smaller.
Use simpler operations instead of complicated ones. Use integers instead of floats. And never use double
in a matrix or a for
loop that executes thousands of times.
Do as little calculation as possible. Can you track an object only in a specific area of the image, instead of processing it all for all the frames? Can you make a rough/approximate detection on a very small image and then refine it on a ROI in the full frame?
D. Use C where it matters
In loops, it may make sense to use C style instead of C++. A pointer to a data matrix or a float array is much faster than mat.at or std::vector<>. Often the bottleneck is a nested loop. Focus on it. It doesn't make sense to replace vector<> all over the place and spaghettify your code.
E. Avoid hidden costs
Some OpenCV functions convert data to double, process it, then convert back to the input format. Beware of them, they kill performance on mobile devices. Examples: warping, scaling, type conversions. Also, color space conversions are known to be lazy. Prefer grayscale obtained directly from native YUV.
F. Use vectorization
ARM processors implement vectorization with a technology called NEON. Learn to use it. It is powerful!
A small example:
float* a, *b, *c;
// init a and b to 1000001 elements
for(int i=0;i<1000001;i++)
c[i] = a[i]*b[i];
can be rewritten as follows. It's more verbose, but much faster.
float* a, *b, *c;
// init a and b to 1000001 elements
float32x4_t _a, _b, _c;
int i;
for(i=0;i<1000001;i+=4)
{
a_ = vld1q_f32( &a[i] ); // load 4 floats from a in a NEON register
b_ = vld1q_f32( &b[i] );
c_ = vmulq_f32(a_, b_); // perform 4 float multiplies in parrallel
vst1q_f32( &c[i], c_); // store the four results in c
}
// the vector size is not always multiple of 4 or 8 or 16.
// Process the remaining elements
for(;i<1000001;i++)
c[i] = a[i]*b[i];
Purists say you must write in assembler, but for a regular programmer that's a bit daunting. I had good results using gcc intrinsics, like in the above example.
Another way to jump-start is to convrt handcoded SSE-optimized code in OpenCV into NEON. SSE is the NEON equivalent in Intel processors, and many OpenCV functions use it, like here. This is the image filtering code for uchar matrices (the regular image format). You should't blindly convert instructions one by one, but take it as an example to start with.
You can read more about NEON in this blog and the following posts.
G. Pay attention to image capture
It can be surprisingly slow on a mobile device. Optimizing it is device and OS specific.
Before making any decision like this, you should profile your code to locate the hotspots in your code. Without this information, any changes you make to speed things up will be guesswork. Have you tried this Android NDK profiler?
There is some performance tests done by shervin imami on his website. You can check it to get some ideas.
http://www.shervinemami.info/timingTests.html
Hope it helps.
(And also, it would be nice if you share your own findings somewhere if you get any way for performance boost.)
I guess the question needs to be formulated to: is C faster than C++? and the answer is NO. Both are compiled to the native machine language and C++ is designed to be as fast as C As for the STL (espeically ISO standard) are also designed and taken care that they are as fast as pointers + they offer flexibility. The only reason to use C is that your platform doesn't support C++ In my humble openion, don't convert everything to C, as you'll probably get almost the same performance. and try instead to improve your code or use other functionalities of opencv to do what you want.
Not convinced? well then write a simple function, once in C and once in C++, and run it in a loop of 100 million times and measure the time yourself. Maybe this helps you taking the right decision
I've never used C or C++ in Android. But in a PC you can get C++ to run as fast as C code (sometimes even faster). Most of C++ was designed specifically to allow more features, but not at the cost of speed (Templates are solved at compile time). Most compilers are pretty good at optimizing your code, and your std::vector calls will be inlined and the code will be almost the same as using a native C array.
I'd suggest you look for another way of improving your performance. Maybe there are some multimedia hardware extensions in the Android you can get access to and use to optimize the code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With