Android: why is native code so much faster than Java code

Tags:

In the following SO question: https://stackoverflow.com/questions/2067955/fast-bitmap-blur-for-android-sdk @zeh claims a port of a java blur algorithm to C runs 40 times faster.

Given that the bulk of the code includes only calculations, and all allocations are only done "one time" before the actual algorithm number crunching - can anyone explain why this code runs 40 times faster? Shouldn't the Dalvik JIT translate the bytecode and dramatically reduce the gap to native compiled code speed?

Note: I have not confirmed the x40 performance gain myself for this algorithm, but all serious image manipulation algorithm I encounter for Android, are using the NDK - so this supports the notion that NDK code will run much faster.

720

asked Jan 28 '14 07:01

Guy

1 Answers

For algorithms that operate over arrays of data, there are two things that significantly change performance between a language like Java, and C:

array bound checking - Java will check every access, bmap[i], and confirm i is within the array bounds. If the code tries to access out of bounds, you will get a useful exception. C & C++ do not check anything and just trust your code. The best case response to an out of bounds access is a page fault. A more likely result is "unexpected behavior".
pointers - You can significantly reduce the operations by using pointers.

Take this innocent example of a common filter (similar to blur, but 1D):

for(i=0; i<ndata-ncoef; ++i) {
  z[i] = 0;
  for(k=0; k<ncoef; ++k) {
    z[i] += c[k] * d[i+k];
  }
}

When you access an array element, coef[k] is:

load address of array coef into register
load value k into a register
sum them
go get memory at that address

Every one of those array accesses can be improved because you know that the indexes are sequential. The compiler, nor the JIT, can know that the indexes are sequential so cannot optimize fully (although they keep trying).

In C++, you would write code more like this:

int d[10000];
int z[10000];
int coef[10];
int* zptr;
int* dptr;
int* cptr;
dptr = &(d[0]); // Just being overly explicit here, more likely you would dptr = d;
zptr = &(z[0]); // or zptr = z;
for(i=0; i<(ndata-ncoef); ++i) {
  *zptr = 0; 
  *cptr = coef;
  *dptr = d + i;
  for(k=0; k<ncoef; ++k) {
    *zptr += *cptr * *dptr;
    cptr++;
    dptr++;
  }
  zptr++;
}

When you first do something like this (and succeed in getting it correct) you will be surprised how much faster it can be. All the array address calculations of fetching the index and summing the index and base address are replaced with an increment instruction.

For 2D array operations such as blur on an image, an innocent code data[r,c] involves two value fetches, a multiply and a sum. So with 2D arrays the benefits of pointers allows you to remove multiply operations.

So the language allows real reduction in the operations the CPU must perform. The cost is that the C++ code is horrendous to read and debug. Errors in pointers and buffer overflows are food for hackers. But when it comes to raw number grinding algorithms, the speed improvement is too tempting to ignore.

126

answered Sep 29 '22 19:09

jdr5ca

Related questions
                            
                                Implement getMaxAmplitude for audioRecord
                            
                                coverflow android :how to get images from a url in the coverflow
                            
                                how to clear surface holder when media player is finished?
                            
                                Android , Java - Rendering a video using bitmap frames to reverse a video (Xuggler)
                            
                                Widget wrong grid size
                            
                                Activity and Fragment Lifecycles and Orientation Changes
                            
                                Install / Unistall from shell command in Android
                            
                                Pass Object reference within Intent without implementing Serializable or Parcelable
                            
                                onCreateView() wait for AsyncTask to execute?
                            
                                android get device overall audio output in pcm
                            
                                Android : Is JDBC supported in Android devices?
                            
                                Android onCreate is called after locking the screen
                            
                                Android - stroke inside path
                            
                                Android - Push Notification are ON?
                            
                                BlueStacks emulator for Linux? [closed]
                            
                                How can convert local xml file to org.ksoap2.serialization.SoapObject?
                            
                                [ Native ]: Using Java functions & 3rd-party libraries in Qt for Android [closed]
                            
                                Fragment Activity - app died, no saved state
                            
                                Android Speech Speech Recognition: Repeated Calling of SpeechRecognizer.startListening() fails on JB 4.1.2
                            
                                Libgdx weird modelling - depth error?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Android: why is native code so much faster than Java code

Tags:

performance

android

dalvik

jit

android-ndk

Guy

People also ask

1 Answers

jdr5ca

Recent Activity

Donate For Us