Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Android: why is native code so much faster than Java code

In the following SO question: https://stackoverflow.com/questions/2067955/fast-bitmap-blur-for-android-sdk @zeh claims a port of a java blur algorithm to C runs 40 times faster.

Given that the bulk of the code includes only calculations, and all allocations are only done "one time" before the actual algorithm number crunching - can anyone explain why this code runs 40 times faster? Shouldn't the Dalvik JIT translate the bytecode and dramatically reduce the gap to native compiled code speed?

Note: I have not confirmed the x40 performance gain myself for this algorithm, but all serious image manipulation algorithm I encounter for Android, are using the NDK - so this supports the notion that NDK code will run much faster.

like image 720
Guy Avatar asked Jan 28 '14 07:01

Guy


People also ask

Is C code faster than Java?

Java is compiled into a lower language, then interpreted. It also has automatic garbage collection, and it's farther from machine code in the first place. Because of this C code tends to run faster than Java, but difference depends on what's being done and how well the code has been optimized.

Why is C++ so much faster than Java?

Speed and performance Java is a favorite among developers, but because the code must first be interpreted during run-time, it's also slower. C++ is compiled to binaries, so it runs immediately and therefore faster than Java programs.

Is Objective C faster than Java?

Run-Time performance — While creating iOS apps, developers benefit a lot from the top-notch, run-time performance of the compiled Objective-C programming language (OO). Since Java codes have to be compiled, as well as interpreted, the performance speeds tend to be lower.

Can Java be fast?

In fact, when compared against its peers, Java is pretty fast. Java is able to compete with -- and sometimes outperform -- other interpreted languages based on how it manages memory, completes just-in-time (JIT) compiles and takes advantage of various features of its underlying architecture.


1 Answers

For algorithms that operate over arrays of data, there are two things that significantly change performance between a language like Java, and C:

  • array bound checking - Java will check every access, bmap[i], and confirm i is within the array bounds. If the code tries to access out of bounds, you will get a useful exception. C & C++ do not check anything and just trust your code. The best case response to an out of bounds access is a page fault. A more likely result is "unexpected behavior".

  • pointers - You can significantly reduce the operations by using pointers.

Take this innocent example of a common filter (similar to blur, but 1D):

for(i=0; i<ndata-ncoef; ++i) {
  z[i] = 0;
  for(k=0; k<ncoef; ++k) {
    z[i] += c[k] * d[i+k];
  }
}

When you access an array element, coef[k] is:

  • load address of array coef into register
  • load value k into a register
  • sum them
  • go get memory at that address

Every one of those array accesses can be improved because you know that the indexes are sequential. The compiler, nor the JIT, can know that the indexes are sequential so cannot optimize fully (although they keep trying).

In C++, you would write code more like this:

int d[10000];
int z[10000];
int coef[10];
int* zptr;
int* dptr;
int* cptr;
dptr = &(d[0]); // Just being overly explicit here, more likely you would dptr = d;
zptr = &(z[0]); // or zptr = z;
for(i=0; i<(ndata-ncoef); ++i) {
  *zptr = 0; 
  *cptr = coef;
  *dptr = d + i;
  for(k=0; k<ncoef; ++k) {
    *zptr += *cptr * *dptr;
    cptr++;
    dptr++;
  }
  zptr++;
}
       

When you first do something like this (and succeed in getting it correct) you will be surprised how much faster it can be. All the array address calculations of fetching the index and summing the index and base address are replaced with an increment instruction.

For 2D array operations such as blur on an image, an innocent code data[r,c] involves two value fetches, a multiply and a sum. So with 2D arrays the benefits of pointers allows you to remove multiply operations.

So the language allows real reduction in the operations the CPU must perform. The cost is that the C++ code is horrendous to read and debug. Errors in pointers and buffer overflows are food for hackers. But when it comes to raw number grinding algorithms, the speed improvement is too tempting to ignore.

like image 126
jdr5ca Avatar answered Sep 29 '22 19:09

jdr5ca