Question

In the following SO question: https://stackoverflow.com/questions/2067955/fast-bitmap-blur-for-android-sdk @zeh claims a port of a java blur algorithm to C runs 40 times faster.

Given that the bulk of the code includes only calculations, and all allocations are only done "one time" before the actual algorithm number crunching - can anyone explain why this code runs 40 times faster? Shouldn't the Dalvik JIT translate the bytecode and dramatically reduce the gap to native compiled code speed?

Note: I have not confirmed the x40 performance gain myself for this algorithm, but all serious image manipulation algorithm I encounter for Android, are using the NDK - so this supports the notion that NDK code will run much faster.

Was it helpful?

Solution

For algorithms that operate over arrays of data, there are two things that significantly change performance between a language like Java, and C:

  • Array bound checking: Java will check every access, bmap[i], and confirm i is within the array bounds. If the code tries to access out of bounds, you will get a useful exception. C & C++ do not check anything and just trust your code. The best case response to an out of bounds access is a page fault. A more likely result is "unexpected behavior".

  • Pointers: You can significantly reduce the operations by using pointers.

Take this innocent example of a common filter (similar to blur, but 1D):

for(int i = 0; i < ndata - ncoef; ++i) {  
    z[i] = 0;  
    for(int k = 0; k < ncoef; ++k) {  
        z[i] += c[k] * d[i + k];  
    }  
}  

When you access an array element, coef[k] is:

  • Load address of array coef into register;
  • Load value k into a register;
  • Sum them;
  • Go get memory at that address.

Every one of those array accesses can be improved because you know that the indexes are sequential. Neither the compiler, nor the JIT can know that the indexes are sequential so they cannot optimize fully (although they keep trying).

In C++, you would write code more like this:

int d[10000];  
int z[10000];  
int coef[10];  
int* zptr;  
int* dptr;  
int* cptr;  
dptr = &(d[0]); // Just being overly explicit here, more likely you would dptr = d;  
zptr = &(z[0]); // or zptr = z;  
for(int i = 0; i < (ndata - ncoef); ++i) {  
    *zptr = 0; 
    *cptr = coef;  
    *dptr = d + i;  
    for(int k = 0; k < ncoef; ++k) {  
        *zptr += *cptr * *dptr;  
        cptr++;  
        dptr++;  
    }  
    zptr++;  
}  
       

When you first do something like this (and succeed in getting it correct) you will be surprised how much faster it can be. All the array address calculations of fetching the index and summing the index and base address are replaced with an increment instruction.

For 2D array operations such as blur on an image, an innocent code data[r,c] involves two value fetches, a multiply and a sum. So with 2D arrays the benefits of pointers allows you to remove multiply operations.

So the language allows real reduction in the operations the CPU must perform. The cost is that the C++ code is horrendous to read and debug. Errors in pointers and buffer overflows are food for hackers. But when it comes to raw number grinding algorithms, the speed improvement is too tempting to ignore.

OTHER TIPS

Another factor not mentioned above is the garbage collector. The problem is that garbage collection takes time, plus it can run at any time. This means that a Java program which creates lots of temporary objects (note that some types of String operations can be bad for this) will often trigger the garbage collector, which in turn will slow down the program (app).

Following is an list of Programming Language based on the levels,

  • Assembly Language ( Machine Language, Lover Level )
  • C Language ( Middle Level )
  • C++, Java, .net, ( Higher Level )

Here Lower level language has direct access to the Hardware. As long as the level gets increased the access to the hardware gets decrease. So Assembly Language's code runs at the highest speed while other language's code runs based on their levels.

This is the reason that C Language's code run much faster than the Java's code.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top