faster Math.exp() via JNI?

https://stackoverflow.com/questions/66402

09-06-2019
|

Question

I need to calculate Math.exp() from java very frequently, is it possible to get a native version to run faster than java's Math.exp()??

I tried just jni + C, but it's slower than just plain java.

Solution

+1 to writing your own exp() implementation. That is, if this is really a bottle-neck in your application. If you can deal with a little inaccuracy, there are a number of extremely efficient exponent estimation algorithms out there, some of them dating back centuries. As I understand it, Java's exp() implementation is fairly slow, even for algorithms which must return "exact" results.

Oh, and don't be afraid to write that exp() implementation in pure-Java. JNI has a lot of overhead, and the JVM is able to optimize bytecode at runtime sometimes even beyond what C/C++ is able to achieve.

OTHER TIPS

This has already been requested several times (see e.g. here). Here is an approximation to Math.exp(), copied from this blog posting:

public static double exp(double val) {
    final long tmp = (long) (1512775 * val + (1072693248 - 60801));
    return Double.longBitsToDouble(tmp << 32);
}

It is basically the same as a lookup table with 2048 entries and linear interpolation between the entries, but all this with IEEE floating point tricks. Its 5 times faster than Math.exp() on my machine, but this can vary drastically if you compile with -server.

Use Java's.

Also, cache results of the exp and then you can look up the answer faster than calculating them again.

You'd want to wrap whatever loop's calling Math.exp() in C as well. Otherwise, the overhead of marshalling between Java and C will overwhelm any performance advantage.

You might be able to get it to run faster if you do them in batches. Making a JNI call adds overhead, so you don't want to do it for each exp() you need to calculate. I'd try passing an array of 100 values and getting the results to see if it helps performance.

The real question is, has this become a bottle neck for you? Have you profiled your application and found this to be a major cause of slow down?

If not, I would recommend using Java's version. Try not to pre-optimize as this will just cause development slow down. You may spend an extended amount of time on a problem that may not be a problem.

That being said, I think your test gave you your answer. If jni + C is slower, use java's version.

Commons Math3 ships with an optimized version: FastMath.exp(double x). It did speed up my code significantly.

Fabien ran some tests and found out that it was almost twice as fast as Math.exp():

 0.75s for Math.exp     sum=1.7182816693332244E7
 0.40s for FastMath.exp sum=1.7182816693332244E7

Here is the javadoc:

Computes exp(x), function result is nearly rounded. It will be correctly rounded to the theoretical value for 99.9% of input values, otherwise it will have a 1 UPL error.

Method:

    Lookup intVal = exp(int(x))
    Lookup fracVal = exp(int(x-int(x) / 1024.0) * 1024.0 );
    Compute z as the exponential of the remaining bits by a polynomial minus one
    exp(x) = intVal * fracVal * (1 + z)

Accuracy: Calculation is done with 63 bits of precision, so result should be correctly rounded for 99.9% of input values, with less than 1 ULP error otherwise.

Since the Java code will get compiled to native code with the just-in-time (JIT) compiler, there's really no reason to use JNI to call native code.

Also, you shouldn't cache the results of a method where the input parameters are floating-point real numbers. The gains obtained in time will be very much lost in amount of space used.

The problem with using JNI is the overhead involved in making the call to JNI. The Java virtual machine is pretty optimized these days, and calls to the built-in Math.exp() are automatically optimized to call straight through to the C exp() function, and they might even be optimized into straight x87 floating-point assembly instructions.

There's simply an overhead associated with using the JNI, see also: http://java.sun.com/docs/books/performance/1st_edition/html/JPNativeCode.fm.html

So as others have suggested try to collate operations that would involve using the JNI.

Write your own, tailored to your needs.

For instance, if all your exponents are of the power of two, you can use bit-shifting. If you work with a limited range or set of values, you can use look-up tables. If you don't need pin-point precision, you use an imprecise, but faster, algorithm.

There is a cost associated with calling across the JNI boundary.

If you could move the loop that calls exp() into the native code as well, so that there is just one native call, then you might get better results, but I doubt it will be significantly faster than the pure Java solution.

I don't know the details of your application, but if you have a fairly limited set of possible arguments for the call, you could use a pre-computed look-up table to make your Java code faster.

There are faster algorithms for exp depending on what your'e trying to accomplish. Is the problem space restricted to a certain range, do you only need a certain resolution, precision, or accuracy, etc.

If you define your problem very well, you may find that you can use a table with interpolation, for instance, which will blow nearly any other algorithm out of the water.

What constraints can you apply to exp to gain that performance trade-off?

-Adam

I run a fitting algorithm and the minimum error of the fitting result is way larger than the precision of the Math.exp().

Transcendental functions are always much more slower than addition or multiplication and a well-known bottleneck. If you know that your values are in a narrow range, you can simply build a lookup-table (Two sorted array ; one input, one output). Use Arrays.binarySearch to find the correct index and interpolate value with the elements at [index] and [index+1].

Another method is to split the number. Lets take e.g. 3.81 and split that in 3+0.81. Now you multiply e = 2.718 three times and get 20.08.

Now to 0.81. All values between 0 and 1 converge fast with the well-known exponential series

1+x+x^2/2+x^3/6+x^4/24.... etc.

Take as much terms as you need for precision; unfortunately it's slower if x approaches 1. Lets say you go to x^4, then you get 2.2445 instead of the correct 2.2448

Then multiply the result 2.781^3 = 20.08 with 2.781^0.81 = 2.2445 and you have the result 45.07 with an error of one part of two thousand (correct: 45.15).

It might not be relevant any more, but just so you know, in the newest releases of the OpenJDK (see here), Math.exp should be made an intrinsic (if you don't know what that is, check here).

This will make performance unbeatable on most architectures, because it means the Hotspot VM will replace the call to Math.exp by a processor-specific implementation of exp at runtime. You can never beat these calls, as they are optimized for the architecture...

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow