Question

We've been having some weird crashes in some Intel FORTRAN code, and I eventually tracked the line down to:

L_F = EXP(-L_B2*L_BETASQ*L_DS)

Where the -L_B2*L_BETASQ*L_DS term evaluated to approximately -230. As it happens, EXP(-230) evaluates to about 1e-100. In all other known cases, L_DS is much smaller, resulting in the smallest (known) return from EXP being about 1e-50, which does not cause an error.

As soon as FORTRAN evaluates the clause EXP(-230), you get:

forrtl: severe (157): Program Exception - access violation
Image              PC        Routine            Line      Source

But no other information.

Exception 157 is generally concerned with interoperability, and you cannot debug into EXP in FORTRAN as it cannot find a particular .c file - which presumably means that EXP is implemented in C (which I find surprising).

My hypothesis is that FORTRAN has implemented EXP in C, but the interface cannot translate floats which are smaller than 1e-100 into REAL(4)s. As I had previously believed floats and REAL(4)s to be byte-wise identical, I cannot back this hypothesis up - and I cannot find anything anywhere about it.

Before I close this bug down, can anyone confirm or deny my hypothesis - or provide me with another?

Best regards,

Mike

EDIT: I'm going to mark this question as answered, as High Performance Mark has answered the immediate question.

My hypothesis is unfortunately incorrect - I've tried to trap the problem doing this:

L_ARG = L_B2*L_BETASQ*L_DS

IF (L_ARG .GT. 230.0) THEN
    L_F = 0.0
ELSE
    L_F = EXP(-L_ARG)
ENDIF

Unfortunately, the exception now (apparently) occurs in the L_ARG .GT. 230.0 clause. This either means that the debugging in Release mode is worse than I thought, or it's some sort of 'stored up' floating point error (see "Floating-point invalid operation" when inputting float to a stringstream).

Was it helpful?

Solution

Fortran hasn't (necessarily) implemented anything in C. The implementation of standard intrinsics is compiler-specific; it is common to find that implementations call libm or one of its relatives. From Intel's (or any other compiler writer's) point of view this makes sense, write one robust and fast implementation of exp in whatever language takes your fancy and call it from Fortran, C, Ada, COBOL, and all the other languages you've ever heard of. It may even be sensible to write it in C. Part of your hypothesis may, therefore, be correct.

However, unless you are explicitly writing C code and Fortran code and making a single binary from it there's not really any interoperability (in the Fortran standard sense) going on, all the dirty details of that are (or should be) hidden from you; the compiler ought to generate correct calls to whatever libraries it uses to implement exp and get the return values whatever they may be including NaNs and similar.

Certainly, the value of exp(-230) is 0.00000000 for a 4-byte real but I see no reason why a Fortran program which uses a library written in C should raise an access violation because it comes across those numbers. I think it is far more likely that you have an error elsewhere in your program, perhaps trying to access an array element outside the bounds of the array, and that your run-time fails to identify it at the right location in the source code. This is not uncommon.

EDIT

I wrote this stuff about interoperability before (re-)reading the question. Now that you've clarified that you are using interoperability features, it may be of interest or use ...

You can certainly not depend on your Fortran's real(4) and your C's float being identical; it is very likely but not certain. Most of the modern Fortran compilers (including Intel's) use kind type parameters which match the number of bytes in their representation, so the code 4 indicates that you are dealing with a 4-byte real which, on an IEEE-754 compliant processor, should be the same as a C float. The Fortran standards do not require any correspondence between those kind type parameters and the number of bytes used to represent a number. It's always worth checking your compiler documentation or doing some tests.

If you are concerned about interoperability you should probably be using Fortran's intrinsic features. For example if you

use :: iso_c_binding

you have available a number of constants including C_FLOAT which can be used like this:

real(C_FLOAT) :: a_fortran_float

If your Fortran compiler supports this then a_fortran_float should match a C float on a companion processor. This last term is left somewhat undefined, in practice compilers from the same stable seem always to be companions, for different stables sometimes yes sometimes no. The Intel Fortran and C and C++ compilers seem to be companions in the required sense. It would not surprise me to learn that the Intel Fortran and MS C++ compilers do not play nicely together.

My vague recollections of C include an uncertainty that float is standardised, in which case you can't be sure, without testing or reading the documentation, that you really do have a 4-byte IEEE single-precision floating-point number on that side of your interoperation either.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top