Possible compiler bug: Weird results using boost bessel functions with Intel compiler between two machines?

StackOverflow https://stackoverflow.com/questions/22178333

  •  03-06-2023
  •  | 
  •  

Question

I'm trying to use boost's bessel function (cyl_bessel_j) in a project. However, I'm finding that the function is returning results with an incorrect sign after around 2000 calls to it.

I've tested this between two machines, one is a CentOS 5.8 (Final) machine, where it oddly enough works, and a RHEL 6.3 (Santiago) machine where it fails.

Both machines are using Boost 1.50.0, and the 13.1.3 20130607 Intel compiler. The CentOS machine is using gcc 4.1.2 20080704, and the RHEL machine is using gcc 4.4.6 20120305.

Here is my code:

template<typename FloatType> FloatType funcT(FloatType z, FloatType phi,
                                             int n, int m, int p)
{
  using namespace boost::math;
  FloatType sqrt2PiZ = sqrt((2 * M_PI)/z);
  FloatType nrmLeg = normalizedLegendre(n,m,-sin(phi));
  FloatType besselJ = cyl_bessel_j(p + 0.5, z);

  std::cout << "  " << p << "," << z << "  besselJ: " << besselJ << std::endl;

  return sqrt2PiZ * nrmLeg * besselJ;
}

Running on the two machines, I found the only term that was coming out different between the two was the besselJ term. For the first 1980 calls to the function, they return identical results, however, on the 1981st call, the RHEL machine suddenly switches sign in its results. The first few failed terms print out as follows (RHEL SIDE):

  ...
  1,7.90559 besselJ: -0.0504874
  2,7.90559 besselJ: 0.264237
  3,7.90559 besselJ: 0.217608
  ...

Running a reference test in MATLAB using the besselJ function, I find that for these inputs, the signs should be reversed, and indeed the CentOS machine agrees with MATLAB.

I decided to write a simple hello-world style example with the besselJ function to try and determine the cause of the failure:

#include <boost/math/special_functions/bessel.hpp>
#include <iostream>

int main(int argc, char** argv)
{
  double besselTerm = boost::math::cyl_bessel_j(1.5, 7.90559);
  std::cout << besselTerm << std::endl;
}

This test returns the expected value of 0.0504874 on BOTH machines.

At this point, I'm ripping my hair out trying to determine the cause of the problem. It seems to be some weird compiler bug or stack corruption.But then, how can stack corruption give the exact correct answer with the exception of a single bit?

Has anyone run into an issue like this with boost or the Intel compiler in general?

Additional info: It has been found that the minimal case test breaks on GCC 4.4.6 with the --fast-math flag. I was also able to get the minimal test case to fail with the Intel compiler by using the -std=c++11 flag (which is what the larger project uses).

Was it helpful?

Solution

A bug was submitted to the boost library trac system.

It was found, however, that the problem is in the std c++ libraries and indicates an issue in GCC 4.4.7.

Boost devs created a workaround patch that addresses the issue at: https://github.com/boostorg/math/commit/9f8ffee4b7a3f82b1c582735d43522d7d0cde746

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top