Adding long doubles gives the wrong answer in C++

https://stackoverflow.com/questions/12219259

29-06-2021
|

Question

I have a section of code that reads:

std::cerr << val1 << " " << val2 << std::endl;
val1 = val1 + val2;
std::cerr << val1 << std::endl;

Both val1 and val2 are long double.

The problem comes from the fact that the result of this is:

-5.000000000000722771452063564190e-01 2.710505431213761085018632002175e-20
-5.000000000000722771452063564190e-01

Which doesn't make sense. It appears that val2 is NOT being added to val1, however, there is obviously enough information in the fractional part of val1 that val2 could be added to it.

I'm stumped, anyone have any ideas?

I'm using GCC 4.2 I believe. Does G++ use the IEEE quadruple-precision format? Or something else (like the 80 bit extended precision, which would explain this problem (though why do more than 18 decimal places show up then?).

Solution 2

Well, I should have guessed... it looks like long double on G++ is stored as a quadruple-precision format, but computed using a 80 bit extended precision format. So, it will give lots of digits, but only some of those are computed.

OTHER TIPS

If your val1 and val2 are printed correctly then the output is correct:-

-5.000000000000722771452063564190e-01 = -5.000000000000722771452063564190 X e^(-1)  //or 10^(-1)

where ^ denotes to the power of

2.710505431213761085018632002175e-20 =  -5.000000000000722771452063564190 X e^(-20)  //or 10^(-20)

Since val1 >> val2 
=> lim (val2/val1 -> 0) (lim is mathematical limit) .... eq (A)

Consider y=val1+val2
=> y= ((val1+val2)/val1)*val1  (rationalizing)
=> y= {(val1/val1)+(val2/val1)} * val1
=> y= {1+val2/val1}*val1
=> y= {1+0}*val1 .........................................From eq (A)
=> y= val1

thats why output is -5.000000000000722771452063564190e-01 (because the difference produced by addition falls out of the range of representation by binary long double format)

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow