Question

my code is quite simple

double d = 405, g = 9.8, v = 63;
double r = d * g / (v * v);
printf("%s\n",(r>1.0)?"GT":"LE");

and here is my result

  • g++-mingw32-v4.8.1: LE (the result is EQ indeed)
  • g++ on ubuntu : GT ( this result comes from my friend, just do not have a linux at hand)
  • VC++ 11 : GT
  • C# (.Net 4.5) : GT
  • Python v2.7.3 :GT (this also comes from my friend)
  • Haskell (GHCi v7.6.3) : GT

g++-mingw, vc++, c#, haskell are all running on my machine with an i7-2630QM

The Python-win32 version comes from my friend, he also gets an LE from his g++-mingw-3.4.2.

And the ubuntu version comes from another friend...

Only g++ gives me LE, and the others are all GT.

I just want to know which one is wrong, g++ or the rest.

Or what SHOULD it be, GT or LE, in IEEE 754?

Was it helpful?

Solution

The IEEE 754 64-bit binary result is GT.

The two exactly representable values bracketing 9.8 are:

9.7999999999999989341858963598497211933135986328125
9.800000000000000710542735760100185871124267578125

The second one is closer to 9.8, so it should be chosen in the normal rounding mode. It is slightly larger than 9.8, resulting in a product that is slightly larger than would have been produced in real number arithmetic, 3969.00000000000045474735088646411895751953125. The conversion of v to double is exact, as is the v*v multiplication. The result is division of a number slightly greater than 3969 by 3969. The rounded result is 1.0000000000000002220446049250313080847263336181640625

OTHER TIPS

The conversion from decimal fraction to a binary fraction is precise only if the decimal fraction can be summed up by binary fractions like 0.5, 0.25, ..., etc.

The number 9.8 in your example contains the fraction 0.8, which can not be represented as an exact fraction using binary number system. Thus different compilers will give you different results depending on the precision to represent fractional numbers.

Run your program using the number 9.75, then all the compilers will give you the same result, because

0.75 = 0.25 + 0.125 = 2-2 + 2-3

So the number 9.75 can be represented exactly by using binary fractions.

The difference likely occurs when d * g is evaluated, because the mathematical result of that product must be rounded upward to produce a value representable in double, but the long double result is more accurate.

Most compilers convert 405 and 63 to double exactly and convert 9.8 to 9.800000000000000710542735760100185871124267578125, although the C++ standard gives them some leeway. The evaluation of v * v is also generally exact, since the mathematical result is exactly representable.

Commonly, on Intel processors, compilers evaluate d * g in one of two ways: Using double arithmetic or using long double with Intel’s 80-bit floating-point format. When evaluated with double, 405•9.800000000000000710542735760100185871124267578125 produces 3969.00000000000045474735088646411895751953125, and dividing this by 3969 yields a number slightly greater than one.

When evaluated with long double, the product is 3969.000000000000287769807982840575277805328369140625. The product, although greater than 3969, is slightly less, and dividing it by 3969 using long double arithmetic produces 1.000000000000000072533125339280246635098592378199100494384765625. When this value is assigned to r, the compiler is required to convert it to double. (Extra precision may be used only in intermediate expressions, not in assignments or casts.) This value is sufficient close to one that rounding it to double produces one.

You can mitigate some (but not all) of the variation between compilers by using casts or assignments with each individual operation:

double t0 = d * g;
double t1 = v * v;
double r = t0/t1;

To answer your question: since the expression ought to evaluate to "equal", the test r>1.0 should be false, and the result printed should be "LE".

In reality you are running into the problem that a number like 9.8 cannot be represented exactly as a floating point number (there a hundreds of good links on the web to explain why this is so). If you need exact math, you have to use integers. Or bigDecimal. Or some such thing.

I tested the code, this should return GT.

int main() {
    double d = 405.0f, g = 9.8f, v = 63.0f;
    double r = d * g / (v * v);
    printf("%s\n",(r>1.0f)?"GT":"LE");
}

GCC compiler sees 405 as int, so "double d = 405" is actually "double d = (double) 405".

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top