Pregunta

When I run

printf("%.8f\n", 971090899.9008999);
printf("%.8f\n", 9710908999008999.0 / 10000000.0);

I get

971090899.90089989
971090899.90089977

I know why neither is exact, but what I don't understand is why doesn't the second match the first?
I thought basic arithmetic operations (+ - * /) were always as accurate as possible...
Isn't the first number a more accurate result of the division than the second?

¿Fue útil?

Solución

Judging from the numbers you're using and based on the standard IEEE 754 floating point standard, it seems the left hand side of the division is too large to be completely encompassed in the mantissa (significand) of a 64-bit double.

You've got 52 bits worth of pure integer representation before you start bleeding precision. 9710908999008999 has ~54 bits in its representation, so it does not fit properly -- thus, the truncation and approximation begins and your end numbers get all finagled.

EDIT: As was pointed out, the first number that has no mathematical operations done on it doesn't fit either. But, since you're doing extra math on the second one, you're introducing extra rounding errors not present with the first number. So you'll have to take that into consideration too!

Otros consejos

Evaluating the expression 971090899.9008999 involves one operation, a conversion from decimal to the floating-point format.

Evaluating the expression 9710908999008999.0 / 10000000.0 involves three operations:

  • Converting 9710908999008999.0 from decimal to the floating-point format.
  • Converting 10000000.0 from decimal to the floating-point format.
  • Dividing the results of the above operations.

The second of those should be exact in any good C implementation, because the result is exactly representable. However, the other two add rounding errors.

C does not require implementations to convert decimal to floating-point as accurately as possible; it allows some slack. However, a good implementation does convert accurately, using extra precision if necessary. Thus, the single operation on 971090899.9008999 produces a more accurate result than the multiple operations.

Additionally, as we learn from a comment, the C implementation used by the OP converts 9710908999008999.0 to 9710908999008998. This is incorrect by the rules of IEEE-754 for the common round-to-nearest mode. The correct result is 9710908999009000. Both of these candidates are representable in IEEE-754 64-bit binary, and both are equidistant from the source value, 9710908999008999. The usual rounding mode is round-to-nearest, ties-to-even, meaning the candidate with the even low bit should be selected, which is 9710908999009000 (with significand 0x1.1400298aa8174), not 9710908999008998 (with significand 0x1.1400298aa8173). (IEEE 754 defines another round-to-nearest mode: ties-to-away, which selects the candidate with the larger magnitude, which is again 9710908999009000.)

The C standard permits some slack in conversions; either of these two candidates conforms to the C standard, but good implementations also conform to IEEE 754.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top