Question

I stumbled upon a difference in the way floating point arithmetics are done between MS VS 2010 builds for x86 and x64 (both executed on the same 64 bit machine).

This is a reduced code sample:

float a = 50.0f;
float b = 65.0f;
float c =  1.3f;
float d = a*c;
bool bLarger1 = d<b;
bool bLarger2 = (a*c)<b;

The boolean bLarger1 is always false (d is set to 65.0 in both builds). Variable bLarger2 is false for x64 but true for x86!

I am well aware of floating point arithmetics and the rounding effects taking place. I also know that 32 bit sometimes uses different instructions for floating operations than 64 bit builds. But in this case I am missing some information.

Why is there a discrepency between bLarger1 and bLarger2 on the first place? Why is it only present on the 32 bit build?

Left: x86, Right: x64

Was it helpful?

Solution

The issue hinges on this expression:

bool bLarger2 = (a*c)<b;

I looked at the code generated under VS2008, not having VS2010 to hand. For 64 bit the code is:

000000013FD51100  movss       xmm1,dword ptr [a] 
000000013FD51106  mulss       xmm1,dword ptr [c] 
000000013FD5110C  movss       xmm0,dword ptr [b] 
000000013FD51112  comiss      xmm0,xmm1 

For 32 bit the code is:

00FC14DC  fld         dword ptr [a] 
00FC14DF  fmul        dword ptr [c] 
00FC14E2  fld         dword ptr [b] 
00FC14E5  fcompp           

So under 32 bit the calculation is performed in the x87 unit, and under 64 bit it is performed by the x64 unit.

And the difference here is that the x87 operations are all performed to higher than single precision. By default the calculations are performed to double precision. On the other hand the SSE unit operations are pure single precision calculations.

You can persuade the 32 bit unit to perform all calculations to single precision accuracy like this:

_controlfp(_PC_24, _MCW_PC);

When you add that to your 32 bit program you will find that the booleans are both set to false.

There is a fundamental difference in the way that the x87 and SSE floating point units work. The x87 unit uses the same instructions for both single and double precision types. Data is loaded into registers in the x87 FPU stack, and those registers are always 10 byte Intel extended. You can control the precision using the floating point control word. But the instructions that the compiler writes are ignorant of that state.

On the other hand, the SSE unit uses different instructions for operations on single and double precision. Which means that the compiler can emit code that is in full control of the precision of the calculation.

So, the x87 unit is the bad guy here. You can maybe try to persuade your compiler to emit SSE instructions even for 32 bit targets. And certainly when I compiled your code under VS2013 I found that both 32 and 64 bit targets emitted SSE instructions.

OTHER TIPS

Floating points operations are always imprecise, and comparing two floats this close (or equal) almost never return the correct output.

Floating point numbers are stored and processed differently on 32bit and 64bit machines (as also suggested by comments). If I remember correctly, in VC 32bit floats are saved on the stack and FPU (Floating-Point Unit) processes them, whereas floats on a 64bit machine can be stored in specialized registers (SSE) and are calculated using other Units in the CPU.

I have no definite source to my answer, but please look at this page or this.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top