Environment settings needed for learning " Computer Systems:A Programmer's Perspective"

Question 1

The example in the book aims (most likely) at the particular properties of the x87 FPU in Intel CPUs: The main property of this FPU type is that it provides only registers with a (visible) 80 Bit precision. So 32 or 64 bit floats are converted into 80 bit floats when loaded into an FPU register. Further, normally arithmetic operations are carried out with full precision, so if a value is kept in a FPU register for later use, it is not rounded to 32 or 64 bit as it is done for a value which is copied into memory and then loaded back later on. Due to this it makes a difference if a value is kept in a register or not.

However, Mac OS X (which I suppose you are using on a Macbook) does not make use of the x87 FPU, it uses the SSE unit: SSE provides 32 and 64 bit floating point register and operations so it makes no difference if a value is kept in a register or stored in memory regarding its precision. The result is always rounded after each operation. This applies normally to 64 bit exectubles on windows and Linux as well.

On e.g. 32 bit, Linux or Windows the situation is different. The use of the x87 or SSE unit depends on the enviroment, often the x87 FPU is used because 32 Bit machines might not support the needed SSE2 instructions, though the last CPUs without SSE2 were built approximately 10 years ago.

Question 2

Not much of an answer, but I researched this a little bit. I found this fcomp.c http://csapp.cs.cmu.edu/public/1e/ics/code/data/fcomp.c, which looks like it is probably from the same example from your book, but your version is just contains the first test. Anyway I played with various different gcc versions and -m32 vs -m64 and found that test1 (the same as your test) always comes up equal, at least for i386 and x86_64.

However there is one test (test2) which seems to exhibit architecture-dependent behavior:

void test2(int denom)
{
  double r1;
  int t1;
  r1 = recip(denom);             /* Default: register, Forced store: memory */
  t1 = r1 == 1.0/(double) denom; /* Compares register or memory to register */
  printf("test2 t1: r1 %f %c= 1.0/10.0\n", r1, t1 ? '=' : '!');  
  printf("A long double on this machine requires %d bytes\n", sizeof(long double));
}

(test2() is called with a demon of 10)

When compiling with gcc -m64 -o fcomp fcomp.c I get this output:

test2 t1: r1 0.100000 == 1.0/10.0
A long double on this machine requires 16 bytes

Whereas when compiling with gcc -m32 -o fcomp fcomp.c I get this output:

test2 t1: r1 0.100000 != 1.0/10.0
A long double on this machine requires 12 bytes

For the record, I got these results with both gcc 3.4.6 and 4.1.2.

All of the other tests come up equal, regardless of what compiler/arch I use.

Environment settings needed for learning " *Computer Systems:A Programmer's Perspective*"

Environment settings needed for learning " Computer Systems:A Programmer's Perspective"