Question

I am reading "Numerical Recipes in C The Art of Scientific Computing", and in chapter one there is a section which discusses how floating point numbers are represented from a somewhat architecture agnostic standpoint. This post pertains to the i386 family of the Intel processors, and how floating point numbers are represented in that architecture. My question specifically targets how the biased exponent is computed and how the Mantissa is represented. Is the leading one stored in the Mantissa or not.

In "Numerical Recipes in C The Art of Scientific Computing", I am given the formula:

s X M X B^(e - E) : s is a single bit to denote sign, M is the mantissa, B is the base (base 2), e is the exponent, and E is the bias on the exponent.

  1. Is e stored in 2's complement, or is it an unsigned 8 bit field?
  2. E is the bias. Is the bias 127?
  3. Is the mantissa to be read as 1.00000(2) or is it read .0000000(2)? Where (2) is base 2.
Was it helpful?

Solution

  1. e is an unsigned 8-bit field. The bias (E) is there to let you represent both positive and negative exponents. This is a slightly saner representation than two's complement for doing actual calculations, even if it's slightly awkward to think about.

  2. What the bias is depends on the floating-point type. For a standard IEEE float, it's 127. For a standard IEEE double, it's 1023.

  3. Not sure what you mean. For the standard float and double types, there's an implied 1 bit before the mantissa for normal numbers and none for subnormal numbers. If you have the IEEE float whose binary representation (sign, exponent, mantissa) is 0 01111111 01110111011101110111011, you can read this as (-1)^0 * 2^(01111111b) * 1.01110111011101110111011b Note the leading 1. before the mantissa.

When the exponent is as small as possible (zero), you have the subnormal numbers. When the exponent as large as possible (all ones), you have infinities and NaNs. The mantissa means something diferent here. All other exponents represent "normal numbers."

For Intel's 80-bit long double type, there is no implied 1 bit (it is stored in the high bit of the mantissa) and I can't recall what happens when you try doing arithmetic with long doubles that have a normal representation but the implied bit is switched off. I think they did this to make the x87 easier to build.

OTHER TIPS

Yup - all of the intel 86 family including 64 bit support IEEE 754 the standard for floating point.

From the source:

http://www.intel.com/standards/floatingpoint.pdf

To answer IEEE 754 - how it works see:

http://en.wikipedia.org/wiki/IEEE_754-2008

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top