Question

I'm trying to wrap my head around this floating point representation of binary numbers, but I couldn't find, no matter where I looked, a good answer to the question.

Why is the exponent biased?

What's wrong with the good old reliable two's complement method?

I tried to look at the Wikipedia's article regarding the topic, but all it says is: "the usual representation for signed values, would make comparison harder."

Was it helpful?

Solution

The IEEE 754 encodings have a convenient property that an order comparison can be performed between two positive non-NaN numbers by simply comparing the corresponding bit strings lexicographically, or equivalently, by interpreting those bit strings as unsigned integers and comparing those integers. This works across the entire floating-point range from +0.0 to +Infinity (and then it's a simple matter to extend the comparison to take sign into account). Thus for example in IEEE 754 binary 64 format, 1.1 is encoded as the bit string (msb first)

0011111111110001100110011001100110011001100110011001100110011010

while 0.01 is encoded as the bit string

0011111110000100011110101110000101000111101011100001010001111011

which occurs lexicographically before the bit string for 1.1.

For this to work, numbers with smaller exponents need to compare before numbers with larger exponents. A biased exponent makes that work, while an exponent represented in two's complement would make the comparison more involved. I believe this is what the Wikipedia comment applies to.

Another observation is that with the chosen encoding, the floating-point number +0.0 is encoded as a bit string consisting entirely of zeros.

OTHER TIPS

I do not recall the specifics, but there was some desire for the highest exponent to be slightly farther from zero than the least normal exponent. This increases the number of values x for which both x and its reciprocal are approximately representable. For example, with IEEE-754 64-bit binary floating-point, the normal exponent range is -1022 to 1023. This makes the largest finite representable value just under 21024, so the interval for which x and its reciprocal are both approximately representable is almost 2-1024 to almost 21024. (Numbers at the very low end of this interval are subnormal, so some precision is being lost, but they are still representable.)

With a two’s complement representation, the exponent values would range from -1024 to 1023, and we have to reserve two of them to handle zeros, subnormals, infinities, and NaNs. That leaves a range of -1023 to 1022. With this, the interval for x such that both x and its reciprocal are approximately representable is almost 2-1023 to 21023. Thus, the biased arrangement provides a greater useful range of values.

enter image description here

I believe this picture will help you to understand what Mark Dickinson said simply comparing the corresponding bit strings lexicographically, or equivalently, by interpreting those bit strings as unsigned integers and comparing those integers.

https://en.wikipedia.org/wiki/Offset_binary

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top