Why do we bias the exponent of a floating-point number?

Question 1

The IEEE 754 encodings have a convenient property that an order comparison can be performed between two positive non-NaN numbers by simply comparing the corresponding bit strings lexicographically, or equivalently, by interpreting those bit strings as unsigned integers and comparing those integers. This works across the entire floating-point range from +0.0 to +Infinity (and then it's a simple matter to extend the comparison to take sign into account). Thus for example in IEEE 754 binary 64 format, 1.1 is encoded as the bit string (msb first)

0011111111110001100110011001100110011001100110011001100110011010

while 0.01 is encoded as the bit string

0011111110000100011110101110000101000111101011100001010001111011

which occurs lexicographically before the bit string for 1.1.

For this to work, numbers with smaller exponents need to compare before numbers with larger exponents. A biased exponent makes that work, while an exponent represented in two's complement would make the comparison more involved. I believe this is what the Wikipedia comment applies to.

Another observation is that with the chosen encoding, the floating-point number +0.0 is encoded as a bit string consisting entirely of zeros.

Question 2

I do not recall the specifics, but there was some desire for the highest exponent to be slightly farther from zero than the least normal exponent. This increases the number of values x for which both x and its reciprocal are approximately representable. For example, with IEEE-754 64-bit binary floating-point, the normal exponent range is -1022 to 1023. This makes the largest finite representable value just under 2¹⁰²⁴, so the interval for which x and its reciprocal are both approximately representable is almost 2^-1024 to almost 2¹⁰²⁴. (Numbers at the very low end of this interval are subnormal, so some precision is being lost, but they are still representable.)

With a two’s complement representation, the exponent values would range from -1024 to 1023, and we have to reserve two of them to handle zeros, subnormals, infinities, and NaNs. That leaves a range of -1023 to 1022. With this, the interval for x such that both x and its reciprocal are approximately representable is almost 2^-1023 to 2¹⁰²³. Thus, the biased arrangement provides a greater useful range of values.

Question 3

I believe this picture will help you to understand what Mark Dickinson said simply comparing the corresponding bit strings lexicographically, or equivalently, by interpreting those bit strings as unsigned integers and comparing those integers.

https://en.wikipedia.org/wiki/Offset_binary