Detection of Negative Binary

Question 1

Question 1:

"Is it a convention of negative binary numbers to have the most significant bit indicate the sign?"

There are multiple ways to represent negative numbers in binary. The most common is the two's-complement representation that you're learning about. In that system, yes, the highest-order bit will indicate the sign of the number (if 0 is grouped with the positive numbers).

In standard unsigned binary, numbers are represented by sequences of bits in positional notation (for brevity I'll only be using three bits):

b₂b₁b₀ = 2²b₂ + 2¹b₁ + 2⁰b₀ = 4b₂ + 2b₁ + b₀

111₂ = 7₁₀
110₂ = 6₁₀
101₂ = 5₁₀
100₂ = 4₁₀
011₂ = 3₁₀
010₂ = 2₁₀
001₂ = 1₁₀
000₂ = 0₁₀

Two's-complement
There are several ways to look at 2s-complement, I think the answer is apparent in all of them. One way to get this system is to take the top half of the unsigned numbers (which all have the high bit set) and move them below zero:

011₂ = 3₁₀
010₂ = 2₁₀
001₂ = 1₁₀
000₂ = 0₁₀
111₂ = -1₁₀
110₂ = -2₁₀
101₂ = -3₁₀
100₂ = -4₁₀

You can clearly see that the high bit indicates the sign. Again, 0 takes up one of the 4 positive representations, which leads to the range not being symmetric: [3, -4] (though sometimes the most negative value is considered special, making the usable range symmetric). Equivalently, we can reinterpret the highest-order bit as a negative number:

b₂b₁b₀ = -(2²) b₂ + 2¹b₁ + 2⁰b₀ = -4 b₂ + 2b₁ + b₀

Clearly, since the high bit has a bigger weight (in the sense of absolute value) than all the other bits combined, if it's set, the result is negative. If it's not set, all the remaining weights are positive and hence so is the result.

From this definition, we can derive the third interpretation: the commonly known rule that -a = ~a + 1 (where - means arithmetic negation, ~ means bitwise complementation, and we ignore overflow):

a + ~a = -4b₂ + 2b₁ + b₀ + -4(~b₂) + 2(~b₁) + ~b₀
a + ~a = -4(b₂+~b₂) + 2(b₁+~b₁) + (b₀+~b₀)
a + ~a = -4(1) + 2(1) + (1)
a + ~a = -1
a = -(~a + 1)
-a = ~a + 1

Here, we see that negation flips the high bit, so it indicates the sign of the number. Note that this is not strictly true, as the addition with one can flip the high bit back if all the other bits are set. However, this is only the case for 0 and the most negative number (-4₁₀, or 100₂, in this case), both of which remain the same when negated.

The benefit in using 2s-complement is that the same hardware can be used to do signed and unsigned addition. This nice property does not hold for the other negative binary representations that have been used in the past, some of which I'll touch on briefly. Due to this fact, modern CPUs almost always use this representation for integer arithmetic (I'm not aware of any recent commercial counterexamples, but they may be out there). This is why you are learning about it (as opposed to Convert binary to denary -> subtract denary -> reconvert into binary): to understand how operations work at the gate level of an ALU.

One's-complement
1s-complement is closely related to 2s-complement. Negation is performed by inverting the bits only (no addition of 1). The leading bit still indicates sign, but there are different representations for positive and negative zero. I've never personally come across a real-world use of 1s-complement, but it's of historical interest.

b₂b₁b₀ = -3b₂ + 2b₁ + b₀

011₂ = 3₁₀
010₂ = 2₁₀
001₂ = 1₁₀
000₂ = 0₁₀
111₂ = -0₁₀
110₂ = -1₁₀
101₂ = -2₁₀
100₂ = -3₁₀

Sign-and-magnitude
Sign-and-magnitude is closest to how humans normally write negative numbers. The low two bits have the same weight as in the systems above and the high bit has no (additive) weight. Instead, it only changes the sign of the result. Here, obviously, the leading bit indicates sign. Like 1s-complement, there are two representations of 0. It is still used today in the mantissa of IEEE floating point numbers (although the exponent sits between the sign and the magnitude).

b₂b₁b₀ = (-1)^b₂(2b₁ + b₀)

0 11 ₂ = + 3 ₁₀
0 10 ₂ = + 2 ₁₀
0 01 ₂ = + 1 ₁₀
0 00 ₂ = + 0 ₁₀
1 00 ₂ = - 0 ₁₀
1 01 ₂ = - 1 ₁₀
1 10 ₂ = - 2 ₁₀
1 11 ₂ = - 3 ₁₀

Excess-n
Excess-n is really more like a family of systems. All values are shifted up by n (known as the bias) and then represented as in the unsigned case. The leading bit may indicate the sign if the right bias is chosen, though with different polarity than the above systems (and 0 could be grouped with either the negatives or the positives). This is still used in the exponent of IEEE floating point numbers. For n = 3, the high bit does indicate sign and 0 is grouped with the negative numbers:

b₂b₁b₀ = 4b₂ + 2b₁ + b₀ - n

111₂ = 4₁₀
110₂ = 3₁₀
101₂ = 2₁₀
100₂ = 1₁₀
011₂ = 0₁₀
010₂ = -1₁₀
001₂ = -2₁₀
000₂ = -3₁₀

Others
There are yet other, more esoteric, integer representations such as balanced-ternary, base-negative-2, or (arguably) binary coded decimal (or BCD for short). The reason I say BCD is arguable is that modern processors often still have support for it (though it's not how numbers are represented internally) and many calculators used to be based on it. In these systems, the leading bit (or trit, or base-n digit) may or may not indicate the sign (or may indicate it in some cases but not others).

Question 2:

"How does the end user determine the difference between 11101100 being 236 and -20?"

In general there's no way to tell whether a number stored in a register or memory is actually meant to be 2s-complement or unsigned, as pointed out by others. You essentially have to track what's done with it to determine that.

However, if the number is an immediate value stored directly in a machine code instruction, the opcode could indicate whether or not it's signed (depending on the architecture). This may change, for instance, how overflows are handled, or whether or not sign-extension is performed.

For example, there may be separate "load immediate" and "load signed immediate" instructions that copy an immediate value value into a larger register, the second doing sign-extension and the first not. "Branch" instructions often have a signed immediate to indicate the size of the jump (so that both forward and backwards branches can use a single instruction). There may be different "add immediate" and "add unsigned immediate" instructions that set overflow flags appropriately for the type of addition.

Sign extension
Sign extension means copying the high bit to preserve the value of a 2s-complement number. This will produce incorrect results for half the unsigned numbers.

Sign extension not performed:

100₂ = 00000100₂
Unsigned: 4₁₀ = 4₁₀
Signed: -4₁₀ = 4₁₀

Sign extension performed:

100₂ = 11111100₂
Signed: -4₁₀ = -4₁₀
Unsigned: 4₁₀ = 252₁₀

001₂ = 00000001₂
Signed and unsigned: 1₁₀ = 1₁₀

Overflow
Adding or subtracting two numbers may give a result that is too big (in an absolute sense) to be correctly represented. Addition of the same two binary sequences may cause overflow for signed numbers but not unsigned (or vice versa).

Signed overflows but unsigned does not:

011₂ + 011₂ = 110₂
Signed: 3₁₀+3₁₀ = -2₁₀
Unsigned: 3₁₀+3₁₀ = 6₁₀

Unsigned overflows but signed does not:

111₂ + 010₂ = 001₂
Unsigned: 7₁₀ + 2₁₀ = 1₁₀
Signed: -1₁₀ + 2₁₀ = 1₁₀

Question 2

In two's complement notation, the highest bit always indicates the sign. However, you must know the field width, and you must also know if two's complement notation is being used. For example, if 11101100 is a 32-bit number, then the most significant bit is 0, so it is +236. If it is an unsigned 8-bit integer, then is +236 because unsigned numbers do not use two's complement notation (only signed numbers do).
Adding and subtracting in binary is how the computer does it. So it is useful to know in order to understand how the computer works.

Question 3

How does the end user determine the difference between 11101100 being 236 and -20?

The user cannot determine it from the bit pattern alone. Some context must tell whether this byte is signed or unsigned. In most programming languages, this context is tracked by keeping track of types. Thus in C or C++ you have signed char and unsigned char. (Plain old char could be either one).

There reason that binary subtraction works is that it (like some other operations) happens to be exactly the same thing to the bit pattern even if you got the type "wrong". One way to think of this is that (for these operations) you are doing arithmetic modulo 256, and in that modulus 236 and -20 are actually two names for the same number.

Question 4

The short answer is that it depends on how you are using it. Almost all modern compilers represent their integer values as two's compliment. It's a convention. If you are writing your code in assembly then you have to pay closer attention to what is in memory or in a register, but in higher level languages the value's data type tells you. If the type is signed then the MSB is the sign bit, otherwise it is not. The data type also tells you how many bits are in the value so you will be able to identify which bit is the MSB. For example, an int8_t is always 8 bits and is signed, whereas a uint8_t is always 8 bits but is unsigned. So long as you know how to identify the data type you know exactly how to interpret the value in memory when you see it represented in binary.