UTF-16 reserved codepoints

Question 1

Your proposed scheme is less efficient than the current surrogate pair scheme, which is one problem.

Currently, only 0xD800-0xDFFF (2048 code units) are "out of bounds" as normal characters, leaving 63488 code units mapping to single characters. Under your proposal, 0x8000-0xFFFF (32768) code units are reserved for multi-code-unit code points, leaving only the other 32768 code units for single-code-unit code points.

I don't know how many code points are currently specified in the basic multilingual plane, but I wouldn't be surprised if it were more than 32768, and of course it can grow. As soon as it's more than 32768, there would be more characters which require two code units to be represented under your proposal than in UTF-16 as it stands.

Now I agree that none of this requires UCS to include a reserved range (and it's an ugly mix of meanings, in some ways) - but doing so makes it simple (in code) to map UTF-16 to UCS, while still maintaining a pretty efficient solution.

There are very few downsides of this - there's plenty of space in the UCS, so it's not like reserving this small block means we're going to have significantly less room for future expansion.

Supposition

This bit is an informed guess. You could do the research to find out which characters were used in which versions of Unicode, but I believe it's at least a plausible explanation.

The true reason for this particular block being used is probably historical - for a long time Unicode really was just 16-bit, for everything... and characters were already assigned in the upper ranges (the parts your scheme deems off-limits). By taking a block of 2048 values which weren't previously assigned, all previous valid UCS-2 sequences were preserved as valid UTF-16 sequences with the same meaning, while extending the UCS range beyond the BMP. It's possible that some aspects might be easier if the range had been 0xF800-0xFFFF, but it was too late by then.

Question 2

Codepoints D800-DFFF are reserved because they cannot be represented as themselvees in the current UTF-16 encoding scheme. Since they fall within the 0000-FFFF range, they would be encoded as-is using one UTF-16 codeunit. If that were allowed, when a processor is decoding/seeking forwards through a UTF-16 sequence and encounters a codeunit in the D800-0xDBFF range, it would have to decide whether that codeunit represents a standalone codepoint or the start of a surrogate pair. The only way to do that would be to look at the next codeunit to see if it is in the DC00-DFFF range or not. Similar when decoding/seeking backwards through a sequence, if a codeunit in the DC00-DFFF range is encountered, look at the next codeunit to see if it is in the D800-DBFF range. That makes decoding/seeking a bit harder, and more error prone.

Un-reserving codepoints DB00-DFFF for actual character use would require a logic change to the UTF-16 encoding scheme to escape those specific codepoints in a different manner that does not cause ambiguity. However, under the current encoding scheme, such a change is not possible, AFAIK. So they remain permanently reserved.