There's nothing logically wrong with boxing a reference type reference. It is just a no-op, nothing changes.
But Ecma-335 isn't always a good description for what is really implemented in the .NET CLR. The JIT_Box() helper function that implements Opcodes.Box will actually throw an InvalidCastException when it is asked to box a value that's not a value type. It expects a compiler and the jitter to know when to suppress the boxing conversion when it is unnecessary. They do.