Usual arithmetic conversion — a better set of rules?

https://stackoverflow.com/questions/836544

08-07-2019
|

Question

Consider the following code:

void f(byte x) {print("byte");}
void f(short x) {print("short");}
void f(int x) {print("int");}

void main() {
    byte b1, b2;
    short s1, s2;

    f(b1 + b2); // byte + byte = int
    f(s1 + s2); // short + short = int
}

In C++, C#, D, and Java, both function calls resolve to the "int" overloads... I already realize this is "in the specs", but why are languages designed this way? I'm looking for a deeper reason.

To me, it makes sense for the result to be the smallest type able to represent all possible values of both operands, for example:

byte + byte --> byte
sbyte + sbyte --> sbyte
byte + sbyte --> short
short + short --> short
ushort + ushort --> ushort
short + ushort --> int
// etc...

This would eliminate inconvenient code such as short s3 = (short)(s1 + s2), as well as IMO being far more intuitive and easier to understand.

Is this a left-over legacy from the days of C, or are there better reasons for the current behavior?

Solution

Quoted from this MSDN blog post:

byte b = 32; byte c = 240; int i = b + c; // what is i?

In this fantasy world, the value of i would be 16! Why? Because the two operands to the + operator are both bytes, so the sum "b+c" is computed as a byte, which results in 16 due to integer overflow. (And, as I noted earlier, integer overflow is the new security attack vector.)

Similarly,

int j = -b;

would result in j having the value 224 and not -32, for the same reason.

Is that really what you want?

...

So no matter how you slice it, you're going to have to insert annoying casts. May as well have the language err on the side of safety (forcing you to insert the casts where you know that overflow is not an issue) than to err on the side of silence (where you may not notice the missing casts until your Payroll department asks you why their books don't add up at the end of the month).

Also, it's worth noting that adding in these casts only means extra typing, and nothing more. Once the JIT (or possibly the static compiler itself) reduces the arithmetic operation to a basic processor instruction, there's nothing clever going on - it's just whether the number gets treated as an int or byte.

This is a good question, however... not at all an obvious one. Hope that makes the reasons clear for you now.

OTHER TIPS

A better set of rules IMHO, if one provided that the shifting operators could only be used with constant shift values (use shifting functions for variable shift amounts), would be that the results of any arithmetic expression should always evaluate as though it were processed with the largest possible signed or unsigned type, provided either could be statically guaranteed to give correct results (slightly tricky rules would apply in cases where the largest signed type might not be sufficient). Provided shift operands are only allowed to be constants, one could determine pretty easily at compile time what the largest meaningful value of any operand could be, so I don't see any good reason for compilers not to look at how an operator's result is used in deciding on the implementation of the operator.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow