How does one safely static_cast between unsigned int and int?

https://stackoverflow.com/questions/7601731

05-02-2021
|

Question

I have an 8-character string representing a hexadecimal number and I need to convert it to an int. This conversion has to preserve the bit pattern for strings "80000000" and higher, i.e., those numbers should come out negative. Unfortunately, the naive solution:

int hex_str_to_int(const string hexStr)
{    
    stringstream strm;
    strm << hex << hexStr;
    unsigned int val = 0;
    strm >> val;
    return static_cast<int>(val);
}

doesn't work for my compiler if val > MAX_INT (the returned value is 0). Changing the type of val to int also results in a 0 for the larger numbers. I've tried several different solutions from various answers here on SO and haven't been successful yet.

Here's what I do know:

I'm using HP's C++ compiler on OpenVMS (using, I believe, an Itanium processor).
sizeof(int) will be at least 4 on every architecture my code will run on.
Casting from a number > INT_MAX to int is implementation-defined. On my machine, it usually results in a 0 but interestingly casting from long to int results in INT_MAX when the value is too big.

This is surprisingly difficult to do correctly, or at least it has been for me. Does anyone know of a portable solution to this?

Update:

Changing static_cast to reinterpret_cast results in a compiler error. A comment prompted me to try a C-style cast: return (int)val in the code above, and it worked. On this machine. Will that still be safe on other architectures?

Solution

While there are ways to do this using casts and conversions, most rely on undefined behavior that happen to have well-defined behaviors on some machines / with some compilers. Instead of relying on undefined behavior, copy the data:

int signed_val;
std::memcpy (signed_val, val, sizeof(int));
return signed_val;

OTHER TIPS

Quoting the C++03 standard, §4.7/3 (Integral Conversions):

If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined.

Because the result is implementation-defined, by definition it is impossible for there to be a truly portable solution.

You can negate an unsigned twos-complement number by taking the complement and adding one. So let's do that for negatives:

if (val < 0x80000000) // positive values need no conversion
  return val;
if (val == 0x80000000) // Complement-and-addition will overflow, so special case this
  return -0x80000000; // aka INT_MIN
else
  return -(int)(~val + 1);

This assumes that your ints are represented with 32-bit twos-complement representation (or have similar range). It does not rely on any undefined behavior related to signed integer overflow (note that the behavior of unsigned integer overflow is well-defined - although that should not happen here either!).

Note that if your ints are not 32-bit, things get more complex. You may need to use something like ~(~0U >> 1) instead of 0x80000000. Further, if your ints are no twos-complement, you may have overflow issues on certain values (for example, on a ones-complement machine, -0x80000000 cannot be represented in a 32-bit signed integer). However, non-twos-complement machines are very rare today, so this is unlikely to be a problem.

Here's another solution that worked for me:

if (val <= INT_MAX) {
    return static_cast<int>(val);
}
else {
    int ret = static_cast<int>(val & ~INT_MIN);
    return ret | INT_MIN;
}

If I mask off the high bit, I avoid overflow when casting. I can then OR it back safely.

C++20 will have std::bit_cast that copies bits verbatim:

#include <bit>
#include <cassert>
#include <iostream>

int main()
{
    int i = -42;
    auto u = std::bit_cast<unsigned>(i);
    // Prints 4294967254 on two's compliment platforms where int is 32 bits
    std::cout << u << "\n";

    auto roundtripped = std::bit_cast<int>(u);
    assert(roundtripped == i);
    std::cout << roundtripped << "\n"; // Prints -42

    return 0;
}

cppreference shows an example of how one can implement their own bit_cast in terms of memcpy (under Notes).

While OpenVMS is not likely to gain C++20 support anytime soon, I hope this answer helps someone arriving at the same question via internet search.

unsigned int u = ~0U;
int s = *reinterpret_cast<int*>(&u); // -1

Сontrariwise:

int s = -1;
unsigned int u = *reinterpret_cast<unsigned int*>(&s); // all ones

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow