Question

I don't understand how the following C conversion functions work (and why they're written this way); I'm fairly certain that the original author knew what he was doing:

typedef union TValue {
  uint64_t u64;
  double n;
  struct {
    uint32_t lo;    /* Lower 32 bits of number. */
    uint32_t hi;    /* Upper 32 bits of number. */
  } u32;
  [...]
} TValue;


static int32_t num2bit(double n)
{
  TValue o;
  o.n = n + 6755399441055744.0;  /* 2^52 + 2^51 */
  return (int32_t)o.u32.lo;
}

static uint64_t num2u64(double n)
{
#ifdef _MSC_VER
  if (n >= 9223372036854775808.0)  /* They think it's a feature. */
    return (uint64_t)(int64_t)(n - 18446744073709551616.0);
  else
#endif
  return (uint64_t)n;
}
  • Does num2bit actually just cast a double into int32_t? Why the addition? Why write it like this?
  • What is this "feature" that is alluded to in num2u64? (I believe _MSC_VER means it's the code-path for Microsofts C compiler).

Note that those functions are not always used (depending on CPU architecture), this is for little-endian (I resolved some preprocessor macros to simplify).

Links to online browseable mirror (the code is from the LuaJIT project): Surrounding Header file (or whole project).

Every hint is appreciated.

Was it helpful?

Solution

num2bit is designed to implement the Lua BitOp semantics especially wrt. modular arithmetic. The implementation-defined behavior is well under control, since LuaJIT only works for specific CPUs, platforms and compilers, anyway. Don't use this code anywhere else.

num2u64 is a workaround for a bug/misfeature of MSVC where it always converts double to uint64_t via int64_t. This doesn't give the desired results for numbers >= 2^63. MS considers this abomination a 'feature'. Duh.

OTHER TIPS

num2bit: By setting the 51st and 52nd bit to 1, this forces the exponent to be a specific number (otherwise there would be overflow) - then when you return (int32_t)o.u32.lo you know you are getting an integer back with the same value as the 'low 32 bits' of the double since the exponent is fixed. So, this is a trick to get the integer value of most doubles quickly. It looks like it would truncate off numbers after the decimal point by doing this, and it would have unexpected effects if it was 2^51 or larger to begin with.

>>> math.frexp(1.0 + 6755399441055744.0)
(0.7500000000000001, 53)
>>> math.frexp(0.0 + 6755399441055744.0)
(0.75, 53)
>>> math.frexp(564563465 + 6755399441055744.0)
(0.7500000626791358, 53)
>>> math.frexp(-564563465 + 6755399441055744.0)
(0.7499999373208642, 53)
>>> math.frexp(1.5 + 6755399441055744.0)
(0.7500000000000002, 53)
>>> math.frexp(1.6 + 6755399441055744.0)
(0.7500000000000002, 53)
>>> math.frexp(1.4 + 6755399441055744.0)
(0.7500000000000001, 53)

EDIT: The reason why both the 51st and 52nd bit are set is because if you only set the 52nd bit, then negative numbers would cause the exponent to change:

>>> math.frexp(0 + 4503599627370496.0)
(0.5, 53)
>>> math.frexp(-543635634 + 4503599627370496.0)
(0.9999998792886404, 52)

num2u64: No clue. But the first number is 2^63 and the second is 2^64. It's probably to prevent overflow or signedness failure when casting a double larger than 2^63 to its integer representation, but I can't tell you more.

num2bit manually converts the in-memory representation of a IEEE standard double to 32-bit, fixed-point, two's complement signed format, using rounding to the nearest integer.

Converting through a union is unsafe because it violates strict type aliasing rules. You're not allowed to write to one member of a union, then read from another. It would be more proper to do something like

static int32_t num2bit(double n)
{
  int32_t o;
  n += 6755399441055744.0;  /* 2^52 + 2^51 */
  memcpy( & o, & n, sizeof o ); /* OK with strict aliasing but must mind endianness. */
  return o;
}

This function is probably intended as an optimization, but its value as such is dubious. You need to re-test on every new microprocessor and ensure it's only used on hardware where it's faster.

Note also that a plain C floating-integral conversion uses round-to-zero, or truncation. This function is perhaps not intended to handle fractional values at all.


num2u64 is a Windows-specific workaround (note the #ifdef). When converting a double value greater than 263 to an unsigned integer, "something bad" happens (perhaps saturation), so the author subtracts 264 to make it a negative number, then casts that to a signed, negative integer, then casts the result to an unsigned integer which will have a value greater than 263.

In any case, you can tell the intent is simply to convert a double to a uint64_t, since that's all it does on non-Windows platforms.

These functions "work" by magic.

This comes from §6.2.6.1p7 of n1570.pdf, which is the C standard draft: When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that member but do correspond to other members take unspecified values

Note how the code presented uses unspecified values by assigning to o.n and then using the value of o.u32.lo.

This comes from §6.3.1.3p3 of n1570.pdf, which is the C standard draft: Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

Note how the code presented invokes implementation-defined behaviour, as it converts from unsigned to signed 32-bit integer multiple times. Suppose that it were to instead raise an implementation-defined computational exception signal. If the default signal handler were to return, this would also result in undefined behaviour. /* They think it's a feature. */

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top