Question

I am trying to extract the bits from a float without invoking undefined behavior. Here is my first attempt:

unsigned foo(float x)
{
    unsigned* u = (unsigned*)&x;
    return *u;
}

As I understand it, this is not guaranteed to work due to strict aliasing rules, right? Does it work if a take an intermediate step with a character pointer?

unsigned bar(float x)
{
    char* c = (char*)&x;
    unsigned* u = (unsigned*)c;
    return *u;
}

Or do I have to extract the individual bytes myself?

unsigned baz(float x)
{
    unsigned char* c = (unsigned char*)&x;
    return c[0] | c[1] << 8 | c[2] << 16 | c[3] << 24;
}

Of course this has the disadvantage of depending on endianness, but I could live with that.

The union hack is definitely undefined behavior, right?

unsigned uni(float x)
{
    union { float f; unsigned u; };
    f = x;
    return u;
}

Just for completeness, here is a reference version of foo. Also undefined behavior, right?

unsigned ref(float x)
{
    return (unsigned&)x;
}

So, is it possible to extract the bits from a float (assuming both are 32 bits wide, of course)?


EDIT: And here is the memcpy version as proposed by Goz. Since many compilers do not support static_assert yet, I have replaced static_assert with some template metaprogramming:

template <bool, typename T>
struct requirement;

template <typename T>
struct requirement<true, T>
{
    typedef T type;
};

unsigned bits(float x)
{
    requirement<sizeof(unsigned)==sizeof(float), unsigned>::type u;
    memcpy(&u, &x, sizeof u);
    return u;
}
Was it helpful?

Solution

About the only way to truly avoid any issues is to memcpy.

unsigned int FloatToInt( float f )
{
   static_assert( sizeof( float ) == sizeof( unsigned int ), "Sizes must match" );
   unsigned int ret;
   memcpy( &ret, &f, sizeof( float ) );
   return ret;
}

Because you are memcpying a fixed amount the compiler will optimise it out.

That said the union method is VERY widely supported.

OTHER TIPS

The union hack is definitely undefined behavior, right?

Yes and no. According to the standard, it is definitely undefined behavior. But it is such a commonly used trick that GCC and MSVC and as far as I know, every other popular compiler, explicitly guarantees that it is safe and will work as expected.

The following does not violate the aliasing rule, because it has no use of lvalues accessing different types anywhere

template<typename B, typename A>
B noalias_cast(A a) { 
  union N { 
    A a; 
    B b; 
    N(A a):a(a) { }
  };
  return N(a).b;
}

unsigned bar(float x) {
  return noalias_cast<unsigned>(x);
}

If you really want to be agnostic about the size of the float type and just return the raw bits, do something like this:

void float_to_bytes(char *buffer, float f) {
    union {
        float x;
        char b[sizeof(float)];
    };

    x = f;
    memcpy(buffer, b, sizeof(float));
}

Then call it like so:

float a = 12345.6789;
char buffer[sizeof(float)];

float_to_bytes(buffer, a);

This technique will, of course, produce output specific to your machine's byte ordering.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top