Question

I want to reinterpret data of one type as another type in a portable way (C99). I am not talking about casting, I want a reinterpretation of some given data. Also, by portable I mean that it does not break C99 rules - I do not mean that the reinterpretated value is equal on all systems.

I know 3 different way to reinterpret data, but only two of these are portable:

  1. This is not portable - it breaks the strict aliasing rule.

    /* #1 Type Punning */
    
    float float_value = 3.14;
    int *int_pointer = (int *)&float_value;
    int int_value = *int_pointer;
    
  2. This is platform dependent, because it reads an int value from the union after writing a float into it. But it does not break any C99 rules, so that should work (if sizeof(int) == sizeof(float)).

    /* #2 Union Punning */
    
    union data {
      float float_value;
      int int_value;
    };
    
    union data data_value;
    data_value.float_value = 3.14;
    int int_value = data_value.int_value;
    
  3. Should be fine, as long as sizeof(int) == sizeof(float)

    /* #3 Copying */
    
    float float_value = 3.14;
    int int_value = 0;
    memcpy(&int_value, &float_value, sizeof(int_value));
    

My Questions:

  1. Is this correct?
  2. Do you know other ways to reinterpret data in a portable way?
Was it helpful?

Solution

Solution 2 is portable - type punning through unions has always been legal in C99, and it was made explicit with TC3, which added the following footnote to section 6.5.2.3:

If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.

Annex J still lists it as unspecfied behaviour, which is a known defect and has been corrected with C11, which changed

The value of a union member other than the last one stored into [is unspecified]

to

The values of bytes that correspond to union members other than the one last stored into [are unspecified]

It's not that big a deal as the annex is only informative, not normative.

Keep in mind that you can still end up with undefined behaviour, eg

  • by creating a trap representation
  • by violating aliasing rules in case of members with pointer type (which should not be converted via type-punning anyway as there need not be a uniform pointer representation)
  • if the union members have different sizes - only the bytes of the member last used in a store have specified value; in particular, storing values in a smaller member can also invalidate trailing bytes of a larger member
  • if a member contains padding bytes, which always take unspecified values

OTHER TIPS

  1. The union solution is as defined as the memcpy one in C (AFAIK, it is UB in C++), see DR283

  2. It is possible to cast a pointer to a pointer to (signed/unsigned/) char, so

    unsigned char *ptr = (unsigned char*)&floatVar;
    

    and then accessing ptr[0] to ptr[sizeof(floatVar)-1] is legal.

to be safe, I'd go with with a byte array (unsigned char) rather than an 'int' to hold the value.

the data type int is an example of a non-portable type since endianness can change byte order between platforms.

if you want to be portable you need to define your own types, then implement them on each platform that you want to port to. Then define conversion methods for your data types. That is as far as I know the only way to have full control of byte orders etc.

If you want to avoid the strict aliasing rule, you need to first cast to a char pointer:

float float_value = 3.14;
int *int_pointer = (int *)(char *)&float_value;
int int_value = *int_pointer;

Note however, that you might have sizeof(int) > sizeof(float), in which case you still get undefined behavior

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top