Question

I have a routine that promotes an array of single precision data to double precision in place using that the array is appropriately sized to handle the extra bytes:

void dpromote(const int n, double *x)             
{
    for (int i = n; i --> 0 ;) {    
        x[i] = ((float *)x)[i];
    }
}

On entry, x should contain n floats and on exit it will contain n doubles:

void test_dpromote()
{
    double  e[]   = {1, 2, 3, 4, 5, 6, 7};
    float   x[]   = {1, 2, 3, 4, 5, 6, 7, 0, 0, 0, 0, 0, 0, 0};
    const int n = sizeof(e)/sizeof(e[0]);
    dpromote(n, (void *) x);
    /* memcmp(x, e, sizeof(e)) will return 0 when this works as expected */
}

Why am I doing this? Mixed precision iterative refinement within a numerics-heavy code. For the purposes of this question, you can ignore why as its genuinely irrelevant.

Multiple compilers are okay with the dpromote logic at a variety of aggressive optimization levels with strict aliasing enabled. Recently, one compiler vendor (who shall remain nameless) decided to re-index my loop so that it was a forward traversal through memory instead of a backwards traversal. If you stare at the code for half a minute, you'll see that loop transformation produces utter garbage.

Does the dpromote logic, with all the C99 strict-aliasing bells and whistles enabled, rely on undefined behavior? I can't figure out why a compiler would think it okay to change the loop indexing unless the code is doing undefined things.

Was it helpful?

Solution

Yes, you are breaking the strict aliasing rules. Use a union - it may not maintain your original layout, however it reflects your intent much better and is generally cleaner:

#include <stdio.h>

union value 
{
    double d;
    float f;
};

void dpromote (const int n, value* x)
{
    for (int i=0; i < n; ++i)
        x[i].d = x[i].f;
}

void test_dpromote()
{
    value x[] = {{.f=1}, {.f=2}, {.f=3}, {.f=4}, {.f=5}, {.f=6}, {.f=7}};
    const int n = sizeof(x) / sizeof(x[0]);

    for (int i=0; i < n; ++i)
        printf("float: %f\n", x[i].f);

    dpromote(n, x);

    for (int i=0; i < n; ++i)
        printf("double: %f\n", x[i].d);
}

int main ()
{
    test_dpromote();
    return 0;
}

If you must maintain the original layout then you'll need to manage the block of memory manually to satisfy strict aliasing rules:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

double* dpromote (const int n, char* x)
{
    for (int i=n; i-- > 0; )
    {
        float f;
        memcpy(&f, &x[i*sizeof(float)], sizeof(f));
        double d = f;
        memcpy(&x[i*sizeof(double)], &d, sizeof(d));
    }

    return (double*)x;
}

void test_dpromote()
{
    int const n = 7;
    char* block = (char*)malloc(n*sizeof(double));
    for (int i=0; i < n; ++i)
    {
        float const x = i+1;
        memcpy(&block[i*sizeof(float)], &x, sizeof(x));
    }

    // It is now safe to access block through x

    float* x = (float*)block;
    for (int i=0; i < n; ++i)
        printf("float: %f\n", x[i]);

    double* y = dpromote(n, block);
    for (int i=0; i < n; ++i)
        printf("double: %f\n", y[i]);

    // It is now safe to access block through y, however
    // subsequent access through x will violate strict aliasing rules
}

int main ()
{
    test_dpromote();
    return 0;
}

OTHER TIPS

The Standard mandates that compilers make pessimistic aliasing assumptions in some places that are not likely to be useful, but is interpreted as inviting compilers to ignore evidence of other kinds of aliasing. I think your code would have defined behavior if every element that was read as float had been previously written as float, and was sequenced before any operation that writes it as "double". I think your code invokes UB on the last iteration, however, because the assignment reads and writes overlapping objects.

Had your code been written as:

void dpromote(const int n, double *x)             
{
    float *fp = x;
    for (int i = n; i --> 0 ;) {    
        double d = fp[i];
        x[i] = d;
    }
}

I would not particularly condemn a compiler that didn't notice the aliasing since there would be no evidence within the loop to suggest that aliasing might occur. I'm not keen on the behavior of the compiler rearranging your loop, though I wouldn't fault a compiler that malfunctioned for other reasons given the overlapping read and write [e.g. if a compiler targeting a CPU without an FPU did something like clear some bits of the "double" and then overwrote parts of it with a bit-shifted versions of the original exponent and mantissa]. If splitting the assignment into separated read and write doesn't fix the compiler's behavior, however, I would regard it as dodgy since such behavior abides neither the Standard nor common sense.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top