Question

I've been debugging some SSE-optimised vector code and noticed some odd behaviour. To be fair, the code style is pretty bad, but what the compiler does still seems wrong to me. Here is the function in question:

inline void daxpy(int n, double alph, const double* x, int incx, double* y, int incy) {
    __m128d sse_alph = _mm_load1_pd(&alph);
    while (n >= 4) {
        n -= 4;
        __m128d y1 = _mm_load_pd(y+n), y2 = _mm_load_pd(y+n+2);
        __m128d x1 = _mm_load_pd(x+n), x2 = _mm_load_pd(x+n+2);
        y1 = _mm_add_pd(y1, _mm_mul_pd(x1, sse_alph));
        y2 = _mm_add_pd(y2, _mm_mul_pd(x2, sse_alph));
        _mm_store_pd(y+n, y1), _mm_store_pd(y+n+2, y2);
    }
}

The function is that the array y = y + alph * x. We guarantee that the arrays both have the same length, n, which is a multiple of 4, and that x and y are aligned on 16-byte boundaries (I've omitted the relevant assertions for clarity).

The last line of the loop has been written with a comma operator so that it looks like the two load lines. The problem is that the first _mm_store_pd call is not executed. Isn't that wrong? I guess the compiler might have decided that only the second call is necessary to evaluate the expression, but it seems pretty obvious that the intrinsic function has a side effect.

Have I misunderstood what is going on here? I realise that using a comma operator like this is pretty poor style - my question is whether the compiler is wrong. The compiler in question is Visual C++ 2010 SP 1.

Était-ce utile?

La solution

Building this code with Microsoft Visual Studio 2008, 2010, and 2012 shows they all eliminate the left operand of the comma operator. This happens only if optimization is enabled. When this code is built using gcc 4.8.1, the left operand of the comma operator is not eliminated, even when full optimization is used.

The C99 specification states, "The left operand of a comma operator is evaluated as a void expression; there is a sequence point after its evaluation. Then the right operand is evaluated".

In my opinion, the Microsoft optimizer is incorrect to remove this code. This is because the language specification says both operands are evaluated. The only differences between the two operands of the comma operator are the order of their evaluation and which one provides the result for the comma operator. In this case the result is void.

Work-around: replace the comma with a semicolon.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top