Am I breaking strict aliasing rules?

Question 1

There is only one intrinsic that "extracts" the lower order double value from xmm register:

double _mm_cvtsd_f64 (__m128d a)

You could use it this way:

return _mm_cvtsd_f64(x);

There is some contradiction between different references. MSDN says: This intrinsic does not map to any specific machine instruction. While Intel intrinsic guide mentions movsd instruction. In latter case this additional instruction is easily eliminated by optimizer. At least gcc 4.8.1 with -O2 flag generates code with no additional instruction.

Question 2

The bullet point in bold should i think allow your cast here, as we may consider __m128d as an aggregate of four double union to the full register. In regards to strict aliasing, compiler had always be very conciliate around union where at the origin, only a cast to (char*) was supposed valid.

§3.10: If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined (The intent of this list is to specify those circumstances in which an object may or may not be aliased):

the dynamic type of the object,

a cv-qualified version of the dynamic type of the object,

a type similar (as defined in 4.4) to the dynamic type of the object,

a type that is the signed or unsigned type corresponding to the dynamic type of the object,

a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,

an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union),

a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,

a char or unsigned char type.

Question 3

Yes, I think this breaks strict aliasing. However, in practice this is usually fine.
(I'm mostly writing this as an answer because It's difficult to describe well in a comment)

But, you could instead do something like this:

inline double plop() const // member function
{
    __m128d x = _mm_load_pd(v);
    ... // some stuff

    union {
        unsigned long long i; // 64-bit int
        double             d; // 64-bit double
    };

    i = _mm_cvtsi128_si64(_mm_castpd_si128(x)); // _mm_castpd_si128 to interpret the register as an int vector, _mm_cvtsi128_si64 to extract the lowest 64-bits

    return d; // use the union to return the value as a double without breaking strict aliasing
}

Question 4

What about return x.m128d_f64[0]; ?