I suggest that you use the following code:
inline static _mm_cvtsd_f64_h(__m128d x) {
return _mm_cvtsd_f64(_mm_unpackhi_pd(x, x));
}
This is likely the fastest way to get get the upper half of xmm
register, and it is compatible with MSVC/icc/gcc/clang.