I would simply use the _mm_set_pd
or _mm_set1_pd
intrinsics and see what your compiler generates - it should be reasonably efficient, and if not then the generated code may give you an idea of how to improve on it with more explicit intrinsics, e.g.:
double d[2];
__m128d v0 = _mm_set_pd(d[0], 0.0);
__m128d v1 = _mm_set_pd(d[1], 0.0);
Alternatively, as pointed out by @Mysticial and @Anycorn, you can just use _mm_load_sd
:
double d[2];
__m128d v0 = _mm_load_sd(&d[0]);
__m128d v1 = _mm_load_sd(&d[1]);