Note sure if there's a single instruction for this, but something like this ought to work (untested):
; Assume that the 8 16-bit values are in xmm0
PSHUFLW xmm1,xmm0,0D8h ; Change word order to 3120 in the low qword
PSHUFHW xmm1,xmm1,0D8h ; Change word order to 3120 in the high qword
PSHUFD xmm1,xmm1,0D8h ; Change dword order to 3120
MOVAPD xmm0,xmm1 ; Copy to xmm0
PUNPCKLWD xmm0,xmm0 ; Expand even words to dwords
PUNPCKHWD xmm1,xmm1 ; Expand odd words to dwords
PSLLD xmm0,16 ; Sign-extend
PSRAD xmm0,16 ; ...
PSLLD xmm1,16
PSRAD xmm1,16
xmm0
should now contain the 4 even words sign-extended to 32 bits, and xmm1
should contain the odd words.
If you can use SSE4.1 instructions it's possible to simplify the sign-extension part a bit. For the even words (xmm0
) you could replace the unpack and the two shifts with:
PMOVSXWD xmm0,xmm0