Use _mm_mul_epu32
when the operands should be considered unsigned integers, and _mm_mul_epi32
otherwise.
In 32-bit -> 64-bit multiplication, treating operands as unsigned or signed yield different results, so there are separate instructions. Add, sub and mov don't need separate instructions. There is no separate __m128u
type. Just use __m128i
and remember it contains unsigned numbers.