Question

I'm wondering if it is possible to do the following calculation with four values parallel within a MMX-Register:

(a*b)/256

where a is a signed word and b is an unsigned value (blend factor) in the range of 0-256

I think my problem is that I'm not sure about how (or if) pmullw and pmulhw will help me with this task.

Was it helpful?

Solution

If you know that a*b won't overflow a signed 16-bit field, then you can use pmullw (intrinsic _mm_mullo_pi16, or SSE intrinsic _mm_mullo_epi16) and then shift right by 8 to do the division by 256.

Where

MMX:

__m64 a, b;
...
a = _mm_mullo_pi16 (a, b);
a = _mm_srli_pi16 (a, 8);

SSE2:

__m128i a, b;
...
a = _mm_mullo_epi16 (a, b);
a = _mm_srli_epi16 (a, 8);
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top