How can I convert an XMM register of single-precision floats to integers?

https://stackoverflow.com/questions/18861499

29-06-2022
|

题

I have a bunch of packed floats inside an XMM register (using SSE intrinsics):

__m128 xmm = _mm_set_ps(4.0f, 3.0f, 2.0f, 1.0f);

I'd like to convert all of these to integers in one go. I found an intrinsic, that does what I want (_mm_cvtps_pi16()), but it yields 4x16-bit short instead of full-blown int. An intrinsic called _mm_cvtps_pi32() yields int, but only for the two lower values in xmm. I can use it, extract the values, move things around and use it again, but is there a simpler way? Why wouldn't there be a straightforward 32bit packed float -> 32bit integer instruction? Surely both fit in the same space of an XMM register?

EDIT: Okay, I see now that _mm_cvtps_pi32() returns __m64 instead of __m128, which means it operates on a MMX-style MM... register. That would explain why it returns just two ints, but now I'm wondering:

Will I have trouble when compiling for x64? Reportedly, __m64 isn't supported there...
Why didn't they extend this instruction when SSE rolled out?

Thanks!

解决方案

According to this documentation: __m128d _mm_cvtps_epi32(__m128d a) generates a cvtps2dq instruction, which does what you want.

其他提示

Use documentation (_mm_cvtps_epi32):

Magic documentation.

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow