finding a neon instruction corresponding to sse instruction

https://stackoverflow.com/questions/19631506

01-07-2022
|

Question

I want to know what is the equivalent instruction/code to SSE instruction in Neon instruction.

__m128i a,b,c;
c = _mm_packs_epi32(a, b);

Packs the 8 signed 32-bit integers from a and b into signed 16-bit integers and saturates.

I checked the equivalent instruction on ARM site but I didn't find any equivalent instruction. http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0204j/Bcfjicfj.html

La solution

There is no instruction that directly does what you want, but all the building blocks to build one are there:

The saturation/narrow instruction is:

int16x4_t vqmovn_s32 (int32x4_t)

This intrinsic saturates from signed 32 bit to signed 16 bit integers, returning the four narrowed integers in a 64 bit wide variable.

Combining these into your _mm_packs_epi32 is easy: Just do it for a and b, and combine the results:

  int32x4_t a,b;
  int16x8_t c;

  c = vcombine_s16 (vqmovn_s32(a), vqmovn_s32(b));

You may have to swap the order of the vcombine_s16 arguments.

Autres conseils

This pack/saturate operation comes under the MOV instruction category in NEON:

VQMOVN (Vector Saturating Move and Narrow) copies each element of the operand vector to the corresponding element of the destination vector. The result element is half the width of the operand element, and values are saturated to the result width.

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow