Question

I am trying to figure out how to use sse _mm_shuffle_epi8 to compact a 128-bit register.

Let's say, I have an input variable

__m128i target

which is basically 8 16-bits, indicated as:

a[0], a[1] ... a[7].  // each slot is 16 bits

my output is called:

__m128i output

Now I have a bit-vector of size 8:

char bit_mask // 8 bits, i-th bit each indicate if
              // the corresponding a[i] should be included

OK, how can I get the final result based on the bit_mask and the input target?

assume my bitvector is:

[0 1 1 0 0 0 0 0]

then I want to result to be:

output = [a1, a2 , ... ]

Any known way to do this using _mm_shuffle_epi8?

Assume I use a lookup array: _mm_shuffle_epi8(a, mask_lookup[bitvector]);

How do I create the array?

No correct solution

OTHER TIPS

Simple and very fast, but requires 4KB of table space:

_mm_shuffle_epi8(a, mask_lookup[bitvector]);

where you simply store all 256 possible shuffle masks in a table indexed by the bitvector.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top