質問

I wrote a program that runs _mm_cmpistri to get the next \n (newline) character. While this works great on my computer, it fails on a server due to missing SSE 4.2 support.

Is there a good alternative using SSE commands <= SSE 4.1?

役に立ちましたか?

解決

Ok, actual code it is. This hasn't been tested, it's just to give you the idea.

__m128i lf = _mm_set1_epi8('\n');
// unaligned part
__m128i data = _mm_loadu_si128((__m128i *)ptr);
int mask = _mm_movemask_epi8(_mm_cmpeq_epi8(data, lf));
if (mask != 0)
    return ffs(mask);
int index = 16 - ((size_t)ptr & 15);
// aligned part, possibly overlaps unaligned part but that's ok
for (; index < length; index += 16) {
    data = _mm_load_si128((__m128i *)(ptr + index));
    mask = _mm_movemask_epi8(_mm_cmpeq_epi8(data, lf));
    if (mask != 0)
        return index + ffs(mask);
}

For MSVC, ffs can be defined in terms of _BitScanForward.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top