Why does even a 16-byte aligned address cause _mm_load_si128 to cause access violation?

StackOverflow https://stackoverflow.com/questions/22881224

  •  28-06-2023
  •  | 
  •  

Question

The following compiles without warnings on MSVC.

#include <iostream>
#include <emmintrin.h>

int main() 
{
    __declspec(align(16)) int x = 42;
    std::cout << &x << "\n";  // Print out the address that holds x

    __m128i v = _mm_load_si128((__m128i const*)(x));
}

Essentially, the code aligns a 32-bit integer, and tries to load that into a __m128i type. The _mm_load_si128 requires the input address to be 16-byte aligned. The _mm_loadu_si128 does not require it, but both cause the above code to emit an access violation when ran. Why, and how do I fix it?

Was it helpful?

Solution

You have forgotten to take the address of x:

__m128i v = _mm_load_si128((__m128i const*)(&x));
//                                          ^
//                                          |
//                     Here ----------------+

In addition, you did not provide enough space for the data, so _mm_load_si128 would end up reading past the end of the allocated block of memory.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top