Question

I'm new to using intrinsic functions, so I'm not sure why my program is crashing. I'm able to build the program, but when I run it I just get the "programname.exe has stopped working" window.

#include "stdafx.h"
#include <stdio.h>
#include <Windows.h>
#include <intrin.h>

int _tmain(int argc, _TCHAR* argv[])
{
    const int N = 128;
    float x[N], y[N];
    float sum = 0;

    for (int i = 0; i < N; i++)
    {
        x[i] = rand() >> 1;
        y[i] = rand() >> 1;
    }

    float* ptrx = x;
    float* ptry = y;

    __m128 x1;

    x1 = _mm_load_ps(ptrx);

    return 0;
}

If I comment out the 'x1 = _mm_load_ps(ptrx);' line, the program is able to run, so that is what is causing the crash.

Here is the output when building the solution...

1>------ Rebuild All started: Project: intrins2, Configuration: Debug Win32 ------
1>  stdafx.cpp
1>  intrins2.cpp
1>c:\...\visual studio 2013\projects\intrins2\intrins2\intrins2.cpp(20): warning C4244: '=' : conversion from 'int' to 'float', possible loss of data
1>c:\...\visual studio 2013\projects\intrins2\intrins2\intrins2.cpp(21): warning C4244: '=' : conversion from 'int' to 'float', possible loss of data
1>  intrins2.vcxproj -> c:\...\visual studio 2013\Projects\intrins2\Debug\intrins2.exe
========== Rebuild All: 1 succeeded, 0 failed, 0 skipped ==========
Was it helpful?

Solution

The problem is that your "source" (the array x) is not aligned to the size that the SSE instructions require.

You can fix this with using the "unaligned" load instruction, or you can fix it by using the __declspec(align(n)), e.g:

    float __declspec(align(16)) x[N];
    float __declspec(align(16)) y[N];

Now your x and y arrays are aligned to 16 bytes, and are viable for access [on indices that are multiples of 4, of course] from SSE instructions. Note that unaligned access is not allowed for general SSE instructions that take memory arguments, so for example a _mm_max_ps requires that the second argument (in Intel order, first in AT&T order) is an aligned memory location.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top