Possible OpenMP + SSE bug when using _mm_shuffle_ps in Xcode 4 (LLVM GCC)
Question
I have switched my compiler to LLVM GCC 4.2 in XCode 4.2 from GCC and have run into a strange linker error for the _mm_shuffle_ps intrinsic under OpenMP. This function will works else where but once I put it within a omp block it starts generating the following linker error:
"___builtin_ia32_shufps", referenced from:
__ZN7Annulus12traceFactorsEP9PrimitiveP8VFMatrix.omp_fn.0 in Annulus.o
ld: symbol(s) not found for architecture x86_64
collect2: ld returned 1 exit status
The basic structure of my code is as follows :
#pragma omp parallel {
//Some stuff
#pragma omp for {
//Do more stuff including _mm_shuffle_ps
}
}
The code works fine in GCC 4.2 so is this a bug in the LLVM GCC implementation of OpenMP or do I need an exotic compiler flag?
Solution
Totally a bug. Please file it. Thanks.
OTHER TIPS
Just FYI:
I have the same problem here, but with the shuf_pd instruction. Other intrinsics work just fine. I just filed that bug to Apple.
There may be a workaround I have not tried yet: Put all the SSE code into a different function and call it from the OpenMP loop.