SSE2 for double calculations with GCC
-
28-02-2021 - |
Question
How can I use SSE2 in GCC? I want to work with double values.
I search s.th. like this: http://vrm-vrm.blogspot.com/2009/10/gcc-intrinsics.html only for double values.
La solution
If you want to use the SSE2 double
insns, you have to compile with gcc -mfpmath=sse -msse2
.
The option -msse2
alone will allow you to use SSE2 intrinsics, -mfpmath=sse
will cause GCC to emit SSE2 insns for all FP operations.
Also note that vectorization is enabled at -O3
.
The advantages of vectorized SSE2-4 insn are obvious, Sandy Bridge processors can execute up to three 256-bit operations per cycle (for example 4 double multiplies, 4 double additions and some shuffle on top of it)
However, Intel optimizations manual recommends using SSE even for scalar operations, for reasons including flat register model and shorter latencies, compared to legacy x87 insns.
EDIT:
Forgot to mention, for 32-bit code, you may also add -msseregparm
, which will cause FP arguments and return values to be passed via SSE registers. By default they are passed on memory and in %st0
, respectively. Naturally, this changes the ABI, so all interacting modules have to be compiled with this option.