SSE2 for double calculations with GCC

https://stackoverflow.com/questions/8125934

28-02-2021
|

Question

How can I use SSE2 in GCC? I want to work with double values.

I search s.th. like this: http://vrm-vrm.blogspot.com/2009/10/gcc-intrinsics.html only for double values.

La solution

If you want to use the SSE2 double insns, you have to compile with gcc -mfpmath=sse -msse2.

The option -msse2 alone will allow you to use SSE2 intrinsics, -mfpmath=sse will cause GCC to emit SSE2 insns for all FP operations.

Also note that vectorization is enabled at -O3.

The advantages of vectorized SSE2-4 insn are obvious, Sandy Bridge processors can execute up to three 256-bit operations per cycle (for example 4 double multiplies, 4 double additions and some shuffle on top of it)

However, Intel optimizations manual recommends using SSE even for scalar operations, for reasons including flat register model and shorter latencies, compared to legacy x87 insns.

EDIT:

Forgot to mention, for 32-bit code, you may also add -msseregparm, which will cause FP arguments and return values to be passed via SSE registers. By default they are passed on memory and in %st0, respectively. Naturally, this changes the ABI, so all interacting modules have to be compiled with this option.

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow