Question

When compiled for x64, the following function uses the XMM0 register for parameter passing:

void foo (double const scalar)
{
    __m256d vector = _mm256_broadcast_sd(&scalar);
}

In assembly, the vbroadcastsd opcode can take a register operand. The equivalent intrinsic appears to only accept a pointer to a memory operand. Is there a way to guarantee that compilers will optimise loads like this to avoid a store to memory?

Était-ce utile?

La solution

I wouldn't think anyone can GUARANTEE it, but assuming you enable at least some optimisation, I'd be very disappointed if any modern compiler didn't remove unnecessary pointer indirections... I have certainly seen more intricate problems that the compiler has figured out how to simplify.

I take it you haven't looked at the generated code to determine what it does (because your question would have been phrased differently).

Autres conseils

If you're worried about parameter passing on the stack, then you're function is likely too short or too important to end up being called as a separate function. Use

__forceinline

with visual C++ or

__attribute__((always_inline)) 

with g++.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top