문제

When compiled for x64, the following function uses the XMM0 register for parameter passing:

void foo (double const scalar)
{
    __m256d vector = _mm256_broadcast_sd(&scalar);
}

In assembly, the vbroadcastsd opcode can take a register operand. The equivalent intrinsic appears to only accept a pointer to a memory operand. Is there a way to guarantee that compilers will optimise loads like this to avoid a store to memory?

도움이 되었습니까?

해결책

I wouldn't think anyone can GUARANTEE it, but assuming you enable at least some optimisation, I'd be very disappointed if any modern compiler didn't remove unnecessary pointer indirections... I have certainly seen more intricate problems that the compiler has figured out how to simplify.

I take it you haven't looked at the generated code to determine what it does (because your question would have been phrased differently).

다른 팁

If you're worried about parameter passing on the stack, then you're function is likely too short or too important to end up being called as a separate function. Use

__forceinline

with visual C++ or

__attribute__((always_inline)) 

with g++.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top