How can I tell if the function's intrinsic version is used from the disassembly?
-
27-09-2019 - |
Question
Im trying to optimize my exercise application in VS2010. Basically I have several sqrt, pow and memset in the core loop. More specifically, this is what I do:
// in a cpp file ...
#include <cmath>
#pragma intrinsic(sqrt, pow, memset)
void Simulator::calculate()
{
for( int i=0; i<NUM; i++ )
{
...
float len = std::sqrt(lenSq);
distrib[0] = std::pow(baseVal, expVal);
...
clearQuad(i); // invokes memset
}
}
After build, the disassembly shows that, for example, the sqrt call still compiles as "call _CIsqrt(0x####)" nothing changes regardless of whether the /Oi flag is enabled or not.
Can anybody kindly explain how can i enable the intrinsic version and how can I verify it by the disassembly code? (I have also enabled the /O2 in the project settings.)
Thank you
Edit: Problem solved by adding /fp:fast. For sqrt, as an example, the intrinsic version uses a single "fsqrt" to replace the std version "call __CIsqrt()". Sadly, in my case, the intrinsic version is 5% slower.
Many thanks to Zan Lynx and mch.
Solution
You are compiling to machine code and not to .NET CLR. Right?
If you compile to .NET then the code won't be optimized until it is run through JIT. At that point .NET has its own intrinsics and other things that will happen.
If you are compiling to native machine code, you might want to play with the /arch option and the /fp:fast option.
OTHER TIPS
The use of the C++ std namespace might be causing the compiler not to use the intrinsics. Try removing std::
from your sqrt
, pow
, and memset
calls.
The MSDN Library documentation for #pragma intrinsic
offers up an example for testing if the intrinsic truely is being used: compile with the -FAs
flag and look at the resulting .asm file.
Looking at the disassembly in the debugger, as you seem to already be doing, should also show the intrinsic rather than a call
.