Why does vsperfmon tell me the called function's inclusive time is taking longer then the root function's?

StackOverflow https://stackoverflow.com/questions/9322211

Question

So I've been experimenting with vsperfmon through the vsperfreport/vsperfcmd command line tools with VS 2010. I've constructed a really simple program to profile and try to understand the numbers these tools output:

void DoStuff()
{
    double res = 0.0;
    for (double i = 0.0; i < 10000.0; ++i)
    {
        res += sin(i);      
    }
    printf("res is %lf", res);
}


int _tmain(int argc, _TCHAR* argv[])
{
    DoStuff();  
    return 0;
}

I profile the executable by running the steps as detailed here at the command line. The above gets compiled to perfPlay.exe, then I do the following steps:

vsinstr perfPlay.exe
vsperfcmd /start:trace /output:perfPlay.vsp
perfPlay.exe
vsperfcmd /shutdown
vsperfreport perfPlay.vsp /output:singleFile /summary:All

Here's the odd thing I can't figure out. The elapsed inclusive time for DoStuff is less than the inclusive time for sin() in both the function and caller/callee report: Here's the caller/callee report for DoStuff(), notice the Elapsed Inclusive Time for THUNK:sin vs the Root function

Type        Function Name       Elapsed Inclusive Time  Elapsed Exclusive Time  
Root        DoStuffInLib(void)  2157487                 0
Caller      _wmain  2157487 0   2157487                 0
Callee      __RTC_CheckEsp      57                      57  
Callee      _printf             347667                  347667  
Callee      THUNK:sin           2282622                 81435   

Elapsed inclusive time is defined as the amount of time taken to execute code in your function including functions you call. By that definition DoStuff's inclusive time should always be > the inclusive time of sin. The difference above is relatively small, but if I let this thing run for a while it grows larger. This difference exists both in Debug and Release modes.

Why is it the case that sin's inclusive time is higher? I would expect it to represent a portion of the Root entry's time. I'm not quite sure whats going on or even if I can trust this tool if its doing seemingly weird stuff. I suspect though that I'm just missing something that would make things clearer for me.

Was it helpful?

Solution

The performance monitor uses probes to collect data. It tries to subtract the time taken by its own probes from the collected data, but this correction is approximate and it will often be consistently off in one direction or the other. Generally, the smaller the function you are trying to probe, the more inaccurate the measurements because the time taken collecting the data is a higher percentage of the time elapsed.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top