*This is a more specific, better-formed question of something I already asked. I deleted the other one.
So I'm trying to collect kernel timing data from a CUDA library...
The library has benchmarks for different types for each of its algorithms and they work like this:
There is a 2d array which has pairs of array sizes & test iterations. Example:
const int Tests[][2] = {
{ 10000, 10000 },
{ 50000, 5000 },
{ 100000, 5000 },
{ 200000, 2000 }
// ...
};
Then in main there will be a loop
// get context ptr
for(int test = 0; test < numTests; ++test)
BenchmarkMyAlg(Tests[test][0], Tests[test][1], *context);
BenchmarkMyAlg sets up the data and everything, then runs the kernel in a loop (Tests[test][1] times)
What I want to do is to get the "CUDA Launch Summary" data, specifically "average duration for executing the device function in microseconds," for each test parameter pair. I.e. for each iteration of that loop in main.
As it is now, I am only able to get the average timing for entire main loop. To put it another way, I can only get 1 row of NSight data after the application executes and I want numTests rows of data.
If a 2nd, different algorithm is tested in main, NSight will make another row of data. e.g...
for(int test = 0; test < numTests; ++test)
BenchmarkMyAlg(Tests[test][0], Tests[test][1], *context);
for(int test = 0; test < numTests; ++test)
BenchmarkMyOtherAlg(Tests[test][0], Tests[test][1], *context);
But again, that new row of data refers to the whole loop, giving me 2 rows of data when I want 2 * numTests rows of data.
I've tried digging through settings in NSight and I've also tinkered with nvprof some, but I haven't made any progress.
I'm thinking there is a way I could re-code the file so that NSight would recognize each test iteration as a new/different kernel like it does when actually switching to a different kernel (like in my 2nd example). Perhaps initializing numTests separate references to the BenchmarkMyAlg function and then running through those? I'll go try that for now and comment back if I get anywhere.