Question

I am trying to profile the CUDA rodinia benchmarks executing on a GTX 650. I am using the code /usr/local/cuda-5.0/extras/CUPTI/samples/event_sampling to read the instructions executed counter. It seems strange that I do not see any change in the values reported by the event_sampling whether I am executing the CUDA benchmarks or not.

The event_sampling code also has some calculations of its own for which it measures the instructions executed. Unlike CPU, do I need to make changes to the source code of the application to be able to read the GPU counters such as instruction_executed?

Was it helpful?

Solution

CUPTI will only give you counter updates for kernels in the same process. You can get some of these values, though not to the same level of precision, with the NVIDIA visual profiler or related environment variables without modifying the code however.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top