Size of statically allocated shared memory per block with Compute Prof (Cuda/OpenCL)

https://stackoverflow.com/questions/3927800

profiling
cuda
nvidia
opencl

30-09-2019
|

Question

In Nvidia's compute prof there is a column called "static private mem per work group" and the tooltip of it says "Size of statically allocated shared memory per block". My application shows that I am getting 64 (bytes I assume) per block. Does that mean I am using somewhere between 1-64 of those bytes or is the profiler just telling me that this amount of shared memory was allocated and who knows if it was used at all?

Solution

If it's allocated, it's probably because you used it. AFAIK CUDA passes parameters to kernels via shared memory, so it's must be that.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow