Question

I am trying to get memory traces from cuda-gdb. However, I am not able to step into the kernel code. I use the nvcc flags -g -G and -keep but to no effect. I am able to put a breakpoint on the kernel function but when I try to access the next instruction, it jump to the end of the kernel function. I have tried this on the sdk examples and I observe the same behaviour. I am working on cuda 5 toolkit. Any suggestions? Thanks!

Was it helpful?

Solution

This behavior is typical for kernel launch failure. Make sure you check return codes of the CUDA calls. Note that for debugging you may want to add additional call cudaDeviceSynchronize immediately after the kernel call and to check the return code from this call - it is the most precise way to obtain the cause of the asynchronous kernel launch failure.

Update: The code running outside of debugger but not in cuda-gdb most often is caused by trying to debug on a single-GPU system from graphical environment. cuda-gdb cannot share GPU with Xwindows as this would hang the OS.

You need to exit the graphical environment (e.g. quit X window) and debug from the console if your system only has one GPU.

If you have a multi-GPU system, then you should check your Xwindow configuration (Xorg.conf) so it does not use the GPU you reserve for debugging.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top