"The launch timed out and was terminated" error with Bumblebee on Linux

https://stackoverflow.com/questions/17335808

01-06-2022
|

Question

When running a long kernel (especially in debug mode with some memory checking) on a CUDA-enabled GeForce GPU with Bumblebee, I get the following error:

CUDA error 6: the launch timed out and was terminated

This seems to be caused by the NVIDIA driver's watchdog. A solution is available here. However, why is this happening while using Bumblebee and optirun to run a simple CUDA kernel (i.e. I do not use my NVIDIA GPU for display)?

The command I used to launch the program is:

optirun [cuda-memcheck] ./my_program program_options

Solution

The solution (found here) was to use the --no-xorg option for optirun, i.e.:

optirun --no-xorg [cuda-memcheck or cuda-gdb] ./my_program program_options

Indeed, the default behavior of optirun is to create a secondary X server which will then be subject to the driver's watchdog. By using the --no-xorg option, we can avoid the unnecessary consequences of this extra X server. This new option is available since Bumblebee 3.2.

It also provides a way to use cuda-gdb and avoid the following error:

fatal: All CUDA devices are used for display and cannot be used while debugging. (error code = 24)

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow