The driver crash followed by API errors occurs when your app hits a windows TDR event due to kernel execution taking too long. You can work around this by modifying the system registry, or putting a Quadro or Tesla GPU in TCC mode, or reducing the run-time of your kernel(s).
When you debug with nsight, your kernel execution may get halted for various reasons (single step, breakpoints, and other reasons), and then restarted, depending on what you are doing exactly in your debug session. The halting of the kernel execution allows the windows watchdog to be satisfied without a TDR event.