I'm trying to call a CUDA kernel from another kernel, but get the following error :
Traceback (most recent call last):
File "C:\temp\GPU Program Shell.py", line 22, in <module>
""")
File "C:\Python33\lib\site-packages\pycuda\compiler.py", line 262, in __init__
arch, code, cache_dir, include_dirs)
File "C:\Python33\lib\site-packages\pycuda\compiler.py", line 252, in compile
return compile_plain(source, options, keep, nvcc, cache_dir)
File "C:\Python33\lib\site-packages\pycuda\compiler.py", line 134, in compile_plain
cmdline, stdout=stdout.decode("utf-8"), stderr=stderr.decode("utf-8"))
pycuda.driver.CompileError: nvcc compilation of c:\users\karste~1\appdata\local\temp\tmpgq8t45\kernel.cu failed
[command: nvcc --cubin -arch sm_35 -m64 -Ic:\python33\lib\site-packages\pycuda\cuda kernel.cu]
[stderr:
kernel.cu(14): error: kernel launch from __device__ or __global__ functions requires separate compilation mode
My understanding is that this is has to do with Dynamic Parallelism and the other question related to this error is due to a user without approppriate hardware. I have a GTX Titan, however, so it should be compatible. What am I missing?
EDIT
After adding "options=['--cubin','-rdc=true' ,'-lcudart', '-lcudadevrt,','-Ic:\python33\lib\site-packages\pycuda\cuda kernel.cu']" to SourceModule, I get the following error:
Traceback (most recent call last):
File "C:\temp\GPU Program Shell.py", line 22, in <module>
""", options=['--cubin','-rdc=true' ,'-lcudart', '-lcudadevrt,','-Ic:\python33\lib\site-packages\pycuda\cuda kernel.cu'])
File "C:\Python33\lib\site-packages\pycuda\compiler.py", line 265, in __init__
self.module = module_from_buffer(cubin)
pycuda._driver.LogicError: cuModuleLoadDataEx failed: not found -