Question

I know from my own projects that cuda has for example a file scope linkage when using __constant__ memory. That means that I have to use the constant memory inside the same file where I did the cudaMemcpyToSymbol. So for what other storage/memory types does cuda uses a file scope linkage? I think it uses it also for texture. But what is with __shared__ memory?

Was it helpful?

Solution

The first point is that in the classic CUDA compilation model, the scope requirement for all device symbols is translation unit scope. And all means at least

  1. __global__ functions
  2. __device__ functions
  3. __texture__ objects
  4. __constant__ memory
  5. __shared__ memory declared extern
  6. __device__ symbols

All of these have to be defined at the same translation unit where they are used in order for the runtime to work correctly. The underlying reason is that all of the objects on the list above must get emitted into the same CUDA module (this is a driver API concept) for everything to work.

CUDA 5 introduces a separate compilation mode, which is effectively a static linkage model. This means symbols don't need defined in the same translation unit, they can be forward declared and the linker will handle symbol matching during a device code linkage pass.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top