From the programming guide:
The maximum number of kernel launches that a device can execute concurrently is 32 on devices of compute capability 3.5 and 16 on devices of lower compute capability.
質問
As we know Fermi support only single connection to GPU, and as written here: http://on-demand.gputechconf.com/gtc-express/2011/presentations/StreamsAndConcurrencyWebinar.pdf
Fermi architecture can simultaneously support
Up to 16 CUDA kernels on GPU
And as we know Hyper-Q allows for up to 32 simultaneous connections from multiple CUDA streams, MPI processes, or threads within a process: http://www.nvidia.com/content/PDF/kepler/NVIDIA-Kepler-GK110-Architecture-Whitepaper.pdf
But how many kernels simultaneously support on Kepler CC3.0/3.5, 16 or 32 (STREAMs)?
解決
From the programming guide:
The maximum number of kernel launches that a device can execute concurrently is 32 on devices of compute capability 3.5 and 16 on devices of lower compute capability.