Question

I use the cuFFT library calls in a host code they work fine, but I want to call the cuFFT library from a kernel. Earlier versions of the CUDA didn't have this kind of support but with the dynamic parallelism is this possible ?

It will be great if there are any examples on how to achieve this.

Was it helpful?

Solution

Despite the introduction of dynamic parallelism on Kepler (cc 3.5) cards, cuFFT remains a host API and there is currently no way of creating or executing FFT operations in device code using cuFFT.

OTHER TIPS

there is NO way to call the APIs from the GPU kernel. You must call them from the host. If you want to run a FFT without passing from DEVICE -> HOST -> DEVICE to continue your elaboration, the only solution is to write a kernel that performs the FFT in a device function. Actually I'm doing this because I need to run more FFTs in parallel without passing again the datas to the HOST. If you find/have another solution let me know. There are a lot of example on the web to how to achieve this: -https://hackage.haskell.org/package/pure-fft-0.2.0/docs/Numeric-FFT.html

I already answered this in the duplicate thread: Is there a method of FFT that will run inside CUDA Kernel?. In short, since CUDA 11.0, there is cuFFTDx (Device Extensions), which allows you to do exactly that.

Link to my answer there: https://stackoverflow.com/a/72403181/6924585.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top