I'm working on a project that needs to make use of FFTs on both Nvidia and AMD graphics cards. I initially looked for a library that would work on both (thinking this would be the OpenCL way) but I wasn't having any luck.
Someone suggested to me that I would have to use each vendor's FFT implementation and write a wrapper that chose what to do based on the platform. I found AMD's implementation pretty easily, but I'm actually working with an Nvidia card in the meantime (and this is the more important one for my particular application).
The only Nvidia implementation I can find is the CUFFT one. Does anyone know how I can actually use the CUFFT library from OpenCL? The only way I can think of is by having some CUDA code alongside my OpenCL code. I've read that I can't just use OpenCL buffers as CUDA pointers ( Trying to mix in OpenCL with CUDA in NVIDIA's SDK template ). Instead, would I have to copy the buffers back to the host after running OpenCL kernels and then copy them back to the GPU using the CUDA memory transfer routines? I don't really like this approach as it seems to involve pointless memory transfers, I would much prefer it if I could just use CUFFT from OpenCL.
In short: Yes, programs developed with the OpenCL headers from Nvidia toolkit will also work on AMD and Intel GPUs. Programs for Nvidia GPUs are always in OpenCL C 1.2, and AMD/Intel GPUs support OpenCL C 1.2/2.0/2.1/2.2, which always is backwards-compatible with OpenCL C 1.2.
OpenCL™ (Open Computing Language) is a low-level API for heterogeneous computing that runs on CUDA-powered GPUs. Using the OpenCL API, developers can launch compute kernels written using a limited subset of the C programming language on a GPU.
If you have an Nvidia card, then use CUDA. It's considered faster than OpenCL much of the time. Note too that Nvidia cards do support OpenCL. The general consensus is that they're not as good at it as AMD cards are, but they're coming closer all the time.
NVIDIA has not done any work to support OpenCL libraries, like FFT. It also has not provided source to its CUDA libraries, so there is no way to run those using OpenCL.
AMD's FFT library is your best bet and will run on any other OpenCL-compliant device, including NVIDIA's GPUs. ArrayFire OpenCL leverages AMD's FFT library, and I've run that on Intel, NVIDIA, and AMD devices in our lab.
In addition to Ben's AMD suggestion, you could also investigate the Apple FFT example code. However, their code runs only on GPU devices as it checks for which device types the provided command queue was created for.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With