I switched to a new GPU GeForce GTX 980 with cc 5.2, so it must support dynamic parallelism. However, I was not able to compile even a simple code (from programming guide). I will not provide it here (not necessary, just there is a global kernel calling another global kernel).
1) I use VS2013 for coding. In property pages -> CUDA C/C++ -> device
, I changed code generation
property to compute_35,sm_35
, and here is the output:
1>------ Build started: Project: testCublas3, Configuration: Debug Win32 ------
1> Compiling CUDA source file kernel.cu...
1>
1> C:\programs\misha\cuda\Projects\test projects\testCublas3\testCublas3>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\bin\nvcc.exe" -gencode=arch=compute_35,code=\"sm_35,compute_35\" --use-local-env --cl-version 2013 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -G --keep-dir Debug -maxrregcount=0 --machine 32 --compile -cudart static -g -DWIN32 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MDd " -o Debug\kernel.cu.obj "C:\programs\misha\cuda\Projects\test projects\testCublas3\testCublas3\kernel.cu"
1>C:/programs/misha/cuda/Projects/test projects/testCublas3/testCublas3/kernel.cu(13): error : kernel launch from __device__ or __global__ functions requires separate compilation mode
1> kernel.cu
1>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V120\BuildCustomizations\CUDA 6.5.targets(593,9): error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\bin\nvcc.exe" -gencode=arch=compute_35,code=\"sm_35,compute_35\" --use-local-env --cl-version 2013 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -G --keep-dir Debug -maxrregcount=0 --machine 32 --compile -cudart static -g -DWIN32 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MDd " -o Debug\kernel.cu.obj "C:\programs\misha\cuda\Projects\test projects\testCublas3\testCublas3\kernel.cu"" exited with code 2.
I guess, that I need another option for this compilation: -rdc=true
, but I didn't find where I can set it in VS2013.
2) When I set code generation
property to compute_52,sm_52
, there is a error: Unsupported gpu architecture 'compute_52'
. But my cc is 5.2. So I can compile codes for 3.5 cc maximum?
Thanks
CUDA dynamic parallelism is an extension to the CUDA programming model enabling a CUDA kernel to create new thread grids by launching new kernels. Dynamic parallelism is introduced with the Kepler architecture, first appearing in the GK110 chip. In previous CUDA systems, kernels can only be launched from the host code.
Introduction. NVRTC is a runtime compilation library for CUDA C++. It accepts CUDA C++ source code in character string form and creates handles that can be used to obtain the PTX.
Regarding item 1, cuda dynamic parallelism requires separate compilation and linking (-rdc=true
), as well as linking in of the device cudart libraries (-lcudadevrt
). Dynamic parallelism that also uses CUBLAS will also require linking in the device CUBLAS library (-lcublas_device
). Possibly the simplest way to define where all these should go in a visual studio project is to start by looking at a visual studio project for the device cublas sample.
Regarding item 2, the reason your GTX 980 compute capability 5.2 is not being recognized is that you need the latest update for the cuda 6.5 toolkit, which is available here.
(Note that the cublas_device
capability has been removed from recent versions of CUDA.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With