Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

cuda 5.0 dynamic parallelism error: ptxas fatal . unresolved extern function 'cudaLaunchDevice

I am using tesla k20 with compute capability 35 on Linux with CUDA 5.With a simple child kernel call it gives a compile error : Unresolved extern function cudaLaunchDevice

My command line looks like:

nvcc --compile -G -O0 -g -gencode arch=compute_35 , code=sm_35 -x cu -o fill.cu fill.o

I see cudadevrt.a in lib64.. Do we need to add it or what coukd be done to resolve it? Without child kernel call everything works fine.

like image 836
Zahid Avatar asked Dec 15 '12 02:12

Zahid


2 Answers

You must explicitly compile with relocatable device code enabled and link the device runtime library in order to use dynamic parallelism. So your compilation command must include --relocatable-device-code true and the linking command (which you haven't shown us) should include -lcudadevrt.

This procedure is described in detail in the "TOOLKIT SUPPORT FOR DYNAMIC PARALLELISM" section of the Dynamic Parallelism Programming Guide pdf, available here.

like image 111
talonmies Avatar answered Oct 02 '22 19:10

talonmies


Perhaps I'm somewhat off topic, but I would like to mention that I had the same issue under Windows/Visual Studio 2010 and I have solved the problem using the last comment by talonmies in few steps.

1) View -> Property Pages
2) Configuration Properties -> CUDA C/C++ -> Common -> Generate Relocatable Device Code -> Yes (-rdc=true)
3) Configuration Properties -> CUDA C/C++ -> Device -> Code Generation -> compute_35,sm_35
4) Configuration Properties -> Linker -> Input -> Additional Dependencies -> cudadevrt.lib

I hope that this information is useful.

like image 39
Vitality Avatar answered Oct 02 '22 19:10

Vitality