cuda 5.0 dynamic parallelism error: ptxas fatal . unresolved extern function 'cudaLaunchDevice

Question

I am using tesla k20 with compute capability 35 on Linux with CUDA 5.With a simple child kernel call it gives a compile error : Unresolved extern function cudaLaunchDevice

My command line looks like:

nvcc --compile -G -O0 -g -gencode arch=compute_35 , code=sm_35 -x cu -o fill.cu fill.o

I see cudadevrt.a in lib64.. Do we need to add it or what coukd be done to resolve it? Without child kernel call everything works fine.

talonmies · Accepted Answer

You must explicitly compile with relocatable device code enabled and link the device runtime library in order to use dynamic parallelism. So your compilation command must include --relocatable-device-code true and the linking command (which you haven't shown us) should include -lcudadevrt.

This procedure is described in detail in the "TOOLKIT SUPPORT FOR DYNAMIC PARALLELISM" section of the Dynamic Parallelism Programming Guide pdf, available here.

Vitality · Answer

Perhaps I'm somewhat off topic, but I would like to mention that I had the same issue under Windows/Visual Studio 2010 and I have solved the problem using the last comment by talonmies in few steps.

1) View -> Property Pages
2) Configuration Properties -> CUDA C/C++ -> Common -> Generate Relocatable Device Code -> Yes (-rdc=true)
3) Configuration Properties -> CUDA C/C++ -> Device -> Code Generation -> compute_35,sm_35
4) Configuration Properties -> Linker -> Input -> Additional Dependencies -> cudadevrt.lib

I hope that this information is useful.

cuda 5.0 dynamic parallelism error: ptxas fatal . unresolved extern function 'cudaLaunchDevice

Tags:

parallel-processing

cuda

gpu

Zahid

2 Answers

talonmies

Vitality

Recent Activity

Donate For Us

cuda 5.0 dynamic parallelism error: ptxas fatal . unresolved extern function 'cudaLaunchDevice

Tags:

parallel-processing

cuda

gpu

Zahid

2 Answers

talonmies

Vitality

Related questions

Recent Activity

Donate For Us