Cuda Hello World printf not working even with -arch=sm_20

Question

I didn't think I was a complete newbie with Cuda, but apparently I am.

I recently upgraded my cuda device to one capable capability 1.3 to 2.1 (Geforce GT 630). I thought to do a full upgrade to Cuda toolkit 5.0 as well.

I can compile general cuda kernels, but printf is not working even with -arch=sm_20 set.

Code:

#include <stdio.h>
#include <assert.h>
#include <cuda.h>
#include <cuda_runtime.h>

__global__ void test(){

    printf("Hi Cuda World");
}

int main( int argc, char** argv )
{

    test<<<1,1>>>();
        return 0;
}

Compiler:

Error   2   error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\bin
vcc.exe" -gencode=arch=compute_10,code=\"sm_20,compute_10\" --use-local-env --cl-version 2010 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin"  -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\include"  -G   --keep-dir "Debug" -maxrregcount=0  --machine 32 --compile -arch=sm_20  -g   -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MDd  " -o "Debug\main.cu.obj" "d:\userstore\documents\visual studio 2010\Projects	estCuda	estCuda\main.cu"" exited with code 2.  C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\BuildCustomizations\CUDA 5.0.targets  592 10  testCuda
Error   1   error : calling a __host__ function("printf") from a __global__ function("test") is not allowed d:\userstore\documents\visual studio 2010\Projects	estCuda	estCuda\main.cu    9   1   testCuda

I'm about done with life because of this problem...done done done. Please talk me down from the rooftops with an answer.

Tomasz Dzięcielewski · Accepted Answer

If you're using printf in kernel, you should use cudaDeviceSynchronize():

#include <stdio.h>
#include <assert.h>
#include <cuda.h>
#include <cuda_runtime.h>

__global__ void test(){
    printf("Hi Cuda World");
}

int main( int argc, char** argv )
{
    test<<<1,1>>>();
    cudaDeviceSynchronize();
    return 0;
}

talonmies · Answer

In kernel printf is only supported in compute capability 2 or higher hardware. Because your project is set to build for both compute capability 1.0 and compute 2.1, nvcc compiles the code multiple times and builds a multi-architecture fatbinary object. It is during the compute capability 1.0 compilation cycle that the error is being generated, because the printf call is unsupported for that architecture.

If you remove the compute capability 1.0 build target from your project, the error will disappear.

You could alternatively, write the kernel like this:

__global__ void test()
{
#if __CUDA_ARCH__ >= 200
    printf("Hi Cuda World");
#endif
}

The __CUDA_ARCH__ symbol will only be >= 200 when building for compute capability 2.0 or high targets and this would allow you to compile this code for compute capability 1.x devices without encountering a syntax error.

When compiling for the correct architecture and getting no output, you also need to ensure that the kernel finishes and the driver flushes the output buffer. To do this add a synchronizing call after the kernel launch in the host code

for example:

int main( int argc, char** argv )
{

    test<<<1,1>>>();
    cudaDeviceSynchronize();
    return 0;
}

[disclaimer: all code written in browser, never compiled, use at own risk]

If you do both things, you should be able to compile, run and see output.

Cuda Hello World printf not working even with -arch=sm_20

Tags:

c++

cuda

PaulD

2 Answers

Tomasz Dzięcielewski

talonmies

Recent Activity

Donate For Us

Cuda Hello World printf not working even with -arch=sm_20

Tags:

c++

cuda

PaulD

2 Answers

Tomasz Dzięcielewski

talonmies

Related questions

Recent Activity

Donate For Us