Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

printf inside CUDA __global__ function

Tags:

c++

c

cuda

gpu

I am currently writing a matrix multiplication on a GPU and would like to debug my code, but since I can not use printf inside a device function, is there something else I can do to see what is going on inside that function. This my current function:

__global__ void MatrixMulKernel(Matrix Ad, Matrix Bd, Matrix Xd){      int tx = threadIdx.x;     int ty = threadIdx.y;      int bx = blockIdx.x;     int by = blockIdx.y;      float sum = 0;      for( int k = 0; k < Ad.width ; ++k){         float Melement = Ad.elements[ty * Ad.width + k];         float Nelement = Bd.elements[k * Bd.width + tx];         sum += Melement * Nelement;     }      Xd.elements[ty * Xd.width + tx] = sum; } 

I would love to know if Ad and Bd is what I think it is, and see if that function is actually being called.

like image 633
Jose Vega Avatar asked Jan 31 '10 23:01

Jose Vega


People also ask

Can you printf from CUDA kernel?

CUDA now supports printf s directly in the kernel.

What is __ global __ In Cuda?

__global__ is a CUDA C keyword (declaration specifier) which says that the function, Executes on device (GPU) Calls from host (CPU) code.

How does Cuda printf work?

From the CUDA C Programming Guide: printf prints formatted output from a kernel to a host-side output stream. The output buffer for printf() is set to a fixed size before kernel launch (see Associated Host-Side API).

Is there a way to call printf from a CUDA kernel?

you can’t do that because you can only call cuda functions from the global and device functions. There is a method for use printf within a cuda kernel? no not really. I think your question comes from a lack of understanding of the underlying hardware architecture.

How many lines of output does the printf () method produce in CUDA?

nvcc -arch compute_20 printf.cu An important thing to note is that everyCUDA thread will call printf. In this example, we'll see 100 lines of output!

Which CUDA devices support printffrom?

Devices with compute capability 2.x or higher support calls to printffrom within a CUDA kernel. 1(You must be using CUDA version 3.1 or higher). Here's a small example:

How to call CUDA set device from the host code?

Using the __global__ keyword for the functions that will be called from the host and run on the device Using the <<< , >>> angle brackets to mark a call from host code to device code The recommended parts are: Calling cudaSetDevice (int device); to specify which device should be used


1 Answers

CUDA now supports printfs directly in the kernel. For formal description see Appendix B.16 of the CUDA C Programming Guide.

like image 68
M. Tibbits Avatar answered Sep 24 '22 18:09

M. Tibbits