I am currently writing a matrix multiplication on a GPU and would like to debug my code, but since I can not use printf inside a device function, is there something else I can do to see what is going on inside that function. This my current function:
__global__ void MatrixMulKernel(Matrix Ad, Matrix Bd, Matrix Xd){ int tx = threadIdx.x; int ty = threadIdx.y; int bx = blockIdx.x; int by = blockIdx.y; float sum = 0; for( int k = 0; k < Ad.width ; ++k){ float Melement = Ad.elements[ty * Ad.width + k]; float Nelement = Bd.elements[k * Bd.width + tx]; sum += Melement * Nelement; } Xd.elements[ty * Xd.width + tx] = sum; }
I would love to know if Ad and Bd is what I think it is, and see if that function is actually being called.
CUDA now supports printf s directly in the kernel.
__global__ is a CUDA C keyword (declaration specifier) which says that the function, Executes on device (GPU) Calls from host (CPU) code.
From the CUDA C Programming Guide: printf prints formatted output from a kernel to a host-side output stream. The output buffer for printf() is set to a fixed size before kernel launch (see Associated Host-Side API).
you can’t do that because you can only call cuda functions from the global and device functions. There is a method for use printf within a cuda kernel? no not really. I think your question comes from a lack of understanding of the underlying hardware architecture.
nvcc -arch compute_20 printf.cu An important thing to note is that everyCUDA thread will call printf. In this example, we'll see 100 lines of output!
Devices with compute capability 2.x or higher support calls to printffrom within a CUDA kernel. 1(You must be using CUDA version 3.1 or higher). Here's a small example:
Using the __global__ keyword for the functions that will be called from the host and run on the device Using the <<< , >>> angle brackets to mark a call from host code to device code The recommended parts are: Calling cudaSetDevice (int device); to specify which device should be used
CUDA now supports printf
s directly in the kernel. For formal description see Appendix B.16 of the CUDA C Programming Guide.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With