Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CUDA: How to assert in kernel code?

Tags:

What is the equivalent technique of an assertion in CUDA kernel code?

There does not seem to be an assert for CUDA kernel code. I want a way to catch programmer mistakes easily in kernel code. A mechanism where I can set conditions that need to be true and the kernel should bail out when the condition is false with an error message.

like image 214
Ashwin Nanjappa Avatar asked Feb 25 '11 06:02

Ashwin Nanjappa


People also ask

How do I start CUDA kernel?

In order to run a kernel on the CUDA threads, we need two things. First, in the main() function of the program, we call the function to be executed by each thread on the GPU. This invocation is called Kernel Launch and with it we need provide the number of threads and their grouping.

What is the syntax to write kernel in CUDA?

When a kernel is called, its execution configuration is provided through <<<...>>> syntax, e.g. cuda_hello<<<1,1>>>() . In CUDA terminology, this is called "kernel launch".

Can a CUDA kernel call another kernel?

Basically, a child CUDA kernel can be called from within a parent CUDA kernel and then optionally synchronize on the completion of that child CUDA Kernel. The parent CUDA kernel can consume the output produced from the child CUDA kernel, all without CPU involvement [136].

What is CUDA kernel?

Figure 1 shows that the CUDA kernel is a function that gets executed on GPU. The parallel portion of your applications is executed K times in parallel by K different CUDA threads, as opposed to only one time like regular C/C++ functions. Figure 1. The kernel is a function executed on the GPU.

How do you call a kernel in CUDA?

Such function can be called through host code, e.g. the main () function in the example, and is also known as " kernels ". When a kernel is called, its execution configuration is provided through <<<...>>> syntax, e.g. cuda_hello<<<1,1>>> (). In CUDA terminology, this is called " kernel launch ".

How to compile a CUDA program?

Compiling a CUDA program is similar to C program. NVIDIA provides a CUDA compiler called nvcc in the CUDA toolkit to compile CUDA code, typically stored in a file with extension .cu. For example You might see following warning when compiling a CUDA program using above command

What is CUDA C++?

CUDA C++ extends C++ by allowing the programmer to define C++ functions, called kernels, that, when called, are executed N times in parallel by N different CUDA threads, as opposed to only once like regular C++ functions.

How does CUDA work with multiple processors?

When a CUDA program on the host CPU invokes a kernel grid, the blocks of the grid are enumerated and distributed to multiprocessors with available execution capacity. The threads of a thread block execute concurrently on one multiprocessor, and multiple thread blocks can execute concurrently on one multiprocessor.


2 Answers

For devices of cc 2.x or above, assertion , void assert(int expression), could be used within a kernel such that threads with expression == 0 send a message to stderr once a host synchronization function is called.

For other cases or when assertion cannot be used (e.g. on MacOS), you won't be able to return an error message or error code to the host from a kernel.

Instead, I would set a error state and check it from the host. Use device global memory or (better) mapped host memory for storing an error state, passed as a parameter to each kernel call. Use if statements in the kernel, and of if the statements fail, set the error code and return. You will be able to check the error code from the host after the kernel call, but keep in mind that you will have synchronize the host and device after the kernel launch before checking the error code. I guess this will work fine for development but not so much for production.

As to printing an error message straight from the device

  • In 1.x, 2.x, and 3.0 cards, you can use emulation mode to print an error message.
  • In 3.1 forward (on fermi), apparently you can use printf in kernels to print the error message. It appears that it doesn't always work right away, e.g. http://forums.nvidia.com/index.php?showtopic=182448
like image 69
jmilloy Avatar answered Nov 07 '22 11:11

jmilloy


I would like to point out that an assert may occur in one thread only, but if you decide to early terminate that thread its absense may cause other bugs (and probably other asserts) happening later; possibly leading to a complete kernel crash and loose of all information on the GPU.

Also, the answer given at " Using assert within kernel invocation " will work only if the assert is used directly in the __ global__ function and not deeper, somewhere inside __ device__ function.

My suggestion is, that even an assert fails, you proceed normally with your code, but leave an error message. You can use mapped, pinned memory (you map host RAM memory into GPU address space) to store error codes/messages. That way, even if your kernel crashes and GPU is reset, you are likely to obtain valuable information in that mapped memory. If I am not mistaken, mapped, pinned memory is supported by almost all devices of Compute Capability 1.1 and higher.

like image 25
CygnusX1 Avatar answered Nov 07 '22 13:11

CygnusX1