Are cuda kernel calls synchronous or asynchronous

Tags:

I read that one can use kernel launches to synchronize different blocks i.e., If i want all blocks to complete operation 1 before they go on to operation 2, I should place operation 1 in one kernel and operation 2 in another kernel. This way, I can achieve global synchronization between blocks. However, the cuda c programming guide mentions that kernel calls are asynchronous ie. the CPU does not wait for the first kernel call to finish and thus, the CPU can also call the second kernel before the 1st has finished. However, if this is true, then we cannot use kernel launches to synchronize blocks. Please let me know where i am going wrong

512

asked Dec 12 '11 11:12

Programmer

1 Answers

Kernel calls are asynchronous from the point of view of the CPU so if you call 2 kernels in succession the second one will be called without waiting for the first one to finish. It only means that the control returns to the CPU immediately.

On the GPU side, if you haven't specified different streams to execute the kernel they will be executed by the order they were called (if you don't specify a stream they both go to the default stream and are executed serially). Only after the first kernel is finished the second one will execute.

This behavior is valid for devices with compute capability 2.x which support concurrent kernel execution. On the other devices even though kernel calls are still asynchronous the kernel execution is always sequential.

Check the CUDA C programming guide on section 3.2.5 which every CUDA programmer should read.

134

answered Sep 20 '22 08:09

jmsu

Related questions
                            
                                CUDA Runtime API error 38: no CUDA-capable device is detected
                            
                                What's the relation between nvidia driver, cuda driver and cuda toolkit?
                            
                                Does GPL code linking with proprietary library depend which is created first? [closed]
                            
                                CUDA atomicAdd for doubles definition error
                            
                                What is a CUDA context?
                            
                                How to install Cudnn from command line
                            
                                Tensorflow cannot open libcuda.so.1
                            
                                Why aren't there bank conflicts in global memory for Cuda/OpenCL?
                            
                                How do you measure peak memory bandwidth in OpenGL?
                            
                                nvidia-smi Failed to initialize NVML: GPU access blocked by the operating system
                            
                                Tensorflow CUDA - CUPTI error: CUPTI could not be loaded or symbol could not be found
                            
                                GPU shared memory size is very small - what can I do about it?
                            
                                what is Device interconnect StreamExecutor with strength 1 edge matrix
                            
                                NVidia CUDA toolkit 7.5.27 failing to install on OS X
                            
                                Qt 5.12: Failed to find "GL/gl.h" in "/usr/include/libdrm"
                            
                                How is CUDA memory managed?
                            
                                Forcing NVIDIA GPU programmatically in Optimus laptops
                            
                                Explanation of CUDA C and C++
                            
                                What is CUDA like? What is it for? What are the benefits? And how to start?
                            
                                What is the correct version of CUDA for my nvidia driver?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Are cuda kernel calls synchronous or asynchronous

Tags:

cuda

nvidia

Programmer

People also ask

1 Answers

jmsu

Recent Activity

Donate For Us