Multiple processes launching CUDA kernels in parallel

Tags:

I know that NVIDIA gpus with compute capability 2.x or greater can execute u pto 16 kernels concurrently. However, my application spawns 7 "processes" and each of these 7 processes launch CUDA kernels.

My first question is that what would be the expected behavior of these kernels. Will they execute concurrently as well or, since they are launched by different processes, they would execute sequentially.

I am confused because the CUDA C programming guide says:

"A kernel from one CUDA context cannot execute concurrently with a kernel from another CUDA context." This brings me to my second question, what are CUDA "contexts"?

Thanks!

276

asked Feb 15 '13 12:02

user2075543

1 Answers

A CUDA context is a virtual execution space that holds the code and data owned by a host thread or process. Only one context can ever be active on a GPU with all current hardware.

So to answer your first question, if you have seven separate threads or processes all trying to establish a context and run on the same GPU simultaneously, they will be serialised and any process waiting for access to the GPU will be blocked until the owner of the running context yields. There is, to the best of my knowledge, no time slicing and the scheduling heuristics are not documented and (I would suspect) not uniform from operating system to operating system.

You would be better to launch a single worker thread holding a GPU context and use messaging from the other threads to push work onto the GPU. Alternatively there is a context migration facility available in the CUDA driver API, but that will only work with threads from the same process, and the migration mechanism has latency and host CPU overhead.

172

answered Sep 29 '22 05:09

talonmies

Related questions
                            
                                Confusion on CUDA/openCL and C++ AMP
                            
                                Ubuntu 14.04 how to install cuda 6.5 without installing nvidia driver
                            
                                Best practice for upgrading CUDA and cuDNN for tensorflow
                            
                                Forcing CUDA to use register for a variable
                            
                                What is the maximum block count possible in CUDA?
                            
                                Which CUDA Toolkit version for older NVIDIA Driver
                            
                                Easiest way to test for existence of cuda-capable GPU from cmake?
                            
                                Installing theano on Windows 8 with GPU enabled
                            
                                Timing CUDA operations
                            
                                Funnel shift - what is it?
                            
                                Financial applications on GPGPU
                            
                                How to calculate the speedup of a GPU program?
                            
                                Can I use C++11 in the .cu-files (CUDA5.5) in Windows7x64 (MSVC) and Linux64 (GCC4.8.2)?
                            
                                Could not insert 'nvidia_352': No such device
                            
                                Are there advantages to using the CUDA vector types?
                            
                                How to find epsilon, min and max constants for CUDA?
                            
                                TensorFlow: libcudart.so.7.5: cannot open shared object file: No such file or directory
                            
                                CUDA Runtime API error 38: no CUDA-capable device is detected
                            
                                CUDA __device__ Unresolved extern function [duplicate]
                            
                                Lambda expressions with CUDA

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Multiple processes launching CUDA kernels in parallel

Tags:

cuda

gpu

user2075543

People also ask

1 Answers

talonmies

Recent Activity

Donate For Us