CUDA synchronization kernels

Tags:

cuda

Hi I have a doubt about programming in CUDA. I have the following code:

int main () {

    for (;;) {
        kernel_1 (x1, x2, ....);
        kernel_2 (x1, x2 ...);
        kernel_3_Reduction (x1);

    // code manipulation host_x1
    // Copy the pointer device to host
        cpy (host_x1, x1, DeviceToHost)
        cpu_code_x1_manipulation;
        kernel_ (x1, x2, ....);
    }

}

So when the copies made and how do I ensure that kernel_1, kernel_2 kernel_3 and completed their tasks?

301

asked Sep 27 '12 19:09

1 Answers

All operations launched on the same stream are synchronized. In the code above, all kernels will run one after another. You will have to explicitly specify streams if you need kernel_1 and kernel_2 run in parallel.

139

answered Sep 23 '22 20:09

Eugene

Related questions
                            
                                CUDA core pipeline
                            
                                How can I modify xorg.conf file to force X server to run on a specific GPU? (I am using multiple GPUs) [closed]
                            
                                Does CUDA really not have a calloc()-like API call?
                            
                                drm.ko missing for CUDA 6.5 / Ubuntu 14.04 / AWS EC2 GPU instance g2.2xlarge
                            
                                Cuda GPU is slower than CPU in simple numpy operation
                            
                                Is it possible to install cupy on google colab?
                            
                                CUDA threads, SMX, SP and blocks, how do they work?
                            
                                Why can't member variables be shared?
                            
                                Efficient way to compute 3D indexes from 1D array representation
                            
                                CUDA 7.0 Error while compiling samples
                            
                                For CUDA, is there a guarantee that Ternary Operator can avoid branch divergence?
                            
                                Failed to compile cuda_ndarray.cu: libcublas.so.7.5: cannot open shared object file
                            
                                CUDA compiling error after installing it
                            
                                Vectorizing for cuda, a function that takes a complex number as input and a complex number as output fails in numba [closed]
                            
                                CUDA: cudaEvent_t and cudaThreadSynchronize usage
                            
                                Units of cuda registers
                            
                                What algorithm does OpenCV's Bayer conversion use?
                            
                                Meaning of following syntax of cuda Kernel
                            
                                Beginner CUDA - Simple var increment not working
                            
                                Half precision floating points in CUDA

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

CUDA synchronization kernels

Tags:

cuda

user1704397

People also ask

1 Answers

Eugene

Recent Activity

Donate For Us