Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Peer-to-Peer CUDA transfers

I heard about peer-to-peer memory transfers and read something about it but could not really understand how much fast this is compared to standard PCI-E bus transfers.

I have a CUDA application which uses more than one gpu and I might be interested in P2P transfers. My question is: how fast is it compared to PCI-E? Can I use it often to have two devices communicate with each other?

like image 324
Marco A. Avatar asked Jan 12 '23 21:01

Marco A.


1 Answers

A CUDA "peer" refers to another GPU that is capable of accessing data from the current GPU. All GPUs with compute 2.0 and greater have this feature enabled.

Peer to peer memory copies involve using cudaMemcpy to copy memory over PCI-E as shown below.

cudaMemcpy(dst, src, bytes, cudaMemcpyDeviceToDevice);

Note that dst and src can be on different devices.

cudaDeviceEnablePeerAccess enables the user to launch a kernel that uses data from multiple devices. The memory accesses are still done over PCI-E and will have the same bottlenecks.

A good example of this would be simplep2p from the cuda samples.

like image 145
Pavan Yalamanchili Avatar answered Feb 06 '23 08:02

Pavan Yalamanchili