Peer-to-Peer CUDA transfers

Question

I heard about peer-to-peer memory transfers and read something about it but could not really understand how much fast this is compared to standard PCI-E bus transfers.

I have a CUDA application which uses more than one gpu and I might be interested in P2P transfers. My question is: how fast is it compared to PCI-E? Can I use it often to have two devices communicate with each other?

Pavan Yalamanchili · Accepted Answer

A CUDA "peer" refers to another GPU that is capable of accessing data from the current GPU. All GPUs with compute 2.0 and greater have this feature enabled.

Peer to peer memory copies involve using cudaMemcpy to copy memory over PCI-E as shown below.

cudaMemcpy(dst, src, bytes, cudaMemcpyDeviceToDevice);

Note that dst and src can be on different devices.

cudaDeviceEnablePeerAccess enables the user to launch a kernel that uses data from multiple devices. The memory accesses are still done over PCI-E and will have the same bottlenecks.

A good example of this would be simplep2p from the cuda samples.

Peer-to-Peer CUDA transfers

Tags:

cuda

nvidia

bandwidth

p2p

pci-e

Marco A.

1 Answers

Pavan Yalamanchili

Recent Activity

Donate For Us

Peer-to-Peer CUDA transfers

Tags:

cuda

nvidia

bandwidth

p2p

pci-e

Marco A.

1 Answers

Pavan Yalamanchili

Related questions

Recent Activity

Donate For Us