Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why copying memory from VRAM to RAM is slower than RAM to VRAM? (OpenGL)

Tags:

c++

gpgpu

opengl

I am creating something similar to CUDA but I saw that copy memory from RAM to VRAM is very fast like copying from RAM to itself. But copy from VRAM to RAM is a way slower than RAM to VRAM.

By the way I am using glTexSubImage2D to copy from RAM to VRAM and glGetTexImage to copy from VRAM to RAM.

Why? Is there a way to improve it's performance like copying RAM to VRAM?

like image 300
User is deleted Avatar asked Jan 14 '23 21:01

User is deleted


1 Answers

Transferring data from GPU to CPU was always a very slow operation.

A GPU -> CPU readback introduces a "sync point" where the CPU must wait for the GPU to complete its calculations. During this time, the CPU stops feeding the GPU with data, causing it to stall.

Now, remember that a modern GPU is designed in a highly parallel manner, with thousand threads in flight at any given moment. The sync point must wait for all those threads to finish processing, before it can readback the result of their calculations. Once the readback is complete, all those threads must restart execution from zero... bad!

Reading back the results asynchronously (after a few frames), allows the GPU continue execution without its threads starving (the stop-and-resume issue outlined above). This improves performance tremendously - the more parallel the GPU, the higher the performance improvement.

Depending on your graphical chip and driver, maybe you get better performances by using PBOs.

like image 151
BЈовић Avatar answered Jan 18 '23 23:01

BЈовић