What is the best pattern to get a GPU efficiently calculate 'anti-functional' routines, that usually depend on positioned memory writes instead of reads? Eg. like calculating a histogram, sorting, dividing a number by percentages, merging data of differing size into lists etc. etc.
The established terms are gather reads and scatter writes
This means that your program will write to a fixed position (like the target fragment position of a fragment shader), but has fast access to arbitrary data sources (textures, uniforms, etc.)
This means, that a program receives a stream of input data which it cannot arbitarily address, but can do fast writes to arbitrary memory locations.
Clearly the shader architecture of OpenGL is a gather system. Latest OpenGL-4 also allows some scatter writes in the fragment shader, but they're slow.
So what is the most efficient way, these days, to emulate "scattering" with OpenGL. So far this is using a vertex shader operating on pixel sized points. You send in as many points as you have data-points to process and scatter them in target memory by setting their positions accordingly. You can use geometry and tesselation shaders to yield the points processed in the vertex unit. You can use texture buffers and UBOs for data input, using the vertex/point index for addressing.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With