What is preferrable, from an effiency point of view (or another point of view if it's important) ?
Situation
An OpenGL application that draws many lines at different positions every frame (60 fps). Lets say there are 10 lines. Or 100 000 lines. Would the answer be different?
Every frame would have one glDrawArrays call per line to draw, and in between there would be matrix transformations to position our one line
Every frame would have a single draw call
The second is incredibly more efficient.
Changing states, particularly transformation and matrices, tends to cause recalculation of other states and generally more math.
Updating geometry, however, simply involves overwriting a buffer.
With modern video hardware on rather massive bandwidth busses, sending a few floats across is trivial. They're designed for moving tons of data quickly, it's a side effect of the job. Updating vertex buffers is exactly what they do often and fast. If we assum points of 32 bytes each (float4 position and color), 100000 line segments is less than 6 MB and PCIe 2.0 x16 is about 8 GB/s, I believe.
In some cases, depending on how the driver or card handles transforms, changing one may cause some matrix multiplication and recalculating of other values, including transforms, culling and clipping planes, etc. This isn't a problem if you change the state, draw a few thousand polys, and repeat, but when the state changes are often, they will have a significant cost.
A good example of this being previously solved is the concept of batching, minimizing state changes so more geometry can be drawn between them. This is used to more efficiently draw large amounts of geometry.
As a very clear example, consider the best case for #1: transform set triggers no additional calculation and the driver buffers zealously and perfectly. To draw 100000 lines, you need:
The function call overhead alone is going to kill performance.
On the other hand, batching involves:
You do copy more data, but there's a good chance the VBO contents still aren't as expensive as copying the matrix data. Plus, you save a huge amount of CPU time in function calls (200000 down to 2). This simplifies life for you, the driver (which has to buffer everything and check for redundant calls and optimize and handle downloading) and probably the video card as well (which may have had to recalculate). To make it really clear, visualize simple code for it:
for (i = 0; i < 100000; ++i)
{
matrix = calcMatrix(i);
setMatrix(matrix);
drawLines(1, vbo);
}
(now unwrap that)
matrix = calcMatrix();
setMatrix(matrix);
for (i = 0; i < 100000; ++i)
{
localVBO[i] = point[i];
}
setVBO(localVBO);
drawLines(100000, vbo);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With