What are some common guidelines in choosing vertex buffer type? When should we use interlaced buffers for vertex data, and when separate ones? When should we use an index array and when direct vertex data?
I'm searching for some common quidelines - I some cases where one or the opposite fits better, but not all cases are easily solvable. What should one have in mind choosing the vertex buffer format when aiming for performance?
Links to web resources on the topic are also welcome.
First of all, you can find some useful information on the OpenGL wiki. Second of all, if in doubt, profile, there are some rules-of-thumb about this one but experience can vary based on the data set, hardware, drivers, ... .
I would almost always by default use the indexed method for vertex buffers. The main reason for this is the so called post-transform cache. It's a cache kept after the vertex processing stage of your graphics pipeline. Essentially it means that if you use a vertex multiple times you have a good chance of hitting this cache and being able to skip the vertex computation. There is one condition to even hit this cache and that is that you need to use indexed buffers, it won't work without them as the index is a part of this cache's key.
Also, you likely will save storage, an index can be as small as you want (1 byte, 2 byte) and you can reuse a full vertex specification. Suppose that a vertex and all attributes total to about 30 bytes of data and you share this vertex over let's say 2 polygons. With indexed rendering (2 byte indices) this will cost you 2*index_size+attribute_size = 34 byte
. With non-indexed rendering this will cost you 60 bytes. Often your vertices will be shared more than twice.
Is index-based rendering always better? No, there might be scenarios where it's worse. For very simple applications it might not be worth the code overhead to set up an index-based data model. Also, when your attributes are not shared over polygons (e.g. normal per-polygon instead of per-vertex) there is likely no vertex-sharing at all and IBO's won't give a benefit, only overhead.
Next to that, while it enables the post-transform cache, it does make generic memory cache performance worse. Because you access the attributes relatively random, you might have quite some more cache misses and memory prefetching (if this would be done on the GPU) won't work decently. So it might be (but measure) that if you have enough memory and your vertex shader is extremely simple that the non-indexed version outperforms the indexed version.
This story is a bit more subtle and I think it comes down to weighing some properties of your attributes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With