How should you efficiently batch complex meshes?

Question

What is the best way to render complex meshes? I wrote different solutions below and wonder what is your opinion about them.

Let's take an example: how to render the 'Crytek-Sponza' mesh?

PS: I do not use Ubershader but only separate shaders

If you download the mesh on the following link:

http://graphics.cs.williams.edu/data/meshes.xml

and load it in Blender you'll see that the whole mesh is composed by about 400 sub-meshes with their own materials/textures respectively.

A dummy renderer (version 1) will render each of the 400 sub-mesh separately! It means (to simplify the situation) 400 draw calls with for each of them a binding to a material/texture. Very bad for performance. Very slow!

pseudo-code version_1:

foreach mesh in meshList //400 iterations :(!
 mesh->BindVBO();

  Material material = mesh->GetMaterial();
  Shader bsdf = ShaderManager::GetBSDFByMaterial(material);

  bsdf->Bind();
   bsdf->SetMaterial(material);
   bsdf->SetTexture(material->GetTexture()); //Bind texture

    mesh->Render();

Now, if we take care of the materials being loaded we can notice that the Sponza is composed in reality of ONLY (if I have a good memory :)) 25 different materials!

So a smarter solution (version 2) should be to gather all the vertex/index data in batches (25 in our example) and not store VBO/IBO into sub-meshes classes but into a new class called Batch.

pseudo-code version_2:

foreach batch in batchList //25 iterations :)!
  batch->BindVBO();

  Material material = batch->GetMaterial();
  Shader bsdf = ShaderManager::GetBSDFByMaterial(material);

  bsdf->Bind();
   bsdf->SetMaterial(material);
   bsdf->SetTexture(material->GetTexture()); //Bind texture

    batch->Render();

In this case each VBO contains data that share exactly the same texture/material settings!

It's so much better! Now I think 25 VBO for render the sponza is too much! The problem is the number of Buffer bindings to render the sponza! I think a good solution should be to allocate a new VBO if the first one if 'full' (for example let's assume that the maximum size of a VBO (value defined in the VBO class as attribute) is 4MB or 8MB).

pseudo-code version_3:

foreach vbo in vboList //for example 5 VBOs (depends on the maxVBOSize)

 vbo->Bind();

 BatchList batchList = vbo->GetBatchList();

 foreach batch in batchList

  Material material = batch->GetMaterial();
  Shader bsdf = ShaderManager::GetBSDFByMaterial(material);

  bsdf->Bind();
   bsdf->SetMaterial(material);
   bsdf->SetTexture(material->GetTexture()); //Bind texture

    batch->Render();

In this case each VBO does not contain necessary data that share exactly the same texture/material settings! It depends of the sub-mesh loading order!

So OK, there are less VBO/IBO bindings but not necessary less draw calls! (are you OK by this affirmation ?). But in a general manner I think this version 3 is better than the previous one! What do you think about this ?

Another optimization should be to store all the textures (or group of textures) of the sponza model in array(s) of textures! But if you download the sponza package you will see that all texture has different sizes! So I think they can't be bound together because of their format differences.

But if it's possible, the version 4 of the renderer should use only less texture bindings rather than 25 bindings for the whole mesh! Do you think it's possible ?

So, according to you, what is the best way to render the sponza mesh ? Have you another suggestion ?

Nicol Bolas · Accepted Answer

You are focused on the wrong things. In two ways.

First, there's no reason you can't stick all of the mesh's vertex data into a single buffer object. Note that this has nothing to do with batching. Remember: batching is about the number of draw calls, not the number of buffers you use. You can render 400 draw calls out of the same buffer.

This "maximum size" that you seem to want to have is a fiction, based on nothing from the real world. You can have it if you want. Just don't expect it to make your code faster.

So when rendering this mesh, there is no reason to be switching buffers at all.

Second, batching is not really about the number of draw calls (in OpenGL). It's really about the cost of the state changes between draw calls.

This video clearly spells out (about 31 minutes in), the relative cost of different state changes. Issuing two draw calls with no state changes between them is cheap (relatively speaking). But different kinds of state changes have different costs.

The cost of changing buffer bindings is quite small (assuming you're using separate vertex formats, so that changing buffers doesn't mean changing vertex formats). The cost of changing programs and even texture bindings is far greater. So even if you had to make multiple buffer objects (which again, you don't have to), that's not going to be the primary bottleneck.

So if performance is your goal, you'd be better off focusing on the expensive state changes, not the cheap ones. Making a single shader that can handle all of the material settings for the entire mesh, so that you only need to change uniforms between them. Use array textures so that you only have one texture binding call. This will turn a texture bind into a uniform setting, which is a much cheaper state change.

There are even fancier things you can do, involving base instance counts and the like. But that's overkill for a trivial example like this.

How should you efficiently batch complex meshes?

Tags:

opengl

glsl

shader

user1364743

1 Answers

Nicol Bolas

Recent Activity

Donate For Us

How should you efficiently batch complex meshes?

Tags:

opengl

glsl

shader

user1364743

1 Answers

Nicol Bolas

Related questions

Recent Activity

Donate For Us