Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to Optimizing a VBO/IBO to maximize GPU cache usage

I am generating a mesh from volumetric data using Marching Cubes algorithm running on CUDA.

I have tried saving the mesh and rendering it in 3 ways.

  1. save a crude set of triangles as a continuous array of vertex data. I estimate the size if the first pass, create an OpenGL VBO, map it to CUDA and write the vertex data to it in the format below

V0x, V0y, V0z, N0x, N0y, N0z, V1x, V1y, V1z, N1x, N1y, N1z, ...

and draw it using glDrawArrays().

Redundant Vertices in VBO, Redundant Vertices per Cube, No Indices.

  1. Take the mesh from step 1, use thrust::sort() and thrust::unique()to remove redundant vertices, compute indices using thrust::lower_bound(). save results to an OpenGL VBO/IBO mapped to CUDA. draw the model using glDrawElements().

No Redundant Vertices in VBO, Generated Indices.

  1. Generate a unique list of vertices per cube, store them in VBO along with their indices forming triangles in the IBO. Render using glDrawElements().

Redundant Vertices in VBO, Unique Vertices per Cube, Generated Indices per Cube

Now The FPS I get for the same dataset at same ISO-Value ` is

Method 1 : 92  FPS, 30,647,016 Verts,          0 Indices
Method 2 : 122 FPS,  6,578,066 Verts, 30,647,016 Indices
Method 3 : 140 FPS, 20,349,880 Verts, 30,647,016 Indices

Even though Method 2 yields the least number of vertices, the FPS is low. I believe this is because indices are in an order that minimizes GPU cache usage. The Indices order for Method 3 gets higher GPU cache usage hence the higher FPS.

How to modify/amend method 2 to yield higher FPS?

like image 643
Harish Avatar asked Jul 05 '15 19:07

Harish


1 Answers

Two things can help:

  • trying to optimize data cache usage by putting the vertices roughly in the order you will draw them
  • trying to optimize post transform cache usage (there is an algorithm to do that here, and implementations can probably be found on the net)
like image 76
Jerem Avatar answered Sep 17 '22 04:09

Jerem