I am generating a mesh from volumetric data using Marching Cubes algorithm running on CUDA.
I have tried saving the mesh and rendering it in 3 ways.
V0x, V0y, V0z, N0x, N0y, N0z, V1x, V1y, V1z, N1x, N1y, N1z, ...
and draw it using glDrawArrays().
Redundant Vertices in VBO, Redundant Vertices per Cube, No Indices.
thrust::sort() and thrust::unique()to remove redundant vertices, compute indices using thrust::lower_bound(). save results to an OpenGL VBO/IBO mapped to CUDA.
draw the model using glDrawElements().No Redundant Vertices in VBO, Generated Indices.
glDrawElements().Redundant Vertices in VBO, Unique Vertices per Cube, Generated Indices per Cube
Now The FPS I get for the same dataset at same ISO-Value ` is
Method 1 : 92 FPS, 30,647,016 Verts, 0 Indices
Method 2 : 122 FPS, 6,578,066 Verts, 30,647,016 Indices
Method 3 : 140 FPS, 20,349,880 Verts, 30,647,016 Indices
Even though Method 2 yields the least number of vertices, the FPS is low. I believe this is because indices are in an order that minimizes GPU cache usage. The Indices order for Method 3 gets higher GPU cache usage hence the higher FPS.
How to modify/amend method 2 to yield higher FPS?
Two things can help:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With