I am generating a mesh from volumetric data using Marching Cubes algorithm running on CUDA.
I have tried saving the mesh and rendering it in 3 ways.
V0x, V0y, V0z, N0x, N0y, N0z, V1x, V1y, V1z, N1x, N1y, N1z, ...
and draw it using glDrawArrays()
.
Redundant Vertices in VBO, Redundant Vertices per Cube, No Indices.
thrust::sort()
and thrust::unique()
to remove redundant vertices, compute indices using thrust::lower_bound()
. save results to an OpenGL VBO/IBO mapped to CUDA.
draw the model using glDrawElements()
.No Redundant Vertices in VBO, Generated Indices.
glDrawElements()
.Redundant Vertices in VBO, Unique Vertices per Cube, Generated Indices per Cube
Now The FPS I get for the same dataset at same ISO-Value ` is
Method 1 : 92 FPS, 30,647,016 Verts, 0 Indices
Method 2 : 122 FPS, 6,578,066 Verts, 30,647,016 Indices
Method 3 : 140 FPS, 20,349,880 Verts, 30,647,016 Indices
Even though Method 2 yields the least number of vertices, the FPS is low. I believe this is because indices are in an order that minimizes GPU cache usage. The Indices order for Method 3 gets higher GPU cache usage hence the higher FPS.
How to modify/amend method 2 to yield higher FPS?
Two things can help:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With