With OpenGL 4.0, shader tessellation support has been added. What is the difference between per primitive tessellation and vao vertex preloading in terms of performance? By vao vertex preloading I mean loading all the vertex data in a vao and rendering it while keeping the tesselated surface in the RAM, instead of creating the tesselated surface on the fly through the shader pipeline.
The answer is very definitely "it depends."
OK, you've got a bunch of control points for a mesh (Bezier, NURB, Catmull-Clark, ...). In classic OpenGL you'd tessellate the control points on the CPU and store the triangles in a VAO collection of vertex buffer objects. With OpenGL 4 you can pass the control points in a VAO, render as GL_PATCHES, and have the control and evaluation shaders generate the triangles.
If the mesh of control points is static and rarely changes, the CPU solution is better because you only do the tessellation once and store the results, instead of recalculating exactly the same tessellation each frame.
If the mesh changes, then the GPU solution will be better. Copying new control points from CPU to GPU is quicker because less data, and the GPU can tessellate in parallel so will be faster. GPU side tessellation can now change the level of detail by distance (or whatever) at individual quads within the mesh, so you can implement ROAM type algorithms with shaders.
Unless your GPU is already heavily loaded, say by a complex lighting and shadowing algorithm, and the CPU is lightly loaded. In which case doing tessellation on the CPU gives overall better system performance even if the tessellation itself takes longer.
Which is you? Try it and see!
(And also be aware that it might not matter at all. Modern 3D systems are awesomely fast.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With