I'm working on a Three.js WebGL scene and am noticing 60 FPS when I'm zoomed out so that all observations (~20,000 triangles) are in view, but very low FPS when I'm zoomed in so that only a small subset of the triangles are in view.
I'd like to figure out what's causing this discrepancy. My intuition is that the opposite would be true: I'd assume when the user is zoomed in the near and far clipping planes would remove many triangles from the scene which would increase FPS. I want to figure out why this intuition is wrong in this scene.
How can one identify the full stack of calls used within a three.js program? Ideally I'd like to identify all the function / method calls and the time required to execute that function so that I can try and figure out which portion of the shaders I'm working on are killing the FPS when the user is zoomed in.
Three. js is an open source JavaScript library that is used to display the graphics, 3D and 2D objects on the web browser. It uses WebGL API behind the scene.
three. js is a JavaScript-based WebGL engine that can run GPU-powered games and other graphics-powered apps straight from the browser. The three. js library provides many features and APIs for drawing 3D scenes in your browser.
js is a cross-browser JavaScript library and application programming interface (API) used to create and display animated 3D computer graphics in a web browser using WebGL. The source code is hosted in a repository on GitHub.
GPUs have a few basic places where they spend computing power. It should be pretty obvious. One is running the vertex shader once per vertex. The other is running the fragment shader once per pixel/fragment.
There are almost always a ton more pixels than vertices. A single 1920x1080 screen is nearly 2 million pixels yet can be covered in a 3 vertex triangle or a 4 or 6 vertex quad (2 triangles). That means to cover the entire screen the vertex shader ran 3 to 6 times but the fragment shader ran 2 million times!!!
Sending too much work to the fragment shader is called being "fill bound". You maxed out the fill rate (filling triangles with pixels) and that is what you're seeing. In the worse case on my 2014 MacBook Pro I might be able to only draw at 6 or so screens worth of pixels before I've hit the fill rate limit for updating the screen at 60 frames a second.
There are various solutions to this.
The first is the z-buffer. The GPU will first test the depth buffer to see if it needs to run the fragment shader at all. If the depth test fails the GPU does not need to run the fragment shader. So, if you sort and draw opaque objects, closest objects first to furthest object last, then most of those objects in the distance will fail the depth test when rendering the pixels of their triangles. Note that this is only possible if your fragment shader does not write to gl_FragDepth
and does not use the discard
keyword.
This is a method of "avoiding overdraw". Overdraw is any pixel that is drawn more than once. If you draw a cube in the distance and then draw a sphere up close such that it covers the cube then for every pixel that was rendered for the cube it was "overdrawn" by the sphere pixels. That was a waste of time.
If your fragment shaders are really complicated and therefore slow to run some 3D engines will draw a "Z buffer pre-pass". They'll draw all the opaque geometry with the simplest vertex and fragment shader. The vertex shader only needs position. The fragment shader just emits a constant value. They'll even turn off drawing to the color buffer gl.colorMask(false, false, false, false)
or possibly make a depth only framebuffer if that's supported by the hardware. They then use this to fill out the depth buffer. When finished they render everything again with the expensive shader and the depth test set to LEQUAL
(or whatever works for their engine). In this way every pixel will only be rendered once. Of course it's not free, it still takes the GPU time to try to rasterize the triangles and test every pixel but it can still be faster than overdraw if the shaders are expensive.
Another way is to try to figure out which objects are going to be occluded by closer objects and not even submit them to the GPU. There are tons of ways to do this, usually involving bounding spheres and or bounding boxes. Some potentially visible sets techniques can also help with occlusion culling. You can even ask the GPU to compute some of this using occlusion queries though that's only available in WebGL2
The easiest way to see if you're fill bound is to make your canvas tiny, like 2x1 pixels (or just size your browser window really small). If your app starts running fast it's likely fill bound. If it's still running slow it could either be geometry bound (the vertex shader is doing too much work) or it's CPU bound (whatever work you're doing on the CPU is taking too long whether that's just calling WebGL commands or computing animation or collisions or physics or whatever).
In your case you likely are fill bound since you see when all the triangles are small it runs fast (because very few pixels are being drawn) vs when you're zoomed in and lots of triangles cover the screen then it runs slow (because too many pixels are being drawn).
There are no really "simple" solutions. I really just depends on what you're trying to do. Apparently you're using three.js, I know it can sort for transparent objects. I have no idea if it sorts for opaque objects. The other techniques listed I believe are kind of outside the scope of three.js and more up to your app to take things in and out of the scene or set their visibility to false etc...
Note: here is a simple demo to show how little overdraw your GPU can handle. It just draws a bunch of fullscreen quads. By default it likely can't draw that many, especially at fullscreen size, before it can no longer hit 60fps. Turn on sorting front to back and it will be able to draw more and still hit 60fps.
Also note that enabling blending is slower than with blending disabled. This should be clear because without blending the GPU just writes the pixel. With blending the GPU has to first read the destination pixel so that it can do the blending therefore it's slower.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With