I'm using GLSL fragment shaders for GPGPU calculations (I have my reasons).
In nSight I see that I'm doing 1600 drawcalls per frame.
There could be 3 bottlenecks:
How do I find which one it is?
If my algorithm was simple (e.g. a gaussian blur or something), I could force the viewport of each drawcall to be 1x1, and depending on the speed change, I could rule out a fillrate problem.
In my case, though, that would require changing the entire algorithm.
Since you're mentioning Nvidia NSight tool, you could try to follow the procedures explained in the following Nvidia blog post.
It explains how to read and understand hardware performance counters to interpret performance bottlenecks.
The Peak-Performance-Percentage Analysis Method for Optimizing Any GPU Workload :
https://devblogs.nvidia.com/the-peak-performance-analysis-method-for-optimizing-any-gpu-workload/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With