Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I see the GPU's bottleneck in a complex algorithm?

I'm using GLSL fragment shaders for GPGPU calculations (I have my reasons).

In nSight I see that I'm doing 1600 drawcalls per frame.

There could be 3 bottlenecks:

  • Fillrate
  • Just too many drawcalls
  • GPU stalls due to my GPU->CPU downloads and CPU->GPU uploads

How do I find which one it is?

If my algorithm was simple (e.g. a gaussian blur or something), I could force the viewport of each drawcall to be 1x1, and depending on the speed change, I could rule out a fillrate problem.

In my case, though, that would require changing the entire algorithm.

like image 476
Stefan Monov Avatar asked May 12 '26 20:05

Stefan Monov


1 Answers

Since you're mentioning Nvidia NSight tool, you could try to follow the procedures explained in the following Nvidia blog post.

It explains how to read and understand hardware performance counters to interpret performance bottlenecks.

The Peak-Performance-Percentage Analysis Method for Optimizing Any GPU Workload :

https://devblogs.nvidia.com/the-peak-performance-analysis-method-for-optimizing-any-gpu-workload/

like image 112
rotoglup Avatar answered May 14 '26 18:05

rotoglup



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!