I am currently using a tool shipped with nvidia's driver 'nvidia-smi' for performance monitoring on GPU. When we use 'nvidia-smi -a', it will give the information of current GPU information, including GPU core and memory usage, temperature and so on like this:
==============NVSMI LOG==============
Timestamp : Tue
Feb 22 22:39:09 2011
Driver Version : 260.19.26
GPU 0:
Product Name : GeForce 8800 GTX PCI Device/Vendor ID : 19110de PCI Location ID : 0:4:0 Board Serial : 211561763875 Display : Connected Temperature : 55 C Fan Speed : 47% Utilization GPU : 1% Memory : 0%
I am curious about how are the GPU and memory Utilization defined? For example, GPU core's utilization is 47%. It means there are 47% of SMs active working? Or all the GPU cores are busy in 47% time while idle other 53% time? For memory, the utilization stands for the ratio between current bandwidth and max bandwidth, or the busy time ratio in last time unit?
It ranges from P0 to P12 referring to maximum and minimum performance respectively. Persistence-M: The value of Persistence Mode flag where “On” means that the NVIDIA driver will remain loaded(persist) even when no active client such as Nvidia-smi is running.
To monitor the overall GPU resource usage, click the Performance tab, scroll down the left pane, and find the “GPU” option. Here you can watch real-time usage. It displays different graphs for what is happening with your system — like encoding videos or gameplay.
Utilization rates report how busy each GPU is over time, and can be used to determine how. much an application is using the GPUs in the system. GPU Percent of time over the past second during which one or more kernels was executing on the GPU.
GPU memory utilization: the percentage of time the memory controller was busy at any given time.
A post by a moderator on the NVIDIA forums says the GPU utilization and memory utilization figures are based on activity over the last second:
GPU busy is actually the percentage of time over the last second the SMs were busy, and the memory utilization is actually the percentage of bandwidth used during the last second. Full memory consumption statistics come with the next release.
You can refer to this official API document: http://docs.nvidia.com/deploy/nvml-api/structnvmlUtilization__t.html#structnvmlUtilization__t
It says : "Percent of time over the past sample period during which one or more kernels was executing on the GPU."
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With