nvidia-smi Volatile GPU-Utilization explanation?

Tags:

I know that nvidia-smi -l 1 will give the GPU usage every one second (similarly to the following). However, I would appreciate an explanation on what Volatile GPU-Util really means. Is that the number of used SMs over total SMs, or the occupancy, or something else?

+-----------------------------------------------------------------------------+ | NVIDIA-SMI 367.48                 Driver Version: 367.48                    | |-------------------------------+----------------------+----------------------+ | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC | | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. | |===============================+======================+======================| |   0  Tesla K20c          Off  | 0000:03:00.0     Off |                    0 | | 30%   41C    P0    53W / 225W |      0MiB /  4742MiB |     96%      Default | +-------------------------------+----------------------+----------------------+ |   1  Tesla K20c          Off  | 0000:43:00.0     Off |                    0 | | 36%   49C    P0    95W / 225W |   4516MiB /  4742MiB |     63%      Default | +-------------------------------+----------------------+----------------------+  +-----------------------------------------------------------------------------+ | Processes:                                                       GPU Memory | |  GPU       PID  Type  Process name                               Usage      | |=============================================================================| |    1      5193    C   python                                        4514MiB | +-----------------------------------------------------------------------------+

622

asked Dec 02 '16 17:12

user3813674

2 Answers

It is a sampled measurement over a time period. For a given time period, it reports what percentage of time one or more GPU kernel(s) was active (i.e. running).

It doesn't tell you anything about how many SMs were used, or how "busy" the code was, or what it was doing exactly, or in what way it may have been using memory.

The above claim(s) can be verified without too much difficulty using a microbenchmarking-type exercise (see below).

Based on the Nvidia docs, The sample period may be between 1 second and 1/6 second depending on the product. However, the period shouldn't make much difference on how you interpret the result.

Also, the word "Volatile" does not pertain to this data item in nvidia-smi. You are misreading the output format.

Here's a trivial code that supports my claim:

#include <stdio.h> #include <unistd.h> #include <stdlib.h>  const long long tdelay=1000000LL; const int loops = 10000; const int hdelay = 1;  __global__ void dkern(){    long long start = clock64();   while(clock64() < start+tdelay); }  int main(int argc, char *argv[]){    int my_delay = hdelay;   if (argc > 1) my_delay = atoi(argv[1]);   for (int i = 0; i<loops; i++){     dkern<<<1,1>>>();     usleep(my_delay);}    return 0; }

On my system, when I run the above code with a command line parameter of 100, nvidia-smi will report 99% utilization. When I run with a command line parameter of 1000, nvidia-smi will report ~83% utilization. When I run it with a command line parameter of 10000, nvidia-smi will report ~9% utilization.

170

answered Sep 19 '22 05:09

Robert Crovella

The 'Volatile' on nvidia-smi isn't part of GPU-Util, it's part of 'Volatile Uncorr. ECC', which shows the number of uncorrected errors that have occurred on the GPU since the last driver load. There's a good writeup of everything in nvidia-smi here:

https://medium.com/analytics-vidhya/explained-output-of-nvidia-smi-utility-fc4fbee3b124

answered Sep 20 '22 05:09

alexcwsmith

Related questions
                            
                                How can I compile CUDA code then link it to a C++ project?
                            
                                Structure of Arrays vs Array of Structures
                            
                                Python GPU programming [closed]
                            
                                What is the difference between cuda vs tensor cores?
                            
                                Compression library using Nvidia's CUDA [closed]
                            
                                Error compiling CUDA from Command Prompt
                            
                                How and when should I use pitched pointer with the cuda API?
                            
                                Does __syncthreads() synchronize all threads in the grid?
                            
                                Cuda gridDim and blockDim
                            
                                CUDA or FPGA for special purpose 3D graphics computations? [closed]
                            
                                Does CUDA support recursion?
                            
                                Coding CUDA with C#?
                            
                                CUDA determining threads per block, blocks per grid
                            
                                Error Message : Cannot find or open the PDB file
                            
                                How can I flush GPU memory using CUDA (physical reset is unavailable)
                            
                                GPU Programming, CUDA or OpenCL? [closed]
                            
                                When to call cudaDeviceSynchronize?
                            
                                Passing pointers between C and Java through JNI
                            
                                LNK2038: mismatch detected for 'RuntimeLibrary': value 'MT_StaticRelease' doesn't match value 'MD_DynamicRelease' in file.obj
                            
                                In CUDA, what is memory coalescing, and how is it achieved?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

nvidia-smi Volatile GPU-Utilization explanation?

Tags:

cuda

gpgpu

gpu

nvidia

user3813674

People also ask

2 Answers

Robert Crovella

alexcwsmith

Recent Activity

Donate For Us