I am frequently rerunning the same mxnet
script while I try to iron out some bugs in a new script (and I am new to mxnet
). Pretty often when I try to run my script I get an error that the GPU is out of memory, and when I use nvidia-smi
to check, this is what I see:
Wed Dec 5 15:41:29 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.24.02 Driver Version: 396.24.02 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 108... Off | 00000000:65:00.0 On | N/A |
| 0% 54C P2 68W / 300W | 10891MiB / 11144MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1446 G /usr/lib/xorg/Xorg 40MiB |
| 0 1481 G /usr/bin/gnome-shell 114MiB |
| 0 10216 G ...-token=8422C9FC67F51AEC1893FEEBE9DB68C6 31MiB |
| 0 18221 G /usr/lib/xorg/Xorg 458MiB |
| 0 18347 G /usr/bin/gnome-shell 282MiB |
+-----------------------------------------------------------------------------+
So it seems like most of the memory is in use (10891/11144) BUT I don't see any process in the list taking up a large portion of the GPU, so there doesn't seem to be anything to call. And my mxnet script has been exited out, so I assume it shouldn't be that. I would understand if there were some seconds or even tens of seconds lagging if the GPU does not know right away that the script no longer needs memory, but I am going on many minutes and still see the same display. What gives, and is there some memory cleanup I should do? If so, how? Thank you for any tips to a newbie.
To monitor the overall GPU resource usage, click the Performance tab, scroll down the left pane, and find the “GPU” option. Here you can watch real-time usage. It displays different graphs for what is happening with your system — like encoding videos or gameplay.
Right click on the desktop and select [NVIDIA Control Panel]. Select [View] or [Desktop] (the option varies by driver version) in the tool bar then check [Display GPU Activity Icon in Notification Area]. In Windows taskbar, mouse over the "GPU Activity" icon to check the list.
This utility allows administrators to query GPU device state and with the appropriate privileges, permits administrators to modify GPU device state.
The GPU memory usage is completely bound to the lifetime of the process. If you see GPU memory used, there must be a process that's still alive and holding on to memory. If you run ps -a |grep python
you should see all python processes and that will tell you which process is still alive.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With