Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I flush GPU memory using CUDA (physical reset is unavailable)

My CUDA program crashed during execution, before memory was flushed. As a result, device memory remained occupied.

I'm running on a GTX 580, for which nvidia-smi --gpu-reset is not supported.

Placing cudaDeviceReset() in the beginning of the program is only affecting the current context created by the process and doesn't flush the memory allocated before it.

I'm accessing a Fedora server with that GPU remotely, so physical reset is quite complicated.

So, the question is - Is there any way to flush the device memory in this situation?

like image 266
timdim Avatar asked Mar 04 '13 08:03

timdim


People also ask

Why is cuda out of memory?

In my model, it appears that “cuda runtime error(2): out of memory” is occurring due to a GPU memory drain. Because PyTorch typically manages large amounts of data, failure to recognize small errors can cause your program to crash to the ground without all its GPU being available.


2 Answers

check what is using your GPU memory with

sudo fuser -v /dev/nvidia* 

Your output will look something like this:

                     USER        PID  ACCESS COMMAND /dev/nvidia0:        root       1256  F...m  Xorg                      username   2057  F...m  compiz                      username   2759  F...m  chrome                      username   2777  F...m  chrome                      username   20450 F...m  python                      username   20699 F...m  python 

Then kill the PID that you no longer need on htop or with

sudo kill -9 PID. 

In the example above, Pycharm was eating a lot of memory so I killed 20450 and 20699.

like image 70
Kenan Avatar answered Oct 27 '22 10:10

Kenan


First type

nvidia-smi 

then select the PID that you want to kill

sudo kill -9 PID 
like image 45
Ashiq Imran Avatar answered Oct 27 '22 10:10

Ashiq Imran