Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GPU out of memory error message on Google Colab

I'm using a GPU on Google Colab to run some deep learning code.

I have got 70% of the way through the training, but now I keep getting the following error:

RuntimeError: CUDA out of memory. Tried to allocate 2.56 GiB (GPU 0; 15.90 GiB total capacity; 10.38 GiB already allocated; 1.83 GiB free; 2.99 GiB cached)

I'm trying to understand what this means. Is it talking about RAM memory? If so, the code should just run the same as is has been doing shouldn't it? When I try to restart it, the memory message appears immediately. Why would it be using more RAM when I start it today than it did when I started it yesterday or the day before?

Or is this message about hard disk space? I could understand that because the code saves things as it goes on and so the hard disk usage would be cumulative.

Any help would be much appreciated.


So if it's just the GPU running out of memory - could someone explain why the error message says 10.38 GiB already allocated - how can there be memory already allocated when I start to run something. Could that be being used by someone else? Do I just need to wait and try again later?

Here is a screenshot of the GPU usage when I run the code, just before it runs out of memory:

enter image description here


I found this post in which people seem to be having similar problems. When I run a code suggested on that thread I see:

Gen RAM Free: 12.6 GB  | Proc size: 188.8 MB
GPU RAM Free: 16280MB | Used: 0MB | Util   0% | Total 16280MB

which seems to suggest there is 16 GB of RAM free.

I'm confused.

like image 800
user1551817 Avatar asked Jan 17 '20 14:01

user1551817


1 Answers

You are getting out of memory in GPU. If you are running a python code, try to run this code before yours. It will show the amount of memory you have. Note that if you try in load images bigger than the total memory, it will fail.

# memory footprint support libraries/code
!ln -sf /opt/bin/nvidia-smi /usr/bin/nvidia-smi
!pip install gputil
!pip install psutil
!pip install humanize

import psutil
import humanize
import os
import GPUtil as GPU

GPUs = GPU.getGPUs()
# XXX: only one GPU on Colab and isn’t guaranteed
gpu = GPUs[0]
def printm():
    process = psutil.Process(os.getpid())
    print("Gen RAM Free: " + humanize.naturalsize(psutil.virtual_memory().available), " |     Proc size: " + humanize.naturalsize(process.memory_info().rss))
    print("GPU RAM Free: {0:.0f}MB | Used: {1:.0f}MB | Util {2:3.0f}% | Total     {3:.0f}MB".format(gpu.memoryFree, gpu.memoryUsed, gpu.memoryUtil*100, gpu.memoryTotal))
printm()
like image 52
Etore Marcari Jr. Avatar answered Oct 28 '22 09:10

Etore Marcari Jr.