Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Caffe | Check failed: error == cudaSuccess (2 vs. 0) out of memory

I am trying to train a network on Caffe. I have image size of 512x640. Batch size is 1. I'm trying to implement FCN-8s.

I am currently running this on a Amazon EC2 instance (g2.2xlarge) with 4GB of GPU memory. But when I run the solver, it immediately throws out an error

Check failed: error == cudaSuccess (2 vs. 0)  out of memory
*** Check failure stack trace: ***
Aborted (core dumped)

Can someone help me proceed from here?

like image 995
Abhilash Panigrahi Avatar asked Nov 18 '15 21:11

Abhilash Panigrahi


2 Answers

The error you get is indeed out of memory, but it's not the RAM, but rather GPU memory (note that the error comes from CUDA).
Usually, when caffe is out of memory - the first thing to do is reduce the batch size (at the cost of gradient accuracy), but since you are already at batch size = 1...
Are you sure batch size is 1 for both TRAIN and TEST phases?

like image 140
Shai Avatar answered Sep 21 '22 15:09

Shai


Caffe can use multiple GPU's. This is only supported in the C++ interface, not in the python one. You could also enable cuDNN for a lower memory footprint.

https://github.com/BVLC/caffe/blob/master/docs/multigpu.md

like image 40
Simon Avatar answered Sep 17 '22 15:09

Simon