Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to clear Cuda memory in PyTorch

Tags:

python

pytorch

I am trying to get the output of a neural network which I have already trained. The input is an image of the size 300x300. I am using a batch size of 1, but I still get a CUDA error: out of memory error after I have successfully got the output for 25 images.

I tried torch.cuda.empty_cache(), but this still doesn't seem to solve the problem. Code:

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

train_x = torch.tensor(train_x, dtype=torch.float32).view(-1, 1, 300, 300)
train_x = train_x.to(device)
dataloader = torch.utils.data.DataLoader(train_x, batch_size=1, shuffle=False)

right = []
for i, left in enumerate(dataloader):
    print(i)
    temp = model(left).view(-1, 1, 300, 300)
    right.append(temp.to('cpu'))
    del temp
    torch.cuda.empty_cache()

This for loop runs for 25 times every time before giving the memory error.

Every time, I am sending a new image in the network for computation. So, I don't really need to store the previous computation results in the GPU after every iteration in the loop. Is there any way to achieve this?

like image 485
ntd Avatar asked Mar 24 '19 09:03

ntd


People also ask

How do you release a memory PyTorch?

PyTorch uses a memory cache to avoid malloc/free calls and tries to reuse the memory, if possible, as described in the docs. To release memory from the cache so that other processes can use it, you could call torch. cuda. empty_cache() .

Why is cuda out of memory?

In my model, it appears that “cuda runtime error(2): out of memory” is occurring due to a GPU memory drain. Because PyTorch typically manages large amounts of data, failure to recognize small errors can cause your program to crash to the ground without all its GPU being available.

What does torch cuda empty cache do?

Releases all unoccupied cached memory currently held by the caching allocator so that those can be used in other GPU application and visible in nvidia-smi .


1 Answers

I figured out where I was going wrong. I am posting the solution as an answer for others who might be struggling with the same problem.

Basically, what PyTorch does is that it creates a computational graph whenever I pass the data through my network and stores the computations on the GPU memory, in case I want to calculate the gradient during backpropagation. But since I only wanted to perform a forward propagation, I simply needed to specify torch.no_grad() for my model.

Thus, the for loop in my code could be rewritten as:

for i, left in enumerate(dataloader):     print(i)     with torch.no_grad():         temp = model(left).view(-1, 1, 300, 300)     right.append(temp.to('cpu'))     del temp     torch.cuda.empty_cache() 

Specifying no_grad() to my model tells PyTorch that I don't want to store any previous computations, thus freeing my GPU space.

like image 178
ntd Avatar answered Oct 03 '22 04:10

ntd