Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CUDA out of memory error when reloading Pytorch model

Common pytorch error here, but I'm seeing it under a unique circumstance: when reloading a model, I get a CUDA: Out of Memory error, even though I haven't yet placed the model on the GPU.

model = model.load_state_dict(torch.load(model_file_path))
optimizer = optimizer.load_state_dict(torch.load(optimizer_file_path))
# Error happens here ^, before I send the model to the device.
model = model.to(device_id)
like image 927
Jacob Stern Avatar asked Oct 20 '25 09:10

Jacob Stern


1 Answers

The issue is that I was trying to load to a new GPU (cuda:2) but originally saved the model and optimizer from a different GPU (cuda:0). So even though I didn't explicitly tell it to reload to the previous GPU, the default behavior is to reload to the original GPU (which happened to be occupied).

Adding map_location=device_id to each torch.load call fixed the problem:

model.to(device_id)
model = model.load_state_dict(torch.load(model_file_path, map_location=device_id))
optimizer = optimizer.load_state_dict(torch.load(optimizer_file_path, map_location=device_id))
like image 197
Jacob Stern Avatar answered Oct 22 '25 05:10

Jacob Stern



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!