Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between model.to(device) and model=model.to(device)?

Tags:

python

pytorch

Suppose the model is originally stored on CPU, and then I want to move it to GPU0, then I can do:

device = torch.device('cuda:0')
model = model.to(device)
# or
model.to(device)

What is the difference between those two lines?

like image 485
Obsidian Avatar asked Jan 02 '20 07:01

Obsidian


People also ask

What is to device in PyTorch?

device enables you to specify the device type responsible to load a tensor into memory. The function expects a string argument specifying the device type. You can even pass an ordinal like the device index. or leave it unspecified for PyTorch to use the currently available device. Example 1.1.

What does model cuda do?

cuda is used to set up and run CUDA operations. It keeps track of the currently selected GPU, and all CUDA tensors you allocate will by default be created on that device. The selected device can be changed with a torch.

How do I send a model to cuda?

An alternative way to send the model to a specific device is model.to(torch. device('cuda:0')) . This, of course, is subject to the device visibility specified in the environment variable CUDA_VISIBLE_DEVICES . You can check GPU usage with nvidia-smi .


2 Answers

No semantic difference. nn.Module.to function moves the model to the device.

But be cautious.

For tensors (documentation):

# tensor a is in CPU
device = torch.device('cuda:0')
b = a.to(device)
# a is still in CPU!
# b is in GPU!
# a and b are different 

For models (documentation):

# model a is in CPU
device = torch.device('cuda:0')
b = a.to(device)
# a and b are in GPU
# a and b point to the same model 
like image 172
youkaichao Avatar answered Sep 18 '22 19:09

youkaichao


Citing the documentation on to:

When loading a model on a GPU that was trained and saved on GPU, simply convert the initialized model to a CUDA optimized model using model.to(torch.device('cuda')). Also, be sure to use the .to(torch.device('cuda')) function on all model inputs to prepare the data for the model. Note that calling my_tensor.to(device) returns a new copy of my_tensor on GPU. It does NOT overwrite my_tensor. Therefore, remember to manually overwrite tensors: my_tensor = my_tensor.to(torch.device('cuda')).

Mostly, when using to on a torch.nn.Module, it does not matter whether you save the return value or not, and as a micro-optimization, it is actually better to not save the return value. When used on a torch tensor, you must save the return value - seeing you are actually receiving a copy of the tensor.

Ref: Pytorch to()

like image 45
Mano Avatar answered Sep 18 '22 19:09

Mano