How to assign a new value to a pytorch Variable without breaking backpropagation?

Edit

At the time I asked this question, I hadn't realized that PyTorch doesn't have a static graph as Tensorflow or Keras do.

In PyTorch, the training loop is made manually and you need to call everything in each training step. (There isn't the notion of placeholder + static graph for later feeding data).

Consequently, we can't "break the graph", since we will use the new variable to perform all the further computations again. I was worried about a problem that happens in Keras, not in PyTorch.

579

asked Dec 17 '18 16:12

Daniel Möller

1 Answers

You can use the data attribute of tensors to modify the values, since modifications on data do not affect the graph.
So the graph will still be intact and modifications of the data attribute itself have no influence on the graph. (Operations and changes on data are not tracked by autograd and thus not present in the graph)

Since you haven't given an example, this example is based on your comment statement:
'Suppose I want to change the weights of a layer.'
I used normal tensors here, but this works the same for weight.data and bias.data attributes of a layers.

Here is a short example:

import torch
import torch.nn.functional as F



# Test 1, random vector with CE
w1 = torch.rand(1, 3, requires_grad=True)
loss = F.cross_entropy(w1, torch.tensor([1]))
loss.backward()
print('w1.data', w1)
print('w1.grad', w1.grad)
print()

# Test 2, replacing values of w2 with w1, before CE
# to make sure that everything is exactly like in Test 1 after replacing the values
w2 = torch.zeros(1, 3, requires_grad=True)
w2.data = w1.data
loss = F.cross_entropy(w2, torch.tensor([1]))
loss.backward()
print('w2.data', w2)
print('w2.grad', w2.grad)
print()

# Test 3, replace data after computation
w3 = torch.rand(1, 3, requires_grad=True)
loss = F.cross_entropy(w3, torch.tensor([1]))
# setting values
# the graph of the previous computation is still intact as you can in the below print-outs
w3.data = w1.data
loss.backward()

# data were replaced with values from w1
print('w3.data', w3)
# gradient still shows results from computation with w3
print('w3.grad', w3.grad)

Output:

w1.data tensor([[ 0.9367,  0.6669,  0.3106]])
w1.grad tensor([[ 0.4351, -0.6678,  0.2326]])

w2.data tensor([[ 0.9367,  0.6669,  0.3106]])
w2.grad tensor([[ 0.4351, -0.6678,  0.2326]])

w3.data tensor([[ 0.9367,  0.6669,  0.3106]])
w3.grad tensor([[ 0.3179, -0.7114,  0.3935]])

The most interesting part here is w3. At the time backward is called the values are replaced by values of w1.
But the gradients are calculated based on the CE-function with values of original w3. The replaced values have no effect on the graph. So the graph connection is not broken, replacing had no influence on graph. I hope this is what you were looking for!

answered Oct 12 '22 19:10

MBT

Related questions
                            
                                How does one have parameters in a pytorch model not be leafs and be in the computation graph?
                            
                                Why is this tensorflow training taking so long?
                            
                                Correct way to use custom weight maps in unet architecture
                            
                                Updating a BERT model through Huggingface transformers
                            
                                How can I fix this pytorch error on Windows? (ModuleNotFoundError: No module named 'torch')
                            
                                How to use the PyTorch Transformer with multi-dimensional sequence-to-seqence?
                            
                                Implementing an “infinite loop” Dataset & DataLoader in PyTorch
                            
                                Does pytorch apply softmax automatically in nn.Linear
                            
                                Why is PyTorch 2x slower than Keras for an identical model and hyperparameters?
                            
                                Training TFBertForSequenceClassification with custom X and Y data
                            
                                Duplicate layers when reusing pytorch model
                            
                                How to Create Class Label for Mosaic Augmentation in Image Classification?
                            
                                How can I downgrade the version pytorch from 0.4 to 0.31 with anaconda?
                            
                                Overflow when unpacking long - Pytorch
                            
                                Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #2 'weight'
                            
                                Fastai learner not loading
                            
                                Does batch_first affect hidden tensors in Pytorch LSTMs?
                            
                                Installing PyTorch under conda fails with permissions error and Rolling back transaction
                            
                                Keras vs PyTorch LSTM different results
                            
                                How to get the output from a specific layer from a PyTorch model?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to assign a new value to a pytorch Variable without breaking backpropagation?

Tags:

pytorch

Edit

Daniel Möller

People also ask

1 Answers

MBT

Recent Activity

Donate For Us