I am a beginner in pytorch and I face the following issue:
When I get the gradient of the below tensor (note that I use some variable x in some way as you can see below), I get the gradient:
import torch
myTensor = torch.randn(2, 2,requires_grad=True)
with torch.enable_grad():
x=myTensor.sum() *10
x.backward()
print(myTensor.grad)
Now, if I try to modify an element of myTensor
, I get the error of leaf variable has been moved into the graph interior
. See this code:
import torch
myTensor = torch.randn(2, 2,requires_grad=True)
myTensor[0,0]*=5
with torch.enable_grad():
x=myTensor.sum() *10
x.backward()
print(myTensor.grad)
What is wrong with my latter code? And how do I correct it?
Any help would be highly appreciated. Thanks a lot!
we can modify a tensor by using the assignment operator. Assigning a new value in the tensor will modify the tensor with the new value. Import the torch libraries and then create a PyTorch tensor.
To compute the gradients, a tensor must have its parameter requires_grad = true. The gradients are same as the partial derivatives. For example, in the function y = 2*x + 1, x is a tensor with requires_grad = True. We can compute the gradients using y.
PyTorch does it by building a Dynamic Computational Graph (DCG). This graph is built from scratch in every iteration providing maximum flexibility to gradient calculation.
PyTorch computes the gradient of a function with respect to the inputs by using automatic differentiation. Automatic differentiation is a technique that, given a computational graph, calculates the gradients of the inputs. Automatic differentiation can be performed in two different ways; forward and reverse mode.
If you just put a tensor full of ones instead of dL_dy you’ll get precisely the gradient you are looking for. import torch from torch.autograd import Variable x = Variable (torch.ones (10), requires_grad=True) y = x * Variable (torch.linspace (1, 10, 10), requires_grad=False) y.backward (torch.ones (10)) print (x.grad) produces Variable con…
pytorch autograd : getting pixel grid tensor from coordinates tensor in a differentiable way 0 In Pytorch, quantity.backward() computes the gradient of quantity wrt which of the parameters?
Unfortunately there are not many pytorch functions to help out on this problem. So you would have to use a helper tensor to avoid the in-placeoperation in this case:
For example, if the indices are (1, 2, 3) and the tensors are (t0, t1, t2), then the coordinates are (t0 [1], t1 [2], t2 [3]) dim ( int, list of int, optional) – the dimension or dimensions to approximate the gradient over.
The problem here is that this line represents an in-place operation:
myTensor[0,0]*=5
And PyTorch or more precisely autograd is not very good in handling in-place operations, especially on those tensors with the requires_grad
flag set to True
.
You can also take a look here:
https://pytorch.org/docs/stable/notes/autograd.html#in-place-operations-with-autograd
Generally you should avoid in-place operations where it is possible, in some cases it can work, but you should always avoid in-place operations on tensors where you set requires_grad
to True
.
Unfortunately there are not many pytorch functions to help out on this problem. So you would have to use a helper tensor to avoid the in-place
operation in this case:
Code:
import torch
myTensor = torch.randn(2, 2,requires_grad=True)
helper_tensor = torch.ones(2, 2)
helper_tensor[0, 0] = 5
new_myTensor = myTensor * helper_tensor # new tensor, out-of-place operation
with torch.enable_grad():
x=new_myTensor.sum() *10 # of course you need to use the new tensor
x.backward() # for further calculation and backward
print(myTensor.grad)
Output:
tensor([[50., 10.],
[10., 10.]])
Unfortunately this is not very nice and I would appreciate if there would be a better or nicer solution out there.
But for all I know in the current version (0.4.1) you will have to got with this workaround for tensors with gradient resp. requires_grad=True
.
Hopefully for future versions there will be a better solution.
Btw. if you activate the gradient later you can see that it works just fine:
import torch
myTensor = torch.randn(2, 2,requires_grad=False) # no gradient so far
myTensor[0,0]*=5 # in-place op not included in gradient
myTensor.requires_grad = True # activate gradient here
with torch.enable_grad():
x=myTensor.sum() *10
x.backward() # no problem here
print(myTensor.grad)
But of course this will yield to a different result:
tensor([[10., 10.],
[10., 10.]])
Hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With