Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Modifying a pytorch tensor and then getting the gradient lets the gradient not work

I am a beginner in pytorch and I face the following issue:

When I get the gradient of the below tensor (note that I use some variable x in some way as you can see below), I get the gradient:

import torch
myTensor = torch.randn(2, 2,requires_grad=True)
with torch.enable_grad():
    x=myTensor.sum() *10
x.backward()
print(myTensor.grad)

Now, if I try to modify an element of myTensor, I get the error of leaf variable has been moved into the graph interior. See this code:

import torch
myTensor = torch.randn(2, 2,requires_grad=True)
myTensor[0,0]*=5
with torch.enable_grad():
    x=myTensor.sum() *10
x.backward()
print(myTensor.grad)

What is wrong with my latter code? And how do I correct it?

Any help would be highly appreciated. Thanks a lot!

like image 418
Aly Avatar asked Oct 29 '18 18:10

Aly


People also ask

How do you modify a tensor in PyTorch?

we can modify a tensor by using the assignment operator. Assigning a new value in the tensor will modify the tensor with the new value. Import the torch libraries and then create a PyTorch tensor.

How do you find the gradient of a tensor in PyTorch?

To compute the gradients, a tensor must have its parameter requires_grad = true. The gradients are same as the partial derivatives. For example, in the function y = 2*x + 1, x is a tensor with requires_grad = True. We can compute the gradients using y.

How does PyTorch keep track of gradients?

PyTorch does it by building a Dynamic Computational Graph (DCG). This graph is built from scratch in every iteration providing maximum flexibility to gradient calculation.

How does PyTorch gradient work?

PyTorch computes the gradient of a function with respect to the inputs by using automatic differentiation. Automatic differentiation is a technique that, given a computational graph, calculates the gradients of the inputs. Automatic differentiation can be performed in two different ways; forward and reverse mode.

How do I get the gradient of a tensor?

If you just put a tensor full of ones instead of dL_dy you’ll get precisely the gradient you are looking for. import torch from torch.autograd import Variable x = Variable (torch.ones (10), requires_grad=True) y = x * Variable (torch.linspace (1, 10, 10), requires_grad=False) y.backward (torch.ones (10)) print (x.grad) produces Variable con…

What is autograd in PyTorch?

pytorch autograd : getting pixel grid tensor from coordinates tensor in a differentiable way 0 In Pytorch, quantity.backward() computes the gradient of quantity wrt which of the parameters?

Is it possible to avoid in-place operation in PyTorch?

Unfortunately there are not many pytorch functions to help out on this problem. So you would have to use a helper tensor to avoid the in-placeoperation in this case:

What are the coordinates of a gradient in Python?

For example, if the indices are (1, 2, 3) and the tensors are (t0, t1, t2), then the coordinates are (t0 [1], t1 [2], t2 [3]) dim ( int, list of int, optional) – the dimension or dimensions to approximate the gradient over.


1 Answers

The problem here is that this line represents an in-place operation:

myTensor[0,0]*=5

And PyTorch or more precisely autograd is not very good in handling in-place operations, especially on those tensors with the requires_grad flag set to True.

You can also take a look here:
https://pytorch.org/docs/stable/notes/autograd.html#in-place-operations-with-autograd

Generally you should avoid in-place operations where it is possible, in some cases it can work, but you should always avoid in-place operations on tensors where you set requires_grad to True.

Unfortunately there are not many pytorch functions to help out on this problem. So you would have to use a helper tensor to avoid the in-place operation in this case:

Code:

import torch

myTensor = torch.randn(2, 2,requires_grad=True)
helper_tensor = torch.ones(2, 2)
helper_tensor[0, 0] = 5
new_myTensor = myTensor * helper_tensor # new tensor, out-of-place operation
with torch.enable_grad():
    x=new_myTensor.sum() *10 # of course you need to use the new tensor
x.backward()                 # for further calculation and backward
print(myTensor.grad)

Output:

tensor([[50., 10.],
        [10., 10.]])

Unfortunately this is not very nice and I would appreciate if there would be a better or nicer solution out there.
But for all I know in the current version (0.4.1) you will have to got with this workaround for tensors with gradient resp. requires_grad=True.

Hopefully for future versions there will be a better solution.


Btw. if you activate the gradient later you can see that it works just fine:

import torch
myTensor = torch.randn(2, 2,requires_grad=False) # no gradient so far
myTensor[0,0]*=5                                 # in-place op not included in gradient
myTensor.requires_grad = True                    # activate gradient here
with torch.enable_grad():
    x=myTensor.sum() *10
x.backward()                                     # no problem here
print(myTensor.grad)

But of course this will yield to a different result:

tensor([[10., 10.],
        [10., 10.]])

Hope this helps!

like image 101
MBT Avatar answered Oct 21 '22 17:10

MBT