I am using the autograd
tool in PyTorch
, and have found myself in a situation where I need to access the values in a 1D tensor by means of an integer index. Something like this:
def basic_fun(x_cloned):
res = []
for i in range(len(x)):
res.append(x_cloned[i] * x_cloned[i])
print(res)
return Variable(torch.FloatTensor(res))
def get_grad(inp, grad_var):
A = basic_fun(inp)
A.backward()
return grad_var.grad
x = Variable(torch.FloatTensor([1, 2, 3, 4, 5]), requires_grad=True)
x_cloned = x.clone()
print(get_grad(x_cloned, x))
I am getting the following error message:
[tensor(1., grad_fn=<ThMulBackward>), tensor(4., grad_fn=<ThMulBackward>), tensor(9., grad_fn=<ThMulBackward>), tensor(16., grad_fn=<ThMulBackward>), tensor(25., grad_fn=<ThMulBackward>)]
Traceback (most recent call last):
File "/home/mhy/projects/pytorch-optim/predict.py", line 74, in <module>
print(get_grad(x_cloned, x))
File "/home/mhy/projects/pytorch-optim/predict.py", line 68, in get_grad
A.backward()
File "/home/mhy/.local/lib/python3.5/site-packages/torch/tensor.py", line 93, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/mhy/.local/lib/python3.5/site-packages/torch/autograd/__init__.py", line 90, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
I am in general, a bit skeptical about how using the cloned version of a variable is supposed to keep that variable in gradient computation. The variable itself is effectively not used in the computation of A
, and so when you call A.backward()
, it should not be part of that operation.
I appreciate your help with this approach or if there is a better way to avoid losing the gradient history and still index through a 1D tensor with requires_grad=True
!
res
is a list of zero-dimensional tensors containing squared values of 1 to 5. To concatenate in a single tensor containing [1.0, 4.0, ..., 25.0], I changed return Variable(torch.FloatTensor(res))
to torch.stack(res, dim=0)
, which produces tensor([ 1., 4., 9., 16., 25.], grad_fn=<StackBackward>)
.
However, I am getting this new error, caused by the A.backward()
line.
Traceback (most recent call last):
File "<project_path>/playground.py", line 22, in <module>
print(get_grad(x_cloned, x))
File "<project_path>/playground.py", line 16, in get_grad
A.backward()
File "/home/mhy/.local/lib/python3.5/site-packages/torch/tensor.py", line 93, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/mhy/.local/lib/python3.5/site-packages/torch/autograd/__init__.py", line 84, in backward
grad_tensors = _make_grads(tensors, grad_tensors)
File "/home/mhy/.local/lib/python3.5/site-packages/torch/autograd/__init__.py", line 28, in _make_grads
raise RuntimeError("grad can be implicitly created only for scalar outputs")
RuntimeError: grad can be implicitly created only for scalar outputs
grad. Computes and returns the sum of gradients of outputs with respect to the inputs. grad_outputs should be a sequence of length matching output containing the “vector” in vector-Jacobian product, usually the pre-computed gradients w.r.t. each of the outputs.
autograd provides classes and functions implementing automatic differentiation of arbitrary scalar valued functions. It requires minimal changes to the existing code - you only need to declare Tensor s for which gradients should be computed with the requires_grad=True keyword.
So, when we call loss. backward() , the whole graph is differentiated w.r.t. the loss, and all Variables in the graph will have their . grad Variable accumulated with the gradient. For illustration, let us follow a few steps backward: print(loss.
Computes the gradient of current tensor w.r.t. graph leaves. The graph is differentiated using the chain rule. If the tensor is non-scalar (i.e. its data has more than one element) and requires gradient, the function additionally requires specifying gradient .
I changed my basic_fun
to the following, which resolved my problem:
def basic_fun(x_cloned):
res = torch.FloatTensor([0])
for i in range(len(x)):
res += x_cloned[i] * x_cloned[i]
return res
This version returns a scalar value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With