Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PyTorch autograd -- grad can be implicitly created only for scalar outputs

I am using the autograd tool in PyTorch, and have found myself in a situation where I need to access the values in a 1D tensor by means of an integer index. Something like this:

def basic_fun(x_cloned):
    res = []
    for i in range(len(x)):
        res.append(x_cloned[i] * x_cloned[i])
    print(res)
    return Variable(torch.FloatTensor(res))


def get_grad(inp, grad_var):
    A = basic_fun(inp)
    A.backward()
    return grad_var.grad


x = Variable(torch.FloatTensor([1, 2, 3, 4, 5]), requires_grad=True)
x_cloned = x.clone()
print(get_grad(x_cloned, x))

I am getting the following error message:

[tensor(1., grad_fn=<ThMulBackward>), tensor(4., grad_fn=<ThMulBackward>), tensor(9., grad_fn=<ThMulBackward>), tensor(16., grad_fn=<ThMulBackward>), tensor(25., grad_fn=<ThMulBackward>)]
Traceback (most recent call last):
  File "/home/mhy/projects/pytorch-optim/predict.py", line 74, in <module>
    print(get_grad(x_cloned, x))
  File "/home/mhy/projects/pytorch-optim/predict.py", line 68, in get_grad
    A.backward()
  File "/home/mhy/.local/lib/python3.5/site-packages/torch/tensor.py", line 93, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/mhy/.local/lib/python3.5/site-packages/torch/autograd/__init__.py", line 90, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

I am in general, a bit skeptical about how using the cloned version of a variable is supposed to keep that variable in gradient computation. The variable itself is effectively not used in the computation of A, and so when you call A.backward(), it should not be part of that operation.

I appreciate your help with this approach or if there is a better way to avoid losing the gradient history and still index through a 1D tensor with requires_grad=True!

**Edit (September 15):**

res is a list of zero-dimensional tensors containing squared values of 1 to 5. To concatenate in a single tensor containing [1.0, 4.0, ..., 25.0], I changed return Variable(torch.FloatTensor(res)) to torch.stack(res, dim=0), which produces tensor([ 1., 4., 9., 16., 25.], grad_fn=<StackBackward>).

However, I am getting this new error, caused by the A.backward() line.

Traceback (most recent call last):
  File "<project_path>/playground.py", line 22, in <module>
    print(get_grad(x_cloned, x))
  File "<project_path>/playground.py", line 16, in get_grad
    A.backward()
  File "/home/mhy/.local/lib/python3.5/site-packages/torch/tensor.py", line 93, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/mhy/.local/lib/python3.5/site-packages/torch/autograd/__init__.py", line 84, in backward
    grad_tensors = _make_grads(tensors, grad_tensors)
  File "/home/mhy/.local/lib/python3.5/site-packages/torch/autograd/__init__.py", line 28, in _make_grads
    raise RuntimeError("grad can be implicitly created only for scalar outputs")
RuntimeError: grad can be implicitly created only for scalar outputs
like image 798
mhyousefi Avatar asked Sep 13 '18 15:09

mhyousefi


People also ask

How does torch Autograd grad work?

grad. Computes and returns the sum of gradients of outputs with respect to the inputs. grad_outputs should be a sequence of length matching output containing the “vector” in vector-Jacobian product, usually the pre-computed gradients w.r.t. each of the outputs.

What is Autograd module in PyTorch and why it's powerful?

autograd provides classes and functions implementing automatic differentiation of arbitrary scalar valued functions. It requires minimal changes to the existing code - you only need to declare Tensor s for which gradients should be computed with the requires_grad=True keyword.

What is Loss backward ()?

So, when we call loss. backward() , the whole graph is differentiated w.r.t. the loss, and all Variables in the graph will have their . grad Variable accumulated with the gradient. For illustration, let us follow a few steps backward: print(loss.

What does backward do in PyTorch?

Computes the gradient of current tensor w.r.t. graph leaves. The graph is differentiated using the chain rule. If the tensor is non-scalar (i.e. its data has more than one element) and requires gradient, the function additionally requires specifying gradient .


1 Answers

I changed my basic_fun to the following, which resolved my problem:

def basic_fun(x_cloned):
    res = torch.FloatTensor([0])
    for i in range(len(x)):
        res += x_cloned[i] * x_cloned[i]
    return res

This version returns a scalar value.

like image 102
mhyousefi Avatar answered Sep 30 '22 04:09

mhyousefi