trying to wrap my head around how gradients are represented and how autograd works:
import torch
from torch.autograd import Variable
x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y
z.backward()
print(x.grad)
#Variable containing:
#32
#[torch.FloatTensor of size 1]
print(y.grad)
#None
Why does it not produce a gradient for y
? If y.grad = dz/dy
, then shouldn't it at least produce a variable like y.grad = 2*y
?
Autograd is a PyTorch package for the differentiation for all operations on Tensors. It performs the backpropagation starting from a variable. In deep learning, this variable often holds the value of the cost function. Backward executes the backward pass and computes all the backpropagation gradients automatically.
Autograd is reverse automatic differentiation system. Conceptually, autograd records a graph recording all of the operations that created the data as you execute operations, giving you a directed acyclic graph whose leaves are the input tensors and roots are the output tensors.
grad. Computes and returns the sum of gradients of outputs with respect to the inputs. grad_outputs should be a sequence of length matching output containing the “vector” in vector-Jacobian product, usually the pre-computed gradients w.r.t. each of the outputs.
ctx is a context object that can be used to stash information for backward computation. You can cache arbitrary objects for use in the backward pass using the ctx.save_for_backward method. """
By default, gradients are only retained for leaf variables. non-leaf variables' gradients are not retained to be inspected later. This was done by design, to save memory.
-soumith chintala
See: https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94
Call y.retain_grad()
x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y
y.retain_grad()
z.backward()
print(y.grad)
#Variable containing:
# 8
#[torch.FloatTensor of size 1]
Source: https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/16
Register a hook
, which is basically a function called when that gradient is calculated. Then you can save it, assign it, print it, whatever...
from __future__ import print_function
import torch
from torch.autograd import Variable
x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y
y.register_hook(print) ## this can be anything you need it to be
z.backward()
output:
Variable containing: 8 [torch.FloatTensor of size 1
Source: https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/2
Also see: https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/7
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With