Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does autograd not produce gradient for intermediate variables?

trying to wrap my head around how gradients are represented and how autograd works:

import torch
from torch.autograd import Variable

x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y

z.backward()

print(x.grad)
#Variable containing:
#32
#[torch.FloatTensor of size 1]

print(y.grad)
#None

Why does it not produce a gradient for y? If y.grad = dz/dy, then shouldn't it at least produce a variable like y.grad = 2*y?

like image 204
foobar Avatar asked Aug 31 '17 18:08

foobar


People also ask

What does Autograd Variable do?

Autograd is a PyTorch package for the differentiation for all operations on Tensors. It performs the backpropagation starting from a variable. In deep learning, this variable often holds the value of the cost function. Backward executes the backward pass and computes all the backpropagation gradients automatically.

How does Autograd in PyTorch work?

Autograd is reverse automatic differentiation system. Conceptually, autograd records a graph recording all of the operations that created the data as you execute operations, giving you a directed acyclic graph whose leaves are the input tensors and roots are the output tensors.

What does Autograd Grad return?

grad. Computes and returns the sum of gradients of outputs with respect to the inputs. grad_outputs should be a sequence of length matching output containing the “vector” in vector-Jacobian product, usually the pre-computed gradients w.r.t. each of the outputs.

What is CTX in Autograd?

ctx is a context object that can be used to stash information for backward computation. You can cache arbitrary objects for use in the backward pass using the ctx.save_for_backward method. """


1 Answers

By default, gradients are only retained for leaf variables. non-leaf variables' gradients are not retained to be inspected later. This was done by design, to save memory.

-soumith chintala

See: https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94

Option 1:

Call y.retain_grad()

x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y

y.retain_grad()

z.backward()

print(y.grad)
#Variable containing:
# 8
#[torch.FloatTensor of size 1]

Source: https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/16

Option 2:

Register a hook, which is basically a function called when that gradient is calculated. Then you can save it, assign it, print it, whatever...

from __future__ import print_function
import torch
from torch.autograd import Variable

x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y

y.register_hook(print) ## this can be anything you need it to be

z.backward()

output:

Variable containing:  8 [torch.FloatTensor of size 1

Source: https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/2

Also see: https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/7

like image 183
T. Scharf Avatar answered Sep 21 '22 22:09

T. Scharf