a = torch.nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float, device=device))
b = torch.nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float, device=device))
c = a + 1
d = torch.nn.Parameter(c, requires_grad=True,)
for epoch in range(n_epochs):
yhat = d + b * x_train_tensor
error = y_train_tensor - yhat
loss = (error ** 2).mean()
loss.backward()
print(a.grad)
print(b.grad)
print(c.grad)
print(d.grad)
Prints out
None
tensor([-0.8707])
None
tensor([-1.1125])
How do I learn the gradient for a and c? variable d needs to stay a parameter
Basically, when you create a new tensor, like torch.nn.Parameter() or torch.tensor(), you are creating a leaf node tensor.
And when you do something like c=a+1, c will be intermediate node. You can print(c.is_leaf) to check whether the tensor is leaf node or not. Pytorch will not calculate the gradient of intermediate node in default.
In your code snippet, a, b, d are all leaf node tensor, and c is intermediate node. c.grad will None as pytorch doesn't calculate the gradient for intermediate node. a is isolated from the graph when you call loss.backword(). That's why a.grad is also None.
If you change the code to this
a = torch.nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float, device=device))
b = torch.nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float, device=device))
c = a + 1
d = c
for epoch in range(n_epochs):
yhat = d + b * x_train_tensor
error = y_train_tensor - yhat
loss = (error ** 2).mean()
loss.backward()
print(a.grad) # Not None
print(b.grad) # Not None
print(c.grad) # None
print(d.grad) # None
You will find a and b have gradients, but c.grad and d.grad are None, because they're intermediate node.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With