Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pytorch second derivative returns None

I am unable to take a second derivative of the following function. When I want the second derivative with respect to u_s it works but with x_s it doesn't work.

Does anyone know what I have done wrong here?

def cost(xt, x_goal, u, Q, R):
        return (xt - x_goal).matmul(Q).matmul((xt - x_goal).transpose(0,1)) + u.matmul(R).matmul(u)

x_s = tr.tensor([ 0.0000, -1.0000,  0.0000], dtype=torch.float64,  requires_grad=True)
u_s = tr.tensor([-0.2749], dtype=torch.float64, requires_grad=True)
c = cost(x_s, x_Goal, u_s, tr.tensor(Q), tr.tensor(R))

c
   output: 
   tensor([[4.0076]], dtype=torch.float64, grad_fn=<ThAddBackward>)

Cu = grad(c, u_s, create_graph=True)[0]
Cu
   output:
   tensor([-0.0550], dtype=torch.float64, grad_fn=<ThAddBackward>)

Cuu = grad(Cu, u_s, allow_unused=True)[0]
Cuu
   output:
   tensor([0.2000], dtype=torch.float64)

Cux = grad(Cu, x_s, allow_unused=True)
Cux
    output:
    (None,)

I am guessing the Cu itself is completely independent of x_s, but then the derivative should be zero at least, not None!

like image 511
azerila Avatar asked Jan 26 '26 02:01

azerila


1 Answers

You haven't done anything wrong.

Suppose I have variables x, y, and z=f(y). If I compute z.backward() and then try to ask for the gradient with respect to x, I get None. For example,

import torch

x = torch.randn(1,requires_grad=True)
y = torch.randn(1,requires_grad=True)

z = y**2
z.backward()
print(y.grad) # Outputs some non-zero tensor
print(x.grad) # None

So what does this have to do with your attempt to compute the second derivative Cux? When you write create_graph=True, PyTorch keeps track of all the operations in the derivative computations which computed Cu, and since the derivatives themselves are made up of primitive operations, you can compute the gradient of the gradient, as you are doing. The problem here is that the gradient Cu never encounters the variable x_s, so effectively Cu = f(u_s). This means that when you perform Cu.backward(), the computational graph for Cu never sees the variable x_s, so it's gradient type remains None, just like in the example above.

like image 143
Nick McGreivy Avatar answered Jan 28 '26 18:01

Nick McGreivy



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!