I have a neural network that's computing a vector quantity u
. I'd like to compute first and second-order jacobians with respect to the input x
, a single element.
Would anybody know how to do that in PyTorch? Below, the code snippet from my project:
import torch
import torch.nn as nn
class PINN(torch.nn.Module):
def __init__(self, layers:list):
super(PINN, self).__init__()
self.linears = nn.ModuleList([])
for i, dim in enumerate(layers[:-2]):
self.linears.append(nn.Linear(dim, layers[i+1]))
self.linears.append(nn.ReLU())
self.linears.append(nn.Linear(layers[-2], layers[-1]))
def forward(self, x):
for layer in self.linears:
x = layer(x)
return x
I then instantiate my network:
n_in = 1
units = 50
q = 500
pinn = PINN([n_in, units, units, units, q+1])
pinn
Which returns
PINN(
(linears): ModuleList(
(0): Linear(in_features=1, out_features=50, bias=True)
(1): ReLU()
(2): Linear(in_features=50, out_features=50, bias=True)
(3): ReLU()
(4): Linear(in_features=50, out_features=50, bias=True)
(5): ReLU()
(6): Linear(in_features=50, out_features=501, bias=True)
)
)
Then I compute both FO and SO jacobians
x = torch.randn(1, requires_grad=False)
u_x = torch.autograd.functional.jacobian(pinn, x, create_graph=True)
print("First Order Jacobian du/dx of shape {}, and features\n{}".format(u_x.shape, u_x)
u_xx = torch.autograd.functional.jacobian(lambda _: u_x, x)
print("Second Order Jacobian du_x/dx of shape {}, and features\n{}".format(u_xx.shape, u_xx)
Returns
First Order Jacobian du/dx of shape torch.Size([501, 1]), and features
tensor([[-0.0310],
[ 0.0139],
[-0.0081],
[-0.0248],
[-0.0033],
[ 0.0013],
[ 0.0040],
[ 0.0273],
...
[-0.0197]], grad_fn=<ViewBackward>)
Second Order Jacobian du/dx of shape torch.Size([501, 1, 1]), and features
tensor([[[0.]],
[[0.]],
[[0.]],
[[0.]],
...
[[0.]]])
Should not u_xx
be a None
vector if it didn't depend on x
?
Thanks in advance
torch. autograd provides classes and functions implementing automatic differentiation of arbitrary scalar valued functions. It requires minimal changes to the existing code - you only need to declare Tensor s for which gradients should be computed with the requires_grad=True keyword.
To compute the gradients, a tensor must have its parameter requires_grad = true. The gradients are same as the partial derivatives. For example, in the function y = 2*x + 1, x is a tensor with requires_grad = True.
So as @jodag mentioned in his comment, ReLU
being null or linear, its gradient is constant (except on 0
, which is a rare event), so its second-order derivative is zero. I changed the activation function to Tanh
, which finally allows me to compute the jacobian twice.
Final code is
import torch
import torch.nn as nn
class PINN(torch.nn.Module):
def __init__(self, layers:list):
super(PINN, self).__init__()
self.linears = nn.ModuleList([])
for i, dim in enumerate(layers[:-2]):
self.linears.append(nn.Linear(dim, layers[i+1]))
self.linears.append(nn.Tanh())
self.linears.append(nn.Linear(layers[-2], layers[-1]))
def forward(self, x):
for layer in self.linears:
x = layer(x)
return x
def compute_u_x(self, x):
self.u_x = torch.autograd.functional.jacobian(self, x, create_graph=True)
self.u_x = torch.squeeze(self.u_x)
return self.u_x
def compute_u_xx(self, x):
self.u_xx = torch.autograd.functional.jacobian(self.compute_u_x, x)
self.u_xx = torch.squeeze(self.u_xx)
return self.u_xx
Then calling compute_u_xx(x)
on an instance of PINN
with x.require_grad
set to True
gets me there. How to get rid of useless dimensions introduced by torch.autograd.functional.jacobian
remains to be understood though...
The second order Jacobian is known as the Hessian and can be computed easily using PyTorch's builtin functions:
torch.autograd.functional.hessian(func, inputs)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With