I have implemented the following Jacobian function in pytorch. Unless I have made a mistake, it computes the Jacobian of any tensor w.r.t. any dimensional inputs: <pre class="prettyprint"><code>import torch import torch.autograd as ag def nd_range(stop, dims = None): if dims == None: dims = len(stop) if not dims: yield () return for outer in nd_range(stop, dims - 1): for inner in range(stop[dims - 1]): yield outer + (inner,) def full_jacobian(f, wrt): f_shape = list(f.size()) wrt_shape = list(wrt.size()) fs = [] f_range = nd_range(f_shape) wrt_range = nd_range(wrt_shape) for f_ind in f_range: grad = ag.grad(f[tuple(f_ind)], wrt, retain_graph=True, create_graph=True)[0] for i in range(len(f_shape)): grad = grad.unsqueeze(0) fs.append(grad) fj = torch.cat(fs, dim=0) fj = fj.view(f_shape + wrt_shape) return fj </code></pre> On top of this, I have tried to implement a recursive function to calculate nth order derivatives: <pre class="prettyprint"><code>def nth_derivative(f, wrt, n): if n == 1: return full_jacobian(f, wrt) else: deriv = nth_derivative(f, wrt, n-1) return full_jacobian(deriv, wrt) </code></pre> I ran a simple test: <pre class="prettyprint"><code>op = torch.ger(s, s) deep_deriv = nth_derivative(op, s, 5) </code></pre> Unfortunately, this succeeds in getting me the Hessian...but no higher order derivatives. I'm aware many higher order derivatives should be 0, but I'd prefer if pytorch can analytically compute that. One fix has been to change the gradient calculation to: <pre class="prettyprint"><code>try: grad = ag.grad(f[tuple(f_ind)], wrt, retain_graph=True, create_graph=True)[0] except: grad = torch.zeros_like(wrt) </code></pre> Is this the accepted correct way to handle this? Or is there a better option? Or do I have the reason for my issue completely wrong to begin with?

For the second order derivative, you can use PyTorch's <code>hessian</code> function: <pre class="prettyprint lang-py prettyprint-override"><code>torch.autograd.functional.hessian() </code></pre> For higher order derivatives, you can repeatedly call <code>jacobian</code> or <code>grad</code> while maintaining the computational graph: <blockquote> <code>create_graph</code> (bool, optional) – If <code>True</code>, graph of the derivative will be constructed, allowing to compute higher order derivative products. </blockquote>

Higher order gradients in pytorch

Tags:

python

gradient

pytorch

autograd

I have implemented the following Jacobian function in pytorch. Unless I have made a mistake, it computes the Jacobian of any tensor w.r.t. any dimensional inputs:

import torch
import torch.autograd as ag

def nd_range(stop, dims = None):
    if dims == None:
        dims = len(stop)
    if not dims:
        yield ()
        return
    for outer in nd_range(stop, dims - 1):
        for inner in range(stop[dims - 1]):
            yield outer + (inner,)


def full_jacobian(f, wrt):    
    f_shape = list(f.size())
    wrt_shape = list(wrt.size())
    fs = []


    f_range = nd_range(f_shape)
    wrt_range = nd_range(wrt_shape)

    for f_ind in f_range:
        grad = ag.grad(f[tuple(f_ind)], wrt, retain_graph=True, create_graph=True)[0]
        for i in range(len(f_shape)):
            grad = grad.unsqueeze(0)
        fs.append(grad)

    fj = torch.cat(fs, dim=0)
    fj = fj.view(f_shape + wrt_shape)
    return fj

On top of this, I have tried to implement a recursive function to calculate nth order derivatives:

def nth_derivative(f, wrt, n):
    if n == 1:
        return full_jacobian(f, wrt)
    else:        
        deriv = nth_derivative(f, wrt, n-1)
        return full_jacobian(deriv, wrt)

I ran a simple test:

op = torch.ger(s, s)
deep_deriv = nth_derivative(op, s, 5)

Unfortunately, this succeeds in getting me the Hessian...but no higher order derivatives. I'm aware many higher order derivatives should be 0, but I'd prefer if pytorch can analytically compute that.

One fix has been to change the gradient calculation to:

try:
            grad = ag.grad(f[tuple(f_ind)], wrt, retain_graph=True, create_graph=True)[0]
        except:
            grad = torch.zeros_like(wrt)

Is this the accepted correct way to handle this? Or is there a better option? Or do I have the reason for my issue completely wrong to begin with?

823

asked May 14 '18 03:05

user650261

2 Answers

You can just iterate calling the grad function:

import torch
from torch.autograd import grad

def nth_derivative(f, wrt, n):

    for i in range(n):

        grads = grad(f, wrt, create_graph=True)[0]
        f = grads.sum()

    return grads

x = torch.arange(4, requires_grad=True).reshape(2, 2)
loss = (x ** 4).sum()

print(nth_derivative(f=loss, wrt=x, n=3))

outputs

tensor([[  0.,  24.],
        [ 48.,  72.]])

109

answered Sep 23 '22 14:09

Alex

For the second order derivative, you can use PyTorch's hessian function:

torch.autograd.functional.hessian()

For higher order derivatives, you can repeatedly call jacobian or grad while maintaining the computational graph:

create_graph (bool, optional) – If True, graph of the derivative will be constructed, allowing to compute higher order derivative products.

answered Sep 22 '22 14:09

iacob

Related questions
                            
                                How to use Merge layer (concat function) on Keras 2.0.0?
                            
                                Which pool class should i use prefork, eventlet or gevent in celery?
                            
                                Creating Custom Tag in PyYAML
                            
                                Why does ... == True return False in Python 3?
                            
                                How to plot y=1/x as a single graph [duplicate]
                            
                                Format x-axis on chart created with pandas plot method
                            
                                Convert JSON API response to pandas Dataframe
                            
                                Python class variables or @property
                            
                                How can I convert bytes object to decimal or binary representation in python?
                            
                                How to grab all headers from a website using BeautifulSoup?
                            
                                How to use airflow with Celery
                            
                                How to implement a smooth clamp function in python?
                            
                                Flask upload: How to get file name?
                            
                                How to generate a sphere in 3D Numpy array
                            
                                Python3 print() Vs Python2 print
                            
                                FutureWarning: specifying 'categories' or 'ordered' in .astype() is deprecated; pass a CategoricalDtype instead
                            
                                No module named 'websocket'
                            
                                Execute Bash commands Python way [duplicate]
                            
                                XGBOOST: sample_Weights vs scale_pos_weight
                            
                                Issues listening incoming messages in websocket client on Python 3.6

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With