How to compute all second derivatives (only the diagonal of the Hessian matrix) in Tensorflow?

Tags:

I have a loss value/function and I would like to compute all the second derivatives with respect to a tensor f (of size n). I managed to use tf.gradients twice, but when applying it for the second time, it sums the derivatives across the first input (see second_derivatives in my code).

Also I managed to retrieve the Hessian matrix, but I would like to only compute its diagonal to avoid extra-computation.

import tensorflow as tf
import numpy as np

f = tf.Variable(np.array([[1., 2., 0]]).T)
loss = tf.reduce_prod(f ** 2 - 3 * f + 1)

first_derivatives = tf.gradients(loss, f)[0]

second_derivatives = tf.gradients(first_derivatives, f)[0]

hessian = [tf.gradients(first_derivatives[i,0], f)[0][:,0] for i in range(3)]

model = tf.initialize_all_variables()
with tf.Session() as sess:
    sess.run(model)
    print "\nloss\n", sess.run(loss)
    print "\nloss'\n", sess.run(first_derivatives)
    print "\nloss''\n", sess.run(second_derivatives)
    hessian_value = np.array(map(list, sess.run(hessian)))
    print "\nHessian\n", hessian_value

My thinking was that tf.gradients(first_derivatives, f[0, 0])[0] would work to retrieve for instance the second derivative with respect to f_0 but it seems that tensorflow doesn't allow to derive from a slice of a tensor.

207

asked Jul 05 '16 10:07

Clement T.

1 Answers

The following function calculates 2nd derivatives (the diagonal of the Hessian matrix) in Tensorflow 2.0:

%tensorflow_version 2.x  # Tells Colab to load TF 2.x
import tensorflow as tf

def calc_hessian_diag(f, x):
    """
    Calculates the diagonal entries of the Hessian of the function f
    (which maps rank-1 tensors to scalars) at coordinates x (rank-1
    tensors).
    
    Let k be the number of points in x, and n be the dimensionality of
    each point. For each point k, the function returns

      (d^2f/dx_1^2, d^2f/dx_2^2, ..., d^2f/dx_n^2) .

    Inputs:
      f (function): Takes a shape-(k,n) tensor and outputs a
          shape-(k,) tensor.
      x (tf.Tensor): The points at which to evaluate the Laplacian
          of f. Shape = (k,n).
    
    Outputs:
      A tensor containing the diagonal entries of the Hessian of f at
      points x. Shape = (k,n).
    """
    # Use the unstacking and re-stacking trick, which comes
    # from https://github.com/xuzhiqin1990/laplacian/
    with tf.GradientTape(persistent=True) as g1:
        # Turn x into a list of n tensors of shape (k,)
        x_unstacked = tf.unstack(x, axis=1)
        g1.watch(x_unstacked)

        with tf.GradientTape() as g2:
            # Re-stack x before passing it into f
            x_stacked = tf.stack(x_unstacked, axis=1) # shape = (k,n)
            g2.watch(x_stacked)
            f_x = f(x_stacked) # shape = (k,)
        
        # Calculate gradient of f with respect to x
        df_dx = g2.gradient(f_x, x_stacked) # shape = (k,n)
        # Turn df/dx into a list of n tensors of shape (k,)
        df_dx_unstacked = tf.unstack(df_dx, axis=1)

    # Calculate 2nd derivatives
    d2f_dx2 = []
    for df_dxi,xi in zip(df_dx_unstacked, x_unstacked):
        # Take 2nd derivative of each dimension separately:
        #   d/dx_i (df/dx_i)
        d2f_dx2.append(g1.gradient(df_dxi, xi))
    
    # Stack 2nd derivates
    d2f_dx2_stacked = tf.stack(d2f_dx2, axis=1) # shape = (k,n)
    
    return d2f_dx2_stacked

Here's an example usage, with the function f(x) = ln(r), where x are 3D coordinates and r is the radius is spherical coordinates:

f = lambda q : tf.math.log(tf.math.reduce_sum(q**2, axis=1))
x = tf.random.uniform((5,3))

d2f_dx2 = calc_hessian_diag(f, x)
print(d2f_dx2)

The will look something like this:

tf.Tensor(
[[ 1.415968    1.0215727  -0.25363517]
 [-0.67299247  2.4847088   0.70901346]
 [ 1.9416015  -1.1799507   1.3937857 ]
 [ 1.4748447   0.59702784 -0.52290654]
 [ 1.1786096   0.07442689  0.2396735 ]], shape=(5, 3), dtype=float32)

We can check the correctness of the implementation by calculating the Laplacian (i.e., by summing the diagonal of the Hessian matrix), and comparing to the theoretical answer for our chosen function, 2 / r^2:

print(tf.reduce_sum(d2f_dx2, axis=1)) # Laplacian from summing above results
print(2./tf.math.reduce_sum(x**2, axis=1)) # Analytic expression for Lapalcian

I get the following:

tf.Tensor([2.1839054 2.5207298 2.1554365 1.5489659 1.49271  ], shape=(5,), dtype=float32)
tf.Tensor([2.1839058 2.5207298 2.1554365 1.5489662 1.4927098], shape=(5,), dtype=float32)

They agree to within rounding error.

119

answered Sep 16 '22 15:09

apdnu

Related questions
                            
                                Bradley adaptive thresholding algorithm
                            
                                How can I improve PySerial read speed
                            
                                How to write a function which takes a slice?
                            
                                Cyclical Sliding Window Iteration
                            
                                Python length of unicode string confusion
                            
                                Creating deb or rpm with setuptools – data_files
                            
                                Passing a list of strings from Python to Rust
                            
                                TypeError: unsupported operand type(s) for +=: 'int' and 'list'
                            
                                Set the currency symbol when writing with xlsxwriter
                            
                                Multiple scipy.integrate.ode instances
                            
                                Asyncio in python3.5 spams with INFO
                            
                                Playing a Lot of Sounds at Once
                            
                                Django: AttributeError: 'NoneType' object has no attribute 'split'
                            
                                Scaling and fitting to a log-normal distribution using a logarithmic axis in python
                            
                                How to get Flask/Gunicorn to handle concurrent requests for the same Route?
                            
                                Change directory in terminal using python
                            
                                Python: How to read stdout of subprocess in a nonblocking way
                            
                                Numpy: Why doesn't 'a += a.T' work?
                            
                                Keras loading color images
                            
                                Format numbers as currency in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to compute all second derivatives (only the diagonal of the Hessian matrix) in Tensorflow?

Tags:

python

tensorflow

Clement T.

People also ask

1 Answers

apdnu

Recent Activity

Donate For Us