Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Improve performance of autograd jacobian

I'm wondering how the following code could be faster. At the moment, it seems unreasonably slow, and I suspect I may be using the autograd API wrong. The output I expect is each element of timeline evaluated at the jacobian of f, which I do get, but it takes a long time:

import numpy as np
from autograd import jacobian


def f(params):
    mu_, log_sigma_ = params
    Z = timeline * mu_ / log_sigma_
    return Z


timeline = np.linspace(1, 100, 40000)

gradient_at_mle = jacobian(f)(np.array([1.0, 1.0]))

I would expect the following:

  1. jacobian(f) returns an function that represents the gradient vector w.r.t. the parameters.
  2. jacobian(f)(np.array([1.0, 1.0])) is the Jacobian evaluated at the point (1, 1). To me, this should be like a vectorized numpy function, so it should execute very fast, even for 40k length arrays. However, this is not what is happening.

Even something like the following has the same poor performance:

import numpy as np
from autograd import jacobian


def f(params, t):
    mu_, log_sigma_ = params
    Z = t * mu_ / log_sigma_
    return Z


timeline = np.linspace(1, 100, 40000)

gradient_at_mle = jacobian(f)(np.array([1.0, 1.0]), timeline)

like image 341
Cam.Davidson.Pilon Avatar asked Oct 12 '25 17:10

Cam.Davidson.Pilon


1 Answers

From https://github.com/HIPS/autograd/issues/439 I gathered that there is an undocumented function autograd.make_jvp which calculates the jacobian with a fast forward mode.

The link states:

Given a function f, vectors x and v in the domain of f, make_jvp(f)(x)(v) computes both f(x) and the Jacobian of f evaluated at x, right multiplied by the vector v.

To get the full Jacobian of f you just need to write a loop to evaluate make_jvp(f)(x)(v) for each v in the standard basis of f's domain. Our reverse mode Jacobian operator works in the same way.

From your example:

import autograd.numpy as np
from autograd import make_jvp

def f(params):
    mu_, log_sigma_ = params
    Z = timeline * mu_ / log_sigma_
    return Z

timeline = np.linspace(1, 100, 40000)

gradient_at_mle = make_jvp(f)(np.array([1.0, 1.0]))

# loop through each basis
# [1, 0] evaluates (f(0), first column of jacobian)
# [0, 1] evaluates (f(0), second column of jacobian)
for basis in (np.array([1, 0]), np.array([0, 1])):
    val_of_f, col_of_jacobian = gradient_at_mle(basis)
    print(col_of_jacobian)

Output:

[  1.           1.00247506   1.00495012 ...  99.99504988  99.99752494
 100.        ]
[  -1.           -1.00247506   -1.00495012 ...  -99.99504988  -99.99752494
 -100.        ]

This runs in ~ 0.005 seconds on google collab.

Edit:

Functions like cdf aren't defined for the regular jvp yet but you can use another undocumented function make_jvp_reversemode where it is defined. Usage is similar except that the output is only the column and not the value of the function:

import autograd.numpy as np
from autograd.scipy.stats.norm import cdf
from autograd.differential_operators import make_jvp_reversemode


def f(params):
    mu_, log_sigma_ = params
    Z = timeline * cdf(mu_ / log_sigma_)
    return Z

timeline = np.linspace(1, 100, 40000)

gradient_at_mle = make_jvp_reversemode(f)(np.array([1.0, 1.0]))

# loop through each basis
# [1, 0] evaluates first column of jacobian
# [0, 1] evaluates second column of jacobian
for basis in (np.array([1, 0]), np.array([0, 1])):
    col_of_jacobian = gradient_at_mle(basis)
    print(col_of_jacobian)

Output:

[0.05399097 0.0541246  0.05425823 ... 5.39882939 5.39896302 5.39909665]
[-0.05399097 -0.0541246  -0.05425823 ... -5.39882939 -5.39896302 -5.39909665]

Note that make_jvp_reversemode will be slightly faster than make_jvp by a constant factor due to it's use of caching.

like image 127
Community Avatar answered Oct 15 '25 10:10

Community