Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cholesky factor differentiation in TensorFlow

I would like to get the gradient of tf.cholesky with respect to its input. As of the moment, the tf.cholesky does not have a registered gradient:

LookupError: No gradient defined for operation 'Cholesky' (op type: Cholesky)

The code used to generate this error is:

import tensorflow as tf
A = tf.diag(tf.ones([3]))
chol = tf.cholesky(A)
cholgrad = tf.gradients(chol, A)

While it is possible for me to compute the gradient myself and register it, the only existing means by which I've seen the Cholesky gradient computed involved the use of for loops and needs the shape of the input matrix. However, to the best of my knowledge, symbolic loops aren't currently available to TensorFlow.

One possible workaround to getting the shape of the input matrix A would probably be to use:

[int(elem) for elem in list(A.get_shape())]

But this approach doesn't work if the dimensions of A is dependent on a TensorFlow placeholder object with shape TensorShape([Dimension(None)]).

If anyone has any idea for how to compute and register a gradient of tf.cholesky, I would very much appreciate knowing about it.

like image 399
Rui Shu Avatar asked Sep 26 '22 02:09

Rui Shu


1 Answers

We discussed this a bit in the answers and comments to this question: TensorFlow cholesky decomposition. It might (?) be possible to port the Theano implementation of CholeskyGrad, provided its semantics are actually what you want. Theano's is based upon Smith's "Differentiation of the Cholesky Algorithm".

If you implement it as a C++ operation that the Python just calls into, you have unrestricted access to all the looping constructs you could desire, and anything Eigen provides. If you wanted to do it in pure tensorflow, you could use the control flow ops, such as tf.control_flow_ops.While to loop.

Once you know the actual formula you want to apply, the answer here: matrix determinant differentiation in tensorflow shows how to implement and register a gradient for an op in tensorflow.

You could also create an issue on github to request this feature, though, of course, you'll probably get it faster if you implement it yourself and then send in a pull request. :)

like image 169
dga Avatar answered Sep 29 '22 21:09

dga