Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How Can I Define Only the Gradient for a Tensorflow Subgraph?

Tags:

tensorflow

First: I am only a few days in with Tensorflow, so please bear with me.

I started out from the cifar10 tutorial code and I am now using a combination of convolutions and eigenvalue decompositions that break the symbolic differentiation. I.e. the graph gets built, then upon calling train() the script halts with "No gradient defined for operation [...] (op type: SelfAdjointEig)". No surprise there.

The inputs to the subgraph in question are still only the input feature maps and the filters being used, and I have the formulas for the gradients at hand and they should be straight-forward to implement given the inputs to the subgraph and the gradient with respect to its output.

From what I can see in the docs, I can register a gradient method for custom Ops with RegisterGradient or override them with the experimental gradient_override_map. Both of those should give me access to exactly the things I need. For example, searching on Github I find a lot of examples that access the op's inputs as op.input[0] or such.

The problem I have is that I want to essentially "shortcut" a whole subgraph, not a single op, so I have no single op to decorate. Since this is happening in one of the convolutional layers of the cifar example I tried using the scope object for that layer. Conceptually, what enters and exits that scope's graph is exactly what I want so if I could somehow override the whole scope's gradients that would "already" do it.

I saw tf.Graph.create_op which (I think) I could use to register a new type of operation and I could then override that Operation type's gradient computation with aforementioned methods. But I don't see a way of defining that op's forward pass without writing it in C++...

Maybe I am approaching this the wrong way entirely? Since all of my forward or backward operations can be implemented with the python interface I obviously want to avoid implementing anything in C++.

like image 662
black_puppydog Avatar asked Apr 06 '16 16:04

black_puppydog


2 Answers

Here's a trick from Sergey Ioffe:

Suppose you want group of ops that behave as f(x) in forward mode, but as g(x) in the backward mode. You implement it as

t = g(x)
y = t + tf.stop_gradient(f(x) - t)

So in your case your g(x) could be an identity op, with a custom gradient using gradient_override_map

like image 153
Yaroslav Bulatov Avatar answered Oct 13 '22 13:10

Yaroslav Bulatov


From TensorFlow 1.7 onward, tf.custom_gradient is the way to go.

like image 29
Stephane Bersier Avatar answered Oct 13 '22 13:10

Stephane Bersier