Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow: How to replace or modify gradient?

I would like to replace or modify the gradient of an op or portion of the graph in tensorflow. It would be ideal if I can use the existing gradient in the calculation.

In some ways this is the opposite to what tf.stop_gradient() does: instead of adding a calculation which is ignored when calculating gradients, I want a calculation which is only used when calculating gradients.

A simple example would be something which simply scales gradients by multiplying them with a constant (but does not multiply the forward calculation by a constant). Another example would be something which clips the gradients to a given range.

like image 821
Alex I Avatar asked May 08 '17 03:05

Alex I


People also ask

Does TensorFlow have Autograd?

Behind the scenes, TensorFlow is a tensor library with automatic differentiation capability. Hence you can easily use it to solve a numerical optimization problem with gradient descent. In this post, you will learn how TensorFlow's automatic differentiation engine, autograd, works.

How do you apply gradient clipping in TensorFlow?

Applying gradient clipping in TensorFlow models is quite straightforward. The only thing you need to do is pass the parameter to the optimizer function. All optimizers have a `clipnorm` and a `clipvalue` parameters that can be used to clip the gradients.

Can TensorFlow compute gradients?

TensorFlow "records" relevant operations executed inside the context of a tf. GradientTape onto a "tape". TensorFlow then uses that tape to compute the gradients of a "recorded" computation using reverse mode differentiation.

What is stop gradient?

stop_gradient() is an operation that acts as the identity function in the forward direction but stops the accumulated gradient from flowing through that operator in the backward direction.


2 Answers

For TensorFlow 1.7 and TensorFlow 2.0 look at edit blow.


First define your custom gradient:

@tf.RegisterGradient("CustomGrad") def _const_mul_grad(unused_op, grad):   return 5.0 * grad 

Since you want nothing to happen in the forward pass, override the gradient of an identity operation with your new gradient:

g = tf.get_default_graph() with g.gradient_override_map({"Identity": "CustomGrad"}):   output = tf.identity(input, name="Identity") 

Here is a working example with a layer that clips gradients in the backwards pass and does nothing in the forwards pass, using the same method:

import tensorflow as tf  @tf.RegisterGradient("CustomClipGrad") def _clip_grad(unused_op, grad):   return tf.clip_by_value(grad, -0.1, 0.1)  input = tf.Variable([3.0], dtype=tf.float32)  g = tf.get_default_graph() with g.gradient_override_map({"Identity": "CustomClipGrad"}):   output_clip = tf.identity(input, name="Identity") grad_clip = tf.gradients(output_clip, input)  # output without gradient clipping in the backwards pass for comparison: output = tf.identity(input) grad = tf.gradients(output, input)  with tf.Session() as sess:   sess.run(tf.global_variables_initializer())   print("with clipping:", sess.run(grad_clip)[0])   print("without clipping:", sess.run(grad)[0]) 

Edit for TensorFlow 1.7 and TensorFlow 2.0

Since 1.7 there is a new way to redefine the gradient with shorter syntax, which also works with Tensorflow 2.0. It also allows to redefine the gradient of multiple operations at the same time. Here are the examples from above, rewritten for TensorFlow 1.7 and TensorFlow 2.0:

Layer that scales gradients in the backward pass:

@tf.custom_gradient def scale_grad_layer(x):   def grad(dy):     return 5.0 * dy   return tf.identity(x), grad 

Example with a layer that clips gradients in the backward pass:

@tf.custom_gradient def clip_grad_layer(x):   def grad(dy):     return tf.clip_by_value(dy, -0.1, 0.1)   return tf.identity(x), grad 
like image 116
BlueSun Avatar answered Sep 21 '22 16:09

BlueSun


Assuming the forward computation is

y = f(x) 

And you want it to backpropagate like

y = b(x) 

A simple hack will be:

y = b(x) + tf.stop_gradient(f(x) - b(x)) 
like image 43
Bily Avatar answered Sep 21 '22 16:09

Bily