Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to clip intermediate exploded gradients in tensorflow

Problem: a very long RNN net

N1 -- N2 -- ... --- N100

For a Optimizer like AdamOptimizer, the compute_gradient() will give gradients to all training variables.

However, it might explode during some step.

A method like in how-to-effectively-apply-gradient-clipping-in-tensor-flow can clip large final gradient.

But how to clip those intermediate ones?

One way might be manually do the backprop from "N100 --> N99", clip the gradients, then "N99 --> N98" and so on, but that's just too complicated.

So my question is: Is there any easier method to clip the intermediate gradients? (of course, strictly speaking, they are not gradients anymore in the mathematical sense)

like image 886
user1441268 Avatar asked Oct 12 '16 09:10

user1441268


1 Answers

@tf.custom_gradient
def gradient_clipping(x):
  return x, lambda dy: tf.clip_by_norm(dy, 10.0)
like image 192
Hanhan Li Avatar answered Sep 28 '22 00:09

Hanhan Li