Is there a way to clip intermediate exploded gradients in tensorflow

Question

Problem: a very long RNN net

N1 -- N2 -- ... --- N100

For a Optimizer like AdamOptimizer, the compute_gradient() will give gradients to all training variables.

However, it might explode during some step.

A method like in how-to-effectively-apply-gradient-clipping-in-tensor-flow can clip large final gradient.

But how to clip those intermediate ones?

One way might be manually do the backprop from "N100 --> N99", clip the gradients, then "N99 --> N98" and so on, but that's just too complicated.

So my question is: Is there any easier method to clip the intermediate gradients? (of course, strictly speaking, they are not gradients anymore in the mathematical sense)

Hanhan Li · Accepted Answer

@tf.custom_gradient
def gradient_clipping(x):
  return x, lambda dy: tf.clip_by_norm(dy, 10.0)

Is there a way to clip intermediate exploded gradients in tensorflow

Tags:

gradient

tensorflow

deep-learning

clipping

adam

user1441268

1 Answers

Hanhan Li

Recent Activity

Donate For Us

Is there a way to clip intermediate exploded gradients in tensorflow

Tags:

gradient

tensorflow

deep-learning

clipping

adam

user1441268

1 Answers

Hanhan Li

Related questions

Recent Activity

Donate For Us