Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check NaN in gradients in Tensorflow when updating?

All,

When you train a large model with large amount samples, some samples may be cause NaN gradient when parameter updating.

And I want to find these samples out. And meanwhile I don't want this batch samples' gradient to update model's parameter, because it may be cause model's parameter being NaN.

So dose anyone have good idea to deal with this problem?

My code is like below:

    # Create an optimizer.
    params = tf.trainable_variables()
    opt = tf.train.AdamOptimizer(1e-3)
    gradients = tf.gradients(self.loss, params)

    max_gradient_norm = 10
    clipped_gradients, self.gradient_norms = tf.clip_by_global_norm(gradients,
                                                     max_gradient_norm)

    self.optimizer = opt.apply_gradients(zip(clipped_gradients, params))
like image 630
Issac Avatar asked Nov 20 '16 07:11

Issac


People also ask

Is there a way to use TensorFlow to detect the gradients?

And because of the way tensorflow works (which computes the gradients using the chain rule) it results in nan s or +/-Inf s. The best way probably would be for tensorflow to detect these patterns and replace them with their analytically-simplified equivalent.

How to filter or remove values from a tensor in TensorFlow?

Tensorflow tf.where () function can allow us to filter or remove some values from a tensor. For example, we can remove nan or inf value from a tensor. In order to understand how to use tf.where (), you can view: However, when we are using tf.where () to filer tensor, we may cause gradient nan error.

Why is my gradient always Nan when input is not used?

In short, if the input to a tf.where contains NaNs, the gradient will always be NaN, regardless whether the input is actually used or not, and the workaround is to prevent the inputs from ever containing NaNs. Sorry, something went wrong. Are you satisfied with the resolution of your issue? Sorry, something went wrong.

What is automatic differentiation in TensorFlow?

Automatic differentiation is useful for implementing machine learning algorithms such as backpropagation for training neural networks. In this guide, you will explore ways to compute gradients with TensorFlow, especially in eager execution.


1 Answers

You can check whether your gradients have NaN by tf.check_numerics:

grad_check = tf.check_numerics(clipped_gradients)
with tf.control_dependencies([grad_check]):
  self.optimizer = opt.apply_gradients(zip(clipped_gradients, params))

The grad_check would throw InvalidArgument if clipped_gradients is NaN or infinity.

The tf.control_dependencies makes sure that the grad_check is evaluated before applying the gradients.

Also see tf.add_check_numerics_ops().

like image 137
yuefengz Avatar answered Nov 10 '22 06:11

yuefengz