I would like to monitor the gradient changes in tensorboard with keras to decide whether gradient vanish or explosion. What should I do?
To check for vanishing / exploding gradients, pay attention the gradients distribution and absolute values in the layer of interest ("Distributions" tab): If the distribution is highly peaked and concentrated around 0, the gradients are probably vanishing. Here's a concrete example how it looks like in practice.
Gradient Clipping Another popular technique to mitigate the exploding gradients problem is to clip the gradients during backpropagation so that they never exceed some threshold. This is called Gradient Clipping. This optimizer will clip every component of the gradient vector to a value between –1.0 and 1.0.
Method to overcome the problem The vanishing gradient problem is caused by the derivative of the activation function used to create the neural network. The simplest solution to the problem is to replace the activation function of the network. Instead of sigmoid, use an activation function such as ReLU.
To visualize the training in Tensorboard, add keras.callbacks.TensorBoard
callback to model.fit
function. Don't forget to set write_grads=True
to see the gradients there. Right after training start, you can run...
tensorboard --logdir=/full_path_to_your_logs
... from the command line and point your browser to htttp://localhost:6006
. See the example code in this question.
To check for vanishing / exploding gradients, pay attention the gradients distribution and absolute values in the layer of interest ("Distributions" tab):
NaN
s very quickly as well.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With