I think I might have a problem with dead Relus, but I don't really know how to check it with tensorboard or any other way. Your help would be really appreciated.
Leaky ReLU is a common effective method to solve a dying ReLU problem, and it does so by adding a slight slope in the negative range. This modifies the function to generate small negative outputs when input is less than 0.
It can be fixed by using smaller learning rates so that the big gradient doesn't set a big negative weight and bias in a ReLU neuron. Another fix is to use the Leaky ReLU, which allows the neurons outside the active interval to leak some gradient backward.
The dying ReLU refers to the problem when ReLU neurons become inactive and only output 0 for any input. There are many empirical and heuristic explanations of why ReLU neurons die.
I initially had this same question myself and couldn't find an answer, so here's how I'm doing it with Tensorboard (This assumes some familiarity with Tensorboard).
activation = tf.nn.relu(layer)
active = tf.count_nonzero(tf.count_nonzero(activation, axis=0))
tf.summary.scalar('pct-active-neurons', active / layer.shape[1])
In this snip, activation
is my post-ReLU activation for this particular layer. The first call to tf.count_nonzero(out, axis=0)
is counting how many activations each neuron has seen across all training examples for the current step of training. The second call tf.count_nonzero( ... )
that wraps the first call counts how many neurons in the layer had at least one activation for the batch of training examples for this step. Finally, I convert it to a percentage by dividing the number of neurons that had at least one activation in the training step by the total number of neurons for the layer.
More information on setting up Tensorboard can be found here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With