When training an Object Detection DNN with Tensorflows Object Detection API it's Visualization Plattform Tensorboard plots a scalar named regularization_loss_1
What is this? I know what regularization is (to make the Network good at generalizing through various methods like dropout) But it is not clear to me what this displayed loss could be.
Thanks!
TL;DR: it's just the additional loss generated by the regularization function. Add that to the network's loss and optimize over the sum of the two. As you correctly state, regularization methods are used to help an optimization method to generalize better.
Regularizers allow you to apply penalties on layer parameters or layer activity during optimization. These penalties are summed into the loss function that the network optimizes. Regularization penalties are applied on a per-layer basis.
Regularization is a set of techniques that can prevent overfitting in neural networks and thus improve the accuracy of a Deep Learning model when facing completely new data from the problem domain.
To add a regularizer to a layer, you simply have to pass in the prefered regularization technique to the layer's keyword argument 'kernel_regularizer'. The Keras regularization implementation methods can provide a parameter that represents the regularization hyperparameter value.
TL;DR: it's just the additional loss generated by the regularization function. Add that to the network's loss and optimize over the sum of the two.
As you correctly state, regularization methods are used to help an optimization method to generalize better. A way to obtain this is to add a regularization term to the loss function. This term is a generic function, which modifies the "global" loss (as in, the sum of the network loss and the regularization loss) in order to drive the optimization algorithm in desired directions.
Let's say, for example, that for whatever reason I want to encourage solutions to the optimization that have weights as close to zero as possible. One approach, then, is to add to the loss produced by the network, a function of the network weights (for example, a scaled-down sum of all the absolute values of the weights). Since the optimization algorithm minimizes the global loss, my regularization term (which is high when the weights are far from zero) will push the optimization towards solutions tht have weights close to zero.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With