Looking at an example 'solver.prototxt'
, posted on BVLC/caffe git, there is a training meta parameter
weight_decay: 0.04
What does this meta parameter mean? And what value should I assign to it?
The most common type of regularization is L2, also called simply “weight decay,” with values often on a logarithmic scale between 0 and 0.1, such as 0.1, 0.001, 0.0001, etc. Reasonable values of lambda [regularization hyperparameter] range between 0 and 0.1.
Weight decay is a regularization technique that is used in machine learning to reduce the complexity of a model and prevent overfitting. It has been shown to improve the generalization performance of many types of machine learning models, including deep neural networks.
The weight_decay
meta parameter govern the regularization term of the neural net.
During training a regularization term is added to the network's loss to compute the backprop gradient. The weight_decay
value determines how dominant this regularization term will be in the gradient computation.
As a rule of thumb, the more training examples you have, the weaker this term should be. The more parameters you have (i.e., deeper net, larger filters, larger InnerProduct layers etc.) the higher this term should be.
Caffe also allows you to choose between L2
regularization (default) and L1
regularization, by setting
regularization_type: "L1"
However, since in most cases weights are small numbers (i.e., -1<w<1
), the L2
norm of the weights is significantly smaller than their L1
norm. Thus, if you choose to use regularization_type: "L1"
you might need to tune weight_decay
to a significantly smaller value.
While learning rate may (and usually does) change during training, the regularization weight is fixed throughout.
Weight decay is a regularization term that penalizes big weights. When the weight decay coefficient is big the penalty for big weights is also big, when it is small weights can freely grow.
Look at this answer (not specific to caffe) for a better explanation: Difference between neural net "weight decay" and "learning rate".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With