I have met a snippet of code of tensorflow 2.0, which is used for calculating the loss. The total loss is composed of two parts: 1) regularization loss, 2) prediction loss. My question is why model.losses
is regularization loss? model
here is an instance of tf.keras.Model
. I'm kind of confused by the tensorflow official API documentation. tf.keras.Model, it says
Losses which are associated with this Layer.
Variable regularization tensors are created when this property is accessed, so it is eager safe: accessing losses under a
tf.GradientTape
will propagate gradients back to the corresponding variables.
Why could we get regularization loss via accessing losses
property? Also, what is eager safe? If losses
property is returning regularization loss, why is it named losses
instead of regularization_loss
?
with tf.GradientTape() as tape:
outputs = model(images, training=True)
regularization_loss = tf.reduce_sum(model.losses)
pred_loss = ...
total_loss = pred_loss + regularization_loss
We get regularization losses accessing the losses
property because these losses are created during the model definition. Since the model is a Keras model, you've built it using Keras layers. Every Keras layer (Dense, Conv3D, ...) can be regularized and this is a property of the layer itself.
The model, being an ordered collection of layers, contains all the layers losses inside the losses
property.
Eager safe means you can use the losses
property of the model during the eager training, being sure that the gradient is propagated only to the correct layers. E.g. if you added an l2 regularization only on the second layer of a model, the variables of the second layer only are influenced (and updated) by that term of the loss.
Is named losses
instead of regularization_losses
because is not limited to regularization losses only; when you compile a model, a non-regularization loss is added to that property
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With