How can I calculate the loss without the weight decay in Keras?

Tags:

keras

I defined a convolutional layer and also use the L2 weight decay in Keras.

When I define the loss in the model.fit(), has all the weight decay loss been included in this loss? If the weight decay loss has been included in the total loss, how can I get the loss without this weight decay during the training?

I want to investigate the loss without the weight decay, while I want this weight decay to attend this training.

440

asked Sep 24 '17 08:09

Kevin Sun

1 Answers

Yes, weight decay losses are included in the loss value printed on the screen.

The value you want to monitor is the total loss minus the sum of regularization losses.

The total loss is just model.total_loss .
The regularization losses are collected in the list model.losses.

The following lines can be found in the source code of model.compile():

# Add regularization penalties
# and other layer-specific losses.
for loss_tensor in self.losses:
    total_loss += loss_tensor

To get the loss without weight decay, you can reverse the above operations. I.e., the value to be monitored is model.total_loss - sum(model.losses).

Now, how to monitor this value is a bit tricky. Fortunately, the list of metrics used by a Keras model is not fixed until model.fit() is called. So you can append this value to the list, and it'll be printed on the screen during model fitting.

Here's a simple example:

input_tensor = Input(shape=(64, 64, 3))
hidden = Conv2D(32, 1, kernel_regularizer=l2(0.01))(input_tensor)
hidden = GlobalAveragePooling2D()(hidden)
out = Dense(1)(hidden)
model = Model(input_tensor, out)
model.compile(loss='mse', optimizer='adam')

loss_no_weight_decay = model.total_loss - sum(model.losses)
model.metrics_tensors.append(loss_no_weight_decay)
model.metrics_names.append('loss_no_weight_decay')

When you run model.fit(), something like this will be printed to the screen:

Epoch 1/1
100/100 [==================] - 0s - loss: 0.5764 - loss_no_weight_decay: 0.5178

You can also verify whether this value is correct by computing the L2 regularization manually:

conv_kernel = model.layers[1].get_weights()[0]
print(np.sum(0.01 * np.square(conv_kernel)))

In my case, the printed value is 0.0585, which is indeed the difference between loss and loss_no_weight_decay (with some rounding error).

200

answered Sep 21 '22 16:09

Yu-Yang

Related questions
                            
                                'RefVariable' object has no attribute '_id'
                            
                                Disable Tensorflow logging completely
                            
                                Stuck understanding ResNet's Identity block and Convolutional blocks
                            
                                Passing `training=true` when using Tensorflow 2's Keras Functional API
                            
                                How to calculate confidence score of a Neural Network prediction
                            
                                indices[201] = [0,8] is out of order. Many sparse ops require sorted indices.Use `tf.sparse.reorder` to create a correctly ordered copy
                            
                                how to use scipy.optimize.linear_sum_assignment in tensorflow or keras?
                            
                                Keras callback AttributeError: 'ModelCheckpoint' object has no attribute '_implements_train_batch_hooks'
                            
                                Tensorflow DecodeJPEG: Expected image (JPEG, PNG, or GIF), got unknown format starting with '\000\000\000\000\000\000\000\00'
                            
                                ModuleNotFoundError: No module named 'tensorflow.python.keras.engine.base_layer_v1
                            
                                keras accuracy doesn't improve more than 59 percent
                            
                                ImportError: cannot import name 'BatchNormalization' from 'keras.layers.normalization'
                            
                                Add AUC as loss function for keras
                            
                                Keras, best way to save state when optimizing
                            
                                L2 normalised output with keras
                            
                                Conv1D on 2D input
                            
                                Keras custom metric iteration
                            
                                Saving layer weights at each epoch during training into a numpy type/array? Converting TensorFlow Variable to numpy array?
                            
                                More than one prediction in multi-classification in Keras?
                            
                                TypeError: Unrecognized keyword arguments: {'show_accuracy': True} #yelp challenge dataset

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With