Adding regularizer to an existing layer of a trained model without resetting weights?

Tags:

Let's say I'm transfer learning via Inception. I add a few layers and train it for a while.

Here is what my model topology looks like:

base_model = InceptionV3(weights='imagenet', include_top=False)
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu', name = 'Dense_1')(x)
predictions = Dense(12, activation='softmax', name = 'Predictions')(x)
model = Model(input=base_model.input, output=predictions)

I train this model for a while, save it and load it again for retraining; this time I want to add l2-regularizer to the Dense_1 without resetting the weights? Is this possible?

path = .\model.hdf5
from keras.models import load_model
model = load_model(path)

The docs show only show the that regularizer can be added as parameter when you initialize a layer:

from keras import regularizers
model.add(Dense(64, input_dim=64,
                kernel_regularizer=regularizers.l2(0.01),
                activity_regularizer=regularizers.l1(0.01)))

This is essentially creating a new layer, so my layer's weights would be resetted.

EDIT:

So I'm playing around with the code the past couple of days, and something strange is happening with my loss when I load the model (after training a bit with the new regularizer).

So the first time I run this code (first time with new regularizer):

from keras.models import load_model
base_model = load_model(path)
x = base_model.get_layer('dense_1').output
predictions = base_model.get_layer('dense_2')(x)
model = Model(inputs = base_model.input, output = predictions)
model.get_layer('dense_1').kernel_regularizer = regularizers.l2(0.02) 

model.compile(optimizer=SGD(lr= .0001, momentum=0.90),
              loss='categorical_crossentropy',
              metrics = ['accuracy'])

My training output seems to be normal:

Epoch 43/50
 - 2918s - loss: 0.3834 - acc: 0.8861 - val_loss: 0.4253 - val_acc: 0.8723
Epoch 44/50
Epoch 00044: saving model to E:\Keras Models\testing_3\2018-01-18_44.hdf5
 - 2692s - loss: 0.3781 - acc: 0.8869 - val_loss: 0.4217 - val_acc: 0.8729
Epoch 45/50
 - 2690s - loss: 0.3724 - acc: 0.8884 - val_loss: 0.4169 - val_acc: 0.8748
Epoch 46/50
Epoch 00046: saving model to E:\Keras Models\testing_3\2018-01-18_46.hdf5
 - 2684s - loss: 0.3688 - acc: 0.8896 - val_loss: 0.4137 - val_acc: 0.8748
Epoch 47/50
 - 2665s - loss: 0.3626 - acc: 0.8908 - val_loss: 0.4097 - val_acc: 0.8763
Epoch 48/50
Epoch 00048: saving model to E:\Keras Models\testing_3\2018-01-18_48.hdf5
 - 2681s - loss: 0.3586 - acc: 0.8924 - val_loss: 0.4069 - val_acc: 0.8767
Epoch 49/50
 - 2679s - loss: 0.3549 - acc: 0.8930 - val_loss: 0.4031 - val_acc: 0.8776
Epoch 50/50
Epoch 00050: saving model to E:\Keras Models\testing_3\2018-01-18_50.hdf5
 - 2680s - loss: 0.3493 - acc: 0.8950 - val_loss: 0.4004 - val_acc: 0.8787

However, if I try to load the model after this mini-training session(I will load the model from epoch 00050, so new regularizer value should be already implemented, I get a really high loss value)

Code:

path = r'E:\Keras Models\testing_3\2018-01-18_50.hdf5' #50th epoch model

from keras.models import load_model
model = load_model(path)
model.compile(optimizer=SGD(lr= .0001, momentum=0.90),
              loss='categorical_crossentropy',
              metrics = ['accuracy'])

return:

Epoch 51/65
 - 3130s - loss: 14.0017 - acc: 0.8953 - val_loss: 13.9529 - val_acc: 0.8800
Epoch 52/65
Epoch 00052: saving model to E:\Keras Models\testing_3\2018-01-20_52.hdf5
 - 2813s - loss: 13.8017 - acc: 0.8969 - val_loss: 13.7553 - val_acc: 0.8812
Epoch 53/65
 - 2759s - loss: 13.6070 - acc: 0.8977 - val_loss: 13.5609 - val_acc: 0.8824
Epoch 54/65
Epoch 00054: saving model to E:\Keras Models\testing_3\2018-01-20_54.hdf5
 - 2748s - loss: 13.4115 - acc: 0.8992 - val_loss: 13.3697 - val_acc: 0.8824
Epoch 55/65
 - 2745s - loss: 13.2217 - acc: 0.9006 - val_loss: 13.1807 - val_acc: 0.8840
Epoch 56/65
Epoch 00056: saving model to E:\Keras Models\testing_3\2018-01-20_56.hdf5
 - 2752s - loss: 13.0335 - acc: 0.9014 - val_loss: 12.9951 - val_acc: 0.8840
Epoch 57/65
 - 2756s - loss: 12.8490 - acc: 0.9023 - val_loss: 12.8118 - val_acc: 0.8849
Epoch 58/65
Epoch 00058: saving model to E:\Keras Models\testing_3\2018-01-20_58.hdf5
 - 2749s - loss: 12.6671 - acc: 0.9032 - val_loss: 12.6308 - val_acc: 0.8849
Epoch 59/65
 - 2738s - loss: 12.4871 - acc: 0.9039 - val_loss: 12.4537 - val_acc: 0.8855
Epoch 60/65
Epoch 00060: saving model to E:\Keras Models\testing_3\2018-01-20_60.hdf5
 - 2765s - loss: 12.3086 - acc: 0.9059 - val_loss: 12.2778 - val_acc: 0.8868
Epoch 61/65
 - 2767s - loss: 12.1353 - acc: 0.9065 - val_loss: 12.1055 - val_acc: 0.8867
Epoch 62/65
Epoch 00062: saving model to E:\Keras Models\testing_3\2018-01-20_62.hdf5
 - 2757s - loss: 11.9637 - acc: 0.9061 - val_loss: 11.9351 - val_acc: 0.8883

Notice the really high loss values. Is this normal? I understand the l2 regularizer would bring the loss up (if there large weights), but wouldn't that be reflected in the first mini-training session (where I first implemented the regularizer?). The accuracy seems to stay consistent though.

Thank you.

507

asked Jan 18 '18 20:01

Moondra

2 Answers

For tensorflow 2.X you just need to do that:

l2 = tf.keras.regularizers.l2(1e-4)
for layer in model.layers:
    # if hasattr(layer, 'kernel'):
    # or
    # If you want to apply just on Conv
    if isinstance(layer, tf.keras.layers.Conv2D):
        model.add_loss(lambda layer=layer: l2(layer.kernel))

Hope it will help

166

answered Sep 22 '22 02:09

Emilien Garreau

Try this:

# a utility function to add weight decay after the model is defined.
def add_weight_decay(model, weight_decay):
    if (weight_decay is None) or (weight_decay == 0.0):
        return

    # recursion inside the model
    def add_decay_loss(m, factor):
        if isinstance(m, tf.keras.Model):
            for layer in m.layers:
                add_decay_loss(layer, factor)
        else:
            for param in m.trainable_weights:
                with tf.keras.backend.name_scope('weight_regularizer'):
                    regularizer = lambda: tf.keras.regularizers.l2(factor)(param)
                    m.add_loss(regularizer)

    # weight decay and l2 regularization differs by a factor of 2
    add_decay_loss(model, weight_decay/2.0)
    return

answered Sep 18 '22 02:09

mathmanu

Related questions
                            
                                Python Pandas Match Vlookup columns based on header values
                            
                                NumPy sum one array based on values in another array for each matching element in 3rd array
                            
                                Get *all* current jobs from python-rq
                            
                                Pythonic way to use the second condition in list comprehensions
                            
                                PyCharm: Unresolved reference with Scapy
                            
                                'DataFrame' object has no attribute 'melt'
                            
                                OpenCV image subtraction vs Numpy subtraction
                            
                                Append data to HDF5 file with Pandas, Python
                            
                                No module named 'Queue'
                            
                                How to remove every word with non alphabetic characters
                            
                                Django Migrations: Same migrations being created with makemigrations
                            
                                Difference between foo.bar() and bar(foo)?
                            
                                AWS Glue - Truncate destination postgres table prior to insert
                            
                                Which layers should I freeze for fine tuning a resnet model on keras?
                            
                                When to use re.compile
                            
                                GPU only being used 1-5% Tensorflow-gpu and Keras
                            
                                How to apply Polynomial Transformation to subset of features in scikitlearn
                            
                                Python debugger pdb not found on macOS High Sierra
                            
                                webbrowser.get — could not locate runnable browser
                            
                                AttributeError: 'Client' object has no attribute 'send_message' (Discord Bot)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Adding regularizer to an existing layer of a trained model without resetting weights?

Tags:

python

tensorflow

keras

Moondra

People also ask

2 Answers

Emilien Garreau

mathmanu

Recent Activity

Donate For Us