Resume training with different loss function

Tags:

loss-function

I want to implement a two-step learning process where:

pre-train a model for a few epochs using the loss function loss_1
change the loss function to loss_2 and continue the training for fine-tuning

Currently, my approach is:

model.compile(optimizer=opt, loss=loss_1, metrics=['accuracy'])
model.fit_generator(…)
model.compile(optimizer=opt, loss=loss_2, metrics=['accuracy'])
model.fit_generator(…)

Note that the optimizer remains the same, and only the loss function changes. I'd like to smoothly continue training, but with a different loss function. According to this post, re-compiling the model loses the optimizer state. Questions:

a) Will I lose the optimizer state even if I use the same optimizer, eg Adam?
b) if the answer to a) is yes, any suggestions on how to change the loss function to a new one without reseting the optimizer state?

EDIT:
As suggested by Simon Caby and based on this thread, I created a custom loss function with two loss computations that depend on epoch number. However, it does not work for me. My approach:

def loss_wrapper(t_change, current_epoch):
    def custom_loss(y_true, y_pred):
       c_epoch = K.get_value(current_epoch)
       if c_epoch < t_change:
           # compute loss_1
       else:
           # compute loss_2
    return custom_loss

And I compile as follows, after initializing current_epoch:

current_epoch = K.variable(0.)
model.compile(optimizer=opt, loss=loss_wrapper(5, current_epoch), metrics=...)

To update the current_epoch, I create the following callback:

class NewCallback(Callback):
    def __init__(self, current_epoch):
        self.current_epoch = current_epoch

    def on_epoch_end(self, epoch, logs={}):
        K.set_value(self.current_epoch, epoch)

model.fit_generator(..., callbacks=[NewCallback(current_epoch)])

The callback updates self.current_epoch every epoch correctly. But the update does not reach the custom loss function. Instead, current_epoch keeps the initialization value forever, and loss_2 is never executed.

Any suggestion is welcome, thanks!

779

asked Mar 28 '19 20:03

dave

1 Answers

My answers : a) yes, and you should probably make your own learning rate scheduler in order to keep control of it :

keras.callbacks.LearningRateScheduler(schedule, verbose=0)

b) yes you can create your own loss function, including one that flutuates between two different loss methods. see : "Advanced Keras — Constructing Complex Custom Losses and Metrics" https://towardsdatascience.com/advanced-keras-constructing-complex-custom-losses-and-metrics-c07ca130a618

115

answered Sep 27 '22 21:09

Simon Caby

Related questions
                            
                                Why does Keras' train_on_batch produce zero loss and accuracy at the second epoch?
                            
                                TypeError: 'Tensor' object is not callable
                            
                                Wrapping Tensorflow For Use in Keras
                            
                                How training LSTM model for sequences items ?
                            
                                ModuleNotFoundError: No module named 'tensorflow.python.training'
                            
                                Passing multiple inputs to keras model from tf.dataset API?
                            
                                Keras: update model with a bigger training set
                            
                                Saving keras models with shared layers
                            
                                Input 0 of layer lstm_5 is incompatible with the layer: expected ndim=3, found ndim=2
                            
                                How to implement Weighted Binary CrossEntropy on theano?
                            
                                How do I implement the Triplet Loss in Keras?
                            
                                Is there a DropConnect layer in Keras? [closed]
                            
                                Correctly loading Keras model in Django that supports multi-tenancy
                            
                                keras validation_data with multiple input
                            
                                Classification: skewed data within a class
                            
                                'Sequential' object has no attribute '_is_graph_network' when exporting Keras model to TensorFlow
                            
                                How to build 1D Convolutional Neural Network in keras python?
                            
                                Keras: Weighted Binary Crossentropy Implementation
                            
                                Keras Applications and Preprocessing Versions for TensorFlow
                            
                                Using pure numpy metric as metric in Keras/TensorFlow

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With