I want to implement a two-step learning process where:
loss_1
loss_2
and continue the training for fine-tuningCurrently, my approach is:
model.compile(optimizer=opt, loss=loss_1, metrics=['accuracy'])
model.fit_generator(…)
model.compile(optimizer=opt, loss=loss_2, metrics=['accuracy'])
model.fit_generator(…)
Note that the optimizer remains the same, and only the loss function changes. I'd like to smoothly continue training, but with a different loss function. According to this post, re-compiling the model loses the optimizer state. Questions:
a) Will I lose the optimizer state even if I use the same optimizer, eg Adam?
b) if the answer to a) is yes, any suggestions on how to change the loss function to a new one without reseting the optimizer state?
EDIT:
As suggested by Simon Caby and based on this thread, I created a custom loss function with two loss computations that depend on epoch number. However, it does not work for me. My approach:
def loss_wrapper(t_change, current_epoch):
def custom_loss(y_true, y_pred):
c_epoch = K.get_value(current_epoch)
if c_epoch < t_change:
# compute loss_1
else:
# compute loss_2
return custom_loss
And I compile as follows, after initializing current_epoch
:
current_epoch = K.variable(0.)
model.compile(optimizer=opt, loss=loss_wrapper(5, current_epoch), metrics=...)
To update the current_epoch
, I create the following callback:
class NewCallback(Callback):
def __init__(self, current_epoch):
self.current_epoch = current_epoch
def on_epoch_end(self, epoch, logs={}):
K.set_value(self.current_epoch, epoch)
model.fit_generator(..., callbacks=[NewCallback(current_epoch)])
The callback updates self.current_epoch
every epoch correctly. But the update does not reach the custom loss function. Instead, current_epoch
keeps the initialization value forever, and loss_2
is never executed.
Any suggestion is welcome, thanks!
So while you keep using the same evaluation metric like f1 score or AUC on the validation set during (long parts) of your machine learning project, the loss can be changed, adjusted and modified to get the best evaluation metric performance.
The most commonly used loss function in image classification is cross-entropy loss/log loss (binary for classification between 2 classes and sparse categorical for 3 or more), where the model outputs a vector of probabilities that the input image belongs to each of the pre-set categories.
The mean squared error loss function can be used in Keras by specifying 'mse' or 'mean_squared_error' as the loss function when compiling the model. It is recommended that the output layer has one node for the target variable and the linear activation function is used.
My answers : a) yes, and you should probably make your own learning rate scheduler in order to keep control of it :
keras.callbacks.LearningRateScheduler(schedule, verbose=0)
b) yes you can create your own loss function, including one that flutuates between two different loss methods. see : "Advanced Keras — Constructing Complex Custom Losses and Metrics" https://towardsdatascience.com/advanced-keras-constructing-complex-custom-losses-and-metrics-c07ca130a618
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With