I am using keras with a custom loss function like below:
def custom_fn(y_true, y_pred):
# changing y_true, y_pred values systematically
return mean_absolute_percentage_error(y_true, y_pred)
Then I am calling model.compile(loss=custom_fn)
and model.fit(X, y,..validation_data=(X_val, y_val)..)
Keras is then saving loss
and val_loss
in model history. As a sanity check, when the model finishes training, I am using model.predict(X_val)
so I can calculate validation loss manually with my custom_fn
using the trained model.
I am saving the model with the best epoch using this callback:
callbacks.append(ModelCheckpoint(path, save_best_only=True, monitor='val_loss', mode='min'))
so after calculating this, the validation loss should match keras' val_loss
value of the best epoch. But this is not happening.
As another attempt to figure this issue out, I am also doing this:
model.compile(loss=custom_fn, metrics=[custom_fn])
And to my surprise, val_loss
and val_custom_fn
do not match (neither loss
or loss_custom_fn
for that matter).
This is really strange, my custom_fn
is essentially keras' built in mape
with the y_true
and y_pred
slightly manipulated. what is going on here?
PS: the layers I am using are LSTM
layers and a final Dense
layer. But I think this information is not relevant to the problem. I am also using regularisation as hyperparameter but not dropout.
Even removing custom_fn
and using keras' built in mape
as a loss function and metric like so:
model.compile(loss='mape', metrics=['mape'])
and for simplicity, removing ModelCheckpoint
callback is having the same effect; val_loss
and val_mape
for each epoch are not equivalent. This is extremely strange to me. I am either missing something or there is a bug in Keras code..the former might be more realistic.
The loss function is used to optimize your model. This is the function that will get minimized by the optimizer. A metric is used to judge the performance of your model. This is only for you to look at and has nothing to do with the optimization process.
Using via compile Method: Keras losses can be specified for a deep learning model using the compile method from keras. Model.. And now the compile method can be used to specify the loss and metrics.
A metric is a function that is used to judge the performance of your model. Metric functions are similar to loss functions, except that the results from evaluating a metric are not used when training the model. Note that you may use any loss function as a metric.
Loss: A scalar value that we attempt to minimize during our training of the model. The lower the loss, the closer our predictions are to the true labels. This is usually Mean Squared Error (MSE) as David Maust said above, or often in Keras, Categorical Cross Entropy.
This blog post suggests that keras adds any regularisation used in the training when calculating the validation loss. And obviously, when calculating the metric of choice no regularisation is applied. This is why it occurs with any loss function of choice as stated in the question.
This is something I could not find any documentation on from Keras. However, it seems to hold up since when I remove all regularisation hyperparameters, the val_loss
and val_custom_fn
match exactly in each epoch.
An easy workaround is to either use the custom_fn
as a metric and save the best model based on the metric (val_custom_fn
) than on the val_loss
. Or else Loop through each epoch manually and calculate the correct val_loss
manually after training each epoch. The latter seems to make more sense since there is no reason to include custom_fn
both as a metric and as a loss function.
If anyone can find any evidence of this in the Keras documentation that would be helpful.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With