What is the purpose of the add_loss function in Keras?

Tags:

Currently I stumbled across variational autoencoders and tried to make them work on MNIST using keras. I found a tutorial on github.

My question concerns the following lines of code:

# Build model vae = Model(x, x_decoded_mean)  # Calculate custom loss xent_loss = original_dim * metrics.binary_crossentropy(x, x_decoded_mean) kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1) vae_loss = K.mean(xent_loss + kl_loss)  # Compile vae.add_loss(vae_loss) vae.compile(optimizer='rmsprop')

Why is add_loss used instead of specifying it as compile option? Something like vae.compile(optimizer='rmsprop', loss=vae_loss) does not seem to work and throws the following error:

ValueError: The model cannot be compiled because it has no loss to optimize.

What is the difference between this function and a custom loss function, that I can add as an argument for Model.fit()?

Thanks in advance!

P.S.: I know there are several issues concerning this on github, but most of them were open and uncommented. If this has been resolved already, please share the link!

Edit 1

I removed the line which adds the loss to the model and used the loss argument of the compile function. It looks like this now:

# Build model vae = Model(x, x_decoded_mean)  # Calculate custom loss xent_loss = original_dim * metrics.binary_crossentropy(x, x_decoded_mean) kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1) vae_loss = K.mean(xent_loss + kl_loss)  # Compile vae.compile(optimizer='rmsprop', loss=vae_loss)

This throws an TypeError:

TypeError: Using a 'tf.Tensor' as a Python 'bool' is not allowed. Use 'if t is not None:' instead of 'if t:' to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the value of a tensor.

Edit 2

Thanks to @MarioZ's efforts, I was able to figure out a workaround for this.

# Build model vae = Model(x, x_decoded_mean)  # Calculate custom loss in separate function def vae_loss(x, x_decoded_mean):     xent_loss = original_dim * metrics.binary_crossentropy(x, x_decoded_mean)     kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)     vae_loss = K.mean(xent_loss + kl_loss)     return vae_loss  # Compile vae.compile(optimizer='rmsprop', loss=vae_loss)  ...  vae.fit(x_train,      x_train,        # <-- did not need this previously     shuffle=True,     epochs=epochs,     batch_size=batch_size,     validation_data=(x_test, x_test))     # <-- worked with (x_test, None) before

For some strange reason, I had to explicitly specify y and y_test while fitting the model. Originally, I didn't need to do this. The produced samples seem reasonable to me.

Although I could resolve this, I still don't know what the differences and disadvantages of these two methods are (other than needing a different syntax). Can someone give me more insight?

453

asked Apr 27 '18 13:04

DocDriven

2 Answers

I'll try to answer the original question of why model.add_loss() is being used instead of specifying a custom loss function to model.compile(loss=...).

All loss functions in Keras always take two parameters y_true and y_pred. Have a look at the definition of the various standard loss functions available in Keras, they all have these two parameters. They are the 'targets' (the Y variable in many textbooks) and the actual output of the model. Most standard loss functions can be written as an expression of these two tensors. But some more complex losses cannot be written in that way. For your VAE example this is the case because the loss function also depends on additional tensors, namely z_log_var and z_mean, which are not available to the loss functions. Using model.add_loss() has no such restriction and allows you to write much more complex losses that depend on many other tensors, but it has the inconvenience of being more dependent on the model, whereas the standard loss functions work with just any model.

(Note: The code proposed in other answers here are somewhat cheating in as much as they just use global variables to sneak in the additional required dependencies. This makes the loss function not a true function in the mathematical sense. I consider this to be much less clean code and I expect it to be more error-prone.)

answered Sep 18 '22 14:09

jlh

JIH's answer is right of course but maybe it is useful to add:

model.add_loss() has no restrictions, but it also removes the comfort of using for example targets in the model.fit().

If you have a loss that depends on additional parameters of the model, of other models or external variables, you can still use a Keras type encapsulated loss function by having an encapsulating function where you pass all the additional parameters:

def loss_carrier(extra_param1, extra_param2):     def loss(y_true, y_pred):         #x = complicated math involving extra_param1, extraparam2, y_true, y_pred         #remember to use tensor objects, so for example keras.sum, keras.square, keras.mean         #also remember that if extra_param1, extra_maram2 are variable tensors instead of simple floats,         #you need to have them defined as inputs=(main,extra_param1, extraparam2) in your keras.model instantiation.         #and have them defind as keras.Input or tf.placeholder with the right shape.         return x     return loss  model.compile(optimizer='adam', loss=loss_carrier)

The trick is the last row where you return a function as Keras expects them with just two parameters y_true and y_pred.

Possibly looks more complicated than the model.add_loss version, but the loss stays modular.

answered Sep 16 '22 14:09

Nric

Related questions
                            
                                keras tensorboard: plot train and validation scalars in a same figure
                            
                                Neural Networks: What does "linearly separable" mean?
                            
                                How to pick a language for Artificial Intelligence programming? [closed]
                            
                                Machine Learning Algorithm for Predicting Order of Events?
                            
                                How to convert the output of an artificial neural network into probabilities?
                            
                                What is the difference between back-propagation and feed-forward Neural Network?
                            
                                How to use k-fold cross validation in a neural network
                            
                                How can I print the values of Keras tensors?
                            
                                Restore original text from Keras’s imdb dataset
                            
                                Multiple outputs in Keras
                            
                                Why is weight vector orthogonal to decision plane in neural networks
                            
                                Data sets for neural network training [closed]
                            
                                Deep Belief Networks vs Convolutional Neural Networks
                            
                                Feature Importance Chart in neural network using Keras in Python
                            
                                How to apply Drop Out in Tensorflow to improve the accuracy of neural network?
                            
                                Why input is scaled in tf.nn.dropout in tensorflow?
                            
                                Keras + Tensorflow and Multiprocessing in Python
                            
                                How do you use Keras LeakyReLU in Python?
                            
                                What is the difference between an Embedding Layer and a Dense Layer?
                            
                                How to calculate prediction uncertainty using Keras?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the purpose of the add_loss function in Keras?

Tags:

neural-network

keras

autoencoder