Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Keras back propagate custom loss function?

I've studied plenty of examples of custom loss functions for Keras. All of them can be summaraized as "just write it". Namely, one need to write function taking parameters (y_true, y_pred). But normally CNN needs a derivative of a loss function for back propagation. For instance, if you implement custom loss in Caffe, you have to write 2 function: the loss you need itself, and it's derivative for backward propagation. But in Keras it seems you have no need in the 2nd one. How does this magic works?

like image 854
Ilya Ovodov Avatar asked Jan 04 '18 22:01

Ilya Ovodov


People also ask

How does Keras differentiate custom loss function?

Creating custom loss functions in KerasA custom loss function can be created by defining a function that takes the true values and predicted values as required parameters. The function should return an array of losses. The function can then be passed at the compile stage.

How do you implement custom loss function in Keras?

We can create a custom loss function in Keras by writing a function that returns a scalar and takes two arguments: namely, the true value and predicted value. Then we pass the custom loss function to model. compile as a parameter like we we would with any other loss function.

What is cross entropy loss function?

Cross entropy loss is a metric used to measure how well a classification model in machine learning performs. The loss (or error) is measured as a number between 0 and 1, with 0 being a perfect model. The goal is generally to get your model as close to 0 as possible.

How do I create a custom loss function in Tensorflow?

Tensorflow custom loss function numpy To do this task first we will create an array with sample data and find the mean squared value with the numpy() function. Next, we will use the tf. keras. Sequential() function and assign the dense value with input shape.


1 Answers

The magic is called automatic differentiation (AD). Keras is built on top of symbolic computational frameworks, namely Theano, TensorFlow, and/or CNTK. These frameworks allow you to define the loss as a symbolic expression, which can be easily be differentiated at runtime, as the whole representation is symbolic.

In contrast, Caffe is built in C++ and does not use any symbolic representation framework, and as you mention, it needs to specify the loss function and its gradient analytically in code.

like image 76
Dr. Snoopy Avatar answered Sep 21 '22 11:09

Dr. Snoopy