Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does keras(or any other ML framework) calculate the gradient of a lambda function layer for backpropagation?

Keras enables adding a layer which calculates a user defined lambda function. What I don't get is how Keras knows to calculate the gradient of this user defined function for the backpropagation.

like image 908
Isaac Dorfman Avatar asked Dec 26 '16 12:12

Isaac Dorfman


People also ask

How are gradients computed in TensorFlow?

Gradient tapes TensorFlow "records" relevant operations executed inside the context of a tf. GradientTape onto a "tape". TensorFlow then uses that tape to compute the gradients of a "recorded" computation using reverse mode differentiation.

What does Lambda layer do in keras?

Lambda class. Wraps arbitrary expressions as a Layer object. The Lambda layer exists so that arbitrary expressions can be used as a Layer when constructing Sequential and Functional API models. Lambda layers are best suited for simple operations or quick experimentation.

What does Lambda layer do?

Lambda layers provide a convenient way to package libraries and other dependencies that you can use with your Lambda functions. Using layers reduces the size of uploaded deployment archives and makes it faster to deploy your code. A layer is a . zip file archive that can contain additional code or data.

What is Lambda in CNN?

the lambda layer has its own function to perform editing in the input data. Using the lambda layer in a neural network we can transform the input data where expressions and functions of the lambda layer are transformed. By Yugesh Verma.


1 Answers

That one of the benefit of using Theano/Tensorflow and libraries build on top of them. They can give you automatic gradient calculation of the mathematical functions and operations.

Keras gets them by calling:

# keras/theano_backend.py
def gradients(loss, variables):
    return T.grad(loss, variables)

# keras/tensorflow_backend.py
def gradients(loss, variables):
    '''Returns the gradients of `variables` (list of tensor variables)
    with regard to `loss`.
    '''
    return tf.gradients(loss, variables, colocate_gradients_with_ops=True)

which are in turn called by the optimizers(keras/optimizers.py) grads = self.get_gradients(loss, params) to get the gradients which are used to write the update rule for all the params. params here are the trainable weights of the layers. But layers created by the Lambda functional layers don't have any trainable weights. But they affect the loss function though the forward prob and hence indirectly affect the calculation of the gradients of trainable weights of other layers.

The only time you need to write new gradient calculation is when you are defining a new basic mathematical operation/function. Also, when you write a custom loss function the auto grad almost always takes care of the gradient calculation. But optionally you can optimize training (not always) if you implement analytical gradient of your custom functions. For example softwax function can be expressed in exp, sum and div and auto grad can take care of it, but its analytical/symbolic grad is usually implemented in Theano/Tensorflow.

For implementing new Ops you can see the below links for that: http://deeplearning.net/software/theano/extending/extending_theano.html https://www.tensorflow.org/versions/r0.12/how_tos/adding_an_op/index.html

like image 135
indraforyou Avatar answered Sep 21 '22 09:09

indraforyou