Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Implementing a complicated activation function in keras

Tags:

keras

I just read an interesting paper: A continuum among logarithmic, linear, and exponential functions, and its potential to improve generalization in neural networks.

I'd like to try to implement this activation function in Keras. I've implemented custom activations before, e.g. a sinusoidal activation:

def sin(x):
  return K.sin(x)
get_custom_objects().update({'sin': Activation(sin)})

However, the activation function in this paper has 3 unique properties:

  1. It doubles the size of the input (the output is 2x the input)
  2. It's parameterized
  3. It's parameters should be regularized

I think once I have a skeleton for dealing with the above 3 issues, I can work out the math myself, but I'll take any help I can get!

like image 312
Zach Avatar asked Oct 30 '17 22:10

Zach


People also ask

What are advanced activation functions in keras?

About "advanced activation" layersActivations that are more complex than a simple TensorFlow function (eg. learnable activations, which maintain a state) are available as Advanced Activation layers, and can be found in the module tf. keras. layers.

Which is the most popular activation function for deep neural networks?

ReLU (Rectified Linear Unit) Activation Function The ReLU is the most used activation function in the world right now. Since, it is used in almost all the convolutional neural networks or deep learning.


3 Answers

Here, we will need one of these two:

  • A Lambda layer - If your parameters are not trainable (you don't want them to change with backpropagation)
  • A custom layer - If you need custom trainable parameters.

The Lambda layer:

If your parameters are not trainable, you can define your function for a lambda layer. The function takes one input tensor, and it can return anything you want:

import keras.backend as K

def customFunction(x):

    #x can be either a single tensor or a list of tensors
    #if a list, use the elements x[0], x[1], etc.

    #Perform your calculations here using the keras backend
    #If you could share which formula exactly you're trying to implement, 
        #it's possible to make this answer better and more to the point    

    #dummy example
    alphaReal = K.variable([someValue])    
    alphaImag = K.variable([anotherValue]) #or even an array of values   

    realPart = alphaReal * K.someFunction(x) + ... 
    imagPart = alphaImag * K.someFunction(x) + ....

    #You can return them as two outputs in a list (requires the fuctional API model
    #Or you can find backend functions that join them together, such as K.stack

    return [realPart,imagPart]

    #I think the separate approach will give you a better control of what to do next. 

For what you can do, explore the backend functions.

For the parameters, you can define them as keras constants or variables (K.constant or K.variable), either inside or outside the function above, or even transform them in model inputs. See details in this answer

In your model, you just add a lambda layer that uses that function.

  • In a Sequential model: model.add(Lambda(customFunction, output_shape=someShape))
  • In a functional API model: output = Lambda(customFunction, ...)(inputOrListOfInputs)

If you're going to pass more inputs to the function, you'll need the functional model API.
If you're using Tensorflow, the output_shape will be computed automatically, I believe only Theano requires it. (Not sure about CNTK).

The custom layer:

A custom layer is a new class you create. This approach will only be necessary if you're going to have trainable parameters in your function. (Such as: optimize alpha with backpropagation)

Keras teaches it here.

Basically, you have an __init__ method where you pass the constant parameters, a build method where you create the trainable parameters (weights), a call method that will do the calculations (exactly what would go in the lambda layer if you didn't have trainable parameters), and a compute_output_shape method so you can tell the model what the output shape is.

class CustomLayer(Layer):

    def __init__(self, alphaReal, alphaImag):

        self.alphaReal = alphaReal    
        self.alphaImage = alphaImag

    def build(self,input_shape):

        #weights may or may not depend on the input shape
        #you may use it or not...

        #suppose we want just two trainable values:
        weigthShape = (2,)

        #create the weights:
        self.kernel = self.add_weight(name='kernel', 
                                  shape=weightShape,
                                  initializer='uniform',
                                  trainable=True)

        super(CustomLayer, self).build(input_shape)  # Be sure to call this somewhere!

    def call(self,x):

        #all the calculations go here:

        #dummy example using the constant inputs
        realPart = self.alphaReal * K.someFunction(x) + ... 
        imagPart = self.alphaImag * K.someFunction(x) + ....

        #dummy example taking elements of the trainable weights
        realPart = self.kernel[0] * realPart    
        imagPart = self.kernel[1] * imagPart

        #all the comments for the lambda layer above are valid here

        #example returning a list
        return [realPart,imagPart]

    def compute_output_shape(self,input_shape):

        #if you decide to return a list of tensors in the call method, 
        #return a list of shapes here, twice the input shape:
        return [input_shape,input_shape]    

        #if you stacked your results somehow in a single tensor, compute a single tuple, maybe with an additional dimension equal to 2:
        return input_shape + (2,)
like image 82
Daniel Möller Avatar answered Sep 28 '22 05:09

Daniel Möller


You need to implement a "Layer", not a common activation function.

I think the implementation of pReLU in Keras would be a good example for your task. See pReLU

like image 41
Moyan Avatar answered Sep 28 '22 03:09

Moyan


A lambda function in the activation worked for me. Maybe not what you want but it's one step more complicated than the simple use of a built-in activation function.

encoder_outputs = Dense(units=latent_vector_len, activation=k.layers.Lambda(lambda z: k.backend.round(k.layers.activations.sigmoid(x=z))), kernel_initializer="lecun_normal")(x)

This code changes the output of a Dense from Reals to 0,1 (ie, binary).

Keras throws a warning but the code still proves to work.

like image 21
Geoffrey Anderson Avatar answered Sep 28 '22 03:09

Geoffrey Anderson