Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can someone explain to me the difference between activation and recurrent activation arguments passed in initialising keras lstm layer?

Tags:

Can someone explain to me the difference between activation and recurrent activation arguments passed in initialising keras lstm layer?

According to my understanding LSTM has 4 layers. Please explain what are th e default activation functions of each layer if I do not pass any activation argument to the LSTM constructor?

like image 536
Mayank Uniyal Avatar asked Jul 06 '17 11:07

Mayank Uniyal


People also ask

What is activation and recurrent activation?

activation: Activation function to use. Default: hyperbolic tangent ( tanh ). If you pass None , no activation is applied (ie. "linear" activation: a(x) = x ). recurrent_activation: Activation function to use for the recurrent step.

What is the best activation function for LSTM?

Recurrent networks still commonly use Tanh or sigmoid activation functions, or even both. For example, the LSTM commonly uses the Sigmoid activation for recurrent connections and the Tanh activation for output.

How many dense layers does LSTM have?

The vanilla LSTM network has three layers; an input layer, a single hidden layer followed by a standard feedforward output layer.

What is output of LSTM layer?

An LSTM cell in Keras gives you three outputs: an output state o_t (1st output) a hidden state h_t (2nd output) a cell state c_t (3rd output)


2 Answers

On code

Line from 1932

i = self.recurrent_activation(z0)
f = self.recurrent_activation(z1)
c = f * c_tm1 + i * self.activation(z2)
o = self.recurrent_activation(z3)
h = o * self.activation(c)

recurrent_activation is for activate input/forget/output gate.

activation if for cell state and hidden state.

like image 200
peikuo Avatar answered Oct 07 '22 13:10

peikuo


An LSTM Unit has 3 gates called the input, forget, and output gates, in addition to a candidate hidden state (g), and an output hidden state (c).

The build method in the LSTMCell class contains the implementation where these activations are called (https://github.com/keras-team/keras/blob/master/keras/layers/recurrent.py#L1892).

The recurrent_activation argument applies to the input, forget, and output gates. The default value for this argument is a hard-sigmoid function. The activation argument applies to the candidate hidden state and output hidden state. The default value for this argument is a hyperbolic tangent function.

like image 26
Sam - Founder of AceAINow.com Avatar answered Oct 07 '22 15:10

Sam - Founder of AceAINow.com