According to the mathematical formulation from wikipedia-lstm-math-equation, as shown below,

there should be only hidden state h_t and cell state c_t. However, when I tried to write RNN code on Keras, there are three: lstm_output, state_h and state_c.
I am now wondering what is the mathematical formulation of lstm_output?
Here is my code:
from keras.layers import Input, LSTM
lstm_input = Input(shape=(28, 10))
lstm_output, state_h, state_c = LSTM(units=32,
return_sequences=True,
return_state=True,
unroll=True)(lstm_input)
print(lstm_output, state_h, state_c)
and it gives
Using TensorFlow backend.
(<tf.Tensor 'lstm_1/transpose_1:0' shape=(?, 28, 32) dtype=float32>, <tf.Tensor 'lstm_1/mul_167:0' shape=(?, 32) dtype=float32>, <tf.Tensor 'lstm_1/add_221:0' shape=(?, 32) dtype=float32>)
Let's break it down, looking at this line from the source code - return h, [h, c]:
h of each time step. So it has shape (batch_size, sequence_length, hidden_size), in your case it is (?, 28, 32). As the documentation says, it is returned as a sequence because you set return_sequences=True.h and if you can check, it should be equal to lstm_output[:,-1]. Notice why it's shape is (?, 32), since it is the last timestep's output, not at every timestep.c.The equations are often implemented in different ways to optimise for certain features but they all follow the original paper. Note that there might be variations on the activations such as using hard_sigmoid for the recurrent activiation and these should be clearly noted in the documentation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With