Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how could i get both the final hidden state and sequence in a LSTM layer when using a bidirectional wrapper

i have followed the steps in https://machinelearningmastery.com/return-sequences-and-return-states-for-lstms-in-keras/ But when it comes to the Bidirectional lstm, i tried this

lstm, state_h, state_c = Bidirectional(LSTM(128, return_sequences=True, return_state= True))(input)

but it won't work.

is there some approach to get both the final hidden state and sequence in a LSTM layer when using a bidirectional wrapper

like image 216
jessie tio Avatar asked Mar 16 '18 05:03

jessie tio


People also ask

How does a bidirectional LSTM work?

Bidirectional long-short term memory(bi-lstm) is the process of making any neural network o have the sequence information in both directions backwards (future to past) or forward(past to future). In bidirectional, our input flows in two directions, making a bi-lstm different from the regular LSTM.

What is the difference between LSTM and bidirectional LSTM?

LSTM is a Gated Recurrent Neural Network, and bidirectional LSTM is just an extension to that model. The key feature is that those networks can store information that can be used for future cell processing.

Can we use bidirectional LSTM for time series?

Also, if you are an absolute beginner to time series forecasting, I recommend you to check out this Blog. The main objective of this post is to showcase how deep stacked unidirectional and bidirectional LSTMs can be applied to time series data as a Seq-2-Seq based encoder-decoder model.

Is hidden state the output of LSTM?

In Pytorch, the output parameter gives the output of each individual LSTM cell in the last layer of the LSTM stack, while hidden state and cell state give the output of each hidden cell and cell state in the LSTM stack in every layer.


1 Answers

The call Bidirectional(LSTM(128, return_sequences=True, return_state=True))(input) returns 5 tensors:

  1. The entire sequence of hidden states, by default it'll be the concatenation of forward and backward states.
  2. The last hidden state h for the forward LSTM
  3. The last cell state c for the forward LSTM
  4. The last hidden state h for the backward LSTM
  5. The last cell state c for the backward LSTM

The line you've posted would raise an error since you want to unpack the returned value into just three variables (lstm, state_h, state_c).

To correct it, simply unpack the returned value into 5 variables. If you want to merge the states, you can concatenate the forward and backward states with Concatenate layers.

lstm, forward_h, forward_c, backward_h, backward_c = Bidirectional(LSTM(128, return_sequences=True, return_state=True))(input)
state_h = Concatenate()([forward_h, backward_h])
state_c = Concatenate()([forward_c, backward_c])
like image 193
Yu-Yang Avatar answered Sep 25 '22 17:09

Yu-Yang