Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the difference between "hidden" and "output" in PyTorch LSTM?

I'm having trouble understanding the documentation for PyTorch's LSTM module (and also RNN and GRU, which are similar). Regarding the outputs, it says:

Outputs: output, (h_n, c_n)

  • output (seq_len, batch, hidden_size * num_directions): tensor containing the output features (h_t) from the last layer of the RNN, for each t. If a torch.nn.utils.rnn.PackedSequence has been given as the input, the output will also be a packed sequence.
  • h_n (num_layers * num_directions, batch, hidden_size): tensor containing the hidden state for t=seq_len
  • c_n (num_layers * num_directions, batch, hidden_size): tensor containing the cell state for t=seq_len

It seems that the variables output and h_n both give the values of the hidden state. Does h_n just redundantly provide the last time step that's already included in output, or is there something more to it than that?

like image 268
N. Virgo Avatar asked Jan 17 '18 13:01

N. Virgo


People also ask

What is output of LSTM in Pytorch?

The output of the Pytorch LSTM layer is a tuple with two elements. The first element of the tuple is LSTM's output corresponding to all timesteps ( hᵗ : ∀t = 1,2… T ) with shape (timesteps, batch, output_features) . The second element of the tuple is another tuple with two elements.

What is hidden size in Pytorch LSTM?

Here the hidden_size of the LSTM layer would be 512 as there are 512 units in each LSTM cell and the num_layers would be 2. The num_layers is the number of layers stacked on top of each other.

What is hidden in LSTM?

The output of an LSTM cell or layer of cells is called the hidden state. This is confusing, because each LSTM cell retains an internal state that is not output, called the cell state, or c.

What is the output in LSTM?

LSTM Default return value: The size of output is 2D array of real numbers. The first dimension is indicating the number of samples in the batch given to the LSTM layer. The second dimension is the dimensionality of the output space defined by the units parameter in Keras LSTM implementation.


1 Answers

I made a diagram. The names follow the PyTorch docs, although I renamed num_layers to w.

output comprises all the hidden states in the last layer ("last" depth-wise, not time-wise). (h_n, c_n) comprises the hidden states after the last timestep, t = n, so you could potentially feed them into another LSTM.

LSTM diagram

The batch dimension is not included.

like image 98
nnnmmm Avatar answered Sep 23 '22 14:09

nnnmmm