Does batch_first
affect hidden tensors in Pytorch LSTMs?
That is if batch_first
parameter is true,
Will the hidden state be (numlayer*direction,num_batch,encoding_dim)
or (num_batch,numlayer*direction,encoding_dim)
I've tested both, both give no error.
If bidirectional is true the number of directions will be 2 otherwise it will be 1. batch_first=True means batch should be our first dimension (Input Type 2) otherwise if we do not define batch_first=True in RNN we need data in Input type 1 shape (Sequence Length, Batch Size, Input Dimension).
The output of the Pytorch LSTM layer is a tuple with two elements. The first element of the tuple is LSTM's output corresponding to all timesteps ( hᵗ : ∀t = 1,2… T ) with shape (timesteps, batch, output_features) . The second element of the tuple is another tuple with two elements.
I was thinking about the same question some time ago. Like laydog outlined, in the documentation it says
batch_first – If True, then the input and output tensors are provided as (batch, seq, feature)
As I understand the question we are talking about the hidden / cell state tuple, not the actual inputs and outputs.
For me it seems pretty obvious that this does not affect the hidden state as they mention:
(batch, seq, feature)
This clearly refers to inputs and outputs, not the state tuple which consists of two tuples with shape:
(num_layers * num_directions, batch, hidden_size)
So I'm pretty certain the hidden and cell states are not affected by this, it also would not make sense to me changing the order hidden state tuple.
Hope this helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With