Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does batch_first affect hidden tensors in Pytorch LSTMs?

Tags:

lstm

pytorch

Does batch_first affect hidden tensors in Pytorch LSTMs?

That is if batch_first parameter is true, Will the hidden state be (numlayer*direction,num_batch,encoding_dim) or (num_batch,numlayer*direction,encoding_dim)

I've tested both, both give no error.

like image 363
hhoomn Avatar asked Apr 25 '18 00:04

hhoomn


People also ask

What is Batch_first?

If bidirectional is true the number of directions will be 2 otherwise it will be 1. batch_first=True means batch should be our first dimension (Input Type 2) otherwise if we do not define batch_first=True in RNN we need data in Input type 1 shape (Sequence Length, Batch Size, Input Dimension).

What is the output of Pytorch LSTM?

The output of the Pytorch LSTM layer is a tuple with two elements. The first element of the tuple is LSTM's output corresponding to all timesteps ( hᵗ : ∀t = 1,2… T ) with shape (timesteps, batch, output_features) . The second element of the tuple is another tuple with two elements.


1 Answers

I was thinking about the same question some time ago. Like laydog outlined, in the documentation it says

batch_first – If True, then the input and output tensors are provided as (batch, seq, feature)

As I understand the question we are talking about the hidden / cell state tuple, not the actual inputs and outputs.

For me it seems pretty obvious that this does not affect the hidden state as they mention:

(batch, seq, feature)

This clearly refers to inputs and outputs, not the state tuple which consists of two tuples with shape:

(num_layers * num_directions, batch, hidden_size)

So I'm pretty certain the hidden and cell states are not affected by this, it also would not make sense to me changing the order hidden state tuple.

Hope this helps.

like image 190
MBT Avatar answered Sep 25 '22 01:09

MBT