Does <code>batch_first</code> affect hidden tensors in Pytorch LSTMs? That is if <code>batch_first</code> parameter is true, Will the hidden state be <code>(numlayer*direction,num_batch,encoding_dim)</code> or <code>(num_batch,numlayer*direction,encoding_dim)</code> I've tested both, both give no error.

I was thinking about the same question some time ago. Like laydog outlined, in the documentation it says <blockquote> batch_first – If True, then the input and output tensors are provided as (batch, seq, feature) </blockquote> As I understand the question we are talking about the hidden / cell state tuple, not the actual inputs and outputs. For me it seems pretty obvious that this does not affect the hidden state as they mention: <blockquote> (batch, seq, feature) </blockquote> This clearly refers to inputs and outputs, not the state tuple which consists of two tuples with shape: <blockquote> (num_layers * num_directions, batch, hidden_size) </blockquote> So I'm pretty certain the hidden and cell states are not affected by this, it also would not make sense to me changing the order hidden state tuple. Hope this helps.

Does batch_first affect hidden tensors in Pytorch LSTMs?

1 Answers

I was thinking about the same question some time ago. Like laydog outlined, in the documentation it says

batch_first – If True, then the input and output tensors are provided as (batch, seq, feature)

As I understand the question we are talking about the hidden / cell state tuple, not the actual inputs and outputs.

For me it seems pretty obvious that this does not affect the hidden state as they mention:

(batch, seq, feature)

This clearly refers to inputs and outputs, not the state tuple which consists of two tuples with shape:

(num_layers * num_directions, batch, hidden_size)

So I'm pretty certain the hidden and cell states are not affected by this, it also would not make sense to me changing the order hidden state tuple.

Hope this helps.

190

answered Sep 25 '22 01:09

MBT

Related questions
                            
                                Binary Keras LSTM model does not output binary predictions
                            
                                LSTM in Pytorch
                            
                                How to interpret clearly the meaning of the units parameter in Keras?
                            
                                Mean or max pooling with masking support in Keras
                            
                                How to use multilayered bidirectional LSTM in Tensorflow?
                            
                                How can I use a custom data model with Deeplearning4j?
                            
                                Custom Data Generator for Keras LSTM with TimeSeriesGenerator
                            
                                Tensorflow dynamic RNN (LSTM): how to format input?
                            
                                What's the difference between data time major and batch major?
                            
                                ctc_loss error "No valid path found."
                            
                                why set return_sequences=True and stateful=True for tf.keras.layers.LSTM?
                            
                                Is this a practical way to resolve 'Not enough memory' from LuaJit with Torch
                            
                                What is the return_state output using Keras' RNN Layer
                            
                                Tensorflow LSTM Dropout Implementation
                            
                                LSTM Autoencoder no progress when script is running on larger dataset
                            
                                Why is LayerNormBasicLSTMCell much slower and less accurate than LSTMCell?
                            
                                Confusion about Keras RNN Input shape requirement

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Does batch_first affect hidden tensors in Pytorch LSTMs?

Tags:

lstm

pytorch

hhoomn

People also ask

1 Answers

MBT

Recent Activity

Donate For Us