I have read plenty of posts for this point. They are inconsistent with each other and every answer seems to have a different explanation so I thought to ask based on my analyzing of all of them. As Keras RNN documentation states, the input shape is always in this form <code>(batch_size, timesteps, input_dim)</code>. I am a bit confused about that but I guess, not sure though, that <code>input_dim</code> is always 1 while <code>timesteps</code> depends on your problem (could be the data dimension as well). Is that roughly correct? The reason for this question is that I always get an error when trying to change the value of <code>input_dim</code> to be my dataset dimension (as input_dim sounds like that!!), so I made an assumption that <code>input_dim</code> represent the shape of the input vector to LSTM at a time. Am I wrong again? <pre class="prettyprint"><code>C = C.reshape((C.shape[0], C.shape[1], 1)) tr_C, ts_C, tr_r, ts_r = train_test_split(C, r, train_size=.8) batch_size = 1000 print('Build model...') model = Sequential() model.add(LSTM(8, batch_input_shape=(batch_size, C.shape[1], 1), stateful=True, activation='relu')) model.add(Dense(1, activation='relu')) print('Training...') model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) model.fit(tr_C, tr_r, batch_size=batch_size, epochs=1, shuffle=True, validation_data=(ts_C, ts_r)) </code></pre> Thanks!

Indeed, <code>input_dim</code> is the shape of the input vector at a time. In other words, <code>input_dim</code> is the number of the input <code>features</code>. It's not necessarily 1, though. If you're working with more than one var, it can be any number. Suppose you have 10 sequences, each sequence has 200 time steps, and you're measuring just a temperature. Then you have one feature: <ul> <li> <code>input_shape = (200,1)</code> -- notice that the batch size (number of sequences) is ignored here </li> <li> <code>batch_input_shape = (10,200,1)</code> -- only in specific cases, like <code>stateful = True</code>, you will need a batch input shape. </li> </ul> Now suppose you're measuring not only temperature, but also pressure and volume. Now you've got three input features: <ul> <li> <code>input_shape = (200,3)</code> </li> <li> <code>batch_input_shape = (10,200,3)</code> </li> </ul> In other words, the first dimension is the number of different sequences. The second is the length of the sequence (how many measures along time). And the last is how many vars at each time.

Confusion about Keras RNN Input shape requirement

Tags:

keras

lstm

I have read plenty of posts for this point. They are inconsistent with each other and every answer seems to have a different explanation so I thought to ask based on my analyzing of all of them.

As Keras RNN documentation states, the input shape is always in this form (batch_size, timesteps, input_dim). I am a bit confused about that but I guess, not sure though, that input_dim is always 1 while timesteps depends on your problem (could be the data dimension as well). Is that roughly correct?

The reason for this question is that I always get an error when trying to change the value of input_dim to be my dataset dimension (as input_dim sounds like that!!), so I made an assumption that input_dim represent the shape of the input vector to LSTM at a time. Am I wrong again?

C = C.reshape((C.shape[0], C.shape[1], 1))
tr_C, ts_C, tr_r, ts_r = train_test_split(C, r, train_size=.8)
batch_size = 1000

print('Build model...')
model = Sequential()

model.add(LSTM(8, batch_input_shape=(batch_size, C.shape[1], 1), stateful=True, activation='relu'))
model.add(Dense(1, activation='relu'))

print('Training...')
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

model.fit(tr_C, tr_r,
          batch_size=batch_size, epochs=1,
          shuffle=True, validation_data=(ts_C, ts_r))

Thanks!

715

asked Nov 13 '17 16:11

Kristofer

1 Answers

Indeed, input_dim is the shape of the input vector at a time. In other words, input_dim is the number of the input features.

It's not necessarily 1, though. If you're working with more than one var, it can be any number.

Suppose you have 10 sequences, each sequence has 200 time steps, and you're measuring just a temperature. Then you have one feature:

input_shape = (200,1) -- notice that the batch size (number of sequences) is ignored here
batch_input_shape = (10,200,1) -- only in specific cases, like stateful = True, you will need a batch input shape.

Now suppose you're measuring not only temperature, but also pressure and volume. Now you've got three input features:

input_shape = (200,3)
batch_input_shape = (10,200,3)

In other words, the first dimension is the number of different sequences. The second is the length of the sequence (how many measures along time). And the last is how many vars at each time.

answered Sep 21 '22 19:09

Daniel Möller

Related questions
                            
                                Keras: does save_model really save all optimizer weights?
                            
                                How can I use TensorFlow's sampled softmax loss function in a Keras model?
                            
                                Validation loss when using Dropout
                            
                                How do I change the default download directory for pre-trained model in Keras?
                            
                                Does the TensorFlow backend of Keras rely on the eager execution?
                            
                                Why are my results still not reproducible?
                            
                                Why is PyTorch 2x slower than Keras for an identical model and hyperparameters?
                            
                                BERT embedding for semantic similarity
                            
                                How to Create Class Label for Mosaic Augmentation in Image Classification?
                            
                                Implementing a Siamese NN in Keras
                            
                                How to do transfer learning for MNIST dataset?
                            
                                keras-js "Error: [Model] Model configuration does not contain any layers."
                            
                                How can I make a trainable parameter in keras?
                            
                                Feature-wise scaling and shifting (FiLM layer) in Keras
                            
                                Custom RMSE not the same as taking the root of built-in Keras MSE for same prediction
                            
                                Python shared library not found, Python bindings not loaded. in RStudio on Mac
                            
                                How to output the second layer of a network?
                            
                                Converting TensorFlow tensor into Numpy array
                            
                                Python keras how to transform a dense layer into a convolutional layer
                            
                                Variationnal auto-encoder: implementing warm-up in Keras

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With