I am able to successfully train my stateful LSTM using keras. My batch size is 60 and every input I am sending in the network is divisible by batch_size Following is my snippet :
model = Sequential()
model.add(LSTM(80,input_shape = trainx.shape[1:],batch_input_shape=(60,
trainx.shape[1], trainx.shape[2]),stateful=True,return_sequences=True))
model.add(Dropout(0.15))
model.add(LSTM(40,return_sequences=False))
model.add(Dense(40))
model.add(Dropout(0.3))
model.add(Dense(output_dim=1))
model.add(Activation("linear"))
keras.optimizers.RMSprop(lr=0.005, rho=0.9, epsilon=1e-08, decay=0.0)
model.compile(loss="mse", optimizer="rmsprop")
My training line which runs successfully:
model.fit(trainx[:3000,:],trainy[:3000],validation_split=0.1,shuffle=False,nb_epoch=9,batch_size=60)
Now I try to predict on test set which is again divisible by 60 , but I get error :
ValueError: In a stateful network, you should only pass inputs with a number of samples that can be divided by the batch size. Found: 240 samples. Batch size: 32.
Can anyone tell me what is wrong above ? I am confused , tried so many things but nothing helps.
By experience, in most cases, an optimal batch-size is 64. Nevertheless, there might be some cases where you select the batch size as 32, 64, 128 which must be dividable by 8. Note that this batch size fine-tuning must be done based on the performance observation.
Setting an RNN to be stateful means that it can build a state across its training sequence and even maintain that state when doing predictions. The benefits of using stateful RNNs are smaller network sizes and/or lower training times.
All the RNN or LSTM models are stateful in theory. These models are meant to remember the entire sequence for prediction or classification tasks. However, in practice, you need to create a batch to train a model with backprogation algorithm, and the gradient can't backpropagate between batches.
Stateless works best when the the sequences you're learning aren't dependent on one another. Sentence-level prediction of a next word might be a good example of when to use stateless. The stateful configuration resets LSTM cell memory every epoch.
I suspect that the reason for the error is that you did not specify the batch size in model.predict
. As you can see in the documentation in the "predict" section, the default parameters are
model.predict(self, x, batch_size=32, verbose=0)
which is why 32 appears in your error message. So you need to specify batch_size=60
in model.predict
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With