Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stateful LSTM fails to predict due to batch_size issue

I am able to successfully train my stateful LSTM using keras. My batch size is 60 and every input I am sending in the network is divisible by batch_size Following is my snippet :

model = Sequential()
model.add(LSTM(80,input_shape = trainx.shape[1:],batch_input_shape=(60, 
trainx.shape[1], trainx.shape[2]),stateful=True,return_sequences=True))
model.add(Dropout(0.15))
model.add(LSTM(40,return_sequences=False))
model.add(Dense(40))
model.add(Dropout(0.3))
model.add(Dense(output_dim=1))
model.add(Activation("linear"))
keras.optimizers.RMSprop(lr=0.005, rho=0.9, epsilon=1e-08, decay=0.0)
model.compile(loss="mse", optimizer="rmsprop")

My training line which runs successfully:

  model.fit(trainx[:3000,:],trainy[:3000],validation_split=0.1,shuffle=False,nb_epoch=9,batch_size=60)

Now I try to predict on test set which is again divisible by 60 , but I get error :

ValueError: In a stateful network, you should only pass inputs with a number of samples that can be divided by the batch size. Found: 240 samples. Batch size: 32.

Can anyone tell me what is wrong above ? I am confused , tried so many things but nothing helps.

like image 540
Harshit Avatar asked Jul 14 '17 13:07

Harshit


People also ask

What is Batch_size in LSTM?

By experience, in most cases, an optimal batch-size is 64. Nevertheless, there might be some cases where you select the batch size as 32, 64, 128 which must be dividable by 8. Note that this batch size fine-tuning must be done based on the performance observation.

What is the advantage of stateful RNNs over stateless RNNs?

Setting an RNN to be stateful means that it can build a state across its training sequence and even maintain that state when doing predictions. The benefits of using stateful RNNs are smaller network sizes and/or lower training times.

What is stateful LSTM?

All the RNN or LSTM models are stateful in theory. These models are meant to remember the entire sequence for prediction or classification tasks. However, in practice, you need to create a batch to train a model with backprogation algorithm, and the gradient can't backpropagate between batches.

What is stateful and stateless LSTM?

Stateless works best when the the sequences you're learning aren't dependent on one another. Sentence-level prediction of a next word might be a good example of when to use stateless. The stateful configuration resets LSTM cell memory every epoch.


1 Answers

I suspect that the reason for the error is that you did not specify the batch size in model.predict. As you can see in the documentation in the "predict" section, the default parameters are

model.predict(self, x, batch_size=32, verbose=0)

which is why 32 appears in your error message. So you need to specify batch_size=60 in model.predict.

like image 176
Miriam Farber Avatar answered Oct 24 '22 12:10

Miriam Farber