Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

batch_input_shape tuple on Keras LSTM

I have the following feature vector that consists in a single feature for each sample and 32 samples at total:

X = [[0.1], [0.12], [0.3] ... [0.10]]

and a label vector that consists of binary values

Y = [0, 1, 0 , 0, .... 1] (with 32 samples as well)

I'm trying to use Keras LSTM to predict the next value of the sequence based on a new entry. What I can't figure out is what the "batch_input_shape" tuple means for instance:

 model.add(LSTM(neurons, batch_input_shape=(?, ?, ?), return_sequences=False, stateful=True))

According to this article the first one is the batch size, but what about the other two? Are they the number of features for each sample and the number of samples? What should be the value of batch_size in this case?

At the moment receiving the error message:

ValueError: Error when checking input: expected lstm_1_input to have 3 dimensions, but got array with shape (32, 1)

Edit: Here is the model declaration:

def create_lstm(batch_size, n_samples, neurons, dropout):
model = Sequential()
model.add(LSTM(neurons, batch_size=batch_size, input_shape=(n_samples, 1), return_sequences=False, stateful=True))
model.add(Dropout(dropout))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
return model
like image 386
André Heringer Avatar asked Oct 27 '17 20:10

André Heringer


1 Answers

According to this Keras Sequential Model guide on "stateful" LSTM (at the very bottom), we can see what those three elements mean:

Expected input batch shape: (batch_size, timesteps, data_dim). Note that we have to provide the full batch_input_shape since the network is stateful. the sample of index i in batch k is the follow-up for the sample i in batch k-1.

The first one as you already discovered is the size of the batches to be used during training. How much you should chose depends in part on your specific problem, but mostly is given by the size of your dataset. If you specify a batch size of x and your dataset contains N samples, during training your data will be split in N/x groups (batches) of size x each.

Therefore, you probably want your batch size to be smaller than the size of your dataset. There is no unique value, but you want it to be proportionally smaller (say one or two orders) than all your data. Some people prefer to use powers of 2 (32, 128, etc.) as their batch sizes. It is also possible in some cases to not use batches at all, and train with all your data at once (although not necessarily better).

The other two values are the timesteps (the size of your temporal dimension) or "frames" each sample sequence has, and the data dimension (that is, the size of your data vector on each timestep).

For example, say your input sequences look like X = [[0.54, 0.3], [0.11, 0.2], [0.37, 0.81]]. We can see that this sequence has a timestep of 3 and a data_dim of 2.

So, the ValueError you are getting is most probably due to this (the error even hints that it expected 3 dims). Also, make sure your array is a Numpy Array.

As a last comment, given that you say you have 32 samples total (that is your whole dataset contains 32 samples) I consider is too few data to be using batches; usually the minimum batch size I have seen is 32, so consider obtaining more data before trying to use batch training. Hope this helps.

like image 176
DarkCygnus Avatar answered Nov 11 '22 08:11

DarkCygnus