Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do I get a Keras LSTM RNN input_shape error?

I keep getting an input_shape error from the following code.

from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout
from keras.layers.recurrent import LSTM

def _load_data(data):
    """
    data should be pd.DataFrame()
    """
    n_prev = 10
    docX, docY = [], []
    for i in range(len(data)-n_prev):
        docX.append(data.iloc[i:i+n_prev].as_matrix())
        docY.append(data.iloc[i+n_prev].as_matrix())
    if not docX:
        pass
    else:
        alsX = np.array(docX)
        alsY = np.array(docY)
        return alsX, alsY

X, y = _load_data(dframe)
poi = int(len(X) * .8)
X_train = X[:poi]
X_test = X[poi:]
y_train = y[:poi]
y_test = y[poi:]

input_dim = 3

All of the above runs smoothly. This is where it goes wrong.

in_out_neurons = 2
hidden_neurons = 300
model = Sequential()
#model.add(Masking(mask_value=0, input_shape=(input_dim,)))
model.add(LSTM(in_out_neurons, hidden_neurons, return_sequences=False, input_shape=(len(full_data),)))
model.add(Dense(hidden_neurons, in_out_neurons))
model.add(Activation("linear"))
model.compile(loss="mean_squared_error", optimizer="rmsprop")
model.fit(X_train, y_train, nb_epoch=10, validation_split=0.05)

It returns this error.

Exception: Invalid input shape - Layer expects input ndim=3, was provided with input shape (None, 10320)

When I check the website it says to specify a tuple "(e.g. (100,) for 100-dimensional inputs)."

That being said, my data set consists of one column with a length of 10320. I assume that that means that I should be putting (10320,) in as the input_shape, but I get the error anyways. Does anyone have a solution?

like image 493
Ravaal Avatar asked Mar 17 '16 18:03

Ravaal


2 Answers

Following is the working version with Keras 2.0.0, Modified radix's code

from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout
from keras.layers.recurrent import LSTM
import numpy as np

X= np.random.rand(1000)
y = 2 * X

poi = int(len(X) * .8)
X_train = X[:poi]
y_train = y[:poi]

X_test = X[poi:]
y_test = y[poi:]

# you have to change your input shape (nb_samples, timesteps, input_dim)
X_train = X_train.reshape(len(X_train), 1, 1)
# and also the output shape (note that the output *shape* is 2 dimensional)
y_train = y_train.reshape(len(y_train), 1)

# Change test data's dimension also.
X_test = X_test.reshape(len(X_test),1,1)
y_test = y_test.reshape(len(y_test),1)


#in_out_neurons = 2
in_out_neurons = 1

hidden_neurons = 300
model = Sequential()
# model.add(Masking(mask_value=0, input_shape=(input_dim,)))
# Remove batch_input_shape and add input_shape = (1,1) - Imp change for Keras 2.0.0
model.add(LSTM(hidden_neurons, return_sequences=False, input_shape=(X_train.shape[1],X_train.shape[2])))
# only specify the output dimension
model.add(Dense(in_out_neurons))
model.add(Activation("linear"))
model.compile(loss="mean_squared_error", optimizer="rmsprop")
model.summary()
model.fit(X_train, y_train, epochs=10, validation_split=0.05)

# calculate test set MSE
preds = model.predict(X_test).reshape(len(y_test))
print(preds)
MSE = np.mean((preds-y_test)**2)
print('MSE ', MSE)
like image 133
Shakti Avatar answered Nov 15 '22 02:11

Shakti


Some more information: when using RNN (like LSTM) with sequences of variable length you have to take of the format of your data.

When you group sequences in order to pass it to the fit method, keras will try to build a matrix of samples, which implies that all input sequences must have the same size, otherwise you won't have a matrix of the correct dimension.

There several possible solutions:

  1. train your network using samples one by one (using fit_generator for example)
  2. pad all your sequences so they have the same size
  3. group sequences by size (eventually padding them) and train your network group by group (again using generator based fit)

The third solution corresponds to the most common strategy with variable size. And if you pad sequences (second or third solution) you may want to add a masking layer as input.

If you're not sure, try to print the shape of your data (using the shape attribute of the numpy array.)

You may need to look at: https://keras.io/preprocessing/sequence/ (pad_sequences) and https://keras.io/layers/core/#masking

like image 21
Marwan Burelle Avatar answered Nov 15 '22 02:11

Marwan Burelle