I am trying to train a Seq2Seq model using LSTM in Keras library of Python. I want to use TF IDF vector representation of sentences as input to the model and getting an error.
X = ["Good morning", "Sweet Dreams", "Stay Awake"]
Y = ["Good morning", "Sweet Dreams", "Stay Awake"]
vectorizer = TfidfVectorizer()
vectorizer.fit(X)
vectorizer.transform(X)
vectorizer.transform(Y)
tfidf_vector_X = vectorizer.transform(X).toarray() #shape - (3,6)
tfidf_vector_Y = vectorizer.transform(Y).toarray() #shape - (3,6)
tfidf_vector_X = tfidf_vector_X[:, :, None] #shape - (3,6,1) since LSTM cells expects ndims = 3
tfidf_vector_Y = tfidf_vector_Y[:, :, None] #shape - (3,6,1)
X_train, X_test, y_train, y_test = train_test_split(tfidf_vector_X, tfidf_vector_Y, test_size = 0.2, random_state = 1)
model = Sequential()
model.add(LSTM(output_dim = 6, input_shape = X_train.shape[1:], return_sequences = True, init = 'glorot_normal', inner_init = 'glorot_normal', activation = 'sigmoid'))
model.add(LSTM(output_dim = 6, input_shape = X_train.shape[1:], return_sequences = True, init = 'glorot_normal', inner_init = 'glorot_normal', activation = 'sigmoid'))
model.add(LSTM(output_dim = 6, input_shape = X_train.shape[1:], return_sequences = True, init = 'glorot_normal', inner_init = 'glorot_normal', activation = 'sigmoid'))
model.add(LSTM(output_dim = 6, input_shape = X_train.shape[1:], return_sequences = True, init = 'glorot_normal', inner_init = 'glorot_normal', activation = 'sigmoid'))
adam = optimizers.Adam(lr = 0.001, beta_1 = 0.9, beta_2 = 0.999, epsilon = None, decay = 0.0, amsgrad = False)
model.compile(loss = 'cosine_proximity', optimizer = adam, metrics = ['accuracy'])
model.fit(X_train, y_train, nb_epoch = 100)
The above code throws:
Error when checking target: expected lstm_4 to have shape (6, 6) but got array with shape (6, 1)
Could someone tell me what's wrong and how to fix it?
Currently, you are returning sequences of dimension 6 in your final layer. You probably want to return a sequence of dimensionality 1 to match your target sequence. I'm not 100% sure here because I'm not experienced with seq2seq models, but at least the code runs in this way. Maybe have a look at the seq2seq tutorial on the Keras blog.
Besides that, two small points: when using the Sequential API, you only need to specify an input_shape for the first layer of your model. Also, the output_dim argument of the LSTM layer is deprecated and should be replaced by the units argument:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
X = ["Good morning", "Sweet Dreams", "Stay Awake"]
Y = ["Good morning", "Sweet Dreams", "Stay Awake"]
vectorizer = TfidfVectorizer().fit(X)
tfidf_vector_X = vectorizer.transform(X).toarray() #//shape - (3,6)
tfidf_vector_Y = vectorizer.transform(Y).toarray() #//shape - (3,6)
tfidf_vector_X = tfidf_vector_X[:, :, None] #//shape - (3,6,1)
tfidf_vector_Y = tfidf_vector_Y[:, :, None] #//shape - (3,6,1)
X_train, X_test, y_train, y_test = train_test_split(tfidf_vector_X, tfidf_vector_Y, test_size = 0.2, random_state = 1)
from keras import Sequential
from keras.layers import LSTM
model = Sequential()
model.add(LSTM(units=6, input_shape = X_train.shape[1:], return_sequences = True))
model.add(LSTM(units=6, return_sequences=True))
model.add(LSTM(units=6, return_sequences=True))
model.add(LSTM(units=1, return_sequences=True, name='output'))
model.compile(loss='cosine_proximity', optimizer='sgd', metrics = ['accuracy'])
print(model.summary())
model.fit(X_train, y_train, epochs=1, verbose=1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With