Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Input 0 of layer lstm_5 is incompatible with the layer: expected ndim=3, found ndim=2

I am trying to create an image captioning model. Could you please help with this error? input1 is the image vector, input2 is the caption sequence. 32 is the caption length. I want to concatenate the image vector with the embedding of the sequence and then feed it to the decoder model.

    def define_model(vocab_size, max_length):
      input1 = Input(shape=(512,))
      input1 = tf.keras.layers.RepeatVector(32)(input1)

      input2 = Input(shape=(max_length,))
      e1 = Embedding(vocab_size, 512, mask_zero=True)(input2)

      dec1 = tf.concat([input1,e1], axis=2)

      dec2 = LSTM(512)(dec1)
      dec3 = LSTM(256)(dec2)
      dec4 = Dropout(0.2)(dec3)
      dec5 = Dense(256, activation="relu")(dec4)
      output = Dense(vocab_size, activation="softmax")(dec5)
      model = tf.keras.Model(inputs=[input1, input2], outputs=output)
      model.compile(loss="categorical_crossentropy", optimizer="adam")
      return model

ValueError: Input 0 of layer lstm_5 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 512]
like image 459
Satashree Roy Avatar asked Jun 15 '20 19:06

Satashree Roy

1 Answers

This error occurs when an LSTM layer gets input in 2D instead of 3D. For instance:

(64, 100)

The correct format is (n_samples, time_steps, features):

(64, 5, 100)

In this case, the mistake you did was that the input of dec3, which is an LSTM layer, was the output of dec2, which is also an LSTM layer. By default, the argument return_sequences in an LSTM layer is False. This means that the first LSTM returned a 2D tensor, which was incompatible with the next LSTM layer. I solved your issue by setting return_sequences=True in your first LSTM layer.

Also, there was an error in this line:

model = tf.keras.Model(inputs=[input1, input2], outputs=output)

input1 was not an input layer because you reassigned it. See:

input1 = Input(shape=(512,))
input1 = tf.keras.layers.RepeatVector(32)(input1)

I renamed the second one e0, consistent with how you're naming your variables.

Now, everything is working:

import tensorflow as tf
from tensorflow.keras.layers import *
from tensorflow.keras import Input

vocab_size, max_length = 1000, 32

input1 = Input(shape=(128))
e0 = tf.keras.layers.RepeatVector(32)(input1)

input2 = Input(shape=(max_length,))
e1 = Embedding(vocab_size, 128, mask_zero=True)(input2)

dec1 = Concatenate()([e0, e1])

dec2 = LSTM(16, return_sequences=True)(dec1)
dec3 = LSTM(16)(dec2)
dec4 = Dropout(0.2)(dec3)
dec5 = Dense(32, activation="relu")(dec4)
output = Dense(vocab_size, activation="softmax")(dec5)
model = tf.keras.Model(inputs=[input1, input2], outputs=output)
model.compile(loss="categorical_crossentropy", optimizer="adam")
Model: "model_2"
Layer (type)                    Output Shape         Param #     Connected to 
input_24 (InputLayer)           [(None, 128)]        0    

input_25 (InputLayer)           [(None, 32)]         0                          

repeat_vector_12 (RepeatVector) (None, 32, 128)      0           input_24[0][0]  

embedding_11 (Embedding)        (None, 32, 128)      128000      input_25[0][0]
concatenate_7 (Concatenate)     (None, 32, 256)      0     repeat_vector_12[0][0]
lstm_12 (LSTM)                  (None, 32, 16)       17472    concatenate_7[0][0]
lstm_13 (LSTM)                  (None, 16)           2112        lstm_12[0][0]
dropout_2 (Dropout)             (None, 16)           0           lstm_13[0][0]
dense_4 (Dense)                 (None, 32)           544         dropout_2[0][0]
dense_5 (Dense)                 (None, 1000)         33000       dense_4[0][0]
Total params: 181,128
Trainable params: 181,128
Non-trainable params: 0
like image 183
Nicolas Gervais Avatar answered Nov 15 '22 20:11

Nicolas Gervais