I'm trying to use the following ConvLSTM2D
architecture to estimate high resolution image sequences from low resolution ones:
import numpy as np, scipy.ndimage, matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, ConvLSTM2D, MaxPooling2D, UpSampling2D
from sklearn.metrics import accuracy_score, confusion_matrix, cohen_kappa_score
from sklearn.preprocessing import MinMaxScaler, StandardScaler
np.random.seed(123)
raw = np.arange(96).reshape(8,3,4)
data1 = scipy.ndimage.zoom(raw, zoom=(1,100,100), order=1, mode='nearest') #low res
print (data1.shape)
#(8, 300, 400)
data2 = scipy.ndimage.zoom(raw, zoom=(1,100,100), order=3, mode='nearest') #high res
print (data2.shape)
#(8, 300, 400)
X_train = data1.reshape(data1.shape[0], 1, data1.shape[1], data1.shape[2], 1)
Y_train = data2.reshape(data2.shape[0], 1, data2.shape[1], data2.shape[2], 1)
#(samples,time, rows, cols, channels)
model = Sequential()
input_shape = (data1.shape[0], data1.shape[1], data1.shape[2], 1)
#samples, time, rows, cols, channels
model.add(ConvLSTM2D(16, kernel_size=(3,3), activation='sigmoid',padding='same',input_shape=input_shape))
model.add(ConvLSTM2D(8, kernel_size=(3,3), activation='sigmoid',padding='same'))
print (model.summary())
model.compile(loss='mean_squared_error',
optimizer='adam',
metrics=['accuracy'])
model.fit(X_train, Y_train,
batch_size=1, epochs=10, verbose=1)
x,y = model.evaluate(X_train, Y_train, verbose=0)
print (x,y)
This declaration will result in the following Value
error:
ValueError: Input 0 is incompatible with layer conv_lst_m2d_2: expected ndim=5, found ndim=4
How can I correct this ValueError
? I think problem is with input shapes, but could not figure out what exactly is wrong.
Notice that the output should be sequences of images too, instead of a classification result.
This is happening because LSTMs
require temporal data, but your first one was declared as a many-to-one
model, which outputs a tensor of shape (batch_size, 300, 400, 16)
. That is, batches of images:
model.add(ConvLSTM2D(16, kernel_size=(3,3), activation='sigmoid',padding='same',input_shape=input_shape))
model.add(ConvLSTM2D(8, kernel_size=(3,3), activation='sigmoid',padding='same'))
You want the output to be a tensor of shape (batch_size, 8, 300, 400, 16)
(i.e. sequences of images), so they can be consumed by the second LSTM. The way to fix this is to add return_sequences
in the first LSTM definition:
model.add(ConvLSTM2D(16, kernel_size=(3,3), activation='sigmoid',padding='same',input_shape=input_shape,
return_sequences=True))
model.add(ConvLSTM2D(8, kernel_size=(3,3), activation='sigmoid',padding='same'))
You mentioned classification. If what you indent is to classify entire sequences, then you need a classifier at the end:
model.add(ConvLSTM2D(16, kernel_size=(3,3), activation='sigmoid',padding='same',input_shape=input_shape,
return_sequences=True))
model.add(ConvLSTM2D(8, kernel_size=(3,3), activation='sigmoid',padding='same'))
model.add(GlobalAveragePooling2D())
model.add(Dense(10, activation='softmax')) # output shape: (None, 10)
But if you are trying to classify each image within the sequences, then you can simply reapply the classifier using TimeDistributed
:
x = Input(shape=(300, 400, 8))
y = GlobalAveragePooling2D()(x)
y = Dense(10, activation='softmax')(y)
classifier = Model(inputs=x, outputs=y)
x = Input(shape=(data1.shape[0], data1.shape[1], data1.shape[2], 1))
y = ConvLSTM2D(16, kernel_size=(3, 3),
activation='sigmoid',
padding='same',
return_sequences=True)(x)
y = ConvLSTM2D(8, kernel_size=(3, 3),
activation='sigmoid',
padding='same',
return_sequences=True)(y)
y = TimeDistributed(classifier)(y) # output shape: (None, 8, 10)
model = Model(inputs=x, outputs=y)
Finally, take a look at the examples in keras repository. There's one for a generative model using ConvLSTM2D.
Edit: to estimate data2 from data1...
If I got it right this time, X_train
should be 1 sample of a stack of 8 (300, 400, 1) images, not 8 samples of a stack of 1 image of shape (300, 400, 1).
If that's true, then:
X_train = data1.reshape(data1.shape[0], 1, data1.shape[1], data1.shape[2], 1)
Y_train = data2.reshape(data2.shape[0], 1, data2.shape[1], data2.shape[2], 1)
Should be updated to:
X_train = data1.reshape(1, data1.shape[0], data1.shape[1], data1.shape[2], 1)
Y_train = data2.reshape(1, data2.shape[0], data2.shape[1], data2.shape[2], 1)
Also, accuracy
doesn't usually make sense when your loss is mse. You can use other metrics such as mae
.
Now you just need to update your model to return sequences and to have a single unit in the last layer (because the images you are trying to estimate have a single channel):
model = Sequential()
input_shape = (data1.shape[0], data1.shape[1], data1.shape[2], 1)
model.add(ConvLSTM2D(16, kernel_size=(3, 3), activation='sigmoid', padding='same',
input_shape=input_shape,
return_sequences=True))
model.add(ConvLSTM2D(1, kernel_size=(3, 3), activation='sigmoid', padding='same',
return_sequences=True))
model.compile(loss='mse', optimizer='adam')
After that, model.fit(X_train, Y_train, ...)
will start training normally:
Using TensorFlow backend.
(8, 300, 400)
(8, 300, 400)
Epoch 1/10
1/1 [==============================] - 5s 5s/step - loss: 2993.8701
Epoch 2/10
1/1 [==============================] - 5s 5s/step - loss: 2992.4492
Epoch 3/10
1/1 [==============================] - 5s 5s/step - loss: 2991.4536
Epoch 4/10
1/1 [==============================] - 5s 5s/step - loss: 2989.8523
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With