Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

keras shapes while UpSampling mismatch

I'm trying to run this convolutional auto encoder sample but with my own data, so I modified its InputLayer accoridng to my images. However, on the output layer there is a problem with dimensions. I'm sure the problem is with UpSampling, but I'm not sure why is this happening: here goes the code.

N, H, W = X_train.shape
input_img = Input(shape=(H,W,1))  # adapt this if using `channels_first` image data format

x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)

# at this point the representation is (4, 4, 8) i.e. 128-dimensional

x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)

autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

autoencoder.summary()

model summary for my images of 150x81 Then, When I run fit, throws this error:

i+=1
autoencoder.fit(x_train, x_train,
            epochs=50,
            batch_size=128,
            shuffle=True,
            validation_data=(x_test, x_test),
            callbacks= [TensorBoard(log_dir='/tmp/autoencoder/{}'.format(i))])

ValueError: Error when checking target: expected conv2d_23 to have shape (148, 84, 1) but got array with shape (150, 81, 1)

I went back to the tutorial code, and try to see its model's summary, and it shows the following:

model summary for images of 28x28 I'm sure there is a problem while reconstructing the output on decoder, But I'm not sure why is it, why does it work for 128x28 images but not for mines of 150x81

I guess I can solve this changing a little my image's dimencions, but I'd like to understand what is happening and how can I avoid it

like image 978
Rodrigo Laguna Avatar asked Nov 08 '22 06:11

Rodrigo Laguna


2 Answers

This is a typical problem you face when dealing with decoders or some form of upsampling.

 What you might no be aware of is that upsampling or deconvolution typically results in an increased height and width that can surpass the dimensions that you would expect.

To be more precise:
In your case you expect and output of shape (148, 84, 1), however, due to the upsampling you ended up with (150, 81, 1).

The solution here is to crop your output using a cropping layer after upsamling:

tf.keras.layers.Cropping2D(cropping=(top_crop, bottom_crop), (left_crop, right_crop)) 

therefore in your case for example:

tf.keras.layers.Cropping2D(cropping=((1, 1), (1, 2))) 
# or
tf.keras.layers.Cropping2D(cropping=((0, 2), (2, 1))) 
# or 
tf.keras.layers.Cropping2D(cropping=((2, 0), (1, 2))) 

This will crop the output in to the expected shape from (150, 81, 1) to (148, 84, 1).

For more details please refer to: tf.keras.layers.Cropping2D

like image 103
Anel Music Avatar answered Nov 14 '22 20:11

Anel Music


You can use ZeroPadding2D padding input image to 32X32, then use Cropping2D cropping decoded image.

from keras.layers import ZeroPadding2D, Cropping2D


input_img = Input(shape=(28,28,1))  # adapt this if using `channels_first` image data format
input_img_padding = ZeroPadding2D((2,2))(input_img)  #zero padding image to shape 32X32
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img_padding)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)

# at this point the representation is (4, 4, 8) i.e. 128-dimensional

x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
decoded_cropping = Cropping2D((2,2))(decoded)

autoencoder = Model(input_img, decoded_cropping) #cropping image from 32X32 to 28X28
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

autoencoder.summary()

summary

like image 36
dassh Avatar answered Nov 14 '22 22:11

dassh