I want to reuse an convolutional autoencoder (with the mnist datasets with 10 digits/categories) from https://blog.keras.io/building-autoencoders-in-keras.html and put it into a modified version where images are loaded from diretories with ImageDataGenerator. My data has only two classes, maybe that is the problem but I do not know to solve it...
from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Input, Conv2D, MaxPooling2D, UpSampling2D
from keras.models import Model
root_dir = '/opt/data/pets'
epochs = 10 # few epochs for testing
batch_size = 32 # No. of images to be yielded from the generator per batch.
seed = 4321 # constant seed for constant conditions
img_channel = 1 # only grayscale image: 1x8bit
img_x, img_y = 128, 128 # image x- and y-dimensions
input_img = Input(shape = (img_x, img_y, img_channel)) # keras image input type
# this is the augmentation configuration we will use for training: do only flips
train_datagen = ImageDataGenerator(
        rescale=1./255,
        horizontal_flip=True)
# this is the augmentation configuration we will use for testing: only rescaling
test_datagen = ImageDataGenerator(rescale=1./255)
# this is a generator that will read pictures found in
# subfolers of 'data/train', and indefinitely generate
# batches of augmented image data
train_generator = train_datagen.flow_from_directory(
        root_dir + '/train',  # this is the target directory
        target_size=(img_x, img_y),  # all images will be resized
        batch_size=batch_size,
        color_mode='grayscale',
        seed = seed)
# this is a similar generator, for validation data
validation_generator = test_datagen.flow_from_directory(
        root_dir + '/validate',
        target_size=(img_x, img_y),
        batch_size=batch_size,
        color_mode='grayscale',
        seed = seed)
# create Convolutional autoencoder from https://blog.keras.io/building-autoencoders-in-keras.html
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu',padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.summary() # show model data
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
autoencoder_train = autoencoder.fit_generator(
        train_generator,
        validation_data=validation_generator,
        epochs=epochs,
        shuffle=True)
The error is expected conv2d_121 to have 4 dimensions, but got array with shape (32, 2).
But I do not understand the problem. Other guys with simular errors have CNNs with very few outputs in the last layer that must fit the number of classes but I don't.
Here is the output of the model summary and the error:
Found 3784 images belonging to 2 classes.
Found 1074 images belonging to 2 classes.
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_24 (InputLayer)        (None, 128, 128, 1)       0         
_________________________________________________________________
conv2d_115 (Conv2D)          (None, 128, 128, 16)      160       
_________________________________________________________________
max_pooling2d_55 (MaxPooling (None, 64, 64, 16)        0         
_________________________________________________________________
conv2d_116 (Conv2D)          (None, 64, 64, 8)         1160      
_________________________________________________________________
max_pooling2d_56 (MaxPooling (None, 32, 32, 8)         0         
_________________________________________________________________
conv2d_117 (Conv2D)          (None, 32, 32, 8)         584       
_________________________________________________________________
max_pooling2d_57 (MaxPooling (None, 16, 16, 8)         0         
_________________________________________________________________
conv2d_118 (Conv2D)          (None, 16, 16, 8)         584       
_________________________________________________________________
up_sampling2d_46 (UpSampling (None, 32, 32, 8)         0         
_________________________________________________________________
conv2d_119 (Conv2D)          (None, 32, 32, 8)         584       
_________________________________________________________________
up_sampling2d_47 (UpSampling (None, 64, 64, 8)         0         
_________________________________________________________________
conv2d_120 (Conv2D)          (None, 64, 64, 16)        1168      
_________________________________________________________________
up_sampling2d_48 (UpSampling (None, 128, 128, 16)      0         
_________________________________________________________________
conv2d_121 (Conv2D)          (None, 128, 128, 1)       145       
=================================================================
Total params: 4,385
Trainable params: 4,385
Non-trainable params: 0
_________________________________________________________________
Epoch 1/10
Traceback (most recent call last):
.....
  File "/opt/anaconda/lib/python3.6/site-packages/keras/engine/training_utils.py", line 126, in standardize_input_data
    'with shape ' + str(data_shape))
ValueError: Error when checking target: expected conv2d_121 to have 4 dimensions, but got array with shape (32, 2)
The error is not about the model but the image generator. The flow_from_directory by default does classification and will generate a class output based on the directory so you get something like (32, 2) which is for every image a class label while the model expects an actual image.
class_mode: One of "categorical", "binary", "sparse", "input", or None. Default: "categorical". Determines the type of label arrays that are returned: "categorical" will be 2D one-hot encoded labels.
So you want class_mode="input" in your flow method to return as target the same image. More info in documentation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With