I am implementing Segnet in Python. Following is the code.
img_w = 480
img_h = 360
pool_size = 2
def build_model(img_w, img_h, pool_size):
n_labels = 12
kernel = 3
encoding_layers = [
Conv2D(64, (kernel, kernel), input_shape=(img_h, img_w, 3), padding='same'),
BatchNormalization(),
Activation('relu'),
Convolution2D(64, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
MaxPooling2D(pool_size = (pool_size,pool_size)),
Convolution2D(128, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
Convolution2D(128, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
MaxPooling2D(pool_size = (pool_size,pool_size)),
Convolution2D(256, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
Convolution2D(256, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
Convolution2D(256, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
MaxPooling2D(pool_size = (pool_size,pool_size)),
Convolution2D(512, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
Convolution2D(512, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
Convolution2D(512, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
MaxPooling2D(pool_size = (pool_size,pool_size)),
Convolution2D(512, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
Convolution2D(512, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
Convolution2D(512, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
MaxPooling2D(pool_size = (pool_size,pool_size)),
]
autoencoder = models.Sequential()
autoencoder.encoding_layers = encoding_layers
for l in autoencoder.encoding_layers:
autoencoder.add(l)
decoding_layers = [
UpSampling2D(),
Convolution2D(512, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
Convolution2D(512, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
Convolution2D(512, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
UpSampling2D(),
Convolution2D(512, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
Convolution2D(512, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
Convolution2D(256, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
UpSampling2D(),
Convolution2D(256, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
Convolution2D(256, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
Convolution2D(128, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
UpSampling2D(),
Convolution2D(128, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
Convolution2D(64, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
UpSampling2D(),
Convolution2D(64, (kernel, kernel), padding='same'),
BatchNormalization(),
Activation('relu'),
Convolution2D(n_labels, (1, 1), padding='valid', activation="sigmoid"),
BatchNormalization(),
]
autoencoder.decoding_layers = decoding_layers
for l in autoencoder.decoding_layers:
autoencoder.add(l)
autoencoder.add(Reshape((n_labels, img_h * img_w)))
autoencoder.add(Permute((2, 1)))
autoencoder.add(Activation('softmax'))
return autoencoder
model = build_model(img_w, img_h, pool_size)
But it returns me error.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-21-051f06a53a14> in <module>()
----> 1 model = build_model(img_w, img_h, pool_size)
<ipython-input-20-c37fd94c8641> in build_model(img_w, img_h, pool_size)
119 autoencoder.add(l)
120
--> 121 autoencoder.add(Reshape((n_labels, img_h * img_w)))
122 autoencoder.add(Permute((2, 1)))
123 autoencoder.add(Activation('softmax'))
ValueError: total size of new array must be unchanged
I can't see any reason for the error. When I change img_w and img_h to 256, this error is resolved but problem is that's not the image size or original dataset so I can't use that. How to resolve this?
The problem is that you are performing (2, 2)
downsampling 5 times so, let's track the shape:
(360, 480) -> (180, 240) -> (90, 120) -> (45, 60) -> (22, 30) -> (11, 15)
And now upsampling:
(11, 15) -> (22, 30) -> (44, 60) -> (88, 120) -> (176, 240) -> (352, 480)
So, when you try to reshape
the output using original shape - the problem is raised due to model mismatch.
Possible solutions:
Resize your image that both input dimensions are divisible by 32
(e.g. (352, 480)
or (384, 480)
.
Add ZeroPadding2D(((1, 0), (0, 0)))
after 3rd upsampling to change the shape from (44, 60)
to (45, 60)
, what will make your network finish with a good output shape.
Other issues:
Please find out that the last MaxPooling2D
is followed by the first Upsampling2D
. This might be a problem as this is an useless bottlenecking of your network.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With