I am building an autoencoder to compress the image. my input image is mnist dataset which contain (28,28,1) images and I want my latent space (encoded image)to have the shape (10,10,1) to have high compression ratio. in encoder part ,I don't have any problem but in the decoder part I cant return the the image to the original shape (28,28,1).
my code :
#Encoder
input_img = keras.Input(shape=(28, 28, 1))
x = layers.Conv2D(64 ,(3, 3), activation='relu', padding='same')(input_img)
x =layers.MaxPooling2D((3,3), padding='same')(x)
x = layers.Conv2D(128, (3, 3), activation='relu', padding='same')(x)
x = layers.Conv2D(256, (3, 3), activation='relu', padding='same')(x)
x = layers.Conv2D(512, (3, 3), activation='relu', padding='same')(x)
encoded = layers.Conv2D(1, (3, 3), activation='relu', padding='same')(x)
Encoded shape
#Decoder
x = layers.Conv2D(512, (3, 3), activation='relu', padding='same')(encoded)
x = layers.Conv2D(256, (3, 3), activation='relu', padding='same')(x)
x = layers.Conv2D(128, (3, 3), activation='relu', padding='same')(x)
x = layers.UpSampling2D((2, 2),interpolation="bilinear")(x)
x = layers.Conv2D(64 ,(3, 3), activation='relu', padding='same')(x)
decoded = x = layers.Conv2D(1, (3, 3), activation='relu', padding='same')(x)
decoded shape :(20,20,1) Decoded shape
How i can return the image to the original shape?
There are multiple ways to upscale a 2D tensor, or alternatively, to project a smaller vector into a larger one.
Here's a non exhaustive list:
I can see that you have already attempted the UpSampling + Conv direction. What you want to do next is apply a flatten layer, followed by a projection layer with 768 output units, before reshaping into batch, 28, 28, 1 again to get what you need.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With