What is the difference between performing upsampling together with strided transpose convolution and transpose convolution with stride 1 only?

Tags:

I noticed in a number of places that people use something like this, usually in fully convolutional networks, autoencoders, and similar:

model.add(UpSampling2D(size=(2,2)))
model.add(Conv2DTranspose(kernel_size=k, padding='same', strides=(1,1))

I am wondering what is the difference between that and simply:

model.add(Conv2DTranspose(kernel_size=k, padding='same', strides=(2,2))

Links towards any papers that explain this difference are welcome.

756

asked Jan 12 '18 12:01

Aleksandar Jovanovic

1 Answers

Here and here you can find a really nice explanation of how transposed convolutions work. To sum up both of these approaches:

In your first approach, you are first upsampling your feature map:
```
[[1, 2], [3, 4]] -> [[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]]
```
and then you apply a classical convolution (as Conv2DTranspose with stride=1 and padding='same' is equivalent to Conv2D).
In your second approach you are first un(max)pooling your feature map:
```
[[1, 2], [3, 4]] -> [[1, 0, 2, 0], [0, 0, 0, 0], [3, 0, 4, 0], [0, 0, 0, 0]]
```
and then apply a classical convolution with filter_size, filters`, etc.

Fun fact is that - although these approaches are different they share something in common. Transpose convolution is meant to be the approximation of gradient of convolution, so the first approach is approximating sum pooling whereas second max pooling gradient. This makes the first results to produce slightly smoother results.

Other reasons why you might see the first approach are:

Conv2DTranspose (and its equivalents) are relatively new in keras so the only way to perform learnable upsampling was using Upsample2D,
Author of keras - Francois Chollet used this approach in one of his tutorials,
In the past equivalents of transpose, convolution seemed to work awful in keras due to some API inconsistencies.

answered Nov 15 '22 12:11

Marcin Możejko

Related questions
                            
                                Caffe HDF5 pixel-wise classification
                            
                                Gradient from Theano expression for filter visualization in Keras
                            
                                Theano - Keras - No Module named `pool`
                            
                                Image classification in Caffe always returns same class
                            
                                Caffe: how to choose maximum avalible batch size that can fit in memory?
                            
                                Keras LSTM RNN forecast - Shifting fitted forecast backward
                            
                                how to use model after trained in tensorflow (save/load graph)
                            
                                Customizing the convolution layer in caffe windows cpp
                            
                                Tensorflow: jointly training CNN + LSTM
                            
                                Keras: network doesn't train with fit_generator()
                            
                                reduce size of pretrained deep learning model for feature generation
                            
                                Training a dense layer from bottleneck features vs. freezing all layers but the last - should be the same, but they behave differently
                            
                                How to continue training model using ModelCheckpoint of Keras
                            
                                Keras: What is the difference between layers.Input and layers.InputLayer?
                            
                                Regarding Dilated Convolutions vs Max-Pooling with Padding
                            
                                How do I create uncertainty color map image from deep CNN output?
                            
                                Validation loss not moving with MLP in Regression
                            
                                Recurrent neural networks for Time Series with Multiple Variables - TensorFlow

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the difference between performing upsampling together with strided transpose convolution and transpose convolution with stride 1 only?

Tags:

deep-learning

keras

conv-neural-network

convolution

deconvolution

Aleksandar Jovanovic

People also ask

1 Answers

Marcin Możejko

Recent Activity

Donate For Us