I didn't find a clearly answer to this question online (sorry if it exists). I would like to understand the differences between the two functions (SeparableConv2D and Conv2D), step by step with, for example a input dataset of (3,3,3) (as RGB image).
Running this script based on Keras-Tensorflow :
import numpy as np
from keras.layers import Conv2D, SeparableConv2D
from keras.models import Model
from keras.layers import Input
red = np.array([1]*9).reshape((3,3))
green = np.array([100]*9).reshape((3,3))
blue = np.array([10000]*9).reshape((3,3))
img = np.stack([red, green, blue], axis=-1)
img = np.expand_dims(img, axis=0)
inputs = Input((3,3,3))
conv1 = SeparableConv2D(filters=1,
strides=1,
padding='valid',
activation='relu',
kernel_size=2,
depth_multiplier=1,
depthwise_initializer='ones',
pointwise_initializer='ones',
bias_initializer='zeros')(inputs)
conv2 = Conv2D(filters=1,
strides=1,
padding='valid',
activation='relu',
kernel_size=2,
kernel_initializer='ones',
bias_initializer='zeros')(inputs)
model1 = Model(inputs,conv1)
model2 = Model(inputs,conv2)
print("Model 1 prediction: ")
print(model1.predict(img))
print("Model 2 prediction: ")
print(model2.predict(img))
print("Model 1 summary: ")
model1.summary()
print("Model 2 summary: ")
model2.summary()
I have the following output :
Model 1 prediction:
[[[[40404.]
[40404.]]
[[40404.]
[40404.]]]]
Model 2 prediction:
[[[[40404.]
[40404.]]
[[40404.]
[40404.]]]]
Model 1 summary:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 3, 3, 3) 0
_________________________________________________________________
separable_conv2d_1 (Separabl (None, 2, 2, 1) 16
=================================================================
Total params: 16
Trainable params: 16
Non-trainable params: 0
_________________________________________________________________
Model 2 summary:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 3, 3, 3) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 2, 2, 1) 13
=================================================================
Total params: 13
Trainable params: 13
Non-trainable params: 0
I understand how Keras compute the Conv2D prediction of model 2 thanks to this post, but can someone explains the SeperableConv2D computation of model 1 prediction please and its number of parameters (16) ?
As Keras uses Tensorflow, you can check in the Tensorflow's API the difference.
The conv2D is the traditional convolution. So, you have an image, with or without padding, and filter that slides through the image with a given stride.
On the other hand, the SeparableConv2D is a variation of the traditional convolution that was proposed to compute it faster. It performs a depthwise spatial convolution followed by a pointwise convolution which mixes together the resulting output channels. MobileNet, for example, uses this operation to compute the convolutions faster.
I could explain both operations here, however, this post has a very good explanation using images and videos that I strongly recommend you to read.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With