Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between SeparableConv2D and Conv2D layers?

I didn't find a clearly answer to this question online (sorry if it exists). I would like to understand the differences between the two functions (SeparableConv2D and Conv2D), step by step with, for example a input dataset of (3,3,3) (as RGB image).

Running this script based on Keras-Tensorflow :

import numpy as np
from keras.layers import Conv2D, SeparableConv2D
from keras.models import Model
from keras.layers import Input

red   = np.array([1]*9).reshape((3,3))
green = np.array([100]*9).reshape((3,3))
blue  = np.array([10000]*9).reshape((3,3))

img = np.stack([red, green, blue], axis=-1)
img = np.expand_dims(img, axis=0)

inputs = Input((3,3,3))
conv1 = SeparableConv2D(filters=1, 
              strides=1, 
              padding='valid', 
              activation='relu',
              kernel_size=2, 
              depth_multiplier=1,
              depthwise_initializer='ones',
              pointwise_initializer='ones',
              bias_initializer='zeros')(inputs)

conv2 = Conv2D(filters=1, 
              strides=1, 
              padding='valid', 
              activation='relu',
              kernel_size=2, 
              kernel_initializer='ones', 
              bias_initializer='zeros')(inputs)

model1 = Model(inputs,conv1)
model2 = Model(inputs,conv2)
print("Model 1 prediction: ")
print(model1.predict(img))
print("Model 2 prediction: ")
print(model2.predict(img))
print("Model 1 summary: ")
model1.summary()
print("Model 2 summary: ")
model2.summary()

I have the following output :

Model 1 prediction:
 [[[[40404.]
   [40404.]]
  [[40404.]
   [40404.]]]]
Model 2 prediction: 
[[[[40404.]
   [40404.]]
  [[40404.]
   [40404.]]]]
Model 1 summary: 
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 3, 3, 3)           0         
_________________________________________________________________
separable_conv2d_1 (Separabl (None, 2, 2, 1)           16        
=================================================================
Total params: 16
Trainable params: 16
Non-trainable params: 0
_________________________________________________________________
Model 2 summary: 
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 3, 3, 3)           0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 2, 2, 1)           13        
=================================================================
Total params: 13
Trainable params: 13
Non-trainable params: 0

I understand how Keras compute the Conv2D prediction of model 2 thanks to this post, but can someone explains the SeperableConv2D computation of model 1 prediction please and its number of parameters (16) ?

like image 490
etiennedm Avatar asked Feb 15 '19 08:02

etiennedm


1 Answers

As Keras uses Tensorflow, you can check in the Tensorflow's API the difference.

The conv2D is the traditional convolution. So, you have an image, with or without padding, and filter that slides through the image with a given stride.

On the other hand, the SeparableConv2D is a variation of the traditional convolution that was proposed to compute it faster. It performs a depthwise spatial convolution followed by a pointwise convolution which mixes together the resulting output channels. MobileNet, for example, uses this operation to compute the convolutions faster.

I could explain both operations here, however, this post has a very good explanation using images and videos that I strongly recommend you to read.

like image 159
André Pacheco Avatar answered Oct 16 '22 07:10

André Pacheco