Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between DepthwiseConv2D and SeparableConv2D

Tags:

keras

From the document, I know SeparableConv2D is a combination of depthwise and pointwise operation. However, when I call

SeparableConv2D(100, 5, input_shape=(416,416,10) 

# total parameters is 1350

model.add(DepthwiseConv2D(5, input_shape=(416,416,10)))
model.add(Conv2D(100, 1))

# total parameters is 1360

Does it mean SeparableConv2D does not use bias in depthwise phase by default?

Thanks.

like image 981
michaelowenliu Avatar asked Jun 26 '19 02:06

michaelowenliu


People also ask

What is depthwiseconv2d?

Depthwise convolution is a type of convolution in which each input channel is convolved with a different kernel (called a depthwise kernel). You can understand depthwise convolution as the first step in a depthwise separable convolution.

What is SeparableConv2D?

On the other hand, the SeparableConv2D is a variation of the traditional convolution that was proposed to compute it faster. It performs a depthwise spatial convolution followed by a pointwise convolution which mixes together the resulting output channels.


1 Answers

Correct, checking the source code (I did this for tf.keras but I suppose it is the same for standalone keras) shows that in SeparableConv2D, the separable convolution works using only filters, no biases, and a single bias vector is added at the end. The second version, on the other hand, has biases for both DepthwiseConv2D and Conv2D.

Given that convolution is a linear operation and you are using no non-linearity inbetween depthwise and 1x1 convolution, I would suppose that having two biases is unnecessary in this case, similar to how you don't use biases in a layer that is followed by batch normalization, for example. As such, the extra 10 parameters wouldn't actually improve the model (nor should they really hurt either).

like image 98
xdurch0 Avatar answered Oct 01 '22 13:10

xdurch0