From the document, I know SeparableConv2D
is a combination of depthwise and pointwise operation. However, when I call
SeparableConv2D(100, 5, input_shape=(416,416,10)
# total parameters is 1350
model.add(DepthwiseConv2D(5, input_shape=(416,416,10)))
model.add(Conv2D(100, 1))
# total parameters is 1360
Does it mean SeparableConv2D
does not use bias in depthwise phase by default?
Thanks.
Depthwise convolution is a type of convolution in which each input channel is convolved with a different kernel (called a depthwise kernel). You can understand depthwise convolution as the first step in a depthwise separable convolution.
On the other hand, the SeparableConv2D is a variation of the traditional convolution that was proposed to compute it faster. It performs a depthwise spatial convolution followed by a pointwise convolution which mixes together the resulting output channels.
Correct, checking the source code (I did this for tf.keras
but I suppose it is the same for standalone keras
) shows that in SeparableConv2D
, the separable convolution works using only filters, no biases, and a single bias vector is added at the end. The second version, on the other hand, has biases for both DepthwiseConv2D
and Conv2D
.
Given that convolution is a linear operation and you are using no non-linearity inbetween depthwise and 1x1 convolution, I would suppose that having two biases is unnecessary in this case, similar to how you don't use biases in a layer that is followed by batch normalization, for example. As such, the extra 10 parameters wouldn't actually improve the model (nor should they really hurt either).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With