Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convolving Across Channels in Keras CNN: Conv1D, Depthwise Separable Conv, CCCP?

I am developing a CNN in keras to classify satellite imagery that has 10 spectral bands. I'm getting decent accuracy with the network below (~60% val accuracy across 15 classes) but I want to better incorporate the relationships between spectral bands at a single pixel which can yield a lot of information on the pixel's class. I see a lot of papers doing this but it is often called different things. For example:

  • Cascaded cross-channel parametric pooling
  • Conv1D
  • Depthwise Separable Convolution
  • Conv2D(num_filters, (1, 1))

And I'm not certain about the differences between these approaches (if there are any) and how I should implement this in my simple CNN below. I'm also not clear if I should do this at the very beginning or towards the end. I'm inclined to do it right at the start when the channels are still the raw spectral data rather than the feature maps.

input_shape = (32,32,10)
num_classes = 15

model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', input_shape=input_shape))
model.add(Activation('relu'))

model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Conv2D(64, (3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))
like image 771
clifgray Avatar asked Apr 30 '19 18:04

clifgray


People also ask

What is conv1d in keras?

1D convolution layer (e.g. temporal convolution). This layer creates a convolution kernel that is convolved with the layer input over a single spatial (or temporal) dimension to produce a tensor of outputs. If use_bias is True, a bias vector is created and added to the outputs.

What is Depthwise conv?

Depthwise Convolution is a type of convolution where we apply a single convolutional filter for each input channel. In the regular 2D convolution performed over multiple input channels, the filter is as deep as the input and lets us freely mix channels to generate each element in the output.

How do you use Depthwise separable convolution?

Depth-Wise Separable Convolutions In depth-wise operation, convolution is applied to a single channel at a time unlike standard CNN's in which it is done for all the M channels. So here the filters/kernels will be of size Dk x Dk x 1. Given there are M channels in the input data, then M such filters are required.

Why is Depthwise separable convolution?

Why is Depthwise Separable Convolution so efficient? So a 2D convolution will require 1,228,800 multiplications, while a Depthwise Separable convolution will require only 53,952 multiplications to reach the same output.


1 Answers

Let me explain the operations you mentioned in a bit of detail so you understand the differences between their intuition and usage:

Cascaded cross-channel parametric pooling:

This is introduced in the Network-in-Network paper and is implemented in Keras as GlobalAveragePooling2D(). This operation averages over the output of each feature map in the previous layers.

It is a structural regularizer that enforces correspondence between feature maps and categories, so feature maps can be interpreted as category confidence. It reduces parameter count and sums up spatial information and hence, it is more robust to spatial translations of the input.

GlobalAveragePooling2D() is generally used without Dense() layers in the model before it.

Conv1D:

Conv1D() is a convolution operation exactly similar to Conv2D() but it applies only to one dimension. Conv1D() is generally used on sequences or other 1D data, not as much on images.

Depthwise Separable Convolution:

Quoting from the Keras documentation

Separable convolutions consist in first performing a depthwise spatial convolution (which acts on each input channel separately) followed by a pointwise convolution which mixes together the resulting output channels. The depth_multiplier argument controls how many output channels are generated per input channel in the depthwise step.

This blog explains the depthwise separable convolution pretty well.

Conv2D(num_filters, (1, 1)):

This is generally known as 1x1 convolution, introduced in the Network-in-Network paper.

The 1x1 convolutional filters are used to reduce/increase dimensionality in the filter dimension, without affecting the spatial dimensions. This is also used in the Google Inception architecture for dimensionality reduction in filter space.

In your particular case, I am not exactly sure which of this techniques you can use. I do not think Conv1D would be of much use. You can definitely use GlobalMaxPooling or GlobalAveragePooling as long as you do not use Dense before them. This is helpful to get spatial information. Depthwise Separable Convolution can be used as well in place of your Conv2D layers. Conv2D(num_filters, (1, 1)) is very helpful for dimensionality reduction in filter space, mostly towards the end of your model architecture.

Maybe, if you follow the resources you get a better understanding of the operations and see how they apply to your problem.

like image 199
Anakin Avatar answered Sep 30 '22 09:09

Anakin