From this example: https://github.com/fchollet/keras/blob/master/examples/imdb_cnn.py
comes this snippet below. The embedding layer outputs a 400 x 50 matrix for each example in a batch. My question is how does the 1D convolution work? How does it work across the 400 x 50 matrix?
# we start off with an efficient embedding layer which maps
# our vocab indices into embedding_dims dimensions
model.add(Embedding(max_features,
embedding_dims,
input_length=maxlen,
dropout=0.2))
# we add a Convolution1D, which will learn nb_filter
# word group filters of size filter_length:
model.add(Convolution1D(nb_filter=nb_filter,
filter_length=filter_length,
border_mode='valid',
activation='relu',
subsample_length=1))
The 1D block is composed by a configurable number of filters, where the filter has a set size; a convolution operation is performed between the vector and the filter, producing as output a new vector with as many channels as the number of filters.
A 1 x 1 Convolution is a convolution with some special properties in that it can be used for dimensionality reduction, efficient low dimensional embeddings, and applying non-linearity after convolutions. It maps an input pixel with all its channels to an output pixel which can be squeezed to a desired output depth.
A 1-D convolutional layer applies sliding convolutional filters to 1-D input. The layer convolves the input by moving the filters along the input and computing the dot product of the weights and the input, then adding a bias term.
The convolutional Neural Network CNN works by getting an image, designating it some weightage based on the different objects of the image, and then distinguishing them from each other. CNN requires very little pre-process data as compared to other deep learning algorithms.
Coming from a background of signal processing it also took me while to understand the concept of it and it seems to be the case of many people in the community.
Pyan gave a very good explanation. As it is often explained with words in many forums, I made a little animation the I hope will help.
See below the input tensor, the filter (or weight) and the outputed tensor. You can also see the size of the output tensor as a function of the number of filters used (represented with different colours).
Visual Representation of the 1D Convolultion (Simplified)
Note that to perform the scalar multiplication between the input and the filter, the filter should be transposed. There are also different implementations (Karas, Tensorflow, Pytorch...), but I think this animation can give a good representation of what is happening.
Hope it can help someone.
In convolutional neural networks (CNNs), 1D and 2D filters are not really 1 and 2 dimensional. It is a convention for description.
In your example, each 1D filter is actually a Lx50 filter, where L is a parameter of filter length. The convolution is only performed in one dimension. That may be why it is called 1D. So, with proper padding, each 1D filter convolution gives a 400x1 vector. The Convolution1D layer will eventually output a matrix of 400*nb_filter
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With