In a CNN, the convolution operation 'convolves' a kernel matrix over an input matrix. Now, I know how a fully connected layer makes use of gradient descent and backpropagation to get trained. But how does the kernel matrix change over time?
There are multiple ways in which the kernel matrix is initialized as mentioned here, in the Keras documentation. However, I am interested to know how it is trained? If it uses backpropagation too, then is there any paper that describes in detail the training process?
This post also raises a similar question, but it is unanswered.
In Convolutional neural network, the kernel is nothing but a filter that is used to extract the features from the images. The kernel is a matrix that moves over the input data, performs the dot product with the sub-region of input data, and gets the output as the matrix of dot products.
Instead of using manually made kernels for feature extraction, through Deep CNNs we can learn these kernel values which can extract latent features.
Convolutional neural networks do not learn a single filter; they, in fact, learn multiple features in parallel for a given input. For example, it is common for a convolutional layer to learn from 32 to 512 filters in parallel for a given input.
The convolutional Neural Network CNN works by getting an image, designating it some weightage based on the different objects of the image, and then distinguishing them from each other. CNN requires very little pre-process data as compared to other deep learning algorithms.
Here you have a well explained post about backpropagation for Convolutional layer. In short, it is also gradient descent just like with FC layer. In fact, you can effectively turn a Convolutional layer into a Fuly Connected layer as explained here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With