How a Convolutional Neural Net handles channels

Tags:

I've looked through a lot of explanations of the way a CNN conventionally handles multiple channels (such as 3 in an RGB image) and am still at a loss.

When a 5x5x3 filter (say) is applied to a patch of an RGB image what exactly happens? Is it in fact 3 different 2D convolutions (with independent weights) that happen separately to each channel? And then the results get simply added together to produce the final output to pass to the next layer? Or is it a truly 3D convolution?

687

asked Dec 26 '17 18:12

Swazy

1 Answers

enter image description here

This image is from Andrew Ng's deeplearning.ai course. 6 X 6 X 3 - where 3 corresponds to 3 color channels. 6 X 6 being the height and widht of the image. For the convolution step we convolve the input image with 3 X 3 X 3 filter/kernel. The input image and filter both will have 3 layers. (Mostly both are same for input image and filter).The output will be 4 X 4 X 1. 3 X 3 X 3 gives you 27 features/parameters which you multiply with the corresponding Red, Green and blue channels. Finally add up all those numbers to get the value for [0,0] in 4 X 4 output image. Now move the yellow cube of the input image and slide it over 1 box to your right and once it reaches the right end, you slide the cube one row down and continue your multiplication to fill the 4 X 4 output. Would suggest you to take a paper and pencil, fill random values in all the cubes for input as well as the kernel and solve the multiplication.

For more details watch these lectures on youtube. https://www.youtube.com/watch?v=KTB_OFoAQcc&index=6&list=PLkDaE6sCZn6Gl29AoE31iwdVwSG-KnDzF

https://www.youtube.com/watch?v=7g8jpK4llkc&t=1s

194

answered Sep 20 '22 22:09

self.Fool

Related questions
                            
                                Similarity matrix -> feature vectors algorithm?
                            
                                What learning algorithm(s) should I consider to train a log-linear regression model?
                            
                                How to get the text of cluster centers from scikit-learn KMeans?
                            
                                CHAID analysis options for OS X / Python / R [closed]
                            
                                Updating the feature names into scikit TFIdfVectorizer
                            
                                R, Confusion Matrix in percent
                            
                                how to interpret the "soft" and "max" in the SoftMax regression?
                            
                                sklearn - model keeps overfitting
                            
                                How does having smaller values for parameters help in preventing over-fitting?
                            
                                OpenCL Theano - How to forcefully disable CUDA?
                            
                                Filtering and displaying values in GraphLab Sframe?
                            
                                How to plot ROC curve and precision-recall curve from BinaryClassificationMetrics
                            
                                Why is `sklearn.manifold.MDS` random when `skbio's pcoa` is not?
                            
                                Syntactic similarity/distance between 2 sentences/string/text using nltk [duplicate]
                            
                                Choose the best cluster partition based on a cost function
                            
                                deep neural network's precision for image recognition, float or double?
                            
                                L2 normalised output with keras
                            
                                Most important features in MLPClassifier in Sklearn
                            
                                When to use supervised or unsupervised learning?
                            
                                Getting correct shape for datapoint to predict with a Regression model after using One-Hot-Encoding in training

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How a Convolutional Neural Net handles channels

Tags:

machine-learning

computer-vision

convolution

Swazy

People also ask

1 Answers

self.Fool

Recent Activity

Donate For Us