In Convolutional Neural Network (CNN), a filter is select for weights sharing. For example, in the following pictures, a 3x3 window with the stride (distance between adjacent neurons) 1 is chosen. <img src="https://i.stack.imgur.com/P5eh4.jpg" alt=""> <img src="https://i.stack.imgur.com/4eMG8.png" alt=""> So my question is: How to choose the window size? If I use 4x4 with the stride being 2, how much difference will it cause? Thanks a lot in advance!

There's no definite answer to this: filter size is one of hyperparameters you generally need to tune. However, there're some useful observations, that may help you. It's often preferred to choose smaller filters, but have greater number of those. Example: four <code>5x5</code> filters have 100 parameters (ignoring bias), while 10 <code>3x3</code> filters have 90 parameters. Through the larger of filters you still can capture the variety of features in the image, but with fewer parameters. More on this here. Modern CNNs go even further with this idea and choose consecutive <code>3x1</code> and <code>1x3</code> convolutional layers. This reduces the number of parameters even more, but doesn't affect the performance. See the evolution of inception network. The choice of stride is also important, but it affects the tensor shape after the convolution, hence the whole network. The general rule is to use <code>stride=1</code> in usual convolutions and preserve the spatial size with padding, and use <code>stride=2</code> when you want to downsample the image.

How to choose the window size of CNN in deep learning?

1 Answers

There's no definite answer to this: filter size is one of hyperparameters you generally need to tune. However, there're some useful observations, that may help you. It's often preferred to choose smaller filters, but have greater number of those.

Example: four 5x5 filters have 100 parameters (ignoring bias), while 10 3x3 filters have 90 parameters. Through the larger of filters you still can capture the variety of features in the image, but with fewer parameters. More on this here.

Modern CNNs go even further with this idea and choose consecutive 3x1 and 1x3 convolutional layers. This reduces the number of parameters even more, but doesn't affect the performance. See the evolution of inception network.

The choice of stride is also important, but it affects the tensor shape after the convolution, hence the whole network. The general rule is to use stride=1 in usual convolutions and preserve the spatial size with padding, and use stride=2 when you want to downsample the image.

117

answered Oct 09 '22 14:10

Maxim

Related questions
                            
                                What are some pagerank alternatives?
                            
                                Which classification algorithm can be used for document categorization?
                            
                                Build an approximately uniform grid from random sample (python)
                            
                                Least squares linear classifier in matlab
                            
                                R: unclear behaviour of tuneRF function (randomForest package)
                            
                                Apache Spark ALS Recommendation Rating values higher than range
                            
                                Torch Resize Tensor
                            
                                Machine learning in Clojure
                            
                                Encoding String to numbers so as to use it in scikit-learn
                            
                                scikit-learn: get selected features when using SelectKBest within pipeline
                            
                                How to write a custom pooling layer module in tensor flow?
                            
                                How to implement multivariate linear stochastic gradient descent algorithm in tensorflow?
                            
                                Vectorization: Not a valid collection
                            
                                How can you remove only the interaction terms in a polynomial regression using scikit-learn?
                            
                                How is the gradient and hessian of logarithmic loss computed in the custom objective function example script in xgboost's github repository?
                            
                                Leaky_Relu in Caffe
                            
                                Decision tree using continuous variable [closed]
                            
                                Python - calculate the co-occurrence matrix
                            
                                Text classification using Keras: How to add custom features?
                            
                                Is it possible to have non-trainable layer in Keras?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to choose the window size of CNN in deep learning?

Tags:

machine-learning

deep-learning

conv-neural-network

data-science

hyperparameters

Fitz999

People also ask

1 Answers

Maxim

Recent Activity

Donate For Us