Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does MaxPooling reduce overfitting?

I have trained the following CNN model with a smaller data set, therefore it does overfitting:

model = Sequential()
model.add(Conv2D(32, kernel_size=(3,3), input_shape=(28,28,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))

model.add(Conv2D(32, kernel_size=(3,3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.4))

model.add(Flatten())
model.add(Dense(512))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
model.compile(loss="categorical_crossentropy", optimizer=Adam(), metrics=['accuracy'])

The model has a lot of trainable parameters (more than 3 million, that's why I wonder if I should reduce the number of parameters with additional MaxPooling like follows?

Conv - BN - Act - MaxPooling - Conv - BN - Act - MaxPooling - Dropout - Flatten

or with an additional MaxPooling and Dropout like follows?

Conv - BN - Act - MaxPooling - Dropout - Conv - BN - Act - MaxPooling - Dropout - Flatten

I am trying to understand the full sense of MaxPooling and whether it can help against overfitting.

like image 457
Code Now Avatar asked Jan 13 '20 13:01

Code Now


People also ask

Does Maxpooling help with overfitting?

Max pooling is done to in part to help over-fitting by providing an abstracted form of the representation. As well, it reduces the computational cost by reducing the number of parameters to learn and provides basic translation invariance to the internal representation.

Which technique reduces overfitting?

Data Augmentation One of the best techniques for reducing overfitting is to increase the size of the training dataset. As discussed in the previous technique, when the size of the training data is small, then the network tends to have greater control over the training data.

Does pooling in CNN control overfitting?

Besides, pooling provides the ability to learn invariant features and also acts as a regularizer to further reduce the problem of overfitting. Additionally, the pooling techniques significantly reduce the computational cost and training time of networks which are equally important to consider.

What is Maxpooling used for?

Max Pooling is a pooling operation that calculates the maximum value for patches of a feature map, and uses it to create a downsampled (pooled) feature map. It is usually used after a convolutional layer.


1 Answers

Overfitting can happen when your dataset is not large enough to accomodate your number of features. Max pooling uses a max operation to pool sets of features, leaving you with a smaller number of them. Therefore, max-pooling should logically reduce overfit.

Drop-out reduces reliance on any single feature by ensuring that feature is not always available, forcing the model to look for different potential hints, rather than just sticking with one -- which would easily allow the model to overfit on any apparently good hint. Therefore, this also should help reduce overfit.

like image 155
Kiara Grouwstra Avatar answered Oct 05 '22 01:10

Kiara Grouwstra