How to overcome overfitting in CNN - standard methods don't work

1 Answers

Things you should try include:

Early stopping, i.e. use a portion of your data to monitor validation loss and stop training if performance does not improve for some epochs.
Check whether you have unbalanced classes, use class weighting to equally represent each class in the data.
Regularization parameter tuning: different l2 coefficients, different dropout values, different regularization constraints (e.g. l1).

Other general suggestions may be to try and replicate the state of the art models on this particular dataset, see if those perform as they should.
Also make sure to have all implementation details ironed out (e.g. convolution is being performed along width and height, and not along the channels dimension - this is a classic rookie mistake when starting out with Keras, for instance).

It would also help to have some more details on the code that you are using, but for now these suggestions will do.
50% accuracy on a 200-class problem doesn't sound so bad anyway.

Cheers

answered Sep 24 '22 01:09

Daniele Grattarola

Related questions
                            
                                Can a model be created on Spark batch and use it in Spark streaming?
                            
                                What is the difference between classification and pattern recognition?
                            
                                Adapting binary stacking example to multiclass
                            
                                Possible to modify/prune learned trees in scikit-learn?
                            
                                The output of a softmax isn't supposed to have zeros, right?
                            
                                Gradient clipping appears to choke on None
                            
                                Add new columns to pandas dataframe based on other dataframe
                            
                                Plot decision tree in R (Caret)
                            
                                Should I avoid to use L2 regularization in conjuntion with RMSProp?
                            
                                how to predict my own image using cnn in keras after training on MNIST dataset
                            
                                How to use `log_loss` in `GridSearchCV` with multi-class labels in Scikit-Learn (sklearn)?
                            
                                Which algorithm is used in google's tesseract-OCR for Recognition?
                            
                                Keras: model accuracy drops after reaching 99 percent accuracy and loss 0.01
                            
                                Whether Data augmentation really needed in Machine Learning [closed]
                            
                                Approximating sine function with Neural Network and ReLU
                            
                                'Shuffle' is claimed to be an invalid parameter for model_selection.train_test_split
                            
                                Training in batches but testing individual data item in Tensorflow?
                            
                                what is the score in plot_learning_curve of scikit learn?
                            
                                How to properly feed specific tensor to keras model
                            
                                Imbalanced Dataset Using Keras

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to overcome overfitting in CNN - standard methods don't work

Tags:

artificial-intelligence

machine-learning

neural-network

deep-learning

conv-neural-network

Michał Gdak

People also ask

1 Answers

Daniele Grattarola

Recent Activity

Donate For Us