How to increase validation accuracy with deep neural net?

Tags:

I am trying to build a 11 class image classifier with 13000 training images and 3000 validation images. I am using deep neural network which is being trained using mxnet. Training accuracy is increasing and reached above 80% but validation accuracy is coming in range of 54-57% and its not increasing. What can be the issue here? Should I increase the no of images?

464

asked May 04 '16 07:05

2 Answers

The issue here is that your network stop learning useful general features at some point and start adapting to peculiarities of your training set (overfitting it in result). You want to 'force' your network to keep learning useful features and you have few options here:

Use weight regularization. It tries to keep weights low which very often leads to better generalization. Experiment with different regularization coefficients. Try 0.1, 0.01, 0.001 and see what impact they have on accuracy.
Corrupt your input (e.g., randomly substitute some pixels with black or white). This way you remove information from your input and 'force' the network to pick up on important general features. Experiment with noising coefficients which determines how much of your input should be corrupted. Research shows that anything in the range of 15% - 45% works well.
Expand your training set. Since you're dealing with images you can expand your set by rotating / scaling etc. your existing images (as suggested). You could also experiment with pre-processing your images (e.g., mapping them to black and white, grayscale etc. but the effectiveness of this technique will depend on your exact images and classes)
Pre-train your layers with denoising critera. Here you pre-train each layer of your network individually before fine tuning the entire network. Pre-training 'forces' layers to pick up on important general features that are useful for reconstructing the input signal. Look into auto-encoders for example (they've been applied to image classification in the past).
Experiment with network architecture. Your network might not have sufficient learning capacity. Experiment with different neuron types, number of layers, and number of hidden neurons. Make sure to try compressing architectures (less neurons than inputs) and sparse architectures (more neurons than inputs).

Unfortunately the process of training network that generalizes well involves a lot of experimentation and almost brute force exploration of parameter space with a bit of human supervision (you'll see many research works employing this approach). It's good to try 3-5 values for each parameter and see if it leads you somewhere.

When you experiment plot accuracy / cost / f1 as a function of number of iterations and see how it behaves. Often you'll notice a peak in accuracy for your test set, and after that a continuous drop. So apart from good architecture, regularization, corruption etc. you're also looking for a good number of iterations that yields best results.

One more hint: make sure each training epochs randomize the order of images.

159

answered Sep 21 '22 08:09

krychu

This clearly looks like a case where the model is overfitting the Training set, as the validation accuracy was improving step by step till it got fixed at a particular value. If the learning rate was a bit more high, you would have ended up seeing validation accuracy decreasing, with increasing accuracy for training set.

Increasing the number of training set is the best solution to this problem. You could also try applying different transformations (flipping, cropping random portions from a slightly bigger image)to the existing image set and see if the model is learning better.

answered Sep 21 '22 08:09

Anoop K. Prabhu

Related questions
                            
                                "UserWarning: An input could not be retrieved. It could be because a worker has died. We do not have any information on the lost sample."
                            
                                Tied weights in Autoencoder
                            
                                module 'tensorflow' has no attribute 'random_uniform'
                            
                                Training a Keras model from batches of .npy files using generator?
                            
                                Sudden drop in accuracy while training a deep neural net
                            
                                Integrating Keras model into TensorFlow
                            
                                LSTM Followed by Mean Pooling
                            
                                Pytorch AssertionError: Torch not compiled with CUDA enabled
                            
                                EM score in SQuAD Challenge
                            
                                tensorflow store training data on GPU memory
                            
                                ValueError: Input 0 of layer sequential is incompatible with the layer: : expected min_ndim=4, found ndim=3. Full shape received: [8, 28, 28]
                            
                                What are the uses of TimeDistributed wrapper for LSTM or any other layers
                            
                                A simple Convolutional neural network code
                            
                                How can I speed up deep learning on a non-NVIDIA setup?
                            
                                How do I use distributed DNN training in TensorFlow?
                            
                                Adding multiple layers to TensorFlow causes loss function to become Nan
                            
                                Caffe | solver.prototxt values setting strategy
                            
                                AttributeError: 'collections.OrderedDict' object has no attribute 'eval'
                            
                                Error in keras - name 'Dense' is not defined
                            
                                set_model() missing 1 required positional argument: 'model'

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to increase validation accuracy with deep neural net?

Tags:

deep-learning

caffe

mxnet

sau

People also ask

2 Answers

krychu

Anoop K. Prabhu

Recent Activity

Donate For Us