Tensorflow: loss decreasing, but accuracy stable

Tags:

My team is training a CNN in Tensorflow for binary classification of damaged/acceptable parts. We created our code by modifying the cifar10 example code. In my prior experience with Neural Networks, I always trained until the loss was very close to 0 (well below 1). However, we are now evaluating our model with a validation set during training (on a separate GPU), and it seems like the precision stopped increasing after about 6.7k steps, while the loss is still dropping steadily after over 40k steps. Is this due to overfitting? Should we expect to see another spike in accuracy once the loss is very close to zero? The current max accuracy is not acceptable. Should we kill it and keep tuning? What do you recommend? Here is our modified code and graphs of the training process.

https://gist.github.com/justineyster/6226535a8ee3f567e759c2ff2ae3776b

Precision and Loss Images

220

asked Apr 19 '17 14:04

Justin Eyster

2 Answers

A decrease in binary cross-entropy loss does not imply an increase in accuracy. Consider label 1, predictions 0.2, 0.4 and 0.6 at timesteps 1, 2, 3 and classification threshold 0.5. timesteps 1 and 2 will produce a decrease in loss but no increase in accuracy.

Ensure that your model has enough capacity by overfitting the training data. If the model is overfitting the training data, avoid overfitting by using regularization techniques such as dropout, L1 and L2 regularization and data augmentation.

Last, confirm your validation data and training data come from the same distribution.

112

answered Sep 23 '22 02:09

rafaelvalle

Here are my suggestions, one of the possible problems is that your network start to memorize data, yes you should increase regularization,

update: Here I want to mention one more problem that may cause this: The balance ratio in the validation set is much far away from what you have in the training set. I would recommend, at first step try to understand what is your test data (real-world data, the one your model will face in inference time) descriptive look like, what is its balance ratio, and other similar characteristics. Then try to build such a train/validation set almost with the same descriptive you achieve for real data.

answered Sep 24 '22 02:09

Ali Abbasi

Related questions
                            
                                Understanding tf.extract_image_patches for extracting patches from an image
                            
                                What is an epoch in TensorFlow?
                            
                                Difference between tf.data.Dataset.map() and tf.data.Dataset.apply()
                            
                                Tensorflow Confusion Matrix in TensorBoard
                            
                                AttributeError: module 'tensorflow.python.keras.utils.generic_utils' has no attribute 'populate_dict_with_module_objects'
                            
                                Installing TensorFlow on Windows (Python 3.6.x)
                            
                                You must feed a value for placeholder tensor 'Placeholder' with dtype float
                            
                                Training on imbalanced data using TensorFlow
                            
                                Hyperparameter optimization for Deep Learning Structures using Bayesian Optimization
                            
                                Building a mutlivariate, multi-task LSTM with Keras
                            
                                Tensorflow: Can't understand ctc_beam_search_decoder() output sequence
                            
                                TensorFlow: Remember LSTM state for next batch (stateful LSTM)
                            
                                Nvidia Cudatoolkit vs Conda Cudatoolkit
                            
                                How does TensorFlow's MultiRnnCell work?
                            
                                Tensorflow : What is the relationship between .ckpt file and .ckpt.meta and .ckpt.index , and .pb file
                            
                                Cannot import keras after installation
                            
                                Tensor is not an element of this graph
                            
                                What is the equivalent of np.std() in TensorFlow?
                            
                                How can I clear a model created with Keras and Tensorflow(as backend)?
                            
                                Is there a better way to guess possible unknown variables without brute force than I am doing? Machine learning? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Tensorflow: loss decreasing, but accuracy stable

Tags:

neural-network

tensorflow

deep-learning

conv-neural-network

convolution

Justin Eyster

People also ask

2 Answers

rafaelvalle

Ali Abbasi

Recent Activity

Donate For Us