Training on imbalanced data using TensorFlow

Tags:

The Situation:

I am wondering how to use TensorFlow optimally when my training data is imbalanced in label distribution between 2 labels. For instance, suppose the MNIST tutorial is simplified to only distinguish between 1's and 0's, where all images available to us are either 1's or 0's. This is straightforward to train using the provided TensorFlow tutorials when we have roughly 50% of each type of image to train and test on. But what about the case where 90% of the images available in our data are 0's and only 10% are 1's? I observe that in this case, TensorFlow routinely predicts my entire test set to be 0's, achieving an accuracy of a meaningless 90%.

One strategy I have used to some success is to pick random batches for training that do have an even distribution of 0's and 1's. This approach ensures that I can still use all of my training data and produced decent results, with less than 90% accuracy, but a much more useful classifier. Since accuracy is somewhat useless to me in this case, my metric of choice is typically area under the ROC curve (AUROC), and this produces a result respectably higher than .50.

Questions:

(1) Is the strategy I have described an accepted or optimal way of training on imbalanced data, or is there one that might work better?

(2) Since the accuracy metric is not as useful in the case of imbalanced data, is there another metric that can be maximized by altering the cost function? I can certainly calculate AUROC post-training, but can I train in such a way as to maximize AUROC?

(3) Is there some other alteration I can make to my cost function to improve my results for imbalanced data? Currently, I am using a default suggestion given in TensorFlow tutorials:

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred, y)) optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

I have heard this may be possible by up-weighting the cost of miscategorizing the smaller label class, but I am unsure of how to do this.

919

asked Jan 27 '16 22:01

MJoseph

1 Answers

(1)It's ok to use your strategy. I'm working with imbalanced data as well, which I try to use down-sampling and up-sampling methods first to make the training set even distributed. Or using ensemble method to train each classifier with an even distributed subset.

(2)I haven't seen any method to maximise the AUROC. My thought is that AUROC is based on true positive and false positive rate, which doesn't tell how well it works on each instance. Thus, it may not necessarily maximise the capability to separate the classes.

(3)Regarding weighting the cost by the ratio of class instances, it similar to Loss function for class imbalanced binary classifier in Tensor flow and the answer.

answered Oct 02 '22 18:10

Young

Related questions
                            
                                How to improve accuracy of Tensorflow camera demo on iOS for retrained graph
                            
                                A few implementation details for a Support-Vector Machine (SVM)
                            
                                How does one train multiple models in a single script in TensorFlow when there are GPUs present?
                            
                                Pointwise mutual information on text
                            
                                Is F1 micro the same as Accuracy?
                            
                                How to use fit_generator with multiple inputs
                            
                                Save python random forest model to file
                            
                                How to duplicate an estimator in order to use it on multiple data sets?
                            
                                How to get a classifier's confidence score for a prediction in sklearn?
                            
                                Printing all the contents of a tensor
                            
                                Choosing from different cost function and activation function of a neural network
                            
                                Using Smote with Gridsearchcv in Scikit-learn
                            
                                Soft attention vs. hard attention
                            
                                What's the difference between LibSVM and LibLinear
                            
                                Is it possible to do multivariate multi-step forecasting using FB Prophet?
                            
                                What is weakly supervised learning (bootstrapping)?
                            
                                Maximum Likelihood Estimate pseudocode
                            
                                How does Pytorch's "Fold" and "Unfold" work?
                            
                                Request for example: Recurrent neural network for predicting next value in a sequence
                            
                                Create Bayesian Network and learn parameters with Python3.x [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Training on imbalanced data using TensorFlow

Tags:

machine-learning

neural-network

tensorflow

deep-learning

perceptron

MJoseph

People also ask

1 Answers

Young

Recent Activity

Donate For Us