Loss function for class imbalanced binary classifier in Tensor flow

Tags:

I am trying to apply deep learning for a binary classification problem with high class imbalance between target classes (500k, 31K). I want to write a custom loss function which should be like: minimize(100-((predicted_smallerclass)/(total_smallerclass))*100)

Appreciate any pointers on how I can build this logic.

829

asked Feb 02 '16 14:02

Venkata Dikshit Pappu

2 Answers

You can add class weights to the loss function, by multiplying logits. Regular cross entropy loss is this:

loss(x, class) = -log(exp(x[class]) / (\sum_j exp(x[j])))                = -x[class] + log(\sum_j exp(x[j]))

in weighted case:

loss(x, class) = weights[class] * -x[class] + log(\sum_j exp(weights[class] * x[j]))

So by multiplying logits, you are re-scaling predictions of each class by its class weight.

For example:

ratio = 31.0 / (500.0 + 31.0) class_weight = tf.constant([ratio, 1.0 - ratio]) logits = ... # shape [batch_size, 2] weighted_logits = tf.mul(logits, class_weight) # shape [batch_size, 2] xent = tf.nn.softmax_cross_entropy_with_logits(   weighted_logits, labels, name="xent_raw")

There is a standard losses function now that supports weights per batch:

tf.losses.sparse_softmax_cross_entropy(labels=label, logits=logits, weights=weights)

Where weights should be transformed from class weights to a weight per example (with shape [batch_size]). See documentation here.

156

answered Oct 23 '22 04:10

ilblackdragon

The code you proposed seems wrong to me. The loss should be multiplied by the weight, I agree.

But if you multiply the logit by the class weights, you end with:

weights[class] * -x[class] + log( \sum_j exp(x[j] * weights[class]) )

The second term is not equal to:

weights[class] * log(\sum_j exp(x[j]))

To show this, we can be rewrite the latter as:

log( (\sum_j exp(x[j]) ^ weights[class] )

So here is the code I'm proposing:

ratio = 31.0 / (500.0 + 31.0) class_weight = tf.constant([[ratio, 1.0 - ratio]]) logits = ... # shape [batch_size, 2]  weight_per_label = tf.transpose( tf.matmul(labels                            , tf.transpose(class_weight)) ) #shape [1, batch_size] # this is the weight for each datapoint, depending on its label  xent = tf.mul(weight_per_label          , tf.nn.softmax_cross_entropy_with_logits(logits, labels, name="xent_raw") #shape [1, batch_size] loss = tf.reduce_mean(xent) #shape 1

answered Oct 23 '22 03:10

JL Meunier

Related questions
                            
                                What are the 15 classifications of types in C++?
                            
                                Recommended anomaly detection technique for simple, one-dimensional scenario?
                            
                                How to engineer features for machine learning [closed]
                            
                                Scikit-learn confusion matrix
                            
                                Difference between Objective and feval in xgboost
                            
                                Unbalanced classification using RandomForestClassifier in sklearn
                            
                                What is the difference between OneVsRestClassifier and MultiOutputClassifier in scikit learn?
                            
                                Dealing with unbalanced datasets in Spark MLlib
                            
                                Controlling the threshold in Logistic Regression in Scikit Learn
                            
                                What is "naive" in a naive Bayes classifier?
                            
                                scikit learn output metrics.classification_report into CSV/tab-delimited format
                            
                                Correlated features and classification accuracy
                            
                                How is the feature score(/importance) in the XGBoost package calculated?
                            
                                What is the difference between back-propagation and feed-forward Neural Network?
                            
                                XGBoost XGBClassifier Defaults in Python
                            
                                Options for deploying R models in production
                            
                                Different decision tree algorithms with comparison of complexity or performance
                            
                                Save Naive Bayes Trained Classifier in NLTK
                            
                                Why does prediction needs batch size in Keras?
                            
                                What is the difference between a sigmoid followed by the cross entropy and sigmoid_cross_entropy_with_logits in TensorFlow?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Loss function for class imbalanced binary classifier in Tensor flow

Tags:

tensorflow

classification

Venkata Dikshit Pappu

People also ask

2 Answers

ilblackdragon

JL Meunier

Recent Activity

Donate For Us