I am implementing a CNN for an highly unbalanced classification problem and I would like to implement custum metrics in tensorflow to use the Select Best Model callback. Specifically I would like to implement the balanced accuracy score, which is the average of the recall of each class (see sklearn implementation here), does someone know how to do it?
Balanced accuracy is a metric we can use to assess the performance of a classification model. It is calculated as: Balanced accuracy = (Sensitivity + Specificity) / 2. where: Sensitivity: The “true positive rate” – the percentage of positive cases the model is able to detect.
Balanced Accuracy = (Sensitivity + Specificity) / 2. F1 Score = 2 * (Precision * Recall) / (Precision + Recall)
Class Accuracy Defined in tensorflow/python/keras/metrics.py. Calculates how often predictions matches labels. For example, if y_true is [1, 2, 3, 4] and y_pred is [0, 2, 3, 4] then the accuracy is 3/4 or . 75.
I was facing the same issue so I implemented a custom class based off SparseCategoricalAccuracy
:
class BalancedSparseCategoricalAccuracy(keras.metrics.SparseCategoricalAccuracy):
def __init__(self, name='balanced_sparse_categorical_accuracy', dtype=None):
super().__init__(name, dtype=dtype)
def update_state(self, y_true, y_pred, sample_weight=None):
y_flat = y_true
if y_true.shape.ndims == y_pred.shape.ndims:
y_flat = tf.squeeze(y_flat, axis=[-1])
y_true_int = tf.cast(y_flat, tf.int32)
cls_counts = tf.math.bincount(y_true_int)
cls_counts = tf.math.reciprocal_no_nan(tf.cast(cls_counts, self.dtype))
weight = tf.gather(cls_counts, y_true_int)
return super().update_state(y_true, y_pred, sample_weight=weight)
The idea is to set each class weight inversely proportional to its size.
This code produces some warnings from Autograph but I believe those are Autograph bugs, and the metric seems to work fine.
There are 3 ways I can think of tackling the situation :-
1)Random Under-sampling - In this method you can randomly remove samples from the majority classes.
2)Random Over-sampling - In this method you can increase the samples by replicating them.
3)Weighted cross entropy - You can also use weighted cross entropy so that the loss value can be compensated for the minority classes. See here
I have personally tried method2 and it does increase my accuracy by significant value but it may vary from dataset to dataset
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With