I am trying to implement a classification problem with three classes: 'A','B' and 'C', where I would like to incorporate penalty for different type of misclassification in my model loss function (kind of like weighted cross entropy). Class weight is not suited as it applies-to all data that belongs to the class. Eg True label 'B' getting misclassified as 'C' should have higher loss as compared to getting misclassified as 'A'. Weight table as follow:
A B C
A 1 1 1
B 1 1 1.2
C 1 1 1
In current categorical_crossentropy loss, for true class 'B' if I have prediction softmax as
0.5 0.4 0.1 vs 0.1 0.4 0.5
categorical_crossentropy will be same. It doesn't matter if 'B' is getting miss-classified as A or C. I want to increase the loss of second prediction softmax as compared to first one.
I have tried https://github.com/keras-team/keras/issues/2115 but none of the code is working for Keras v2. Any help where I can directly enforce the weight matrix into Keras loss function will be highly appreciated.
Building on issue #2115, I've coded the following solution and posted it there too.
I only tested it in Tensorflow 1.14, so I guess it should work with Keras v2.
Adding to the class
solution here in #2115 (comment)
here's a more robust and vectorized implementation:
import tensorflow.keras.backend as K
from tensorflow.keras.losses import CategoricalCrossentropy
class WeightedCategoricalCrossentropy(CategoricalCrossentropy):
def __init__(self, cost_mat, name='weighted_categorical_crossentropy', **kwargs):
assert(cost_mat.ndim == 2)
assert(cost_mat.shape[0] == cost_mat.shape[1])
super().__init__(name=name, **kwargs)
self.cost_mat = K.cast_to_floatx(cost_mat)
def __call__(self, y_true, y_pred):
return super().__call__(
y_true=y_true,
y_pred=y_pred,
sample_weight=get_sample_weights(y_true, y_pred, self.cost_mat),
)
def get_sample_weights(y_true, y_pred, cost_m):
num_classes = len(cost_m)
y_pred.shape.assert_has_rank(2)
y_pred.shape[1].assert_is_compatible_with(num_classes)
y_pred.shape.assert_is_compatible_with(y_true.shape)
y_pred = K.one_hot(K.argmax(y_pred), num_classes)
y_true_nk1 = K.expand_dims(y_true, 2)
y_pred_n1k = K.expand_dims(y_pred, 1)
cost_m_1kk = K.expand_dims(cost_m, 0)
sample_weights_nkk = cost_m_1kk * y_true_nk1 * y_pred_n1k
sample_weights_n = K.sum(sample_weights_nkk, axis=[1, 2])
return sample_weights_n
Usage:
model.compile(loss=WeightedCategoricalCrossentropy(cost_matrix), ...)
Similarly, this can be applied for the CategoricalAccuracy
metric too:
from tensorflow.keras.metrics import CategoricalAccuracy
class WeightedCategoricalAccuracy(CategoricalAccuracy):
def __init__(self, cost_mat, name='weighted_categorical_accuracy', **kwargs):
assert(cost_mat.ndim == 2)
assert(cost_mat.shape[0] == cost_mat.shape[1])
super().__init__(name=name, **kwargs)
self.cost_mat = K.cast_to_floatx(cost_mat)
def update_state(self, y_true, y_pred, sample_weight=None):
return super().update_state(
y_true=y_true,
y_pred=y_pred,
sample_weight=get_sample_weights(y_true, y_pred, self.cost_mat),
)
Usage:
model.compile(metrics=[WeightedCategoricalAccuracy(cost_matrix), ...], ...)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With