keras supports the class_weights
feature to allow giving different classes different weights - for example for when the number of samples is imbalanced
I want to do something similar, but to use dynamic weights, based on the class imbalance in each batch.
is this possible?
As mentioned in the Keras Official Docs, class_weight : Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only). This can be useful to tell the model to "pay more attention" to samples from an under-represented class.
We will search for weights between 0 to 1. The idea is, if we are giving n as the weight for the minority class, the majority class will get 1-n as the weights. Here, the magnitude of the weights is not very large but the ratio of weights between majority and minority class will be very high.
The class_weight is a dictionary that defines each class label (e.g. 0 and 1) and the weighting to apply in the calculation of the negative log likelihood when fitting the model.
Option 1:
Make a manual loop for epochs and batches, use the method train_on_batch
, which also accepts class_weight
:
for epoch in range(epochs):
for batchX,batchY in batches: #adapt this loop to your way of creating/getting batches
weights = calculateOrGetTheWeights(batch)
model.train_on_batch(batchX,batchY,...,class_weight=weights)
Option 2:
Create a custom loss. May be more tricky and depends on the data format, the number of classes, the type of loss function, etc.
Assuming 2D data (samples, classes) and a multiclass problem:
import keras.backend as K
def customLoss(yTrue,yPred):
classes = K.argmax(yTrue)
classCount = K.sum(yTrue,axis=0)
loss = K.some_loss_function(yTrue,yPred)
return loss / K.gather(classCount, classes)
Assuming a binary classification (1 class only) with 1D or 2D data:
import keras.backend as K
def binaryCustomLoss(yTrue,yPred):
positives = yTrue
negatives = 1 - yTrue
positiveRatio = K.mean(positives)
negativeRatio = 1 - positiveRatio #or K.mean(negatives)
weights = (positives / positiveRatio) + (negatives / negativeRatio)
#you may need K.squeeze(weights) here
return weights * K.some_loss_function(yTrue,yPred)
Warning: both loss functions will return Nan (or infinity) if any class count is zero.
One option is to, instead of using class_weight
, use samples weights
If you want your sample weight to be dynamic you'll need to use fit_generator
instead of fit
, so you can change the weights on the run
So in pseudo code:
def gen(x, y):
while True:
for x_batch, y_batch in make_batches(x, y):
weights = make_weights(y_batch)
yield x_batch, y_batch, weights
model.fit_generator(gen(x_train, y_train))
In this code, make_weights
should return an array with the same length as y_batch
. Each element is a weight to be applied to the respective sample
If you're unsure about the behavior of class_weight
and sample weights being the same, notice how keras standardizes class weights.
So class weights are actually translated to sample weights in the end :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With