I'd like to use class_weight argument in keras model.fit to handle the imbalanced training data. By looking at some documents, I understood we can pass a dictionary like this:
class_weight = {0 : 1, 1: 1, 2: 5}
(In this example, class-2 will get higher penalty in the loss function.)
The problem is that my network's output has one-hot encoding i.e. class-0 = (1, 0, 0), class-1 = (0, 1, 0), and class-3 = (0, 0, 1).
How can we use the class_weight for one-hot encoded output?
By looking at some codes in Keras, it looks like _feed_output_names
contain a list of output classes, but in my case, model.output_names
/model._feed_output_names
returns ['dense_1']
Related: How to set class weights for imbalanced classes in Keras?
sample_weights is used to provide a weight for each training sample. That means that you should pass a 1D array with the same number of elements as your training samples (indicating the weight for each of those samples). class_weights is used to provide a weight or bias for each output class.
Generating class weights In binary classification, class weights could be represented just by calculating the frequency of the positive and negative class and then inverting it so that when multiplied to the class loss, the underrepresented class has a much higher error than the majority class.
The LogisticRegression class provides the class_weight argument that can be specified as a model hyperparameter. The class_weight is a dictionary that defines each class label (e.g. 0 and 1) and the weighting to apply in the calculation of the negative log likelihood when fitting the model.
Balanced class weights can be automatically calculated within the sample weight function. Set class_weight = 'balanced' to automatically adjust weights inversely proportional to class frequencies in the input data (as shown in the above table).
Here's a solution that's a bit shorter and faster. If your one-hot encoded y is a np.array:
import numpy as np from sklearn.utils.class_weight import compute_class_weight y_integers = np.argmax(y, axis=1) class_weights = compute_class_weight('balanced', np.unique(y_integers), y_integers) d_class_weights = dict(enumerate(class_weights))
d_class_weights
can then be passed to class_weight
in .fit
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With