I have a dataset of images that has the following distribution:
I think I need to add Class Weights to make up for the low amount of images in class 1, 2, 3 and 4.
I have tried calculating the class weights by dividing class 0 with class 1, class 0 with class 2 and so forth.
I'm assuming that class 0 corresponds to 1, as it doesnt need to be scaled? Not sure if that is correct though.
class_weights = np.array([1, 10.5, 4.9, 29.4, 36.75])
and added them to my fit function:
model.fit(x_train, y_train, batch_size=batch_size, class_weight=class_weights, epochs=epochs, validation_data=(x_test, y_test))
I'm unsure if I have calculated the weights correctly, and if this is even how it is supposed to be done?
Hopefully anyone can help clarifying it.
First of all make sure to pass a dictionary since the class_weights parameter takes a dictionary.
Second, the point of weighting the classes is as follows. Lets say that you have a binary classification problem where class_1 has 1000 instances and class_2 100 instances. Since you wanna make up for the imbalanced data you can set the weights as:
class_weights={"class_1": 1, "class_2": 10}
In other words, this would mean that if the model makes a mistake where the true label is class_2 it is going to be penalized 10 times more than if it makes a mistake on a sample where the true class is class_1. You want to have something like this because given the class distribution in the data, the model will have an inherent tendency of overfitting on the class_1 since it is overpopulated by default. By setting the class weights you are imposing an implicit constraint on the model that it is equally bad to make a wrong prediction on 10 instances of the class_1 and 1 wrong prediction on an instance of the class_2.
With that said, you can set the class_weights anyhow you want meaning that there is no right or wrong way to do it. The way you set the weights seems reasonable to me.
Please visit this answer for a proper solution https://datascience.stackexchange.com/a/18722
I understand that you are trying to set class weights, but also consider image augmentation to generate more images for the underrepresented classes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With