I am interested in how sklearn apply the class weight we supply. The documentation doesn't state explicitly where and how the class weights are applied. Nor does reading the source code helps (seems like sklearn.svm.liblinear is used for the optimization, and I can't read the source codes since it is a .pyd file...)
But I guess it works on the cost function: when class weights are specified, the cost of the respective class will be multiplied by the class weight. For example if I have 2 observations each from class 0 (weight=0.5) and class 1 (weight=1) respectively, then the cost function would be:
Cost = 0.5*log(...X_0,y_0...) + 1*log(...X_1,y_1...) + penalization
Does anyone know whether this is correct?
Logistic Regression (manual class weights): The idea is, if we are giving n as the weight for the minority class, the majority class will get 1-n as the weights. Here, the magnitude of the weights is not very large but the ratio of weights between majority and minority class will be very high.
It computes the probability of an event occurrence. It is a special case of linear regression where the target variable is categorical in nature. It uses a log of odds as the dependent variable. Logistic Regression predicts the probability of occurrence of a binary event utilizing a logit function.
In logistic regression, another technique comes handy to work with imbalance distribution. This is to use class-weights in accordance with the class distribution. Class-weights is the extent to which the algorithm is punished for any wrong prediction of that class.
Class weights give all the classes equal importance on gradient updates, on average, regardless of how many samples we have from each class in the training data. This prevents models from predicting the more frequent class more often just because it's more common.
Check the following lines in the source code:
le = LabelEncoder()
if isinstance(class_weight, dict) or multi_class == 'multinomial':
class_weight_ = compute_class_weight(class_weight, classes, y)
sample_weight *= class_weight_[le.fit_transform(y)]
Here is the source code for the compute_class_weight()
function:
...
else:
# user-defined dictionary
weight = np.ones(classes.shape[0], dtype=np.float64, order='C')
if not isinstance(class_weight, dict):
raise ValueError("class_weight must be dict, 'balanced', or None,"
" got: %r" % class_weight)
for c in class_weight:
i = np.searchsorted(classes, c)
if i >= len(classes) or classes[i] != c:
raise ValueError("Class label {} not present.".format(c))
else:
weight[i] = class_weight[c]
...
In the snippet above class_weight
are applied to sample_weight
, which is used in a few internal function like _logistic_loss_and_grad, _logistic_loss, etc.:
# Logistic loss is the negative of the log of the logistic function.
out = -np.sum(sample_weight * log_logistic(yz)) + .5 * alpha * np.dot(w, w)
# NOTE: ---> ^^^^^^^^^^^^^^^
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With