Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How are class_weights being applied in sklearn logistic regression?

I am interested in how sklearn apply the class weight we supply. The documentation doesn't state explicitly where and how the class weights are applied. Nor does reading the source code helps (seems like sklearn.svm.liblinear is used for the optimization, and I can't read the source codes since it is a .pyd file...)

But I guess it works on the cost function: when class weights are specified, the cost of the respective class will be multiplied by the class weight. For example if I have 2 observations each from class 0 (weight=0.5) and class 1 (weight=1) respectively, then the cost function would be:

Cost = 0.5*log(...X_0,y_0...) + 1*log(...X_1,y_1...) + penalization

Does anyone know whether this is correct?

like image 834
lizardfireman Avatar asked May 20 '18 08:05

lizardfireman


People also ask

How do you use class weight in logistic regression?

Logistic Regression (manual class weights): The idea is, if we are giving n as the weight for the minority class, the majority class will get 1-n as the weights. Here, the magnitude of the weights is not very large but the ratio of weights between majority and minority class will be very high.

How does Sklearn logistic regression work?

It computes the probability of an event occurrence. It is a special case of linear regression where the target variable is categorical in nature. It uses a log of odds as the dependent variable. Logistic Regression predicts the probability of occurrence of a binary event utilizing a logit function.

Can logistic regression be used for an imbalanced classification problem?

In logistic regression, another technique comes handy to work with imbalance distribution. This is to use class-weights in accordance with the class distribution. Class-weights is the extent to which the algorithm is punished for any wrong prediction of that class.

How do class weights work?

Class weights give all the classes equal importance on gradient updates, on average, regardless of how many samples we have from each class in the training data. This prevents models from predicting the more frequent class more often just because it's more common.


1 Answers

Check the following lines in the source code:

le = LabelEncoder()
if isinstance(class_weight, dict) or multi_class == 'multinomial':
    class_weight_ = compute_class_weight(class_weight, classes, y)
    sample_weight *= class_weight_[le.fit_transform(y)]

Here is the source code for the compute_class_weight() function:

...
else:
    # user-defined dictionary
    weight = np.ones(classes.shape[0], dtype=np.float64, order='C')
    if not isinstance(class_weight, dict):
        raise ValueError("class_weight must be dict, 'balanced', or None,"
                         " got: %r" % class_weight)
    for c in class_weight:
        i = np.searchsorted(classes, c)
        if i >= len(classes) or classes[i] != c:
            raise ValueError("Class label {} not present.".format(c))
        else:
            weight[i] = class_weight[c]
...

In the snippet above class_weight are applied to sample_weight, which is used in a few internal function like _logistic_loss_and_grad, _logistic_loss, etc.:

# Logistic loss is the negative of the log of the logistic function.
out = -np.sum(sample_weight * log_logistic(yz)) + .5 * alpha * np.dot(w, w)
# NOTE: --->  ^^^^^^^^^^^^^^^
like image 137
MaxU - stop WAR against UA Avatar answered Oct 18 '22 09:10

MaxU - stop WAR against UA