How are class_weights being applied in sklearn logistic regression?

Tags:

I am interested in how sklearn apply the class weight we supply. The documentation doesn't state explicitly where and how the class weights are applied. Nor does reading the source code helps (seems like sklearn.svm.liblinear is used for the optimization, and I can't read the source codes since it is a .pyd file...)

But I guess it works on the cost function: when class weights are specified, the cost of the respective class will be multiplied by the class weight. For example if I have 2 observations each from class 0 (weight=0.5) and class 1 (weight=1) respectively, then the cost function would be:

Cost = 0.5*log(...X_0,y_0...) + 1*log(...X_1,y_1...) + penalization

Does anyone know whether this is correct?

834

asked May 20 '18 08:05

lizardfireman

1 Answers

Check the following lines in the source code:

le = LabelEncoder()
if isinstance(class_weight, dict) or multi_class == 'multinomial':
    class_weight_ = compute_class_weight(class_weight, classes, y)
    sample_weight *= class_weight_[le.fit_transform(y)]

Here is the source code for the compute_class_weight() function:

...
else:
    # user-defined dictionary
    weight = np.ones(classes.shape[0], dtype=np.float64, order='C')
    if not isinstance(class_weight, dict):
        raise ValueError("class_weight must be dict, 'balanced', or None,"
                         " got: %r" % class_weight)
    for c in class_weight:
        i = np.searchsorted(classes, c)
        if i >= len(classes) or classes[i] != c:
            raise ValueError("Class label {} not present.".format(c))
        else:
            weight[i] = class_weight[c]
...

In the snippet above class_weight are applied to sample_weight, which is used in a few internal function like _logistic_loss_and_grad, _logistic_loss, etc.:

# Logistic loss is the negative of the log of the logistic function.
out = -np.sum(sample_weight * log_logistic(yz)) + .5 * alpha * np.dot(w, w)
# NOTE: --->  ^^^^^^^^^^^^^^^

137

answered Oct 18 '22 09:10

MaxU - stop WAR against UA

Related questions
                            
                                Can Selenium use a specific Firefox profile without making a copy
                            
                                Converting Tensor to Numpy Array - Custom Loss function In keras
                            
                                python h5py file read "OSError: Unable to open file (bad superblock version number)"
                            
                                Plot.ly pie chart result precision
                            
                                Controlling Dataflow/Apache Beam output sharding
                            
                                How to debug async python code?
                            
                                Safely bind method from one class to another class in Python [duplicate]
                            
                                Using context managers for recovering from celery's SoftTimeLimitExceeded
                            
                                Can I use the secrets module with a version of Python earlier than 3.6?
                            
                                df.append() with dicts converts booleans to 1s and 0s
                            
                                Is there a way to impute missing values in machine learning?
                            
                                How do chr() and ord() relate to str and bytes?
                            
                                Pandas to_csv to GzipFile in Python 3 not working
                            
                                Selecting a random value in a pandas data frame by column
                            
                                Python: error handling with recursive function in error
                            
                                "Could not find a version that satisfies the requirement" error for Django2 app installation
                            
                                How can I make string parsing in Python less unwieldy?
                            
                                How to access country restricted website through proxy selenium in python
                            
                                Building python dependency graph on the fly
                            
                                Model Output `to_excel` in Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How are class_weights being applied in sklearn logistic regression?

Tags:

python

scikit-learn

logistic-regression

lizardfireman

People also ask

1 Answers

MaxU - stop WAR against UA

Recent Activity

Donate For Us