Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what is the difference between class weight = none and auto in svm scikit learn

In scikit learn svm classifier what is the difference between class_weight = None and class_weight = Auto.

From the documentation it is given as

Set the parameter C of class i to class_weight[i]*C for SVC. If not given, all classes are supposed to have weight one. The ‘auto’ mode uses the values of y to automatically adjust weights inversely proportional to class frequencies.

class sklearn.svm.SVC(C=1.0, kernel='rbf', degree=3, gamma=0.0, coef0=0.0, shrinking=True, probability=False, tol=0.001, cache_size=200, class_weight=None, verbose=False, max_iter=-1, random_state=None)

But what is the advantage of using auto mode. I couldnt understand its implementation.

like image 457
dangerous Avatar asked Feb 20 '15 04:02

dangerous


1 Answers

This takes place in the class_weight.py file:

elif class_weight == 'auto':
    # Find the weight of each class as present in y.
    le = LabelEncoder()
    y_ind = le.fit_transform(y)
    if not all(np.in1d(classes, le.classes_)):
        raise ValueError("classes should have valid labels that are in y")

    # inversely proportional to the number of samples in the class
    recip_freq = 1. / bincount(y_ind)
    weight = recip_freq[le.transform(classes)] / np.mean(recip_freq)

This means that each class you have (in classes) gets a weight equal to 1 divided by the number of times that class appears in your data (y), so classes that appear more often will get lower weights. This is then further divided by the mean of all the inverse class frequencies.

The advantage is that you no longer have to worry about setting the class weights yourself: this should already be good for most applications.

If you look above in the source code, for None, weight is filled with ones, so each class gets equal weight.

like image 141
IVlad Avatar answered Sep 28 '22 07:09

IVlad