In scikit learn svm classifier what is the difference between class_weight = None and class_weight = Auto.
From the documentation it is given as
Set the parameter C of class i to class_weight[i]*C for SVC. If not given, all classes are supposed to have weight one. The ‘auto’ mode uses the values of y to automatically adjust weights inversely proportional to class frequencies.
class sklearn.svm.SVC(C=1.0, kernel='rbf', degree=3, gamma=0.0, coef0=0.0, shrinking=True, probability=False, tol=0.001, cache_size=200, class_weight=None, verbose=False, max_iter=-1, random_state=None)
But what is the advantage of using auto mode. I couldnt understand its implementation.
This takes place in the class_weight.py file:
elif class_weight == 'auto':
# Find the weight of each class as present in y.
le = LabelEncoder()
y_ind = le.fit_transform(y)
if not all(np.in1d(classes, le.classes_)):
raise ValueError("classes should have valid labels that are in y")
# inversely proportional to the number of samples in the class
recip_freq = 1. / bincount(y_ind)
weight = recip_freq[le.transform(classes)] / np.mean(recip_freq)
This means that each class you have (in classes
) gets a weight equal to 1
divided by the number of times that class appears in your data (y
), so classes that appear more often will get lower weights. This is then further divided by the mean of all the inverse class frequencies.
The advantage is that you no longer have to worry about setting the class weights yourself: this should already be good for most applications.
If you look above in the source code, for None
, weight
is filled with ones, so each class gets equal weight.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With