XGboost python - classifier class weight option?

2 Answers

For sklearn version < 0.19

Just assign each entry of your train data its class weight. First get the class weights with class_weight.compute_class_weight of sklearn then assign each row of the train data its appropriate weight.

I assume here that the train data has the column class containing the class number. I assumed also that there are nb_classes that are from 1 to nb_classes.

from sklearn.utils import class_weight
classes_weights = list(class_weight.compute_class_weight('balanced',
                                             np.unique(train_df['class']),
                                             train_df['class']))

weights = np.ones(y_train.shape[0], dtype = 'float')
for i, val in enumerate(y_train):
    weights[i] = classes_weights[val-1]

xgb_classifier.fit(X, y, sample_weight=weights)

Update for sklearn version >= 0.19

There is simpler solution

from sklearn.utils import class_weight
classes_weights = class_weight.compute_sample_weight(
    class_weight='balanced',
    y=train_df['class']
)

xgb_classifier.fit(X, y, sample_weight=classes_weights)

answered Sep 19 '22 08:09

Firas Omrane

from sklearn.utils.class_weight import compute_sample_weight
xgb_classifier.fit(X, y, sample_weight=compute_sample_weight("balanced", y))

answered Sep 19 '22 08:09

Tianhuang Su

Related questions
                            
                                Parallel jobs don't finish in scikit-learn's GridSearchCV
                            
                                SKlearn SGD Partial Fit
                            
                                PyInstaller: a module is not included into --onefile, but works fine with --onedir
                            
                                Scikit-learn using GridSearchCV on DecisionTreeClassifier
                            
                                No module named 'sklearn.neighbors._base'
                            
                                AttributeError when using ColumnTransformer into a pipeline
                            
                                how to print estimated coefficients after a (GridSearchCV) fit a model? (SGDRegressor)
                            
                                How to perform under sampling in scikit learn?
                            
                                How to set custom stop words for sklearn CountVectorizer?
                            
                                XGBOOST: sample_Weights vs scale_pos_weight
                            
                                displaying scikit decision tree figure in jupyter notebook
                            
                                How should I vectorize the following list of lists with scikit learn?
                            
                                Can the Precision, Recall and F1 be the same value?
                            
                                How does parameters 'c' and 'cmap' behave in a matplotlib scatter plot?
                            
                                How to use mahalanobis distance in sklearn DistanceMetrics?
                            
                                Understanding Text feature extraction TfidfVectorizer in python scikit-learn
                            
                                KL-Divergence of two GMMs
                            
                                ImportError: cannot import name cross_validation
                            
                                CountVectorizer: "I" not showing up in vectorized text
                            
                                How do i visualize data points of tf-idf vectors for kmeans clustering?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

XGboost python - classifier class weight option?

Tags:

scikit-learn

xgboost

Fiction

People also ask

2 Answers

Firas Omrane

Tianhuang Su

Recent Activity

Donate For Us