multiclass classification in xgboost (python)

Tags:

I can't figure out how to pass number of classes or eval metric to xgb.XGBClassifier with the objective function 'multi:softmax'.

I looked at many documentations but the only talk about the sklearn wrapper which accepts n_class/num_class.

My current setup looks like

kf = cross_validation.KFold(y_data.shape[0], \
    n_folds=10, shuffle=True, random_state=30)
err = [] # to hold cross val errors
# xgb instance
xgb_model = xgb.XGBClassifier(n_estimators=_params['n_estimators'], \
    max_depth=params['max_depth'], learning_rate=_params['learning_rate'], \
    min_child_weight=_params['min_child_weight'], \
    subsample=_params['subsample'], \
    colsample_bytree=_params['colsample_bytree'], \
    objective='multi:softmax', nthread=4)

# cv
for train_index, test_index in kf:
    xgb_model.fit(x_data[train_index], y_data[train_index], eval_metric='mlogloss')
    predictions = xgb_model.predict(x_data[test_index])
    actuals = y_data[test_index]
    err.append(metrics.accuracy_score(actuals, predictions))

350

asked Sep 08 '16 09:09

user3804483

1 Answers

You don't need to set num_class in the scikit-learn API for XGBoost classification. It is done automatically when fit is called. Look at xgboost/sklearn.py at the beginning of the fit method of XGBClassifier:

    evals_result = {}
    self.classes_ = np.unique(y)
    self.n_classes_ = len(self.classes_)

    xgb_options = self.get_xgb_params()

    if callable(self.objective):
        obj = _objective_decorator(self.objective)
        # Use default value. Is it really not used ?
        xgb_options["objective"] = "binary:logistic"
    else:
        obj = None

    if self.n_classes_ > 2:
        # Switch to using a multiclass objective in the underlying XGB instance
        xgb_options["objective"] = "multi:softprob"
        xgb_options['num_class'] = self.n_classes_

132

answered Sep 26 '22 01:09

Adrien Renaud

Related questions
                            
                                Python encoding and json dumps
                            
                                Displaying image with matplotlib in ipython
                            
                                Create pandas DataFrame iteratively
                            
                                np.where multiple return values
                            
                                pdfminer - ImportError: No module named pdfminer.pdfdocument
                            
                                python ternary in jinja2 gives TemplateSyntaxError: tag name expected
                            
                                How to send JSON as part of multipart POST-request
                            
                                How to get time from an NTP server?
                            
                                Boto3 uses old credentials
                            
                                rq timeout param in enqueue call not working, giving JobTimeoutException
                            
                                convert list of data object to csv
                            
                                Split dataframe into two on the basis of date
                            
                                Repeat elements in one list based on elements from another
                            
                                Using functools.lru_cache on functions with constant but non-hashable objects
                            
                                Django 1.9 Bulk Create New Users Not hashing Passwords Correctly
                            
                                How to plot age distribution with pandas
                            
                                Extract string inside nested brackets
                            
                                Does Seaborn come with Anaconda?
                            
                                Django's runscript: No (valid) module for script 'filename' found
                            
                                pandas read_csv and keep only certain rows (python)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

multiclass classification in xgboost (python)

Tags:

python

scikit-learn

xgboost

user3804483

People also ask

1 Answers

Adrien Renaud

Recent Activity

Donate For Us