XGBoost for multilabel classification?

Tags:

Is it possible to use XGBoost for multi-label classification? Now I use OneVsRestClassifier over GradientBoostingClassifier from sklearn. It works, but use only one core from my CPU. In my data I have ~45 features and the task is to predict about 20 columns with binary (boolean) data. Metric is mean average precision (map@7). If you have a short example of code to share, that would be great.

429

asked Dec 01 '16 17:12

user3318023

1 Answers

One possible approach, instead of using OneVsRestClassifier which is for multi-class tasks, is to use MultiOutputClassifier from the sklearn.multioutput module.

Below is a small reproducible sample code with the number of input features and target outputs requested by the OP

import xgboost as xgb from sklearn.datasets import make_multilabel_classification from sklearn.model_selection import train_test_split from sklearn.multioutput import MultiOutputClassifier from sklearn.metrics import accuracy_score  # create sample dataset X, y = make_multilabel_classification(n_samples=3000, n_features=45, n_classes=20, n_labels=1,                                       allow_unlabeled=False, random_state=42)  # split dataset into training and test set X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=123)  # create XGBoost instance with default hyper-parameters xgb_estimator = xgb.XGBClassifier(objective='binary:logistic')  # create MultiOutputClassifier instance with XGBoost model inside multilabel_model = MultiOutputClassifier(xgb_estimator)  # fit the model multilabel_model.fit(X_train, y_train)  # evaluate on test data print('Accuracy on test data: {:.1f}%'.format(accuracy_score(y_test, multilabel_model.predict(X_test))*100))

120

answered Sep 28 '22 13:09

Ric S

Related questions
                            
                                Laravel 5.3 - Single Notification for User Collection (followers)
                            
                                Reuse/restart the same node inspect session
                            
                                Why is BigQuery so slow on non-large data sizes?
                            
                                Is it possible to define string.Empty in TypeScript?
                            
                                Trying to replicate GridLayout column alignment with ConstraintLayout
                            
                                Best use of R and SQL if restricted to a local machine
                            
                                When do I use a consensus algorithm like Paxos vs using a something like a Vector Clock?
                            
                                What does Override application root URL really do?
                            
                                Importing Module without routes
                            
                                What is the point of naming a companion object in kotlin
                            
                                Firestore offline cache
                            
                                Spring - Multiple Spring Data modules found, entering strict repository configuration mode

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With