Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XGBoost get predict_contrib using sklearn API?

In Python, XGBoost allows you to train/predict using their Booster class or using their sklearn API (http://xgboost.readthedocs.io/en/latest/python/python_api.html). I'm using the sklearn API, and want to use the pred_contribs capabilities of XGBoost. I would expect this to work, but it doesn't:

model = xgb.XGBClassifier().fit(X_train, y_train)
pred = model.predict_proba(X_test, pred_contribs=True)

It looks like pred_contribs is only a parameter for the Booster class predict function. How do I use this parameter through the sklearn API? Or is there an easy workaround to get the prediction contributors after training using the sklearn API?

like image 536
Kewl Avatar asked Apr 06 '18 16:04

Kewl


1 Answers

You can use the get_booster() method from XGBClassifier, which will return a Booster object, after the XGBClassifier has been fitted with training data.

After that you can simply call predict() on the Booster object with pred_contribs = True.

Example code:

from xgboost import XGBClassifier, DMatrix
from sklearn.datasets import load_iris

iris_data = load_iris()

# Taking only first 100 samples to make this a binary problem, 
# else it will be multi-class and shape of pred_contribs will change
X, y = iris_data.data[:100], iris_data.target[:100]

# This data has 4 features
print(X.shape)
Output: (100, 4)


clf = XGBClassifier()
clf.fit(X, y)

# This is what you need
booster = clf.get_booster()


# Using only a single sample for predict, you can use multiple
test_X = [X[0]]

# Wrapping the test X into a DMatrix, need by Booster
predictions = booster.predict(DMatrix(test_X), pred_contribs=True)

print(predictions.shape)

# Output has 5 columns, 1 for each feature, and last for bias
Output: (1, 5)
like image 104
Vivek Kumar Avatar answered Nov 12 '22 01:11

Vivek Kumar