In Python, XGBoost allows you to train/predict using their Booster class or using their sklearn API (http://xgboost.readthedocs.io/en/latest/python/python_api.html). I'm using the sklearn API, and want to use the pred_contribs
capabilities of XGBoost. I would expect this to work, but it doesn't:
model = xgb.XGBClassifier().fit(X_train, y_train)
pred = model.predict_proba(X_test, pred_contribs=True)
It looks like pred_contribs
is only a parameter for the Booster
class predict function. How do I use this parameter through the sklearn API? Or is there an easy workaround to get the prediction contributors after training using the sklearn API?
You can use the get_booster()
method from XGBClassifier, which will return a Booster object, after the XGBClassifier has been fitted with training data.
After that you can simply call predict()
on the Booster object with pred_contribs = True
.
Example code:
from xgboost import XGBClassifier, DMatrix
from sklearn.datasets import load_iris
iris_data = load_iris()
# Taking only first 100 samples to make this a binary problem,
# else it will be multi-class and shape of pred_contribs will change
X, y = iris_data.data[:100], iris_data.target[:100]
# This data has 4 features
print(X.shape)
Output: (100, 4)
clf = XGBClassifier()
clf.fit(X, y)
# This is what you need
booster = clf.get_booster()
# Using only a single sample for predict, you can use multiple
test_X = [X[0]]
# Wrapping the test X into a DMatrix, need by Booster
predictions = booster.predict(DMatrix(test_X), pred_contribs=True)
print(predictions.shape)
# Output has 5 columns, 1 for each feature, and last for bias
Output: (1, 5)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With