Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Boosting tree to generate feature in sklearn

Tags:

I am referring to this link to Feature Transformation using tree ensembles for the context.

Specifically for below part of code, in the sample of the link, the method of (1) using Boosting tree to generate feature, then using LR to train, outperforms (2) using Boosting tree itself. Questions,

  1. Wondering if it is true in general case using Boosting tree to generate feature (and using another classifier to classify) is better than using Boosting tree to do classification itself?
  2. And also wondering why using Boosting tree to generate feature, then using LR to train, outperforms using Boosting tree itself?

    grd = GradientBoostingClassifier(n_estimators=n_estimator)
    grd_enc = OneHotEncoder()
    grd_lm = LogisticRegression()
    grd.fit(X_train, y_train)
    grd_enc.fit(grd.apply(X_train)[:, :, 0])
    grd_lm.fit(grd_enc.transform(grd.apply(X_train_lr)[:, :, 0]), y_train_lr)
    
like image 324
Lin Ma Avatar asked May 01 '18 04:05

Lin Ma


People also ask

Is there XGBoost in Sklearn?

XGBoost is easy to implement in scikit-learn. XGBoost is an ensemble, so it scores better than individual models.

Can you use boosting for regression?

Gradient boosting can be used for regression and classification problems.

Can gradient boosting be used for multiclass classification?

The power of gradient boosting machines comes from the fact that they can be used on more than binary classification problems, they can be used on multi-class classification problems and even regression problems.


1 Answers

Interesting sources are paper_1 and paper_2 and additional references in them.

So to answer your questions:

  1. Very general statement, looking at some experimental results in the above papers there seem to be some exceptions. However, most of the time it does improve the score.
  2. The main idea behind doing so is to map features into a space where samples are linearly separable. If it really is the case, then linear classifiers shine.
like image 145
Jan K Avatar answered Sep 28 '22 17:09

Jan K