Multiple pipelines that merge within a sklearn Pipeline?

Question

Sometimes I design machine learning pipelines that look something like this:

Example pipeline

Normally I have to hack these "split" pipelines together using my own "Combine Features" function. However, it'd be great if I could fit this into a sklearn Pipeline object. How would I go about doing that? (Pseudo-code is fine.)

Andreas Mueller · Accepted Answer

As long as "Entire Data Set" means the same features, this is exactly what FeatureUnion does:

make_pipeline(make_union(PolynomialFeatures(), PCA()), RFE(RandomForestClassifier()))

If you have two different sets of features that you want to combine, you first need to put them into a single dataset, and then have each branch of the FeatureUnion first select the features it should operate on. [there is currently no ready-made function for this but it's easily achiveable with a FunctionTransformer() for example]

Multiple pipelines that merge within a sklearn Pipeline?

Tags:

python

machine-learning

scikit-learn

Randy Olson

1 Answers

Andreas Mueller

Recent Activity

Donate For Us

Multiple pipelines that merge within a sklearn Pipeline?

Tags:

python

machine-learning

scikit-learn

Randy Olson

1 Answers

Andreas Mueller

Related questions

Recent Activity

Donate For Us