Is it possible to delete or insert a step in a sklearn.pipeline.Pipeline
object?
I am trying to do a grid search with or without one step in the Pipeline object. And wondering whether I can insert or delete a step in the pipeline. I saw in the Pipeline
source code, there is a self.steps
object holding all the steps. We can get the steps by named_steps()
. Before modifying it, I want to make sure, I do not cause unexpected effects.
Here is a example code:
from sklearn.pipeline import Pipeline
from sklearn.svm import SVC
from sklearn.decomposition import PCA
estimators = [('reduce_dim', PCA()), ('svm', SVC())]
clf = Pipeline(estimators)
clf
Is it possible that we do something like steps = clf.named_steps()
, then insert or delete in this list? Does this cause undesired effect on the clf object?
I see that everyone mentioned only the delete step. In case you want to also insert a step in the pipeline:
pipe.steps.append(['step name',transformer()])
pipe.steps
works in the same way as lists do, so you can also insert an item into a specific location:
pipe.steps.insert(1,['estimator',transformer()]) #insert as second step
Based on rudimentary testing you can safely remove a step from a scikit-learn pipeline just like you would any list item, with a simple
clf_pipeline.steps.pop(n)
where n is the position of the individual estimator you are trying to remove.
Just chiming in because I feel like the other answers answered the question of adding steps to a pipeline really well, but didn't really cover how to delete a step from a pipeline.
Watch out with my approach though. Slicing lists in this instance is a bit weird.
from sklearn.pipeline import Pipeline
from sklearn.svm import SVC
from sklearn.decomposition import PCA
from sklearn.preprocessing import PolynomialFeatures
estimators = [('reduce_dim', PCA()), ('poly', PolynomialFeatures()), ('svm', SVC())]
clf = Pipeline(estimators)
If you want to create a pipeline with just steps PCA/Polynomial you can just slice the list step by indexes and pass it to Pipeline
clf1 = Pipeline(clf.steps[0:2])
Want to just use steps 2/3? Watch out these slices don't always make the most amount of sense
clf2 = Pipeline(clf.steps[1:3])
Want to just use steps 1/3? I can't seem to do using this approach
clf3 = Pipeline(clf.steps[0] + clf.steps[2]) # errors
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With