Getting model attributes from pipeline

Tags:

I typically get PCA loadings like this:

pca = PCA(n_components=2) X_t = pca.fit(X).transform(X) loadings = pca.components_

If I run PCA using a scikit-learn pipeline:

from sklearn.pipeline import Pipeline pipeline = Pipeline(steps=[     ('scaling',StandardScaler()), ('pca',PCA(n_components=2)) ]) X_t=pipeline.fit_transform(X)

is it possible to get the loadings?

Simply trying loadings = pipeline.components_ fails:

AttributeError: 'Pipeline' object has no attribute 'components_'

(Also interested in extracting attributes like coef_ from pipelines.)

538

asked Mar 03 '15 01:03

2 Answers

Did you look at the documentation: http://scikit-learn.org/dev/modules/pipeline.html I feel it is pretty clear.

Update: in 0.21 you can use just square brackets:

pipeline['pca']

or indices

pipeline[1]

There are two ways to get to the steps in a pipeline, either using indices or using the string names you gave:

pipeline.named_steps['pca'] pipeline.steps[1][1]

This will give you the PCA object, on which you can get components. With named_steps you can also use attribute access with a . which allows autocompletion:

pipeline.names_steps.pca.<tab here gives autocomplete>

105

answered Sep 30 '22 19:09

Using Neuraxle

Working with pipelines is simpler using Neuraxle. For instance, you can do this:

from neuraxle.pipeline import Pipeline  # Create and fit the pipeline:  pipeline = Pipeline([     StandardScaler(),     PCA(n_components=2) ]) pipeline, X_t = pipeline.fit_transform(X)  # Get the components:  pca = pipeline[-1] components = pca.components_

You can access your PCA these three different ways as wished:

pipeline['PCA']
pipeline[-1]
pipeline[1]

Neuraxle is a pipelining library built on top of scikit-learn to take pipelines to the next level. It allows easily managing spaces of hyperparameter distributions, nested pipelines, saving and reloading, REST API serving, and more. The whole thing is made to also use Deep Learning algorithms and to allow parallel computing.

Nested pipelines:

You could have pipelines within pipelines as below.

# Create and fit the pipeline:  pipeline = Pipeline([     StandardScaler(),     Identity(),     Pipeline([         Identity(),  # Note: an Identity step is a step that does nothing.          Identity(),  # We use it here for demonstration purposes.          Identity(),         Pipeline([             Identity(),             PCA(n_components=2)         ])     ]) ]) pipeline, X_t = pipeline.fit_transform(X)

Then you'd need to do this:

# Get the components:  pca = pipeline["Pipeline"]["Pipeline"][-1] components = pca.components_

answered Sep 30 '22 20:09

Guillaume Chevalier

Related questions
                            
                                How to get Python exception text
                            
                                __init__ as a constructor?
                            
                                How to right align level field in Python logging.Formatter
                            
                                Add a non-model field on a ModelSerializer in DRF 3
                            
                                Numpy remove a dimension from np array
                            
                                Encoding nested python object in JSON
                            
                                UnicodeDecodeError: 'utf8' codec can't decode bytes in position 3-6: invalid data
                            
                                Why does concatenation of DataFrames get exponentially slower?
                            
                                How to iterate over the file in python
                            
                                Python, Overriding an inherited class method
                            
                                How to access data when form.is_valid() is false
                            
                                How to set another Inline title in Django Admin?
                            
                                Python Script to convert Image into Byte array
                            
                                Difference between "fill" and "expand" options for tkinter pack method
                            
                                How can I select all rows with sqlalchemy?
                            
                                Editing django-rest-framework serializer object before save
                            
                                Grouping Python dictionary keys as a list and create a new dictionary with this list as a value
                            
                                iterating quickly through list of tuples
                            
                                How do I run uwsgi with virtualenv
                            
                                How to detect lines in OpenCV?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Getting model attributes from pipeline

Tags:

python

scikit-learn

pipeline

lmart999

People also ask

2 Answers

Andreas Mueller

Using Neuraxle

Nested pipelines:

Guillaume Chevalier

Recent Activity

Donate For Us