Scikit-learn zip argument #1 must support iteration

Question

I have the following pipeline to perform machine learning on a corpus. It first extracts text, uses TfidfVectorizer to extract n-grams and then selects the best features. The pipeline is working fine without the feature selection step. However, with it, I am getting

 Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/sklearn/pipeline.py", line 90, in __init__
    names, estimators = zip(*steps)
TypeError: zip argument #1 must support iteration

at SGDClassifier().

pipeline = Pipeline([
    # Use FeatureUnion to combine the features
    ('features', FeatureUnion(
        transformer_list=[
            # N-GRAMS
            ('ngrams', Pipeline([
                ('extractor', TextExtractor(normalized=True)), # returns a list of strings
                ('vectorizer', TfidfVectorizer(analyzer='word', strip_accents='ascii', use_idf=True, norm="l2", min_df=3, max_df=0.90)),
                ('feature_selection', SelectPercentile(score_func=chi2, percentile=70)),
            ])),
        ],,
    )),

    ('clf', Pipeline([
        SGDClassifier(n_jobs=-1, verbose=0)
    ])),
])

David Maust · Accepted Answer

It looks like you missed a tag in your Pipeline

('clf', Pipeline([
    SGDClassifier(n_jobs=-1, verbose=0)
])),

Should be

('clf', Pipeline([
    ('sgd', SGDClassifier(n_jobs=-1, verbose=0))
])),

Scikit-learn zip argument #1 must support iteration

Tags:

machine-learning

scikit-learn

feature-selection

Justin D.

1 Answers

David Maust

Recent Activity

Donate For Us

Scikit-learn zip argument #1 must support iteration

Tags:

machine-learning

scikit-learn

feature-selection

Justin D.

1 Answers

David Maust

Related questions

Recent Activity

Donate For Us