A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array

Question

Code goes as follows, I am trying to use training data for GBRT regression trees, same data works good for other classifiers but gives above error for GBRT. please help :

dataset = load_files('train')
vectorizer = TfidfVectorizer(encoding='latin1')
X_train = vectorizer.fit_transform((open(f).read() for f in dataset.filenames)) 
assert sp.issparse(X_train)     
print("n_samples: %d, n_features: %d" % X_train.shape)
y_train = dataset.target
def benchmark(clf_class, params, name):
    clf = clf_class(**params).fit(X_train, y_train)

Peiqin · Accepted Answer

I came accross the same problem trying to train a GradientBoostingClassifier using the data loaded by load_svmlight_files. Solved by transforming a sparse matrix to a numpy array.

X_train.todense()

Chung-Yen Hung · Answer

Because GBRT in sklearn request X (training data) is array-like not sparse matrix: sklearn-gbrt

I hope this could help you!

A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array

Tags:

python

scikit-learn

Dhananjay Ambekar

2 Answers

Peiqin

Chung-Yen Hung

Recent Activity

Donate For Us

A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array

Tags:

python

scikit-learn

Dhananjay Ambekar

2 Answers

Peiqin

Chung-Yen Hung

Related questions

Recent Activity

Donate For Us