I am using a supervised learning algorithm Random Forest classifier for training the data.
clf = RandomForestClassifier(n_estimators=50, n_jobs=3, random_state=42)
Different parameter in the grid are:
param_grid = {
'n_estimators': [200, 700],
'max_features': ['auto', 'sqrt', 'log2'],
'max_depth': [5,10],
'min_samples_split': [5,10]
}
Classifier "clf" and parameter grid "param_grid" are passed in the GridSearhCV method.
clf_rfc = GridSearchCV(estimator=clf, param_grid=param_grid)
When I fit the features with labels using
clf_rfc.fit(X_train, y_train)
I get the error "Too many indices in the array". Shape of X_train is (204,3) and of y_train is (204,1).
Tried with the option clf_rfc.fit(X_train.values, y_train.values) but could not get rid of the error.
Any suggestions would be appreciated !!
As mentioned in previous post the problems appears to be in y_train which dimensions are (204,1). I think this is the problem instead of (204,1) should be (204,), click here for more info.
So if you rewrite y_train everything should be fine:
c, r = y_train.shape
y_train = y_train.reshape(c,)
If it gives as error such as: AttributeError: 'DataFrame' object has no attribute 'reshape' then try:
c, r = y_train.shape
y_train = y_train.values.reshape(c,)
The shape of the 'y-train' dataframe is not correct. Try this:
clf_rfc.fit(X_train, y_train[0].values)
OR
clf_rfc.fit(X_train, y_train.values.ravel())
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With