SKLearn warning "valid feature names" in version 1.0

Question

I'm getting the following warning after upgrading to version 1.0 of scikit-learn:

UserWarning: X does not have valid feature names, but IsolationForest was fitted with feature name

I cannot find in the docs on what is a "valid feature name". How do I deal with this warning?

Andrea NR · Accepted Answer

I got the same warning message with another sklearn model. I realized that it was showing up because I fitted the model with a data in a dataframe, and then used only the values to predict. From the moment I fixed that, the warning disappeared.

Here is an example:

model_reg.fit(scaled_x_train, y_train[vp].values)
data_pred = model_reg.predict(scaled_x_test.values)

This first code had the warning, because scaled_x_train is a DataFrame with feature names, while scaled_x_test.values is only values, without feature names. Then, I changed to this:

model_reg.fit(scaled_x_train.values, y_train[vp].values)
data_pred = model_reg.predict(scaled_x_test.values)

And now there are no more warnings on my code.

Naresh Nune · Answer

I was getting very similar error but on module DecisionTreeClassifier for Fit and Predict.

Initially I was sending dataframe as input to fit with headers and I got the error.

When I trimmed to remove the headers and sent only values then the error got disappeared. Sample code before and after changes.

Code with Warning:

model = DecisionTreeClassifier()
model.fit(x,y)  #Here x includes the dataframe with headers
predictions = model.predict([
    [20,1], [20,0]
])
print(predictions)

Code without Warning:

model = DecisionTreeClassifier()
model.fit(x.values,y)  #Here x.values will have only values without headers
predictions = model.predict([
     [20,1], [20,0]
])
print(predictions)

GNETO DOMINIQUE · Answer

I had also the same problem .The problem was due to fact that I fitted the model with X train data as dataframe (model.fit(X,Y)) and I make a prediction with with X test as an array ( model.predict([ [20,0] ]) ) . To solve that I have converted the X train dataframe into an array as illustrated bellow .

BEFORE

model = DecisionTreeClassifier()
model.fit(X,Y) # X train here is a dataFrame
predictions = model.predict([20,0])  ## generates warning

AFTER

model = DecisionTreeClassifier()
X = X.values # conversion of X  into array
model.fit(X,Y)
model.predict([ [20,0] ])  #now ok , no warning

Ben Reiniger · Answer

The other answers so far recommend (re)training using a numpy array instead of a dataframe for the training data. The warning is a sort of safety feature, to ensure you're passing the data you meant to, so I would suggest to pass a dataframe (with correct column labels!) to the predict function instead.

Also, note that it's just a warning, not an error. You can ignore the warning and proceed with the rest of your code without problem; just be sure that the data is in the same order as it was trained with!

Bhuvan bhuvi · Answer

I got the same error while using dataframes but by passing only values it is no more there

use

reg = reg.predict( x[['data']].values , y)

It is showing error because our dataframe has feature names but we should fit the data as 2d array(or matrix) with values for training or testing the dataset.

Here is the image of the same thing mentioned above image of jupytr notebook code

SKLearn warning "valid feature names" in version 1.0

Tags:

python-3.x

pandas

scikit-learn

Jaume Figueras

Video Answer

5 Answers

Andrea NR

Naresh Nune

GNETO DOMINIQUE

Ben Reiniger

Bhuvan bhuvi

Recent Activity

Donate For Us

SKLearn warning "valid feature names" in version 1.0

Tags:

python-3.x

pandas

scikit-learn

Jaume Figueras

Video Answer

5 Answers

Andrea NR

Naresh Nune

GNETO DOMINIQUE

Ben Reiniger

Bhuvan bhuvi

Related questions

Recent Activity

Donate For Us