Why does calling transform() on test data return an error that the data is not fitted yet?

Question

While performing feature scaling, instead of assigning a variable to StandardScaler(), when coded like this:

from sklearn.preprocessing import StandardScaler

x_train = StandardScaler().fit_transform(x_train)

x_test = StandardScaler().transform(x_test)

It gives the following error:

NotFittedError: This StandardScaler instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.

whereas, following code works fine (after giving an identifier to StandardScaler()):

from sklearn.preprocessing import StandardScaler

sc_x = StandardScaler()

x_train = sc_x.fit_transform(x_train)

x_test = sc_x.transform(x_test)

Here, x_train is the training dataset and x_test is the test dataset.

Can someone please explain that and why is it happening?

G. Anderson · Accepted Answer

When you call StandardScaler(), you create a new (a.k.a. unfitted) object of the standscaler class. If you want to use it, you have to fit it before you can transform any data with it.

What you "told" the code to do was (pseudocode):

Create a new scaler object
Fit it to your training data
Create another new scaler object
Don't fit it to anything, but use it to transform some data

In the secod example, you created a single scaler object, fitted it to your data, then used the same object to transform your test data (which is the correct method to use)

Why does calling transform() on test data return an error that the data is not fitted yet?

Tags:

python

machine-learning

scikit-learn

keenlearner

1 Answers

G. Anderson

Recent Activity

Donate For Us

Why does calling transform() on test data return an error that the data is not fitted yet?

Tags:

python

machine-learning

scikit-learn

keenlearner

1 Answers

G. Anderson

Related questions

Recent Activity

Donate For Us