While performing feature scaling, instead of assigning a variable to StandardScaler(), when coded like this:
from sklearn.preprocessing import StandardScaler
x_train = StandardScaler().fit_transform(x_train)
x_test = StandardScaler().transform(x_test)
It gives the following error:
NotFittedError: This StandardScaler instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.
whereas, following code works fine (after giving an identifier to StandardScaler()):
from sklearn.preprocessing import StandardScaler
sc_x = StandardScaler()
x_train = sc_x.fit_transform(x_train)
x_test = sc_x.transform(x_test)
Here, x_train is the training dataset and x_test is the test dataset.
Can someone please explain that and why is it happening?
When you call StandardScaler(), you create a new (a.k.a. unfitted) object of the standscaler class. If you want to use it, you have to fit it before you can transform any data with it.
What you "told" the code to do was (pseudocode):
In the secod example, you created a single scaler object, fitted it to your data, then used the same object to transform your test data (which is the correct method to use)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With