I am using scikit learning's StandardScaler() and notice that after I apply a transform(xtrain) or fit_transform(xtrain), it also changes my xtrain dataframe. Is this supposed to happen? How can I avoid the StandardScaler from changing my dataframe? ( I have tried using copy=False)
xtrain.describe() #everything ok here
scalar = StandardScaler()
xtrain2 = scalar.fit_transform(xtrain)
At this stage, I would expect xtrain to NOT have changed while xtrain2 to be a scaled version of xtrain. But when I run describe() on the 2 dataframes, I see they are both the same and both have been scaled. Why is that?
I experience the same problem when I do:
scalekey = scalar.fit(xtrain)
xtrain2 = scalekey.transform(xtrain)
You can take a copy and pass this in order to not modify your df:
xtrain2 = xtrain.copy()
scalar.fit_transform(xtrain2)
The docs state that the default param for StandardScaler
is that copy=True
so it should not have modified your df.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With