Does scikit learn's fit_transform also transform my original dataframe?

Question

I am using scikit learning's StandardScaler() and notice that after I apply a transform(xtrain) or fit_transform(xtrain), it also changes my xtrain dataframe. Is this supposed to happen? How can I avoid the StandardScaler from changing my dataframe? ( I have tried using copy=False)

xtrain.describe()    #everything ok here
scalar = StandardScaler()
xtrain2 = scalar.fit_transform(xtrain)

At this stage, I would expect xtrain to NOT have changed while xtrain2 to be a scaled version of xtrain. But when I run describe() on the 2 dataframes, I see they are both the same and both have been scaled. Why is that?

I experience the same problem when I do:

scalekey = scalar.fit(xtrain)
xtrain2 = scalekey.transform(xtrain)

EdChum · Accepted Answer

You can take a copy and pass this in order to not modify your df:

xtrain2 = xtrain.copy()
scalar.fit_transform(xtrain2)

The docs state that the default param for StandardScaler is that copy=True so it should not have modified your df.

Does scikit learn's fit_transform also transform my original dataframe?

Tags:

python

pandas

scikit-learn

Jason

1 Answers

EdChum

Recent Activity

Donate For Us

Does scikit learn's fit_transform also transform my original dataframe?

Tags:

python

pandas

scikit-learn

Jason

1 Answers

EdChum

Related questions

Recent Activity

Donate For Us