How to apply standardization to train and test datasets

Question

Let's say I have a 10 feature dataset X of shape [100, 10] and a ytarget dataset of shape [100, 1]. For example, after splitting the two with sklearn.model_selection.train_test_split I obtained:

X_train: [70, 10]
X_test: [30, 10]
y_train: [70, 1]
y_test: [30, 1]

What is the correct way of apply standardization?

I've tried with:

from sklearn import preprocessing
scaler = preprocessing.StandardScaler()

scaler.fit(X_train)

X_train_std = scaler.transform(X_train)
X_test_std = scaler.transform(X_test)

but then if I try to predict using a model, when I try to inverse the scaling for looking at the MAE, I have an error

from sklearn import linear_model
lr = linear_model.LinearRegression()
lr.fit(X_train_std, y_train)
y_pred_std = lr.predict(X_test_std)

y_pred = scaler.inverse_transform(y_pred_std) # error here

I have also another question. Since I have the target values, should I use

scaler = preprocessing.StandardScaler()

X_train_std = scaler.fit_transform(X_train, y_train)
X_test_std = scaler.transform(X_test)

instead of the first code block?

Do I have to apply the transformation also to the y_train and y_test datasets? I am a bit confuse

Jan K · Accepted Answer

StandardScaler is supposed to be used on the feature matrix X only.

So all the fit, transform and inverse_transform methods just need your X.

Note that after you fit the model, you can access the following attributes:

mean_: mean of each feature in X_train
scale_: standard deviation of each feature in X_train

The transform method does (X[i, col] - mean_[col] / scale_[col]) for each sample i. Whereas the inverse_transform method (X[i, col] * scale_[col] + mean_[col]) for each sample i.

How to apply standardization to train and test datasets

Tags:

python

machine-learning

scikit-learn

Facosenpai

1 Answers

Jan K

Recent Activity

Donate For Us

How to apply standardization to train and test datasets

Tags:

python

machine-learning

scikit-learn

Facosenpai

1 Answers

Jan K

Related questions

Recent Activity

Donate For Us