Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Standardization/Normalization test data in Python

I'm working in a sklearn homework and I don't understand why one should standardize and normalize the test data with the training mean and sd. How can I implement this in Python? Here is my implementation for the train data:

digits = sklearn.datasets.load_digits()
X= digits.data
Y= digits.target
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.3,train_size=0.7)
std_scale = preprocessing.StandardScaler().fit(X_train)
X_train_std = std_scale.transform(X_train)
#X_test_std=??

For the train i think it's correct, but for the test?

like image 412
Paolo Milini Avatar asked Dec 06 '17 01:12

Paolo Milini


People also ask

How do you normalize and standardize data in Python?

Using MinMaxScaler() to Normalize Data in Python This is a more popular choice for normalizing datasets. You can see that the values in the output are between (0 and 1). MinMaxScaler also gives you the option to select feature range. By default, the range is set to (0,1).

How do I standardize data in Python?

Python sklearn library offers us with StandardScaler() function to standardize the data values into a standard format. According to the above syntax, we initially create an object of the StandardScaler() function. Further, we use fit_transform() along with the assigned object to transform the data and standardize it.


1 Answers

Why?

Because your classifier/regressor will be trained on those standardizes values. You don't want to use your trained-classifier to predict data which has other statistics.

How:

std_scale = preprocessing.StandardScaler().fit(X_train)
X_train_std = std_scale.transform(X_train)
X_test_std  = std_scale.transform(X_test)

Fitting once, transforming whatever you need to transform. That's the advantage of the class-based StandardScaler (which you already had chosen) compared to scale which does not hold the needed information needed for applying transformations (based on these statistics obtained during fit) at a later time.

like image 88
sascha Avatar answered Nov 15 '22 06:11

sascha