I have a sklearn StandardScaler
saved from a previous model and am trying to apply it to new data
scaler = myOldStandardScaler
print("ORIG:", X)
print("CLASS:", X.__class__)
X = scaler.fit_transform(X)
print("SCALED:", X)
I have three observations each with 2000 features. If I run each observation separately I get an output of all zeros.
ORIG: [[ 3.19029839e-04 0.00000000e+00 1.90985485e-06 ..., 0.00000000e+00
0.00000000e+00 0.00000000e+00]]
CLASS: <class 'numpy.matrixlib.defmatrix.matrix'>
SCALED: [[ 0. 0. 0. ..., 0. 0. 0.]]
But if I append all three observations into one array, I get the results I want
ORIG: [[ 0.00000000e+00 8.69737728e-08 7.53361877e-06 ..., 0.00000000e+00
0.00000000e+00 0.00000000e+00]
[ 9.49627142e-04 0.00000000e+00 0.00000000e+00 ..., 0.00000000e+00
0.00000000e+00 0.00000000e+00]
[ 3.19029839e-04 0.00000000e+00 1.90985485e-06 ..., 0.00000000e+00
0.00000000e+00 0.00000000e+00]]
CLASS: <class 'numpy.matrixlib.defmatrix.matrix'>
SCALED: [[-1.07174217 1.41421356 1.37153077 ..., 0. 0. 0. ]
[ 1.33494964 -0.70710678 -0.98439142 ..., 0. 0. 0. ]
[-0.26320747 -0.70710678 -0.38713935 ..., 0. 0. 0. ]]
I've seen these two questions:
neither of which have an accepted answer.
I've tried:
np.float32
and np.float64
(still all zero)np.matrix
(again, all zeros)What am I missing? The input to fit_transform
is getting the same type, just a different size.
How do I get StandardScaler to work with a single observation?
When you're trying to apply fit_transform
method of StandardScaler
object to array of size (1, n) you obviously get all zeros, because for each number of array you subtract from it mean of this number, which equal to number and divide to std of this number. If you want to get correct scaling of your array, you should convert it to array with size (n, 1). You can do it this way:
import numpy as np
X = np.array([1, -4, 5, 6, -8, 5]) # here should be your X in np.array format
X_transformed = scaler.fit_transform(X[:, np.newaxis])
In this case you get Standard scaling for one object by its features, that's not you're looking for.
If you want to get scaling by one feature of 3 objects, you should pass to fit_transform
method array of size (3, 1) with values of certain feature corresponding to each object.
X = np.array([0.00000000e+00, 9.49627142e-04, 3.19029839e-04])
X_transformed = scaler.fit_transform(X[:, np.newaxis]) # you should get
# array([[-1.07174217], [1.33494964], [-0.26320747]]) you're looking for
And if you want to work with already fitted StandardScaler object, you shouldn't use fit_transform
method, beacuse it refit object with new data. StandardScaler
has transform
method, which work with single observation:
X = np.array([1, -4, 5, 6, -8, 5]) # here should be your X in np.array format
X_transformed = scaler.transform(X.reshape(1, -1))
I had the same problem. Another (simpler) solution to the problem of array with size (1, n) is to transpose the matrix and it will be size (n, 1).
X = np.array([0.00000000e+00, 9.49627142e-04, 3.19029839e-04])
X_transformed = scaler.transform(X.T)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With