Standardization of an numpy array

Question

I am trying to standardize a numpy array of shape(M, N) so that its column mean is 0. I think I have used the formula of standardization correctly where x is the random variable and z is the standardized version of x.

z = (x - mean(x)) / std(x)

But the column mean of the resulted array is not 0. They are very small number but not zero. Any insight regarding my misunderstanding or mistake is welcome. Here is my code:

import numpy as np

X = np.load('data/filename.npy').astype('float')
XNormed = (X - np.mean(X, axis=0))/np.std(X, axis=0)
column_mean = np.mean(XNormed, axis=0)
print(column_mean)

Jose Avila · Accepted Answer

Your code is correct but as you mentioned in the formula of your own question you need to divide by the standard deviation and not by the range of the data (as in your code). The line below , which uses numpy's std() should correct it:

XNormed = (X - X.mean())/(X.std())

Standardization of an numpy array

Tags:

python

numpy

Kajaree Das

1 Answers

Jose Avila

Recent Activity

Donate For Us

Standardization of an numpy array

Tags:

python

numpy

Kajaree Das

1 Answers

Jose Avila

Related questions

Recent Activity

Donate For Us