Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy:zero mean data and standardization

I saw in tutorial (there were no further explanation) that we can process data to zero mean with x -= np.mean(x, axis=0) and normalize data with x /= np.std(x, axis=0). Can anyone elaborate on these two pieces on code, only thing I got from documentations is that np.mean calculates arithmetic mean calculates mean along specific axis and np.std does so for standard deviation.

like image 308
econ Avatar asked Aug 23 '17 08:08

econ


People also ask

How do you normalize data to zero mean and unit variance?

You can determine the mean of the signal, and just subtract that value from all the entries. That will give you a zero mean result. To get unit variance, determine the standard deviation of the signal, and divide all entries by that value.

How do I normalize data in NumPy?

In order to normalize a vector in NumPy, we can use the np. linalg. norm() function, which returns the vector's norm value. We can then use the norm value to divide each value in the array to get the normalized array.

How do you find the mean and standard deviation in NumPy?

The standard deviation is the square root of the average of the squared deviations from the mean, i.e., std = sqrt(mean(x)) , where x = abs(a - a. mean())**2 . The average squared deviation is typically calculated as x. sum() / N , where N = len(x) .


1 Answers

This is also called zscore.

SciPy has a utility for it:

    >>> from scipy import stats
    >>> stats.zscore([ 0.7972,  0.0767,  0.4383,  0.7866,  0.8091,
    ...                0.1954,  0.6307,  0.6599,  0.1065,  0.0508])
    array([ 1.1273, -1.247 , -0.0552,  1.0923,  1.1664, -0.8559,  0.5786,
            0.6748, -1.1488, -1.3324])
like image 81
Jonas Adler Avatar answered Sep 21 '22 17:09

Jonas Adler