I have a 2D Numpy array, in which I want to normalise each column to zero mean and unit variance. Since I'm primarily used to C++, the method in which I'm doing is to use loops to iterate over elements in a column and do the necessary operations, followed by repeating this for all columns. I wanted to know about a pythonic way to do so.
Let class_input_data
be my 2D array. I can get the column mean as:
column_mean = numpy.sum(class_input_data, axis = 0)/class_input_data.shape[0]
I then subtract the mean from all columns by:
class_input_data = class_input_data - column_mean
By now, the data should be zero mean. However, the value of:
numpy.sum(class_input_data, axis = 0)
isn't equal to 0, implying that I have done something wrong in my normalisation. By isn't equal to 0, I don't mean very small numbers which can be attributed to floating point inaccuracies.
You can determine the mean of the signal, and just subtract that value from all the entries. That will give you a zero mean result. To get unit variance, determine the standard deviation of the signal, and divide all entries by that value.
You can normalize a NumPy array to a unit vector using the sklearn. normalize() method. When using the array of data in machine learning, you can only pass the normalized values to the algorithms to achieve better accuracy. A unit vector is a vector that has a magnitude of 1 .
To normalize a 2D-Array or matrix we need NumPy library. For matrix, general normalization is using The Euclidean norm or Frobenius norm. Here, v is the matrix and |v| is the determinant or also called The Euclidean norm. v-cap is the normalized matrix.
You can normalize data between 0 and 1 range by using the formula (data – np. min(data)) / (np. max(data) – np. min(data)) .
Something like:
import numpy as np
eg_array = 5 + (np.random.randn(10, 10) * 2)
normed = (eg_array - eg_array.mean(axis=0)) / eg_array.std(axis=0)
normed.mean(axis=0)
Out[14]:
array([ 1.16573418e-16, -7.77156117e-17, -1.77635684e-16,
9.43689571e-17, -2.22044605e-17, -6.09234885e-16,
-2.22044605e-16, -4.44089210e-17, -7.10542736e-16,
4.21884749e-16])
normed.std(axis=0)
Out[15]: array([ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With